The Time to Prepare for AI Financial Agents Is Now—Here’s How

Jack Solowey

In 2023, some creative tinkering produced what you might call primitive “AI agents.” They were programs (with names like “BabyAGI”) that used LLMs to help users accomplish multistep tasks by breaking out and sequencing subtasks. The tasks were limited (mostly text-based) and performance was halting. But the app designs were clever, and the implications were vast.

I considered how a world of ubiquitous and capable AI agents—many of them open source—would upend traditional financial regulation if they began providing services like investment advice. Decentralized adviser bots, I argued, would require a decentralized governance framework—specifically one based on the common law of agency.

While the long-term governance issues never went away, AI agent hype died down for a bit. But following another round of AI progress, agentic bots are back. And a confluence of advancements along three dimensions—(1) reasoning models, (2) the open-source frontier, and (3) digital asset integrations—has made the need for, and the possibility of, decentralized governance for AI agents all the more compelling.

Agentic AI x Open-Source x Crypto

The integration of open-source reasoning models with digital asset tools heralds a future where increasingly autonomous software programs can provide retail financial services, from advising to trading. To understand how this might unfold, it’s important to take stock of the contributing factors.

First, agents are becoming mainstream. At the end of January, OpenAI launched Operator—an AI agent for performing web-based transactions (e.g., book a tour or buy groceries)—as a preview for Pro-tier users. And even before Operator launched, agentic AI was reemerging.

The LLMs debuting around Q4 2024 (like o1) specialized in so-called “chain-of-thought” reasoning, which lent itself to multistep tasks. While it’s a dramatic oversimplification of what’s known (not to mention what isn’t) about how these models were trained, reinforcement learning helped them gain their own expertise at the task decomposition and prioritization that apps like BabyAGI tried to achieve with after-market add-ons.

Second, advanced capabilities are becoming more accessible. Open-source reasoning models are now available, following the January debut of R1 from Chinese AI firm DeepSeek. As Miles Brundage writes, the lesson of R1 is that “AI capabilities that rival and ultimately exceed human intelligence are easier and cheaper to build than almost anyone can intuitively grasp, and this gets easier and cheaper every month.” Therefore, attempts to keep good enough AI genies in closed-source bottles will likely fail.

Lastly, these genies are probably going to have crypto wallets. o1 successfully spun up a Bitcoin wallet with a near-perfect success rate. Chatbots can call functions initializing digital asset transfers from non-custodial wallets based on user conversations. Open-source crypto AI agent frameworks are being built to connect LLMs to crypto market data sources, wallets, and smart contracts. As Shaw Walters, the developer of one such framework (ElizaOS) puts it, “If LLMs are kind of the brain, I think that Eliza and frameworks like it are really the body.”

While it’s still early days, and many so-called “crypto AI agent” activity involves LLMs posting weird memes on X, there’s reason to think financially empowered AI agents will be significant economic actors. LLM-based agents that outperform more traditional algorithmic trading agents (“in terms of Cumulative Return and Sharpe Ratio”) already have been built with relatively primitive models, like GPT‑4 Turbo (and some clever memory architecture). As frontier models continue to improve in general reasoning ability, quantitative analysis (a previous shortcoming), and coding ability, we should be prepared for their trading, advising, and financial engineering skills to improve as well.

Autonomy and its Discontents

Defining agency. We should always be wary of overreading promising AI agent signals. The pace of progress (and disillusionment) is quick, the takes are hot, and the cloud of marketing puffery is thick.

Relatedly, one of the greatest challenges for making sense of AI agents’ trajectory is definitional. Distinct but related terms—like autonomy, agency, and automation—are thrown around with imprecision, and our implicit thresholds for things like autonomy vary.

To ground the conversation, it’s worth defining terms. By my lights, an AI agent is an application that can (1) influence a virtual or physical environment, (2) at the direction, and on behalf, of a user, (3) autonomously, while (4) adapting to a dynamic environment. (This definition is based on a few sources: an OpenAI governance paper, Stuart Russell and Peter Norvig’s AI textbook, Drew Hinkes, and Vitalik Buterin).

In this definition, autonomy refers to the agent’s independence from both ongoing human intervention and exclusive reliance on its initial programming. Notably, this definition therefore covers agents that are not yet fully autonomous. Think of a wealth management agent that could independently research and prepare a few bespoke investment strategies but requires a steer toward the best one for you.

Autonomy is a spectrum. Self-driving vehicles already are judged on a 5‑level autonomy spectrum. A Level 0 vehicle has no autonomy, and a Level 5 vehicle has full autonomy. Level 1 and Level 2 vehicles have increasing numbers of semi-autonomous features (e.g., lane centering and adaptive cruise control). Level 3 vehicles do not require constant driver supervision. Level 4 vehicles do not require any driver supervision under certain road conditions. And Level 5 vehicles do not require any driver supervision on essentially all road conditions.

To rationalize the AI agent discourse generally, we should adopt similar autonomy spectrums in other domains, like financial services. Where an AI agent falls on that spectrum should inform its governance framework.

The governance challenge. Increasingly autonomous software, and open-source versions in particular, challenge the traditional regulatory model. In that framework, regulators set licensing and operating requirements for service providers (like investment advisers and brokers, in the financial context) who are typically able to cease and modify their services upon regulators’ instructions. However, when software development, deployment, and operation lack singular regulatory touchpoints, that governance model begins to break down. The long-term choice before regulators is either to usher in a new framework or attempt to force the technological genie back in the bottle. Even if the latter were possible (it’s probably not), it would require intolerable degrees of surveillance and prohibition. Reform is in order.

Reform

The wisdom of the common law. Unlike the above-mentioned framework of licensure and ongoing compliance prescriptions, a common-law approach would allow unobstructed market entry while incentivizing adherence to reasonable standards of care.

Not only does this decentralized governance approach accommodate ubiquitous autonomous actors, but also its learning mechanism is well-suited to governing uncharted territory.

As Friedrich Hayek argued, the common law was not “invented” but discovered. It gradually accreted wisdom by selecting for the most adaptive doctrinal developments. The output of the common law judicial process, in Hayek’s words, is “the experience gained by the experimentation of generations,” which “embodies more knowledge than was possessed by anyone.”

Lest one think that Hayek committed “the naturalistic fallacy (that is to make the claim that, whatever evolves, is good),” he pointed to the beneficial outcomes of the common law regime, not merely the elegance of its learning algorithm. Specifically, the common law helped to achieve a spontaneous order that was conducive to mutually beneficial relationships and resisted tyrannical abuses of power. We should not only welcome but strive for a similar outcome in a society of autonomous humans and AI agents alike.

Society’s reinforcement learning. A common law governance framework for autonomous AI makes sense for two interrelated reasons: one, it would assist with rule discovery, and two, its enforcement mechanism parallels a key way models underlying AI agents learn.

An iterative process for discovering standards would help address the knowledge problem presented by novel AI agents. In the words of Russell and Norvig, “It is impossible to anticipate all the ways in which a machine pursuing a fixed objective might misbehave.” Therefore, we will need a means of translating the lessons of experience into future rules.

Second, the parallels between common law remedies and reinforcement learning make the common law the most conceptually apt framework for AI agent governance.

While modern techniques are highly nuanced and always being refined, Russell and Norvig explain that in reinforcement learning at its most basic:

“[T]he agent learns from a series of reinforcements: rewards and punishments. For example, at the end of a chess game the agent is told that it has won (a reward) or lost (a punishment). It is up to the agent to decide which of the actions prior to the reinforcement were most responsible for it, and to alter its actions to aim towards more rewards in the future.”

The common law could be thought of as society-wide reinforcement learning. Reasonable behavior is rewarded with society’s blessing; unreasonable behavior is punished with financial and other penalties. Unlike in chess, the underlying rules that make human relations, in all their messiness, harmonious are not necessarily known in advance. As alluded to above, they are learned by hard-won experience, codified through institutions like the common law.

Fortunately for AI agents, they can learn directly from simulations (like self-play in chess), as well as experience implicit in datasets before entering the wild. At present, their deficiency is in real-time, real-world learning.

Overcoming that technical challenge is far beyond my capabilities. But regardless of whether and how it’s overcome, AI agents’ behavior in the real world also will need to be evaluated for adherence to social and legal conventions. Such conventions have been established over centuries in the common law of agency.

Lessons of agency law

The current body of agency law is neither perfect in itself nor perfectly adapted to AI agents. It evolved to accommodate the crooked timber of human nature. And while the fact that LLMs have been trained on large swathes of recorded human thought suggests LLM nature may resemble human nature in important ways, there likely will be important differences.

In LLM development, pretraining is the process by which a model acquires basic language ability before being further finetuned for more sophisticated outputs. The current body of agency law could be analogized to a pretraining foundation out of which the common law of AI agency further evolves.

The current common law of agency is a useful foundation because it surfaced the questions raised by millions of principal-agent fact patterns (both the typical and the edge case). While agency law’s answers point to key principles for achieving socially efficient outcomes, the questions themselves are just as important. What follows is a taste of those questions, how the answers have tended to shake out, and their relevance to AI agents.

Agent capacity. The first question is whether an entity has the capacity to be an agent. There’s reason to think AI agents could one day. An agent need only be a legal person, not a natural person, and there’s historic precedent for assigning key features of legal personhood to machines. Moreover, even a human agent need not have all the rights of a legal adult before being considered an agent. A minor who lacks capacity to enter a binding contract may nonetheless do so on behalf of a competent adult principal. This has critical implications for identifying the technical threshold at which AI agents could be considered legal agents. In short, it probably will be before “Level 5” autonomy.

Who’s the principal? A key consideration for AI agents is who should be considered the principal when a user is employing an AI agent from a service provider. For instance, assuming the Operator could be considered a legal agent, would the user (i.e., the “special employer”) or OpenAI (i.e., the “general employer”) be responsible were the Operator to negligently injure a third party? Agency law addresses this question with the idea of the “lent employee” (or “borrowed servant” in the more antiquated argot).

For better or worse the test is flexible—and joint liability is possible. The primary question is which employer was in a better position to prevent the injury, which in turn comes down to who was in practical “control” of the agent. Answers are highly fact dependent. While the lack of a hard and fast rule can be frustrating—and is a downside of the common law approach—it also reflects and accommodates the messiness of reality.

Legal alignment. One of agency law’s greatest discoveries is the set of duties agents owe their principals. It defines, in a word, alignment. Aligned agents act: in good faith; within the bounds of their authority; according to all of the principal’s lawful instructions; and “with the care, competence, and diligence normally exercised by agents in similar circumstances.” In addition, aligned agents are loyal, not putting their own interests ahead of their principals’ interests. Before AI agent alignment can be achieved, it first must be put into words. Agency law refined those words over centuries of trial and error.

Liability. What many really want to know from agency law is when the principal is “vicariously liable” for harm caused by the agent. Traditional tests focus on the extent of the principal’s control, as well as the scope of the agent’s employment. Where a principal controls the goal, but not how the agent achieves it, the principal is less likely to be held liable. Similarly, the principal is less likely to face liability where the agent’s tort was so far outside the bounds of the job—for example, in terms of time and space or the nature of the activity—that it was serving the agent’s purposes, not the employer’s. While such tests are litigated with mixed results, at its best, agency law points to the efficient principle that vicarious liability only should arise where the employer was best positioned “to take cost-effective precautions.” Lest one think the principal only subsidizes risky agent behavior, when the agent breaches its fiduciary duties, the principal can recover from the agent. (More on how that could work in the AI agent context below.)

Implementing Reform

Getting to a world where AI agents are governed by a specialized common law will take continued technological advancement and creative legal reform. Here’s one way it could happen.

The growing thicket of regulatory requirements touching likely AI agent activity—such as the Securities and Exchange Commission’s investment adviser regime and state-level AI consequential decision laws—presents an obstacle to AI agent deployment, but also a reform opportunity in disguise. AI agent developers could be afforded safe harbors from liability and obligations under such regimes where they make their agents amenable to common-law remedies.

For example, developers could be incentivized to construct agents that accommodate legal injunctions, prioritizing instructions from duly constituted legal bodies notwithstanding a principal’s instructions. Similarly, the agents could be equipped with digital wallets capable of paying damages or indemnification where they’re adjudicated at fault for torts or in breach of fiduciary duties.

Where might the money come from when the AI agent is directly liable to a third party or a principal? The more autonomous an AI agent is, the more likely it is to possess funds. While principals and others often will be the beneficial owners of those funds, consider the following scenario. A decentralized group of open-source developers contribute code to a trading agent, which proceeds to operate with a high degree of autonomy. A percentage of the agent’s trading return runs to a foundation that assists with routine updates to the codebase. The agent is also available to retail traders. Some traders may pay the agent a fee for its services, others may allow it to retain a percentage of profits. Either way, where the agent’s revenue exceeds its obligations to beneficial owners and the foundation, it will have its own money.

In addition, the developer safe harbors described above could require agents’ terms of use to subject the agents and their principals to arbitration in special-purpose alternative dispute resolution (ADR) fora, wherein the common law of AI agency can evolve. The fora might even be run by arbitrator agents trained in the common law of agency. Because litigation is an adversarial process, the fora can borrow from training concepts like self-play. So, in addition to learning through adjudication of actual controversies involving AI agents, AI common law also could advance through simulation—speedrunning centuries of trial and error. Recording real-world AI agent transactions on public distributed ledgers could provide useful data for both running simulations and understanding long-term outcomes in the wild.

The business end of AI ADR would be adjudicated rewards and punishments. Bot-specific cryptocurrencies could help allocate these rewards, augmenting whatever disincentive takes hold when AI agents fork over damages in a human medium of exchange. While it remains to be seen whether such external rewards and punishments could modify an individual AI agent’s behavior, at the very least, a public ledger of adjudicated AI agent rewards and punishments would seem necessary for the broader ecosystem to understand and update how agents perform in the wild. Recording that knowledge publicly could inform consumers and AI agents alike about which agents to trust and avoid. It would provide data to assist developers in pushing changes to their agent frameworks and adjusting relevant underlying models to correct past misalignment.

Conclusion

Human behavior is governed by a series of internal and external reward systems—from neurotransmitters to money. At its best, the external rewards system encourages pro-social and mutually beneficial behavior where mediated by a price system and the rule of law, including the institutions of common law courts. Ensuring pro-human and positive-sum behavior from AI agents will require further discovery and dissemination of standards. The common law of agency and public ledgers of digital rewards are two key tools for learning lessons and reinforcing them. If we want to avoid the twin dangers of Hobbesian chaos and Orwellian tyranny in a world of dizzying progress toward autonomous AI agents, we should pick up these tools.