Leveraging Account Abstraction for Intents

BalloonDogs with ERC-4337

A considerable amount of discussion has taken place regarding the connection between Account Abstraction (AA) and Intents. Is AA essentially synonymous with intents? Or is it the ideal foundation for intents? And what would an appropriate implementation of this look like?

This article will examine the rationale behind posing these inquiries and supply ample information to address them.

However, before delving into that, we believe it would be beneficial to establish an industry context for these questions.

Note: If you’re not familiar with the general concept of AA or intents, it’s better to check out the following short articles first:

AA: Understanding ERC-4337 and the Future of Account Abstraction in Ethereum

Intents: The next blockchain bull run: User intents paving the way for mass adoption

It’s All About Delegation

Throughout this article, we will briefly discuss AA architecture, exclusive properties, pros & cons, and more... Taking a step back, since Intents is an excellent AA use case, it is worth mentioning the general idea:

AA suggests using a smart contract account as the primary account instead of Externally Owned Accounts (EOAs) such as MetaMask and cold wallets. Without going into the technical definition and architecture just yet, AA appears to impact various use cases. In some cases, it enables use cases that were previously theoretically possible but not in practice and, in other cases, simplifies other use cases that were considered challenging before but now seem to be less so.

AA can be described in many ways and is a topic that has garnered significant attention. The potential impact of AA is undeniably significant, so the expectations surrounding it are high. In this discussion, we will zoom in on one specific aspect of AA: Delegation.

In this case, the delegation will be in the context of execution.

What Does Execution Delegation Mean?

Delegation, in the context of execution, refers to the ability of other entities to use their judgments and impact a user's execution. There are several compelling reasons why a user may want to grant others the authority to act on their behalf and carry out transactions. This can be attributed to various factors, including technical limitations, availability constraints, knowledge gaps, efficiency and convenience, and risk mitigation.

Delegation involves a trust assumption, which carries some risk. It is essential to pay close attention to the privileges granted. This is precisely why the privileges are the primary factor used to distinguish between various types of delegations:

Full Delegation (no restrictions)

This type of delegation grants 1:1 user access, similar to having the wallet's private key. Full delegation allows others to act on behalf of a user without any restrictions or limitations. However, it is essential to note that full delegation is rare because it is not usually necessary and goes against basic security best practices of minimizing permissions.

Limited Delegation

In limited delegation, at least one parameter is not fully delegated. This means that the user still retains specific actions or decisions, while trusted third parties can perform others. It is safer to have a narrow range of delegation, and multiple parameter limitations are preferred to ensure greater control and security.

Limited delegation can take various forms. It can expire, after which the delegated authority is revoked. It can be limited to specific operations, assets, or protocols, allowing the user to specify the exact scope of authority granted. Additionally, limited delegation can be conditional, meaning certain conditions must be met to execute the delegated actions.

The purpose of limited delegation is to focus on a specific goal or objective while minimizing the risk of unintended operations being carried out. By delegating only particular tasks or functions to trusted third parties, users can maintain a higher level of control and mitigate potential risks associated with full delegation.

Pre-AA Delegation: Exploring Alternatives

To gain a deeper understanding of the impact of AA, it is essential first to consider the alternative delegation mechanisms that were available before and still exist. It is worth noting that AA is not the ultimate delegation solution; in some cases, other options may offer a better solution.

As mentioned earlier and will be further discussed, any form of delegation carries inherent risks. The impact of the abuse of delegated privileges is directly proportional to the level of privileges granted to a third party. The challenge lies in striking a balance between granting fewer privileges while still allowing a third party to act on behalf of the user efficiently.

Pre-sign

Pre-signing a transaction without submitting it to the mempool directly is interesting. Instead of immediately executing the transaction, it is held by a third party for later conditional execution. This allows for greater flexibility and control over when the transaction is executed. It’s similar to signing a cheque (with the amount and “pay to” fields filled) and giving it to someone else to decide when to deposit it.

In terms of risk, the pre-sign feature poses minimal threat. It enables the execution of precisely what the user defined, and any manipulation would involve either not executing it at all or executing it contrary to the user's defined time or condition. This level of control ensures that users can have confidence in the execution of their transactions.

Diagram for pre-sign functionality for third party in blockchain transactions
Figure 1: Pre-sign a transaction without submitting it to the mempool for the best outcome later

Notable examples of pre-sign functionality are:

Gelato, with the pre-sign feature, enables users to automate their transaction execution based on predefined conditions, streamlining their interaction with the blockchain.

GasHawk is a tool for executing transactions with a focus on gas costs. It allows users to pre-sign a transaction and determine the best time to broadcast it.

ERC-20 "APPROVE"

The ERC-20 standard allows users to delegate token spending authority to a trusted third party through the approve() function. It enables users to set an address authorized to spend the specified token amount on their behalf. Although this type of delegation is considered very simple, there are some concerns regarding the user experience. First, each delegation requires a separate transaction, and the same goes for every change in the approved amount or spender, which requires a separate transaction. Moreover, delegation is required per token.

For example, Initially approving three different tokens, later changing the amount of each one, and eventually discarding the approval will result in 9 transactions. Additionally, since it is based on the ERC-20 standard, native tokens, such as ETH, are not supported and require an additional step to wrap the ETH into WETH or a separate format.

The risk associated with ERC-20 approve() falls within the medium to high range. While the impact is limited to the approved amount set by the user, there is no inherent limitation on what a third party can do with that approved amount, and also there is no expiration for that. Therefore, users should exercise caution when granting approval and carefully consider their trust level in the authorized address. To prove the point, we will mention the existence of tools like WalletGuard and Revoke to deal with this issue.

A diagram illustrating the ERC-20 Approve function, which allows a users to delegate token spending rights to trusted party through 'approve()' function
Figure 2: ERC-20 Approve - Delegate token spending rights to a trusted third party

A practical example of the ERC-20 approve() function in action is CoW Swap. CoW Swap allows users to delegate token spending by setting an amount and an address authorized to spend their tokens. However, users should be mindful of the potential risks of granting such approval and ensure they are comfortable with the level of control given to the authorized address.

Custody

Transferring funds (depositing) to a smart contract owned by a trusted third party. It is essential to mention that users delegate full control over the assets transferred to the third party by doing so, and getting the control back (withdrawal) depends on the third party's will. This delegation of control can be helpful in particular scenarios where users want to leverage the expertise or capabilities of the third party.

However, it is essential to consider the risk involved in custody. With this approach, token control is no longer in the hands of the user. The third party becomes the custodian of the tokens, which may raise concerns for users who prioritize maintaining full control over their assets.

 Diagram depicting the custody delegation to trusted third party
Figure 3: Custody to the trusted third party with full control over user funds

An example of custody delegation can be found in V1 DEXs. In these exchanges, users deposit their tokens into a smart contract controlled by the exchange. By doing so, they delegate control over their tokens to the exchange, allowing seamless trading and liquidity provision. It is crucial for users to carefully assess the risks and benefits associated with custody before engaging in such arrangements.

In summary, each of these delegation mechanisms offers distinct advantages and risks. Pre-sign functionality provides flexibility in transaction execution, ERC-20 approve() enables trusted third-party token spending, and custody offers convenience at the cost of control.

What Intents Really Need?

Intents, in terms of delegation, can be defined as an action specification that seeks the optimal execution, with "optimal,” as for now, referring typically to cost optimization (with future intents embracing broader objectives, i.e., maximizing utility and enhancing security). The nature of intents makes it challenging to establish a restriction that won't impose limitations on the solver. This is primarily because restrictions typically take the form of general and predefined rules that aim to cover a wide range of intents simultaneously.

However, it is crucial to recognize the inherent limitations of such broad restrictions. The broader the potential solution may be, the more corresponding privileges will be required, which can introduce security concerns. To address this issue, it is imperative to transition towards a more dynamic approach that allows for on-demand restrictions. Simply, this means that each intent can be restricted separately.

Dynamic Delegation

Imagine that before each intent post, there was a reset of restrictions to allow for the upcoming intent solving and only that. This would significantly reduce privileges delegation. Still, it would also result in a poor experience (requiring an additional transaction to be signed and confirmed) and increased costs (an extra transaction per intent).

Is there a way to maintain optimal security on-demand without impacting the experience and costs?

Yes! Dynamic delegation is an approach that integrates the restriction into the call to action. By doing so, several important aspects can be optimized. First and foremost, dynamic delegation enhances security by ensuring the restriction is tailored to each transaction, thereby minimizing the privileges granted to third parties. This targeted restriction mitigates potential risks associated with granting excessive access.

Moreover, dynamic delegation dramatically improves the user experience. The process becomes more streamlined and convenient by eliminating the need for users to repeatedly delegate privileges, whether for recurring tasks or one-time operations. Comparing, for example, to the case of CoW Swap, where users must set the amount they are allowed to spend. This means users must periodically update or reset the spending limit once a swapping is completed. This additional step can be eliminated by incorporating restrictions directly into transactions, resulting in a more efficient user experience.

Regarding costs, the impact of dynamic delegation on transaction fees depends on the specific implementation. While it is theoretically true that more transactions may incur higher fees, the actual cost implications vary based on the context and need to be validated case-by-case. Therefore, it is crucial to consider the specific use case and carefully assess the potential cost implications before implementing dynamic delegation.

Now that we clearly understand the general approach, the next challenge is bringing this concept to life. This is where the ERC-4337 standard comes into play.

Schematic representation predefined delegation vs dynamic delegation
Figure 4: Predefined Delegation vs Dynamic Delegation

The Power of a Standard (ERC-4337)

Regarding EVM and EVM-compatible chains, it isn't easy to separate AA from ERC-4337. This is mainly because ERC-4337 offers an effective solution to various development challenges.

One of the most significant advantages of implementing a standard like ERC-4337 is the ability to develop once and support multiple AA wallets. This means developers only need to create their solution once, which can be easily integrated with various AA wallets.

Furthermore, the AA interface provided by ERC-4337 opens up a world of possibilities for new developments and features. By adhering to a standard, different development teams can focus on addressing specific challenges independently, creating modular building blocks that users can combine according to their requirements. This promotes efficiency and productivity for development teams and provides users with a high degree of flexibility and customization.

ERC-4337 based Delegation

Validation Abstraction

A "standard" transaction verification lacks flexibility. It only includes transaction data and a signature for verification. Moreover, Ethereum, like other platforms, only supports a single signature algorithm (ECDSA).

The key feature of ERC-4337 is the ability to express custom verification logic with code (Solidity with some opcodes excluded). This enables various advanced features, such as other signature verifications like oAuth, Merkle trees, zk-proof, and more complex logic.

Predefined Delegation

A proper implementation of the validateUserOp() function by the wallet, that is the actual validation logic, can easily replace any previously mentioned delegation alternatives. This implementation, for example, allows for accepting signatures from other addresses, with or without fund limitations. It is essential to note that "predefined" does not imply immutability; it simply means that the delegation or restriction setting is a separate process (transaction) that must occur before anyone can act on behalf of the user.

How does it work? Any AA custom validation implementation comprises two corresponding parts: the userOp and the validation function (validateUserOp()). The function expects a specific structure and values from the userOp in order to validate it. This can include fund limits, particular operations, and any other restrictions we have already encountered. Is ERC-4337 implementation always preferred? Not in every case.

On Designing an Optimal Dynamic Delegation

Let's delve deeper into designing an effective and efficient dynamic delegation strategy. With ERC-4337, we can utilize an intent both as a call to action and a set of constraints. This unique feature enables different entities to utilize the same data in diverse ways.

Regarding the Solver, the intent functions as a call to action. This aspect remains unchanged as it represents the user's explicit request to the solver. As defined previously, the solver is responsible for delivering the optimal solution and submitting it.

On the other hand, in the context of smart contract wallet verification, the intent acts as a set of constraints. It establishes the desired outcome, and any solution that fails to meet this intent criteria will fail the validation process.

Understanding the significance of having a unified intent and delegation is crucial. This approach allows for optimizing the validation process by retaining the restrictions within the userOp for subsequent use and bypassing the delegation phase completely.

What Is Still Missing?

The ERC-4337 Entrypoint contract is designed to process each userOp separately (and sequentially) even though it receives a bundle of multiple intents as input. This is generally a good idea as it efficiently isolates and protects from many security issues.

However, there might be an issue when it comes to solving intents: With isolated intent solving, the optimization options are very limited. Coincidence of Wants (CoW) and other optimization techniques are not available for Solvers because of this limitation.

The coincidence of wants (often known as double coincidence of wants) is an economic phenomenon where two parties each hold an item that the other wants, so they exchange these items directly. Within economics, this has often been presented as the foundation of a bartering economy

Let's review a simple example of a CoW solution to understand it better:There are two users with opposing intents. User A wants to exchange his ETH for USDt, while user B wants to trade his USDt for the same amount of ETH. The solution would be for user A to transfer his ETH to user B and for user B to transfer his USDt to user A. Now, let's validate each intent separately: User A's intent is to receive USDt for his ETH, but based on the execution provided by the solver, he only gives ETH and doesn't receive anything in return. This would result in the validation failure. The same goes for user B.

The result is that since it’s not that easy to give up on batch solving, there is no intent to solve a solution based on ERC-4337.

How to Solve It?

In other cases, like CoW Swap, there is no such challenge. Delegation is defined before and unrelated to the intent, and intent validation on-chain is unnecessary. That is mandatory in decentralized/trustless systems, where intent serves as the delegation. Based on the described need, there are two elements required for batch solve validation:

  1. Intent validation should verify the desired outcome and ensure the results align with the user intent.
  2. Single intent verification should consider the entire batch. In other words, the desired outcome should be verified based on the execution of the entire batch rather than just this specific intent.

A Bundle as a Batch

With the ERC-4337 design, using a bundle as a batch of intents is natural. Acting as a container for multiple userOps representing intents and their solutions, a bundle offers efficiency, simplicity, and a reliable approach to validating batch state changes. Grouping related intents in a bundle makes the validation process more streamlined and cohesive. Each userOp within the bundle can affect the entire bundle, making the bundle level the most efficient place to implement batch validation. This approach ensures the integrity of the batch-solve process and enhances scalability and performance.

ERC-7521

One existing solution to address this challenge is ERC-7521, proposed by Essential as "Generalized Intents for Smart Contract Wallets". This standard primarily focuses on handling batch intent validation using a bundle. By incorporating ERC-7521, developers can leverage its functionality to validate batch state changes efficiently and ensure the integrity of the intent-based system. However, it is essential to note that adopting ERC-7521 breaks away from the ERC-4337 standard and introduces a separate standard specifically for dealing with Intents use cases. Ultimately, the best approach is to adhere to industry standards if it is possible to do so without compromising on functionality or security.

Seeking An Optimal Solver Fee Strategy

Paymaster, in short, is a feature of smart contract accounts that allows third-party mechanisms to sponsor transactions. However, using paymaster functionality for fee charges may not be necessary for Intent Solve. Since the solver already can define the set of transactions to execute, it is easy to include the fee payment within that list. In cases where there is no paymaster option, this is how it is done. Additionally, ERC-7521, as an AA-based solution, was designed to exclude the paymaster option completely. Additionally, paymaster appears to be less gas-efficient compared to other alternatives. On the other hand, separating the fee from the intent solve process has a few advantages, such as providing a structured approach to handling fees. Ultimately, the most essential factor to consider is gas fee efficiency, and the most efficient method should be chosen.

Conclusion

The ERC-4337 standard, as it strives to do in general, offers a significantly improved user experience and streamlines development for intent-centric solutions. In this particular case, it also enhances security.

Similar to other use cases of ERC-4337, there is a concern regarding higher gas costs. Currently, the exclusive functionality and security outweigh that concern, justifying user adoption, and the same should apply to intent-centric solutions.

In addition, It is possible that by implementing great optimization, the fee costs will become marginal. A well-thought-out design and implementation can have a significant impact on cost efficiency. Development teams must prioritize this aspect to achieve success and widespread product adoption.

The goal of this post is to encourage any related conversation or feedback. If you have any, please share it with us via Twitter or LinkedIn.