Original title: “Possible futures of the Ethereum protocol, part 5: The Purge”
Original author: Vitalik Buterin
Original translation: Odaily Planet Daily Husband How
Since October 14th, Ethereum founder Vitalik Buterin has been releasing discussions on the possible future of Ethereum, from “The Merge”, “The Surge”, “The Scourge”, “The Verge” to the latest release of “The Purge”, demonstrating Vitalik’s vision for the future development of Ethereum Mainnet and the solutions to the current problems.
“The Merge”: Explores the need for ETH to improve single-slot finality and dropstake thresholds after transitioning to PoS in order to increase participation and transaction confirmation speed.
“The Surge”: Explores different strategies for ETH expansion, especially the roadmap centered on Rollup, and the challenges and solutions for scalability, decentralization, and security.
《The Scourge》: Explores the centralization and value extraction risks Ethereum faces in transitioning to PoS, develops multiple mechanisms to enhance Decentralization and fairness, while also undertaking stake economy reforms to safeguard user interests.
“The Verge”: Explored the challenges and solutions of ETH Ethereum’s stateless verification, focusing on how technologies like Verkle tree and STARK enhance the Decentralization and efficiency of the blockchain.
On October 26th, Vitalik Buterin published an article about ‘The Purge’, in which he discussed how Ethereum faces the challenge of dropping complexity and storage requirements in the long term while maintaining the persistence and decentralization of the chain. Key measures include reducing client storage burden through ‘historical expiration’ and ‘state expiration’ and simplifying the protocol through ‘feature cleaning’ to ensure the sustainability and scalability of the network.
The following is the original text, translated by Odaily Star Daily.
Special thanks to Justin Drake, Tim Beiko, Matt Garnett, Piper Merriam, Marius van der Wijden, and Tomasz Stanczak for their feedback and review.
One of the challenges facing the Ethereum network is that, by default, the expansion and complexity of any blockchain protocol will increase over time. This occurs in two places:
Historical Data: Any transaction that has ever occurred and any account that has been created at any point in history must be permanently stored by all clients and downloaded by any new clients in order to fully synchronize with the network. This leads to an increasing client burden and synchronization time over time, even if the capacity of the chain remains constant.
Function of protocol: Adding new features is much easier than removing old ones, which leads to an increase in code complexity over time.
In order to sustain Ethereum in the long term, we need to exert strong counter-pressure on these two trends, dropping complexity and inflation over time. However, at the same time, we need to retain one of the key attributes that makes the blockchain great: persistence. You can put a Non-fungible Token, a love letter in a transaction call data, or a Smart Contract containing 1 million US dollars on-chain, enter a cave for ten years, and find it still there waiting for you to read and interact with. In order for DApps to confidently achieve full decentralization and remove the upgrade Secret Key, they need to ensure that their dependencies will not upgrade in a way that disrupts them - especially L1 itself.
It is absolutely possible to strike a balance between these two needs, and to minimize or reverse bloat, complexity, and decay while maintaining continuity if we are determined to do so. Organisms can do this: while most organisms age over time, a lucky few do not. Even social systems can have extremely long lifespans. In some cases, Ethereum has succeeded: proof-of-work has disappeared, the SELFDESTRUCT opcode has mostly disappeared, and the beacon chain has stored old data for up to six months. Finding this path for Ethereum in a more general way and moving towards a long-term stable end result is the ultimate challenge for Ethereum’s long-term scalability, technical sustainability, and even security.
The Purge: main objective.
By reducing or eliminating the need for each Node to permanently store all historical records or even final states, drop client storage requirements.
Reduce complexity by dropping unnecessary features.
Table of Contents:
History expiry
State expiry
Feature cleanup
History expiry
What problem does it solve?
As of the writing of this article, a fully synced ETH Ethereum Node requires approximately 1.1 TB of disk space for the client to run, as well as several hundred GB of disk space for the Consensus client. The majority of this is historical: data on historical Blocks, transactions, and receipts, most of which are several years old. This means that even if the Gas limit does not increase at all, the size of the Node will continue to increase by several hundred GB each year.
What is it, and how does it work?
A key simplifying feature of the historical storage problem is that because each block points to the previous block through hash links (and other structures), achieving consensus on the current block is sufficient to achieve consensus on the history. As long as the network reaches consensus on the latest block, any historical blocks, transactions, or states (account balances, random numbers, code, storage) can be provided by any single participant and Merkle proof, and the proof allows anyone else to verify its correctness. Consensus is an N/2-of-N trust model, while history is an N-of-N trust model.
This provides us with a lot of choices for how we store historical records. A natural choice is for each Node to store only a small portion of the data on the network. This is how the seed network has operated for decades: while the network as a whole stores and distributes millions of files, each participant stores and distributes only a few files. Perhaps counterintuitively, this approach may not even drop the robustness of the data. If we can make the Node operation more economical, we can build a network with 100,000 Nodes, each storing a random 10% of the historical records, so each piece of data will be replicated 10,000 times - the same replication factor as a 10,000-Node network where each Node stores all content.
Now, Ethereum has begun to move away from the model of all Nodes permanently storing all history. The ConsensusBlock (i.e., the part related to Proof of Stake Consensus) only stores about 6 months. Blob only stores about 18 days. EIP-4444 aims to introduce a one-year storage period for historical Blocks and receipts. The long-term goal is to establish a unified period (possibly about 18 days), during which each Node is responsible for storing all content, and then establish a peer-to-peer network composed of Ethereum Nodes to store old data in a distributed manner.
Erasure codes can be used to improve robustness while maintaining the same replication factor. In fact, Blob has already undergone erasure coding to support data availability sampling. The simplest solution is likely to reuse these Erasure codes and also put the execution and Consensus block data in the blob.
What are the connections to existing research?
EIP-4444 ;
Torrents and EIP-4444 ;
Portal Network;
Portal Network and EIP-4444 ;
The distributed storage and retrieval of SSZ objects in the Portal;
How to increase gas limit (Paradigm).
What else needs to be done, and what needs to be weighed?
The remaining main tasks include building and integrating a specific distributed solution for storing historical records - at least executing historical records, but ultimately including Consensus and blob. The simplest solution is to (i) simply introduce existing torrent libraries, and (ii) an ETH native solution called the Portal network. Once either of them is introduced, we can open EIP-4444. EIP-4444 itself does not require a hard fork, but it does require a new network protocol version. Therefore, it is valuable to enable it for all clients at the same time, otherwise there is a risk of clients failing because they expect to download the complete historical records from other Nodes but actually do not receive them.
The main trade-off involves how we strive to provide ‘ancient’ historical data. The simplest solution is to stop storing ancient history tomorrow and rely on existing archival nodes and various centralized providers for replication. This is easy, but it weakens ETH’s position as a permanent record venue. A more difficult but more secure approach is to first build and integrate the torrent network to store historical records in a distributed manner. Here, ‘how hard we try’ has two dimensions:
How do we work hard to ensure that the largest Node cluster does indeed store all the data?
How deep is the historical storage integrated into the Depth of the protocol?
For (1), an extreme paranoid approach would involve attestation: essentially requiring each Proof of Stake validator to store a certain proportion of the history and periodically check if they do so in an encryption manner. A more lenient approach is to set a voluntary standard for the percentage of history stored by each client.
For (2), the basic implementation only involves work that has already been completed today: Portal has stored an ERA file containing the entire Ethereum history. A more thorough implementation will involve actually connecting it to the synchronization process, so that if someone wants to synchronize the complete historical storage node or archival node, they can do so through direct synchronization from the portal network, even if there are no other archival nodes online.
How does it interact with other parts of the roadmap?
If we want to make running or starting a Node extremely easy, reducing historical storage requirements is arguably more important than statelessness: of the 1.1 TB required by the Node, about 300 GB is state, and the remaining 800 GB has become history. Only by achieving statelessness and EIP-4444 can we realize the vision of running an Ethereum Node on a smartwatch and setting it up in just a few minutes.
Limiting historical storage also makes it more feasible for newer Ethereum Nodes to only support the latest version of the protocol, making them simpler. For example, many lines of code can now be safely removed as all empty storage slots created during the 2016 DoS attack have been removed. Since the shift to Proof of Stake has become history, clients can safely delete all code related to Proof of Work.
State expiry
What problem does it solve?
Even if we eliminate the need for client-side storage of historical records, the client’s storage requirements will continue to rise, by about 50 GB per year, as the state continues to rise: account balance and random numbers, contract code and contract storage. Users can pay a one-time fee to permanently burden present and future Ethereum clients.
It is harder for states to ‘expire’ than history, because EVM is fundamentally designed around the assumption that once a state object is created, it will always exist and can be read by any transaction at any time. If we introduce statelessness, some people think that this problem may not be so bad: only the specialized BlockBuilder class needs to actually store state, while all other Nodes (including list comprehensions!) can run statelessly. However, there is an argument that we do not want to rely too much on statelessness, and in the end, we may want states to expire to maintain the decentralization of Ethereum.
What is it, how does it work
Today, when you create a new state object (which can happen in one of three ways: (i) sending ETH to a new account, (ii) creating a new account with code, (iii) setting an untouched storage slot), the state object remains in that state indefinitely. On the contrary, what we want is for the object to automatically expire over time. The key challenge is to achieve this in a way that accomplishes the three objectives.
Efficiency: No need for a large amount of additional computation to run the expiration process.
User friendliness: If someone enters the cave for five years and comes back, they should not lose access to ETH, ERC 20, Non-fungible Token, CDP positions…
Developer friendliness: Developers do not need to switch to a completely unfamiliar mental model. In addition, currently stagnant and outdated applications should be able to continue running normally.
Failing to meet these goals can easily solve the problem. For example, you can also have each state object store an expiration date counter (which can be extended by burning ETH, which may occur automatically at any time it is read or written), and have a process state object that loops through the state to delete expired dates. However, this introduces additional computation (or even storage requirements), and it certainly cannot meet the user-friendly requirements. Developers also have a hard time reasoning about edge cases involving stored values sometimes being reset to zero. If you set the expiration timer within the contract scope, it would technically make developers’ lives easier, but it would make the economics more difficult: developers must consider how to “shift” the ongoing storage costs to users.
These are the problems that the Ethereum core development community has been working on for many years, including proposals such as ‘Block Rent’ and ‘Regen’. In the end, we combined the best parts of the proposals and focused on two categories of ‘known not worst solutions’.
Partial status expiration solution
Based on the expiration recommendation of the Address-based cycle.
Partial state expiry
Some status expiration proposals follow the same principle. We divide the status into blocks. Everyone permanently stores the ‘top-level mapping’, where the block is either empty or non-empty. Data in each block is only stored when it has been recently accessed. There is a ‘resurrection’ mechanism if it is no longer stored.
The main differences between these proposals are: (i) how we define “recent”, and (ii) how we define “blocks”? A specific proposal is EIP-7736, which builds upon the “stem-leaf” design introduced for Verkle trees (although compatible with any form of statelessness, such as binary trees). In this design, adjacent headers, code, and storage slots are stored under the same “stem”. The data stored under a stem can be up to 256 * 31 = 7,936 bytes. In many cases, the entire header and code of an account, as well as many of its key storage slots, will be stored under the same stem. If data under a given stem has not been read from or written to within 6 months, the data is no longer stored and only the 32-byte commitment (“stub”) to the data is stored. Transactions that access the data in the future will need to “resurrect” the data and provide a proof that checks against the stub.
There are other ways to achieve similar ideas. For example, if the granularity of account-level is not enough, we can devise a scheme where each 1/2 32 part of the tree is controlled by a similar stem-leaf mechanism.
Due to incentive factors, this becomes more tricky: attackers can force clients to permanently store a large amount of state by putting a large amount of data into a single subtree and sending a single transaction each year to ‘update the tree’. If you make the renewal cost proportional to the tree size (or inversely proportional to the renewal duration), someone may hurt other users by putting a large amount of data into the same subtree as theirs. People can try to limit these two issues by dynamically adjusting the granularity based on the subtree size: for example, every continuous 2^16 = 65536 state objects can be viewed as a ‘group’. However, these ideas are more complex; stem-based methods are simpler and can adjust incentives because all data under the stem is usually related to the same application or user.
Recommended Expiration of Address-Based Periodic State
If we want to completely avoid any permanent state rise, even a 32-byte stub, what should we do? This is a problem due to resurrection conflicts: if a state object is deleted, later EVM executions will place another state object in the exact same position. But what should we do when someone who cares about the original state object comes back and tries to recover it? When some of the states expire, the ‘stub’ prevents new data from being created. As the state fully expires, we can’t even store the stub.
The design based on the Address period is the most famous idea to solve this problem. Instead of using a state tree to store the entire state, we have a continuously rising list of state trees, and any read or write state will be saved in the latest state tree. A new empty state tree is added every period (e.g., 1 year). The old trees are frozen solid. The complete Node only stores the latest two trees. If a state object is not touched within two periods and falls into the expired tree, it can still be read or written, but the transaction needs to prove its Merkle proof - once proven, a copy will be saved again in the latest tree.
A key concept that makes this friendly for both users and developers is the concept of the Address cycle. The Address cycle is a number that belongs to a part of the Address. The key rule is that an Address with cycle N can only be read or written during or after cycle N (i.e., when the state tree list reaches length N). If you want to store a new state object (e.g., a new contract, or a new ERC 20 balance), you can store it immediately without providing evidence that nothing existed before if you ensure that the state object is placed in a contract with an Address cycle of N or N-1. On the other hand, any addition or editing done during an old Address cycle requires evidence.
This design preserves most of Ethereum’s current properties without the need for additional computations, allowing applications to be written almost as they are now (ERC 20 needs to be rewritten to ensure that the balance of Address with a cycle of N is stored in the sub-contract, which itself has a cycle of N). However, it has one major issue: Address needs to be extended to more than 20 bytes to accommodate the cycle of Address.
Address space extension Address
One suggestion is to introduce a new 32-byte Address format, including version number, Address cycle number, and extended hash.
The red is the version number. The four zeros in orange here are intended as blank spaces that can accommodate Sharding numbers in the future. The green is the Address cycle number. The blue is a 26-byte hash value.
The key challenge here is backward compatibility. Existing contracts are designed around 20-byte Address and typically use strict byte packing techniques, explicitly assuming that Address is exactly 20 bytes long. One idea to solve this problem involves a conversion mapping, where old contracts interacting with the new-style Address will see the 20-byte hash value of the new-style Address. However, ensuring its security involves a great deal of complexity.
Address Space Shrinkage
Another approach takes the opposite direction: we immediately prohibit some Address sub-ranges of size 2^128 (for example, all addresses starting with 0xffffffff), and then introduce Addresses with a cycle and a 14-byte hash value within that range.
0xffffffff000169125 d5dFcb7B8C2659029395bdF
The main sacrifice made by this method is the introduction of a hypothetical Address security risk: an Address that holds assets or permissions, but its code has not yet been deployed on the chain. The risk involves someone creating an Address that claims to have a (yet unpublished) code, but there is also another valid code that hashes to the same Address. Today, computing such collisions requires 2^80 hashes; Address space contraction would reduce this number to an easily accessible 2^56 hash values.
In the key risk area, that is, the anti-factual Address of the Wallet held by non-single owners, is relatively rare today, but as we enter the multi-L2 world, it may become more common. The only solution is simply to accept this risk, but to identify all common use cases that may arise and propose effective solutions.
What are the connections to existing research?
Early Proposal
Block Chain Clean;
Regeneration;
Ethereum State Size Management Theory;
Some possible paths for stateless and state expiration;
Some status expiration proposals
EIP-7736 ;
Address space expansion document
Original proposal;
Ipsilon Comments;
Blog post comments;
What will be destroyed if we lose collision resistance.
What else needs to be done, and what needs to be weighed?
I believe there are four viable paths for the future:
We achieve statelessness and never introduce state expiration. The state is constantly rising (although slowly: we may not see it exceed 8 TB for decades), but it only needs to be maintained by relatively special categories of users: even PoS validators do not require state.
One feature that requires access to a portion of the state is list generation, but we can accomplish this in a decentralized manner: each user is responsible for maintaining a portion of the state tree that includes their own account. When they broadcast a transaction, they broadcast the transaction and include proofs of the state objects accessed during the validation process (this applies to both EOA and ERC-4337 accounts). Then, stateless validators can combine these proofs into a proof that includes the entire list.
We partially expire states and accept a much lower but still non-zero permanent state size growth rate. This result can be said to be similar to how historical expired proposals involving peer-to-peer networks accept a much lower but still non-zero permanent historical storage growth rate that each client must store a lower but fixed percentage of historical data.
We use Address space extension to handle state expiration. This will involve a multi-year process to ensure that the Address format conversion method is effective and secure, including existing applications.
We use Address space shrinking to expire states. This will involve a multi-year process to ensure all security risks involving Address conflicts (including Cross-Chain Interaction scenarios) are addressed.
One important point is that regardless of whether the state expiration scheme that relies on Address format changes is implemented, the problem of Address space expansion and contraction must be addressed in the end. Nowadays, generating Address collisions requires approximately 2^80 hash values. This computational load is feasible for resource-rich participants: GPUs can perform approximately 2^27 hash values, so running for a year can calculate 2^52. Therefore, around 2^30 GPUs in the world can calculate a collision in about 1/4 year, and FPGAs and ASICs can further accelerate this process. In the future, such attacks will be open to more and more people. Therefore, the actual cost of implementing a fully state expiration may not be as high as it seems, because we must address this very challenging Address problem anyway.
How does it interact with other parts of the roadmap?
Expiring states may make it easier to convert from one state tree format to another state tree format because there is no conversion process: you can simply start using the new format to create a new tree and then perform a hard fork to convert the old tree. Therefore, although state expiration is complex, it does simplify other aspects of the roadmap.
Feature cleanup
What problem does it solve?
One of the key prerequisites for security, accessibility, and trust neutrality is simplicity. If the protocol is elegant and simple, it reduces the likelihood of errors. It increases the opportunity for new developers to participate in any part of it. It is more likely to be fair and also more resistant to special interests. Unfortunately, like any social system, the protocol defaults to becoming more complex over time. If we don’t want Ethereum to fall into the black hole of increasing complexity, we need to do one of two things: (i) stop making changes and make the protocol rigid, or (ii) be able to actually remove features and drop complexity. An intermediate path is also possible, that is, to make fewer changes to the protocol and eliminate at least some complexity over time. This section discusses how to reduce or eliminate complexity.
What is it, and how does it work?
There is no single major fix that can drop the complexity of the protocol; the nature of this problem is that there are many small solutions.
A mostly completed example is the removal of the SELFDESTRUCT opcode, and it can serve as a blueprint for how to handle other examples. The SELFDESTRUCT opcode is the only opcode that can modify an infinite number of storage slots within a single block, requiring clients to implement significantly higher complexity to avoid DoS attacks. The original purpose of this opcode was to achieve voluntary state settlement, allowing the state size to decrease over time. In practice, few people end up using it. The opcode has been weakened to only allow self-destruct accounts created in the same transaction of the Dencun hard fork. This resolves the DoS issue and can significantly simplify client code. In the future, it may make sense to completely remove the opcode.
Some key examples of simplified opportunities for the determined protocol to date include: First, some examples outside the EVM; these are relatively non-invasive, making it easier to achieve consensus and implementation in a shorter period of time.
RLP → SSZ conversion: Initially, Ethereum objects were serialized using a coding called RLP. RLP is untyped and unnecessarily complex. Today, beacon chain uses SSZ, which is significantly better in many ways, including supporting not only serialization but also hash. Ultimately, we hope to completely move away from RLP and transfer all data types to the SSZ structure, which in turn will make upgrades much easier. Current EIPs include [1] [2] [3].
Remove old transaction types: There are too many transaction types today, many of which may be removed. A more gentle alternative to complete removal is the account abstraction feature, which allows smart accounts to include code for processing and verifying old transactions (if they wish to).
LOG Reform: The log creates a bloom filter and other logic, increasing the complexity of the protocol, but it is not actually used by the client because it is too slow. We can remove these features and focus on alternative solutions, such as using modern technology such as SNARK to decentralize the protocol for log reading.
The final removal of the beacon chain sync committee mechanism: The sync committee mechanism was initially introduced to enable light client verification of ETH 2.0. However, it significantly increases the complexity of the protocol. Ultimately, we will be able to directly use SNARK to verify the ETH 2.0 Consensus layer, which will eliminate the need for a dedicated light client verification protocol. Potentially, the changes in Consensus will allow us to remove the sync committee earlier by creating a more ‘native’ light client protocol that involves verifying signatures from a random subset of ETH 2.0 Consensus validators.
Data format unification: Currently, the execution state is stored in the Merkle Patricia tree, the Consensus state is stored in the SSZ tree, and the blob is committed via KZG commitment. In the future, it makes sense to establish a single unified format for block data and a single unified format for state. These formats will meet all important requirements: (i) simple proof for stateless clients, (ii) serialization and erasure coding of data, and (iii) standardized data structures.
Remove beacon chain committee: This mechanism was initially introduced to support the execution of a specific version of Sharding. Instead, we ultimately achieved Sharding through L2 and blob. Therefore, the committee is unnecessary, and actions are being taken to remove it.
Remove mixed byte order: EVM is big-endian, while the consensus layer is little-endian. It would make sense to harmonize and make everything one way or the other (probably big-endian, as EVM is harder to change).
Now, here are some examples in EVM:
Simplification of Gas Mechanism: The current Gas rules have not been well optimized, and there is no clear limit on the amount of resources required to verify a Block. Key examples in this regard include (i) the cost of storage reads/writes, which aims to limit the number of reads/writes in a block, but is currently quite arbitrary, and (ii) memory filling rules, which are currently difficult to estimate the maximum memory consumption of the EVM. Proposed fixes include changes to the stateless Gas cost (unifying all storage-related costs into a simple formula) and a memory pricing proposal.
Delete Precompiles: Many of the precompiles currently present in the Ethereum network are unnecessarily complex, relatively unused, and constitute a significant part of Consensus failure in situations where they are hardly used by any applications. Two approaches to addressing this issue are (i) simply removing the precompiles, and (ii) replacing them with a segment of EVM code that implements the same logic (inevitably more expensive). This EIP draft suggests starting with the identity precompile for this operation; later, RIPEMD 160, MODEXP, and BLAKE may become candidates for removal.
Remove gas observability: make EVM execution no longer able to see how much gas is left. This will break some applications (most notably sponsored transactions), but it will be easier to upgrade in the future (e.g., for a more advanced version of multidimensional gas). The EOF specification has made Gas unobservable, but for the sake of simplifying protocol, EOF needs to be mandatory.
Improved Static Analysis: It is currently difficult to perform static analysis on EVM code, especially because jumps can be dynamic. This also makes it more difficult to optimize EVM implementations (precompiling EVM code into other languages). We can address this issue by removing dynamic jumps (or making them more expensive, e.g., linear gas cost based on the total number of JUMPDESTs in the contract). EOF does exactly that, although enforcing EOF is necessary to reap the benefits of protocol simplification.
What are the connections to existing research?
Subsequent steps of clearance;
Self-destruct
SSZ transforms EIPS: [1] [2] [3];
No state gas cost changes;
Linear Memory Pricing;
Precompiled deletion;
Remove bloom filter;
A method for secure log retrieval off-chain using incremental verifiable computation (read: recursive STARK);
What else needs to be done, and what needs to be weighed?
The main trade-off for simplifying this functionality is (i) the degree and speed of simplification and (ii) backward compatibility. The value of Ethereum as a chain comes from it being a platform where you can deploy applications and be confident that they will still work years later. At the same time, this ideal can also go too far, to borrow the words of William Jennings Bryan, ‘nailing Ethereum to the cross of backward compatibility’. If there are only two applications in the entire Ethereum using a given feature, and one application has had zero users for years while the other application has barely been used and has a total value of $57, we should remove the feature and, if necessary, pay the victim $57 out of our own pocket.
A more widespread social issue is the creation of a standardized pipeline for making non-urgent backward compatibility-breaking changes. One way to address this issue is to examine and extend existing precedents, such as the self-destruction process. The pipeline looks as follows:
Start talking about deleting feature X;
Analyze and determine the impact of removing X on the application, depending on the result: (i) abandon the idea, (ii) continue as planned, or (iii) determine the ‘least disruptive’ method to remove X and continue;
Formulate a formal EIP to deprecate X. Ensure that popular higher-level infrastructure (such as programming languages, Wallet) respects this and stops using this feature.
Finally, actually delete X;
There should be a long pipeline between steps 1 and 4, spanning many years, and clearly indicating which projects are at which step. At this point, there needs to be a balance between the dynamism and speed of the feature removal process and the more conservative approach of allocating more resources to other areas of protocol development, but we are still far from the Pareto frontier.
EOF
A set of major changes proposed for the EVM is the EVM Object Format (EOF). The EOF introduces a lot of changes, such as prohibiting observable gas, observable code (i.e. no CODECOPY), and allowing only static jumps. The goal is to allow the EVM to be upgraded in a way that has stronger properties, while maintaining backward compatibility (because the EVM before EOF still exists).
The advantage of doing this is that it creates a natural path for adding new EVM features and encourages migration to a stricter EVM with stronger guarantees. The disadvantage is that it significantly increases the complexity of the protocol, unless we can find a way to deprecate and remove the old EVM eventually. One major question is: what role does EOF play in the EVM simplification proposal, especially if the goal is to drop the complexity of the entire EVM?
How does it interact with other parts of the roadmap?
Many of the “improvement” suggestions in the rest of the roadmap are also opportunities to simplify legacy features. Repeat some of the examples above:
Switching to single-slot finality gives us the opportunity to cancel committees, redesign economics, and perform other simplifications related to Proof of Stake.
Fully implementing account abstraction allows us to remove a large amount of existing transaction processing logic and move it to the ‘default account EVM code’ that can replace all EOAs.
If we transfer the Ethereum state to a binary hash tree, this can be consistent with the new version of SSZ, so that all Ethereum data structures can be hashed in the same way.
More aggressive approach: Convert most of the content of the protocol into contract code
A more radical Ethereum simplification strategy is to keep the protocol unchanged but transfer most of its functionality from the protocol to contract code.
The most extreme version is to make ETH L1 ‘technically’ just a beacon chain, and introduce a minimal Virtual Machine (such as RISC-V, Cairo, or even smaller specialized for proof systems) to allow others to create their own aggregators. Then the EVM will become the first one in these aggregators. Ironically, this is exactly the same as the result of the execution environment proposal in 2019-20, although SNARK makes its actual implementation more feasible.
A more gentle approach would be to keep the relationship between the beacon chain and the current Ethereum execution environment unchanged, but to perform an in-place swap of the EVM. We can choose RISC-V, Cairo, or another VM as the new official Ethereum VM, and then forcibly convert all EVM contracts into new VM code that interprets the original code logic (through compilation or interpretation). In theory, this could even be accomplished by a version of the ‘target Virtual Machine’ as an EOF.
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
Vitalik's new article: The possible future of Ethereum (5) - The Purge
Original title: “Possible futures of the Ethereum protocol, part 5: The Purge”
Original author: Vitalik Buterin
Original translation: Odaily Planet Daily Husband How
Special thanks to Justin Drake, Tim Beiko, Matt Garnett, Piper Merriam, Marius van der Wijden, and Tomasz Stanczak for their feedback and review.
One of the challenges facing the Ethereum network is that, by default, the expansion and complexity of any blockchain protocol will increase over time. This occurs in two places:
In order to sustain Ethereum in the long term, we need to exert strong counter-pressure on these two trends, dropping complexity and inflation over time. However, at the same time, we need to retain one of the key attributes that makes the blockchain great: persistence. You can put a Non-fungible Token, a love letter in a transaction call data, or a Smart Contract containing 1 million US dollars on-chain, enter a cave for ten years, and find it still there waiting for you to read and interact with. In order for DApps to confidently achieve full decentralization and remove the upgrade Secret Key, they need to ensure that their dependencies will not upgrade in a way that disrupts them - especially L1 itself.
It is absolutely possible to strike a balance between these two needs, and to minimize or reverse bloat, complexity, and decay while maintaining continuity if we are determined to do so. Organisms can do this: while most organisms age over time, a lucky few do not. Even social systems can have extremely long lifespans. In some cases, Ethereum has succeeded: proof-of-work has disappeared, the SELFDESTRUCT opcode has mostly disappeared, and the beacon chain has stored old data for up to six months. Finding this path for Ethereum in a more general way and moving towards a long-term stable end result is the ultimate challenge for Ethereum’s long-term scalability, technical sustainability, and even security.
The Purge: main objective.
Table of Contents:
History expiry
What problem does it solve?
As of the writing of this article, a fully synced ETH Ethereum Node requires approximately 1.1 TB of disk space for the client to run, as well as several hundred GB of disk space for the Consensus client. The majority of this is historical: data on historical Blocks, transactions, and receipts, most of which are several years old. This means that even if the Gas limit does not increase at all, the size of the Node will continue to increase by several hundred GB each year.
What is it, and how does it work?
A key simplifying feature of the historical storage problem is that because each block points to the previous block through hash links (and other structures), achieving consensus on the current block is sufficient to achieve consensus on the history. As long as the network reaches consensus on the latest block, any historical blocks, transactions, or states (account balances, random numbers, code, storage) can be provided by any single participant and Merkle proof, and the proof allows anyone else to verify its correctness. Consensus is an N/2-of-N trust model, while history is an N-of-N trust model.
This provides us with a lot of choices for how we store historical records. A natural choice is for each Node to store only a small portion of the data on the network. This is how the seed network has operated for decades: while the network as a whole stores and distributes millions of files, each participant stores and distributes only a few files. Perhaps counterintuitively, this approach may not even drop the robustness of the data. If we can make the Node operation more economical, we can build a network with 100,000 Nodes, each storing a random 10% of the historical records, so each piece of data will be replicated 10,000 times - the same replication factor as a 10,000-Node network where each Node stores all content.
Now, Ethereum has begun to move away from the model of all Nodes permanently storing all history. The ConsensusBlock (i.e., the part related to Proof of Stake Consensus) only stores about 6 months. Blob only stores about 18 days. EIP-4444 aims to introduce a one-year storage period for historical Blocks and receipts. The long-term goal is to establish a unified period (possibly about 18 days), during which each Node is responsible for storing all content, and then establish a peer-to-peer network composed of Ethereum Nodes to store old data in a distributed manner.
Erasure codes can be used to improve robustness while maintaining the same replication factor. In fact, Blob has already undergone erasure coding to support data availability sampling. The simplest solution is likely to reuse these Erasure codes and also put the execution and Consensus block data in the blob.
What are the connections to existing research?
What else needs to be done, and what needs to be weighed?
The remaining main tasks include building and integrating a specific distributed solution for storing historical records - at least executing historical records, but ultimately including Consensus and blob. The simplest solution is to (i) simply introduce existing torrent libraries, and (ii) an ETH native solution called the Portal network. Once either of them is introduced, we can open EIP-4444. EIP-4444 itself does not require a hard fork, but it does require a new network protocol version. Therefore, it is valuable to enable it for all clients at the same time, otherwise there is a risk of clients failing because they expect to download the complete historical records from other Nodes but actually do not receive them.
The main trade-off involves how we strive to provide ‘ancient’ historical data. The simplest solution is to stop storing ancient history tomorrow and rely on existing archival nodes and various centralized providers for replication. This is easy, but it weakens ETH’s position as a permanent record venue. A more difficult but more secure approach is to first build and integrate the torrent network to store historical records in a distributed manner. Here, ‘how hard we try’ has two dimensions:
For (1), an extreme paranoid approach would involve attestation: essentially requiring each Proof of Stake validator to store a certain proportion of the history and periodically check if they do so in an encryption manner. A more lenient approach is to set a voluntary standard for the percentage of history stored by each client.
For (2), the basic implementation only involves work that has already been completed today: Portal has stored an ERA file containing the entire Ethereum history. A more thorough implementation will involve actually connecting it to the synchronization process, so that if someone wants to synchronize the complete historical storage node or archival node, they can do so through direct synchronization from the portal network, even if there are no other archival nodes online.
How does it interact with other parts of the roadmap?
If we want to make running or starting a Node extremely easy, reducing historical storage requirements is arguably more important than statelessness: of the 1.1 TB required by the Node, about 300 GB is state, and the remaining 800 GB has become history. Only by achieving statelessness and EIP-4444 can we realize the vision of running an Ethereum Node on a smartwatch and setting it up in just a few minutes. Limiting historical storage also makes it more feasible for newer Ethereum Nodes to only support the latest version of the protocol, making them simpler. For example, many lines of code can now be safely removed as all empty storage slots created during the 2016 DoS attack have been removed. Since the shift to Proof of Stake has become history, clients can safely delete all code related to Proof of Work.
State expiry
What problem does it solve?
Even if we eliminate the need for client-side storage of historical records, the client’s storage requirements will continue to rise, by about 50 GB per year, as the state continues to rise: account balance and random numbers, contract code and contract storage. Users can pay a one-time fee to permanently burden present and future Ethereum clients.
It is harder for states to ‘expire’ than history, because EVM is fundamentally designed around the assumption that once a state object is created, it will always exist and can be read by any transaction at any time. If we introduce statelessness, some people think that this problem may not be so bad: only the specialized BlockBuilder class needs to actually store state, while all other Nodes (including list comprehensions!) can run statelessly. However, there is an argument that we do not want to rely too much on statelessness, and in the end, we may want states to expire to maintain the decentralization of Ethereum.
What is it, how does it work
Today, when you create a new state object (which can happen in one of three ways: (i) sending ETH to a new account, (ii) creating a new account with code, (iii) setting an untouched storage slot), the state object remains in that state indefinitely. On the contrary, what we want is for the object to automatically expire over time. The key challenge is to achieve this in a way that accomplishes the three objectives.
Failing to meet these goals can easily solve the problem. For example, you can also have each state object store an expiration date counter (which can be extended by burning ETH, which may occur automatically at any time it is read or written), and have a process state object that loops through the state to delete expired dates. However, this introduces additional computation (or even storage requirements), and it certainly cannot meet the user-friendly requirements. Developers also have a hard time reasoning about edge cases involving stored values sometimes being reset to zero. If you set the expiration timer within the contract scope, it would technically make developers’ lives easier, but it would make the economics more difficult: developers must consider how to “shift” the ongoing storage costs to users.
These are the problems that the Ethereum core development community has been working on for many years, including proposals such as ‘Block Rent’ and ‘Regen’. In the end, we combined the best parts of the proposals and focused on two categories of ‘known not worst solutions’.
Partial state expiry
Some status expiration proposals follow the same principle. We divide the status into blocks. Everyone permanently stores the ‘top-level mapping’, where the block is either empty or non-empty. Data in each block is only stored when it has been recently accessed. There is a ‘resurrection’ mechanism if it is no longer stored.
The main differences between these proposals are: (i) how we define “recent”, and (ii) how we define “blocks”? A specific proposal is EIP-7736, which builds upon the “stem-leaf” design introduced for Verkle trees (although compatible with any form of statelessness, such as binary trees). In this design, adjacent headers, code, and storage slots are stored under the same “stem”. The data stored under a stem can be up to 256 * 31 = 7,936 bytes. In many cases, the entire header and code of an account, as well as many of its key storage slots, will be stored under the same stem. If data under a given stem has not been read from or written to within 6 months, the data is no longer stored and only the 32-byte commitment (“stub”) to the data is stored. Transactions that access the data in the future will need to “resurrect” the data and provide a proof that checks against the stub.
There are other ways to achieve similar ideas. For example, if the granularity of account-level is not enough, we can devise a scheme where each 1/2 32 part of the tree is controlled by a similar stem-leaf mechanism.
Due to incentive factors, this becomes more tricky: attackers can force clients to permanently store a large amount of state by putting a large amount of data into a single subtree and sending a single transaction each year to ‘update the tree’. If you make the renewal cost proportional to the tree size (or inversely proportional to the renewal duration), someone may hurt other users by putting a large amount of data into the same subtree as theirs. People can try to limit these two issues by dynamically adjusting the granularity based on the subtree size: for example, every continuous 2^16 = 65536 state objects can be viewed as a ‘group’. However, these ideas are more complex; stem-based methods are simpler and can adjust incentives because all data under the stem is usually related to the same application or user.
Recommended Expiration of Address-Based Periodic State
If we want to completely avoid any permanent state rise, even a 32-byte stub, what should we do? This is a problem due to resurrection conflicts: if a state object is deleted, later EVM executions will place another state object in the exact same position. But what should we do when someone who cares about the original state object comes back and tries to recover it? When some of the states expire, the ‘stub’ prevents new data from being created. As the state fully expires, we can’t even store the stub.
The design based on the Address period is the most famous idea to solve this problem. Instead of using a state tree to store the entire state, we have a continuously rising list of state trees, and any read or write state will be saved in the latest state tree. A new empty state tree is added every period (e.g., 1 year). The old trees are frozen solid. The complete Node only stores the latest two trees. If a state object is not touched within two periods and falls into the expired tree, it can still be read or written, but the transaction needs to prove its Merkle proof - once proven, a copy will be saved again in the latest tree.
A key concept that makes this friendly for both users and developers is the concept of the Address cycle. The Address cycle is a number that belongs to a part of the Address. The key rule is that an Address with cycle N can only be read or written during or after cycle N (i.e., when the state tree list reaches length N). If you want to store a new state object (e.g., a new contract, or a new ERC 20 balance), you can store it immediately without providing evidence that nothing existed before if you ensure that the state object is placed in a contract with an Address cycle of N or N-1. On the other hand, any addition or editing done during an old Address cycle requires evidence.
This design preserves most of Ethereum’s current properties without the need for additional computations, allowing applications to be written almost as they are now (ERC 20 needs to be rewritten to ensure that the balance of Address with a cycle of N is stored in the sub-contract, which itself has a cycle of N). However, it has one major issue: Address needs to be extended to more than 20 bytes to accommodate the cycle of Address.
Address space extension Address
One suggestion is to introduce a new 32-byte Address format, including version number, Address cycle number, and extended hash.
0x01 (Red) 0000 (Orange) 000001 (Green) 57 aE408398 dF7 E5 f4552091 A69125 d5dFcb7B8C2659029395bdF (Blue).
The red is the version number. The four zeros in orange here are intended as blank spaces that can accommodate Sharding numbers in the future. The green is the Address cycle number. The blue is a 26-byte hash value.
The key challenge here is backward compatibility. Existing contracts are designed around 20-byte Address and typically use strict byte packing techniques, explicitly assuming that Address is exactly 20 bytes long. One idea to solve this problem involves a conversion mapping, where old contracts interacting with the new-style Address will see the 20-byte hash value of the new-style Address. However, ensuring its security involves a great deal of complexity.
Address Space Shrinkage
Another approach takes the opposite direction: we immediately prohibit some Address sub-ranges of size 2^128 (for example, all addresses starting with 0xffffffff), and then introduce Addresses with a cycle and a 14-byte hash value within that range.
0xffffffff000169125 d5dFcb7B8C2659029395bdF
The main sacrifice made by this method is the introduction of a hypothetical Address security risk: an Address that holds assets or permissions, but its code has not yet been deployed on the chain. The risk involves someone creating an Address that claims to have a (yet unpublished) code, but there is also another valid code that hashes to the same Address. Today, computing such collisions requires 2^80 hashes; Address space contraction would reduce this number to an easily accessible 2^56 hash values.
In the key risk area, that is, the anti-factual Address of the Wallet held by non-single owners, is relatively rare today, but as we enter the multi-L2 world, it may become more common. The only solution is simply to accept this risk, but to identify all common use cases that may arise and propose effective solutions.
What are the connections to existing research?
Early Proposal
Some status expiration proposals
Address space expansion document
What else needs to be done, and what needs to be weighed?
I believe there are four viable paths for the future:
One important point is that regardless of whether the state expiration scheme that relies on Address format changes is implemented, the problem of Address space expansion and contraction must be addressed in the end. Nowadays, generating Address collisions requires approximately 2^80 hash values. This computational load is feasible for resource-rich participants: GPUs can perform approximately 2^27 hash values, so running for a year can calculate 2^52. Therefore, around 2^30 GPUs in the world can calculate a collision in about 1/4 year, and FPGAs and ASICs can further accelerate this process. In the future, such attacks will be open to more and more people. Therefore, the actual cost of implementing a fully state expiration may not be as high as it seems, because we must address this very challenging Address problem anyway.
How does it interact with other parts of the roadmap?
Expiring states may make it easier to convert from one state tree format to another state tree format because there is no conversion process: you can simply start using the new format to create a new tree and then perform a hard fork to convert the old tree. Therefore, although state expiration is complex, it does simplify other aspects of the roadmap.
Feature cleanup
What problem does it solve?
One of the key prerequisites for security, accessibility, and trust neutrality is simplicity. If the protocol is elegant and simple, it reduces the likelihood of errors. It increases the opportunity for new developers to participate in any part of it. It is more likely to be fair and also more resistant to special interests. Unfortunately, like any social system, the protocol defaults to becoming more complex over time. If we don’t want Ethereum to fall into the black hole of increasing complexity, we need to do one of two things: (i) stop making changes and make the protocol rigid, or (ii) be able to actually remove features and drop complexity. An intermediate path is also possible, that is, to make fewer changes to the protocol and eliminate at least some complexity over time. This section discusses how to reduce or eliminate complexity.
What is it, and how does it work?
There is no single major fix that can drop the complexity of the protocol; the nature of this problem is that there are many small solutions.
A mostly completed example is the removal of the SELFDESTRUCT opcode, and it can serve as a blueprint for how to handle other examples. The SELFDESTRUCT opcode is the only opcode that can modify an infinite number of storage slots within a single block, requiring clients to implement significantly higher complexity to avoid DoS attacks. The original purpose of this opcode was to achieve voluntary state settlement, allowing the state size to decrease over time. In practice, few people end up using it. The opcode has been weakened to only allow self-destruct accounts created in the same transaction of the Dencun hard fork. This resolves the DoS issue and can significantly simplify client code. In the future, it may make sense to completely remove the opcode.
Some key examples of simplified opportunities for the determined protocol to date include: First, some examples outside the EVM; these are relatively non-invasive, making it easier to achieve consensus and implementation in a shorter period of time.
Now, here are some examples in EVM:
What are the connections to existing research?
What else needs to be done, and what needs to be weighed?
The main trade-off for simplifying this functionality is (i) the degree and speed of simplification and (ii) backward compatibility. The value of Ethereum as a chain comes from it being a platform where you can deploy applications and be confident that they will still work years later. At the same time, this ideal can also go too far, to borrow the words of William Jennings Bryan, ‘nailing Ethereum to the cross of backward compatibility’. If there are only two applications in the entire Ethereum using a given feature, and one application has had zero users for years while the other application has barely been used and has a total value of $57, we should remove the feature and, if necessary, pay the victim $57 out of our own pocket.
A more widespread social issue is the creation of a standardized pipeline for making non-urgent backward compatibility-breaking changes. One way to address this issue is to examine and extend existing precedents, such as the self-destruction process. The pipeline looks as follows:
There should be a long pipeline between steps 1 and 4, spanning many years, and clearly indicating which projects are at which step. At this point, there needs to be a balance between the dynamism and speed of the feature removal process and the more conservative approach of allocating more resources to other areas of protocol development, but we are still far from the Pareto frontier.
EOF
A set of major changes proposed for the EVM is the EVM Object Format (EOF). The EOF introduces a lot of changes, such as prohibiting observable gas, observable code (i.e. no CODECOPY), and allowing only static jumps. The goal is to allow the EVM to be upgraded in a way that has stronger properties, while maintaining backward compatibility (because the EVM before EOF still exists).
The advantage of doing this is that it creates a natural path for adding new EVM features and encourages migration to a stricter EVM with stronger guarantees. The disadvantage is that it significantly increases the complexity of the protocol, unless we can find a way to deprecate and remove the old EVM eventually. One major question is: what role does EOF play in the EVM simplification proposal, especially if the goal is to drop the complexity of the entire EVM?
How does it interact with other parts of the roadmap?
Many of the “improvement” suggestions in the rest of the roadmap are also opportunities to simplify legacy features. Repeat some of the examples above:
More aggressive approach: Convert most of the content of the protocol into contract code
A more radical Ethereum simplification strategy is to keep the protocol unchanged but transfer most of its functionality from the protocol to contract code.
The most extreme version is to make ETH L1 ‘technically’ just a beacon chain, and introduce a minimal Virtual Machine (such as RISC-V, Cairo, or even smaller specialized for proof systems) to allow others to create their own aggregators. Then the EVM will become the first one in these aggregators. Ironically, this is exactly the same as the result of the execution environment proposal in 2019-20, although SNARK makes its actual implementation more feasible.
A more gentle approach would be to keep the relationship between the beacon chain and the current Ethereum execution environment unchanged, but to perform an in-place swap of the EVM. We can choose RISC-V, Cairo, or another VM as the new official Ethereum VM, and then forcibly convert all EVM contracts into new VM code that interprets the original code logic (through compilation or interpretation). In theory, this could even be accomplished by a version of the ‘target Virtual Machine’ as an EOF.