This week we review the Technology tree to reflect major new milestones in Ethereum 1.x R&D that are not quite a complete realization of stateless Ethereum, but much more reasonably achievable in the medium term. The most important addition to the tech tree is Alexey’s reGenesis proposal. It’s far from a well-specified upgrade, but the general feeling in the R&D department is that reGenesis offers a less dramatic but much more achievable step toward the ultimate goal of the “totally stateless” vision. In many ways, reGenesis is complementary to a static state network that would help distribute state snapshots and historical on-chain data in a bittorrent-style DHT-based network. At the same time, shorter-term improvements such as code merkleization and a binary representation of state are moving ever closer to EIP compatibility. Below I’ll explain and clarify the changes that have been made, and link to relevant discussions if you want to dig deeper into a particular feature.
Binary sort
While Ethereum currently uses a hex Merkle-Patricia Trie For encoding state, there are substantial efficiency gains to be made by moving to a binary format, particularly with regard to the anticipated size of witnesses. A complete re-encoding of the state of Ethereum requires specification of the new format and a clear transition strategy. Finally, it must be decided whether the smart contract code will also be merkleized and whether it should be incorporated into the binary transition or as a standalone change.
Binary Test Format
The general idea of a binary trie is a bit simpler (pun intended :)) than Ethereum’s current hex structure. Instead of having one of 16 possible paths to traverse from the root of the trie to the child nodes, a binary trie has 2. With a complete re-specification of the state trie, there is an additional opportunity to improve the well-established inefficiencies that have made themselves now know that Ethereum has been in business for over 5 years. In particular, this could be an opportunity to make the state much more responsive to the real-world performance challenges of database coding (described in a previous article on the growth of the state).
Discussion of a formal specification of binary trie and merkleization rules can be found on ethresearch.
Binary Trie Transition
It’s not just the destination (binary trie format) that is important, but the journey itself! In an ideal transition, there would be no interruption to transaction processing on the network, meaning clients would need to create the new binary trie. at the same time like managing new blocks rolling every 15 seconds. The transition strategy that continues to appear the most promising is nicknamed the overlay methodwhich is based in part on geth’s new snapshot synchronization protocol. In short, new state changes will be added to the existing trie (hexary) in a binary format, creating a sort of binary/hexary hybrid during the transition. The intact state is converted as a background process. Once the conversion is complete, both layers are flattened into a single binary trie.
It is important to note that the binary transition is a context in which customer diversity is very important. Each customer will either have to implement their own version of the transition or rely on other customers to convert and wait for the retry on the other side of the conversion. This will likely be a “measure twice, scale once” situation, where all customer teams will work together to implement testing and coordinate cutover. It is possible that in the interest of safety and security the network may need to briefly suspend service (e.g., mine a few empty blocks) during the transition, but it is too difficult to agree on a specific plan at the moment. .
Merkleization of code
The smart contract code represents a significant portion of Ethereum’s state trie (around 1 GB of the ~50 GB of state). A witness to any smart contract interaction will necessarily have to provide the code with which it interacts to calculate a codeHashand that could be a lot of additional data. Code Merkleization is a way to break contract code into smaller pieces and replace codeHash with the root of another merkle test. This would allow a witness to replace potentially significant portions of the smart contract code with reference hashes, thereby reducing crucial kilobytes of witness data.
There are a few approaches to code merkleization schemes, which range from universal slicing (e.g., into 64-byte chunks) on the simple side to more complex methods like Solidity-based static analysis. Function ID Or THE JUMPEST instructions. The optimal code merkleization strategy will ultimately rely on what appears to work best with actual data collected on the mainnet.
reGenesis
The best place to get an idea of the reGenesis proposition is this explanation by @mandrigin Or the full proposal from @realLedgerwatchbut the TL;DR is that reGenesis is essentially “spring cleaning for blockchain”. The complete state would be conceptually divided into an “active” state and an “inactive” state. Periodically, the entire “active” state would be deactivated and new transactions would start creating an active state again from almost nothing (hence the name “reGenesis”). If a transaction required an old part of the state, it would provide a witness very similar to what would be required for Stateless Ethereum: a Merkle proof proving that the state change is consistent with part of the inactive state. If a transaction touches an “inactive” part of the state, it automatically elevates it to “active” status (whether the transaction succeeds or not) where it remains until the next reGenesis event. This has the nice property of creating some of the economic limits on the use of state that state rent had without actually removing any state, and of allowing the sender of the transaction unable to generate a witness of blindly continue trying a transaction until everything it touches is “active” again. .
The fun part about reGenesis is that it brings Ethereum a lot closer to the ultimate goal of statelessness, but avoids some of the biggest challenges of statelessness, i.e. how witness gas accounting works during the execution of the EVM. It also gets some version of transaction cookies moving across the network, allowing for leaner, leaner clients and more opportunities for dapp developers to get used to the stateless paradigm and production of cookies. “True” statelessness after reGenesis would then be a question of degree: Stateless Ethereum is actually just a reGenesis after each block.
State Network
A better network protocol has been a “side quest” in the tech tree since the beginning, but with the addition of reGenesis to the scope of stateless Ethereum, finding alternative network primitives to share Ethereum chain data (including state) now seems to fit. much better in the main quest. Ethereum’s current network protocol is a monolith, when in fact there are several distinct types of data that could be shared using different “subnets” optimized for different things.
Previously, we spoke of it as “Three networks” in previous stateless callswith a Based on DHT network capable of serving more efficiently some of the data that don’t change from moment to moment. With the introduction of reGenesis, the “idle” state would fall into this category of immutable data and could theoretically be served by a bittorrent-like swarming network instead of piece by piece from a fully synchronized client like c This is currently the case.
A network circulating in an unchanged state since the last reGenesis event would be a static state networkand could be built by extending the new Discovery v5.1 specification in the devp2p library (Ethereum network protocol). Previous proposals such as Ride synchronization and the (more mature) SNAP protocol for synchronization active state would still be valuable steps towards a fully distributed system dynamic state network for clients trying to quickly sync full state.
Conclusion
A more condensed and technical version of each leaf of the Stateless tech tree (not just the updated ones) is available at the Stateless Ethereum specifications repositoryand active discussions on all topics discussed here can be found in the Eth1x/2 R&D Discord – please request an invite on ethresear.ch if you would like to join us. As always, tweet @gichiba or @JHancock for comments, questions and suggestions for new topics.