January 14 at tl;dc (too long, I didn’t call)
Disclaimer: This is a summary of topics covered in the recurring Eth1.x research call and does not represent finalized plans or commitments regarding network upgrades.
The main themes of this call were
- Raw data quantifying the benefits of moving to a binary trie structure
- Transition strategies and potential challenges for a move to binary testing
- “Merklizing” contract code for witnesses and implications for gas planning/metering
- Chain pruning and historical chain/state data – network implications and distribution approaches.
Logistics
The weekend following EthCC (March 7-8), there will be a small summit on 1.x research, with the aim of having a few days of in-depth discussions and work on the topics covered. The session will be limited (based on venue constraints) to 40 participants, which should be more than enough for the number of expected participants.
There will likely also be some informal, ad hoc gatherings around Stanford Blockchain Week and ETHDenver, but nothing explicitly planned.
The next call is tentatively scheduled for the first or second week of February, halfway between now and the Paris summit.
Technical discussion
PEI #2465
Although not directly related to stateless Ethereum, this EIP improves the network protocol for transaction propagation and is therefore a fairly simple improvement that moves things in the right direction for research work. Support!
Binary Trie Size Savings
The transition to a binary trie structure (instead of the current hexal trie structure) should in theory reduce the size of witnesses by approximately 3.75x, but in practice this reduction might only be about half, depending on how you look at it..
Cookies contain approximately 30% code and 70% hashes. The hashes within the trie are reduced by 3x, but the code is not improved with a binary trie, since it must still be included in the witness. Thus, switching to a binary trie format will bring the witness size to ~300-1400 KB, compared to ~800-3400 KB in hexadecimal trie.
Make the change
Implementing the actual transition to a binary trie is another matter, with some questions that need further investigation. There are essentially two different possible strategies that could be followed:
gradual transition — This is a “Ship of Theseus” transition model in which the entire state is migrated to a binary format on an account-by-account and storageSlot-by-storageSlot basis, as each part of the state is affected by the execution of the EVM. This implies that, forever, the state of Ethereum would be a hexanary/binary hybrid, and that accounts would have to be “poked” in order to be updated to the new trie format (perhaps with a POKE opcode; ). The advantages are that it does not interrupt the normal operation of the chain and does not require large-scale coordination for the upgrade. The downside is complexity: hex and binary trie formats must be accommodated in clients, and the process will never actually “finish”, because some parts of the state are not externally accessible and should be explicitly pushed by their owners. which probably won’t happen to the entire state. The progressive strategy would also require customers to modify their database to be a sort of “virtualized” binary trie inside a hexadecimal database configuration, to avoid a sudden and dramatic increase in storage needs for all customers (note: this database improvement can happen independent of the full “gradual” transition, and would still be beneficial on its own).
calculate and clean — This would be an “immediately” transition accomplished on one or more hard-forks, during which a date in the future would be chosen for the change, and then all participants in the network would have to recalculate the state as binary test. , then switch to the new format together. This strategy would somehow be “easier” to implement because it is simple from an engineering perspective. But it is more complex from a coordination point of view: the new state of the binary trie must be pre-computed before the fork, which could take an hour (or thereabouts) — during this window, it is not clear how transactions and new blocks would be handled (as they would have to be included in the still uncomputed binary state trie and/or the old trie). This process would be made more difficult by the fact that many miners and exchanges prefer to upgrade their clients at the last moment. Alternatively, we could imagine stopping the entire chain for a short period to recalculate the new state – a process that could be even trickier and potentially controversial to coordinate.
Both options are still “on the table” and require further review and discussion before decisions are made regarding next steps. This includes weighing the trade-offs between implementation complexity on the one hand and coordination challenges on the other.
“Block” code
As for the code part of the cookies, some prototyping work has been done on code “merklization”, which essentially allows the contract code to be broken up into chunks before being placed into a cookie. The basic idea is that, if a method in a smart contract is called, the witness should only need to include the parts of the contract code that were actually called, rather than the entire contract. This is still very early research, but it suggests a further reduction of about 50% of the code portion of a witness. More ambitiously, the practice of code slicing could be extended to create a single global “code trie”, but this is not a well-developed idea and likely has its own challenges that warrant further investigation.
There are various methods by which code can be broken into pieces and then used to generate cookies. The first is “dynamic”, in the sense that it relies on finding JUMPDEST instructions and splitting near these points, resulting in varying chunk sizes depending on the code being broken. The second is “static”, which would divide the code into fixed sizes and add some necessary metadata specifying where the correct jump destinations are in the block. It seems that either of these two approaches would be valid, and both could be compatible and could be left up to users to decide which to use. In any case, grouping allows the size of the cookies to be further reduced.
(de)gas
An open question is what changes would be necessary or desirable in gas planning with the introduction of blocking witnesses. The generation of witnesses must be paid for in gas. If the code is fragmented, within a block there will be some overlap where multiple transactions cover the same code, and thus parts of a block witness will be paid more than once by all transactions included in it. the block. It seems like a safe idea (and one that would be good for miners) would be to let the originator of a transaction pay the full cost of witnessing their own transaction, then let the miner keep the excess. paid. This minimizes the need to change gas costs and incentivizes miners to produce witnesses, but unfortunately breaks the current security model of only trusting subcalls (in a transaction) with a portion of the total gas engaged. How this change in the security model is managed is something that needs to be examined thoroughly and thoroughly. Ultimately, the goal is to charge each transaction the cost of producing its own witness, proportional to the code it touches.
Wei Tang’s UNGAS proposal could make changes to the EVM easier to achieve. This isn’t strictly necessary for stateless Ethereum, but it’s an idea on how to facilitate future changes to gas schedules. The question to ask is: what do changes look like without and with UNGAS? And with these things taken into account, does UNGAS actually make these things much easier to implement? ? To answer this question we need experiments that run things with merklized code and new gas rules applied, and then see what should change in terms of cost and execution in the EVM.
Data pruning and delivery
In a stateless model, nodes that lack some or all state need a way to signal to the rest of the network what data they have and what data they are missing. This has implications for network topology: stateless clients lacking data must be able to reliably and quickly find the data they need somewhere on the network, as well as broadcast in advance the data they need. They don’t have (and might need). Adding such functionality to one of the on-chain pruning EIPs is a network protocol change (but not a consensus one), and it’s something that can also be done now.
The second aspect of this problem is where to store historical data, and the best solution proposed so far is an Eth-specific distributed storage network that can serve the requested data. This could come in several flavors; the complete report could be subject to “division”, like the contract code; partial-state nodes could monitor chunks of state (randomly assigned) and serve them on demand to the ends of the network; clients can use an additional data routing mechanism so that a stateless node can still get the missing data through an intermediary (one that doesn’t have the data it needs, but is connected to another node that does) . Regardless of its implementation, the overall goal is that clients can join the network and get all the data they need, reliably and without fighting to connect to a full-state node, which is actually what is happening with LES. nodes now. Work surrounding these ideas is still in its early stages, but the Geth team has achieved promising results experimenting with “state tiling” (chunking), and turbo-geth is working on data routing for parts of the ‘Gossip state.
As always, if you have any questions about Eth1x’s efforts, topic requests, or would like to contribute, attend an event, come introduce yourself at ethresear.ch or contact @gichiba and/or @JHancock on Twitter.