-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Parallel ExecV3 Processing #16922
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Parallel ExecV3 Processing #16922
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
AskAlexSharov
approved these changes
Sep 25, 2025
sudeepdino008
approved these changes
Sep 26, 2025
485a1f9 to
14dcb60
Compare
df69df4 to
650e487
Compare
NazariiDenha
pushed a commit
that referenced
this pull request
Oct 31, 2025
Parallel Transaction Implementation (using BSTM validation) This PR contains a working implementation of Block STM based paralelel execution which has been tested on Ethereum and Polygon main nets. The current version achieves 2-3x parallel execution with a re-execution rate of between 20&30% accross the 1-2 Million Blocks synced using it. Below is the processing model used for execution: <img width="943" height="453" alt="Screenshot 2025-08-31 at 18 25 50" src="https://github.com/user-attachments/assets/700d5e90-f42a-494a-8712-65c526055286" /> Transaction executuion tasks run independently and are validated in order when results are returned to the scheduer task. Finalization applies gas calculation, fee distribution and receipt generation as a final post executuoin step. Post finalization Transaction & Block results are sent in order to the main Exec3 worker thread where these results are applied to the underlying shared domain before commitement calculations are performed in either a batch mode or on a block by block basis. Block commitment calculations are used mainly for debugging as they can be associlated with a specific block for further examination. Parallel execution is currently experimental and is enabled by setting the following environment variables prior to running erigon: **ERIGON_EXEC3_PARALLEL**: set 'true' to enable parallel execution **ERIGON_EXEC3_WORKERS**: set to the number of worker threads used to process transactions tasks In addition to these enviroment variables the following variable can be set to true for bor networks to enable reading of dependecy lists from bor mainnet to cuase the parallel scheduler to set the dependencies of exeuction tasls. While this should theoretically increase parallel throughput experimenation with the erigon implementation shows that it actually slows down execution - so it is set false by defualt even if parallel execution is enabled. It is likely that the dependency analyis in bor may be overly conservative - but this requires further analysis. **ERIGON_USE_TX_DEPENDENCIES**: set 'true' to enable reading of header dependencies for bor based chains. In order to aid debugging of state root, receipt root, logging and gas discrepancies the following environment variablles can be used to induce console tracing of the following aspects of the execution process. This logging is necessarily verbose so is only intended to be enables for specific debug runs and is logged to the process std out. It is not intended to be used for logging and appends block and transaction information to the start of every line to facilitate the subsequent use of grep or similar command line tools to filter the output. The following variables control when tracing occurs. It is used to limit output to specific accounts, blocks, transactions. It is useful to avoid overly verbose output on long runs, especilly when detailed tracing is enabled. **ERIGON_TRACE_ACCOUNTS**: A comma seperated list of accounts (case insensative) to be traced (all blocks) **ERIGON_TRACE_STATE_KEYS**: A comma seperated list of state keys (case insensative) to be traced (all blocks) **ERIGON_TRACE_BLOCKS**: A comma seperated list of blocks to be traced (all transactions) **ERIGON_TRACE_TXINDEXES**: A comma seperated list of tx indecies within traced blocks to be traced **ERIGON_STOP_AFTER_BLOCK**: Block number. Block processing will be stopped (and not commited) after this block The following variables control what is traced whentracing is enabled. Typically IO tracing traces operations that happen on IntraBlockState transitions. It is relatively consise although all reads & writes for all tranactions of large blocks can still produce a signifigant amount of output. Instruction tracing produced a trace of EVM instructions which is very verbose, but makes it possible to see the root cuase of logical errors induced by unexpected reads, or instruction ouputs. **ERIGON_TRACE_TRANSACTION_IO**: Provide trace output for transaction read write actions carried out by the processing of intra block state activity. This includes all read & writes + signifigant other activity sutch as account creation and deletion. **ERIGON_TRACE_INSTRUCTIONS**: Provide an instruction level trace of evm exctivity, this includes the program counter, instruction names, arguments and gas costs, it may also include return values. The arguments and return values are present for all signifigant instructions but is incomplete. Missing data can be added by enhancing the instruction stringers that where introduced as part of this work. **ERIGON_TRACE_LOGS**: Trace log production - including evential logs in receipts. **ERIGON_TRACE_GAS**: Trace various aspects of gas and fee calculations. Useful when debugging gas discrepencies. **ERIGON_TRACE_DYNAMIC_GAS**: This provides addition information around how dynamic gas pricing is produced (It may not be entierly accruate but is enough for tracing purposes when investigating mismatches. It may require additional enhancement. **ERIGON_TRACE_APPLY**: this traces the output of the shared domain apply function. It is useful for observing causes of root mismatches as it shows the key updated which will be applied during root calculation. In practice this has been used for debugging the parallel framework and may not be completely useful as is. The following variable can be used to force block level root calculations to happen even when the execution process is in sync mode. When in this mode block level roots will always be calculated and when a root mismatch occurs the process will dump the applied leys for the block along with all block IO activity. This is useful to determine which key is missing from the calculation. These missing keys can then be used with the environment variables above to investigate the break. **BATCH_COMMITMENTS**: when set to 'true' state root calculations happen for each block irrespective of the exec operating mode. Note: To be merged after 3.2 release branch cut. --------- Co-authored-by: dvovk <[email protected]> Co-authored-by: mholt-dv <[email protected]> Co-authored-by: sudeepdino008 <[email protected]>
NazariiDenha
added a commit
that referenced
this pull request
Oct 31, 2025
This reverts commit fe0b2c0.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Parallel Transaction Implementation (using BSTM validation)
This PR contains a working implementation of Block STM based paralelel execution which has been tested on Ethereum and Polygon main nets. The current version achieves 2-3x parallel execution with a re-execution rate of between 20&30% accross the 1-2 Million Blocks synced using it.
Below is the processing model used for execution:
Transaction executuion tasks run independently and are validated in order when results are returned to the scheduer task. Finalization applies gas calculation, fee distribution and receipt generation as a final post executuoin step. Post finalization Transaction & Block results are sent in order to the main Exec3 worker thread where these results are applied to the underlying shared domain before commitement calculations are performed in either a batch mode or on a block by block basis.
Block commitment calculations are used mainly for debugging as they can be associlated with a specific block for further examination.
Parallel execution is currently experimental and is enabled by setting the following environment variables prior to running erigon:
ERIGON_EXEC3_PARALLEL: set 'true' to enable parallel execution
ERIGON_EXEC3_WORKERS: set to the number of worker threads used to process transactions tasks
In addition to these enviroment variables the following variable can be set to true for bor networks to enable reading of dependecy lists from bor mainnet to cuase the parallel scheduler to set the dependencies of exeuction tasls. While this should theoretically increase parallel throughput experimenation with the erigon implementation shows that it actually slows down execution - so it is set false by defualt even if parallel execution is enabled. It is likely that the dependency analyis in bor may be overly conservative - but this requires further analysis.
ERIGON_USE_TX_DEPENDENCIES: set 'true' to enable reading of header dependencies for bor based chains.
In order to aid debugging of state root, receipt root, logging and gas discrepancies the following environment variablles can be used to induce console tracing of the following aspects of the execution process. This logging is necessarily verbose so is only intended to be enables for specific debug runs and is logged to the process std out. It is not intended to be used for logging and appends block and transaction information to the start of every line to facilitate the subsequent use of grep or similar command line tools to filter the output.
The following variables control when tracing occurs. It is used to limit output to specific accounts, blocks, transactions. It is useful to avoid overly verbose output on long runs, especilly when detailed tracing is enabled.
ERIGON_TRACE_ACCOUNTS: A comma seperated list of accounts (case insensative) to be traced (all blocks)
ERIGON_TRACE_STATE_KEYS: A comma seperated list of state keys (case insensative) to be traced (all blocks)
ERIGON_TRACE_BLOCKS: A comma seperated list of blocks to be traced (all transactions)
ERIGON_TRACE_TXINDEXES: A comma seperated list of tx indecies within traced blocks to be traced
ERIGON_STOP_AFTER_BLOCK: Block number. Block processing will be stopped (and not commited) after this block
The following variables control what is traced whentracing is enabled. Typically IO tracing traces operations that happen on IntraBlockState transitions. It is relatively consise although all reads & writes for all tranactions of large blocks can still produce a signifigant amount of output. Instruction tracing produced a trace of EVM instructions which is very verbose, but makes it possible to see the root cuase of logical errors induced by unexpected reads, or instruction ouputs.
ERIGON_TRACE_TRANSACTION_IO: Provide trace output for transaction read write actions carried out by the processing of intra block state activity. This includes all read & writes + signifigant other activity sutch as account creation and deletion.
ERIGON_TRACE_INSTRUCTIONS: Provide an instruction level trace of evm exctivity, this includes the program counter, instruction names, arguments and gas costs, it may also include return values. The arguments and return values are present for all signifigant instructions but is incomplete. Missing data can be added by enhancing the instruction stringers that where introduced as part of this work.
ERIGON_TRACE_LOGS: Trace log production - including evential logs in receipts.
ERIGON_TRACE_GAS: Trace various aspects of gas and fee calculations. Useful when debugging gas discrepencies.
ERIGON_TRACE_DYNAMIC_GAS: This provides addition information around how dynamic gas pricing is produced (It may not be entierly accruate but is enough for tracing purposes when investigating mismatches. It may require additional enhancement.
ERIGON_TRACE_APPLY: this traces the output of the shared domain apply function. It is useful for observing causes of root mismatches as it shows the key updated which will be applied during root calculation. In practice this has been used for debugging the parallel framework and may not be completely useful as is.
The following variable can be used to force block level root calculations to happen even when the execution process is in sync mode. When in this mode block level roots will always be calculated and when a root mismatch occurs the process will dump the applied leys for the block along with all block IO activity. This is useful to determine which key is missing from the calculation. These missing keys can then be used with the environment variables above to investigate the break.
BATCH_COMMITMENTS: when set to 'true' state root calculations happen for each block irrespective of the exec operating mode.
Note: To be merged after 3.2 release branch cut.