Skip to content

Conversation

@mh0lt
Copy link
Contributor

@mh0lt mh0lt commented Aug 31, 2025

Parallel Transaction Implementation (using BSTM validation)

This PR contains a working implementation of Block STM based paralelel execution which has been tested on Ethereum and Polygon main nets. The current version achieves 2-3x parallel execution with a re-execution rate of between 20&30% accross the 1-2 Million Blocks synced using it.

Below is the processing model used for execution:

Screenshot 2025-08-31 at 18 25 50

Transaction executuion tasks run independently and are validated in order when results are returned to the scheduer task. Finalization applies gas calculation, fee distribution and receipt generation as a final post executuoin step. Post finalization Transaction & Block results are sent in order to the main Exec3 worker thread where these results are applied to the underlying shared domain before commitement calculations are performed in either a batch mode or on a block by block basis.

Block commitment calculations are used mainly for debugging as they can be associlated with a specific block for further examination.

Parallel execution is currently experimental and is enabled by setting the following environment variables prior to running erigon:

ERIGON_EXEC3_PARALLEL: set 'true' to enable parallel execution
ERIGON_EXEC3_WORKERS: set to the number of worker threads used to process transactions tasks

In addition to these enviroment variables the following variable can be set to true for bor networks to enable reading of dependecy lists from bor mainnet to cuase the parallel scheduler to set the dependencies of exeuction tasls. While this should theoretically increase parallel throughput experimenation with the erigon implementation shows that it actually slows down execution - so it is set false by defualt even if parallel execution is enabled. It is likely that the dependency analyis in bor may be overly conservative - but this requires further analysis.

ERIGON_USE_TX_DEPENDENCIES: set 'true' to enable reading of header dependencies for bor based chains.

In order to aid debugging of state root, receipt root, logging and gas discrepancies the following environment variablles can be used to induce console tracing of the following aspects of the execution process. This logging is necessarily verbose so is only intended to be enables for specific debug runs and is logged to the process std out. It is not intended to be used for logging and appends block and transaction information to the start of every line to facilitate the subsequent use of grep or similar command line tools to filter the output.

The following variables control when tracing occurs. It is used to limit output to specific accounts, blocks, transactions. It is useful to avoid overly verbose output on long runs, especilly when detailed tracing is enabled.

ERIGON_TRACE_ACCOUNTS: A comma seperated list of accounts (case insensative) to be traced (all blocks)
ERIGON_TRACE_STATE_KEYS: A comma seperated list of state keys (case insensative) to be traced (all blocks)
ERIGON_TRACE_BLOCKS: A comma seperated list of blocks to be traced (all transactions)
ERIGON_TRACE_TXINDEXES: A comma seperated list of tx indecies within traced blocks to be traced
ERIGON_STOP_AFTER_BLOCK: Block number. Block processing will be stopped (and not commited) after this block

The following variables control what is traced whentracing is enabled. Typically IO tracing traces operations that happen on IntraBlockState transitions. It is relatively consise although all reads & writes for all tranactions of large blocks can still produce a signifigant amount of output. Instruction tracing produced a trace of EVM instructions which is very verbose, but makes it possible to see the root cuase of logical errors induced by unexpected reads, or instruction ouputs.

ERIGON_TRACE_TRANSACTION_IO: Provide trace output for transaction read write actions carried out by the processing of intra block state activity. This includes all read & writes + signifigant other activity sutch as account creation and deletion.
ERIGON_TRACE_INSTRUCTIONS: Provide an instruction level trace of evm exctivity, this includes the program counter, instruction names, arguments and gas costs, it may also include return values. The arguments and return values are present for all signifigant instructions but is incomplete. Missing data can be added by enhancing the instruction stringers that where introduced as part of this work.
ERIGON_TRACE_LOGS: Trace log production - including evential logs in receipts.
ERIGON_TRACE_GAS: Trace various aspects of gas and fee calculations. Useful when debugging gas discrepencies.
ERIGON_TRACE_DYNAMIC_GAS: This provides addition information around how dynamic gas pricing is produced (It may not be entierly accruate but is enough for tracing purposes when investigating mismatches. It may require additional enhancement.
ERIGON_TRACE_APPLY: this traces the output of the shared domain apply function. It is useful for observing causes of root mismatches as it shows the key updated which will be applied during root calculation. In practice this has been used for debugging the parallel framework and may not be completely useful as is.

The following variable can be used to force block level root calculations to happen even when the execution process is in sync mode. When in this mode block level roots will always be calculated and when a root mismatch occurs the process will dump the applied leys for the block along with all block IO activity. This is useful to determine which key is missing from the calculation. These missing keys can then be used with the environment variables above to investigate the break.

BATCH_COMMITMENTS: when set to 'true' state root calculations happen for each block irrespective of the exec operating mode.

Note: To be merged after 3.2 release branch cut.

@sudeepdino008 sudeepdino008 enabled auto-merge (squash) September 30, 2025 08:38
@sudeepdino008 sudeepdino008 merged commit ec7e6d3 into main Sep 30, 2025
17 checks passed
@sudeepdino008 sudeepdino008 deleted the async_tx branch September 30, 2025 09:03
NazariiDenha pushed a commit that referenced this pull request Oct 31, 2025
Parallel Transaction Implementation (using BSTM validation)

This PR contains a working implementation of Block STM based paralelel
execution which has been tested on Ethereum and Polygon main nets. The
current version achieves 2-3x parallel execution with a re-execution
rate of between 20&30% accross the 1-2 Million Blocks synced using it.

Below is the processing model used for execution:

<img width="943" height="453" alt="Screenshot 2025-08-31 at 18 25 50"
src="https://github.com/user-attachments/assets/700d5e90-f42a-494a-8712-65c526055286"
/>

Transaction executuion tasks run independently and are validated in
order when results are returned to the scheduer task. Finalization
applies gas calculation, fee distribution and receipt generation as a
final post executuoin step. Post finalization Transaction & Block
results are sent in order to the main Exec3 worker thread where these
results are applied to the underlying shared domain before commitement
calculations are performed in either a batch mode or on a block by block
basis.

Block commitment calculations are used mainly for debugging as they can
be associlated with a specific block for further examination.

Parallel execution is currently experimental and is enabled by setting
the following environment variables prior to running erigon:

**ERIGON_EXEC3_PARALLEL**: set 'true' to enable parallel execution
**ERIGON_EXEC3_WORKERS**: set to the number of worker threads used to
process transactions tasks

In addition to these enviroment variables the following variable can be
set to true for bor networks to enable reading of dependecy lists from
bor mainnet to cuase the parallel scheduler to set the dependencies of
exeuction tasls. While this should theoretically increase parallel
throughput experimenation with the erigon implementation shows that it
actually slows down execution - so it is set false by defualt even if
parallel execution is enabled. It is likely that the dependency analyis
in bor may be overly conservative - but this requires further analysis.

**ERIGON_USE_TX_DEPENDENCIES**: set 'true' to enable reading of header
dependencies for bor based chains.

In order to aid debugging of state root, receipt root, logging and gas
discrepancies the following environment variablles can be used to induce
console tracing of the following aspects of the execution process. This
logging is necessarily verbose so is only intended to be enables for
specific debug runs and is logged to the process std out. It is not
intended to be used for logging and appends block and transaction
information to the start of every line to facilitate the subsequent use
of grep or similar command line tools to filter the output.

The following variables control when tracing occurs. It is used to limit
output to specific accounts, blocks, transactions. It is useful to avoid
overly verbose output on long runs, especilly when detailed tracing is
enabled.

**ERIGON_TRACE_ACCOUNTS**: A comma seperated list of accounts (case
insensative) to be traced (all blocks)
**ERIGON_TRACE_STATE_KEYS**: A comma seperated list of state keys (case
insensative) to be traced (all blocks)
**ERIGON_TRACE_BLOCKS**: A comma seperated list of blocks to be traced
(all transactions)
**ERIGON_TRACE_TXINDEXES**: A comma seperated list of tx indecies within
traced blocks to be traced
**ERIGON_STOP_AFTER_BLOCK**: Block number. Block processing will be
stopped (and not commited) after this block

The following variables control what is traced whentracing is enabled.
Typically IO tracing traces operations that happen on IntraBlockState
transitions. It is relatively consise although all reads & writes for
all tranactions of large blocks can still produce a signifigant amount
of output. Instruction tracing produced a trace of EVM instructions
which is very verbose, but makes it possible to see the root cuase of
logical errors induced by unexpected reads, or instruction ouputs.

**ERIGON_TRACE_TRANSACTION_IO**: Provide trace output for transaction
read write actions carried out by the processing of intra block state
activity. This includes all read & writes + signifigant other activity
sutch as account creation and deletion.
**ERIGON_TRACE_INSTRUCTIONS**: Provide an instruction level trace of evm
exctivity, this includes the program counter, instruction names,
arguments and gas costs, it may also include return values. The
arguments and return values are present for all signifigant instructions
but is incomplete. Missing data can be added by enhancing the
instruction stringers that where introduced as part of this work.
**ERIGON_TRACE_LOGS**: Trace log production - including evential logs in
receipts.
**ERIGON_TRACE_GAS**: Trace various aspects of gas and fee calculations.
Useful when debugging gas discrepencies.
**ERIGON_TRACE_DYNAMIC_GAS**: This provides addition information around
how dynamic gas pricing is produced (It may not be entierly accruate but
is enough for tracing purposes when investigating mismatches. It may
require additional enhancement.
**ERIGON_TRACE_APPLY**: this traces the output of the shared domain
apply function. It is useful for observing causes of root mismatches as
it shows the key updated which will be applied during root calculation.
In practice this has been used for debugging the parallel framework and
may not be completely useful as is.

The following variable can be used to force block level root
calculations to happen even when the execution process is in sync mode.
When in this mode block level roots will always be calculated and when a
root mismatch occurs the process will dump the applied leys for the
block along with all block IO activity. This is useful to determine
which key is missing from the calculation. These missing keys can then
be used with the environment variables above to investigate the break.

**BATCH_COMMITMENTS**: when set to 'true' state root calculations happen
for each block irrespective of the exec operating mode.

Note: To be merged after 3.2 release branch cut.

---------

Co-authored-by: dvovk <[email protected]>
Co-authored-by: mholt-dv <[email protected]>
Co-authored-by: sudeepdino008 <[email protected]>
NazariiDenha added a commit that referenced this pull request Oct 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants