Skip to content

Conversation

@uri-99
Copy link
Contributor

@uri-99 uri-99 commented Dec 18, 2024

Important

This PR was rebased here: #1664

Fix batcher queue ord

Description

There was a bug in the batcher queue ordering of elements, which led to a wrong placement of proofs when a same sender sent proofs with same max_fee.
The consecuence of this bug was seen when the batcher queue is filled, the proofs sent to chain will be not from lowest to highest nonce, but the other way around.

To Test

You can view the unit tests.

Also you can add the following print statements (and can do before this PR so you can verify the bug indeed existed).

in batcher/aligned-batcher/src/lib.rs, line 1175:

        info!("resulting:");
        for (entry, _priority) in resulting_batch_queue.iter() {
            info!(
                "nonce: {:?}, max fee: {:?}",
                entry.nonced_verification_data.nonce, entry.nonced_verification_data.max_fee
            );
        }

        info!("finalized:");
        for entry in finalized_batch.iter() {
            info!(
                "nonce: {:?}, max fee: {:?}",
                entry.nonced_verification_data.nonce, entry.nonced_verification_data.max_fee
            );
        }

in messaging.rs, line 161:

        info!("Last proof nonce: {:?}", last_proof_nonce);
        info!("Current proof nonce: {:?}", batch_inclusion_data_message.user_nonce);

This will help to view the resulting state of the batcher queue.

To execute the bug you should send a burst bigger than the batch_qty limit. For this it is recommended to lower this value, config-batcher.yaml:

  max_batch_proof_qty: 5 # 5 proofs in a batch, for testing

Then send a burst of size 8, so that the first batch is of size 5 and the second of size 3. (this ensures you don't get the batch already submitted contract revert).

For this you can set BURST_SIZE ?= 8 in the Makefile.

Type of change

  • New feature
  • Bug fix
  • Optimization
  • Refactor

Checklist

  • “Hotfix” to testnet, everything else to staging
  • Linked to Github Issue
  • This change depends on code or research by an external entity
    • Acknowledgements were updated to give credit
  • Unit tests added
  • This change requires new documentation.
    • Documentation has been added/updated.
  • This change is an Optimization
    • Benchmarks added/run
  • Has a known issue
  • If your PR changes the Operator compatibility (Ex: Upgrade prover versions)
    • This PR adds compatibility for operator for both versions and do not change batcher/docs/examples
    • This PR updates batcher and docs/examples to the newer version. This requires the operator are already updated to be compatible

Copy link
Member

@MarcosNicolau MarcosNicolau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty edgy situation... nice catch. Working on my machine.

) -> Result<(), BatcherError> {
for (vd_batch_idx, entry) in finalized_batch.iter().enumerate() {
// iter in reverse because each sender wants to receive responses in ascending nonce order
// and finalized_batch is ordered as the PriorityQueue , low max_nonce first && high nonce first.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// and finalized_batch is ordered as the PriorityQueue , low max_nonce first && high nonce first.
// and finalized_batch is ordered as the PriorityQueue , low max_nonce first && high nonce last.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PriorityQueue is ordered by which elements do we discard first.
So we first discard proofs of low max_fee and high nonce. Because we want the high max_fee and LOW nonce to go in first

Copy link
Contributor

@JulianVentura JulianVentura left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good and works on my machine

) -> Result<(), BatcherError> {
for (vd_batch_idx, entry) in finalized_batch.iter().enumerate() {
// iter in reverse because each sender wants to receive responses in ascending nonce order
// and finalized_batch is ordered as the PriorityQueue , low max_nonce first && high nonce first.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:
When we revert this vec, we are also reverting the orden of the proofs to low max_fee first, and higher max_fee last, too.
This shouldn't be a problem, since the sender is waiting until arrival of the last nonce, it doesn't care about the fees.
I think we should add a comment clarifying that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Priority queue orders first by max_fee.
This means the priorityQueue has ascending max_fee (we first eliminate those with low max_fee). And when max_fee is equal, it has descending nonce (we first eliminate those with high nonce)

This means priorityQueue.rev() has descending max_fee. And when the max_fee is equal, it will have ascending nonce.

This is the desired behaviour, since descending max_fee => ascending nonce, and equal fee => ascending nonce.

@uri-99 uri-99 marked this pull request as draft December 19, 2024 22:10
@uri-99 uri-99 mentioned this pull request Dec 19, 2024
17 tasks
@uri-99 uri-99 closed this Dec 19, 2024
@uri-99 uri-99 deleted the fix-batcher-queue-ord branch December 20, 2024 00:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants