Skip to content

Comments

perf(engine): remove serde(flatten) from execution payload types#3713

Merged
mattsse merged 1 commit intomainfrom
mattsse/perf-remove-serde-flatten-payload
Feb 17, 2026
Merged

perf(engine): remove serde(flatten) from execution payload types#3713
mattsse merged 1 commit intomainfrom
mattsse/perf-remove-serde-flatten-payload

Conversation

@gakonst
Copy link
Member

@gakonst gakonst commented Feb 17, 2026

Summary

Remove serde(flatten) from execution payload deserialization to eliminate the intermediate serde_json::Map buffering overhead.

Motivation

serde(flatten) forces deserialization through an intermediate Map<String, Value>, causing double allocation of every field. For ExecutionPayloadV3, the flatten chain was 3 levels deep (V3 → V2 → V1), meaning every field — including the expensive transactions: Vec<Bytes> and Bloom — got parsed into serde_json::Value first, then re-deserialized into their actual types.

This is on the hot path for engine_newPayloadV3.

What serde(flatten) actually expands to

For ExecutionPayloadV3 which only has 2 own fields (blobGasUsed, excessBlobGas) + a flattened ExecutionPayloadV2, the derived Deserialize expanded to ~470 lines (via cargo expand). The critical overhead is in the visit_map implementation:

fn visit_map<__A>(self, mut __map: __A) -> Result<Self::Value, __A::Error>
where __A: serde::de::MapAccess<'de>
{
    let mut __field1: Option<u64> = None; // blobGasUsed
    let mut __field2: Option<u64> = None; // excessBlobGas

    // This is the expensive part: ALL unrecognized keys + values
    // get buffered into a Vec of Content pairs
    let mut __collect = Vec::<
        Option<(
            serde::__private::de::Content,  // key
            serde::__private::de::Content,  // value (!)
        )>,
    >::new();

    while let Some(__key) = __map.next_key::<__Field>()? {
        match __key {
            __Field::__field1 => { /* deserialize blobGasUsed */ }
            __Field::__field2 => { /* deserialize excessBlobGas */ }

            // Every other field (all 14 V1 fields + withdrawals)
            // gets captured as Content — meaning the full JSON value
            // is parsed into an owned enum tree:
            __Field::__other(__name) => {
                __collect.push(Some((
                    __name,
                    __map.next_value_seed(ContentVisitor::new())?,
                    //                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                    // This allocates: every Bytes tx blob, the Bloom,
                    // all B256 hashes — all become Content::String(String)
                )));
            }
        }
    }

    // Then the collected Content pairs are re-deserialized through
    // FlatMapDeserializer, which reconstructs ExecutionPayloadV2
    // from the buffered data — causing a SECOND deserialization pass
    let __field0: ExecutionPayloadV2 = Deserialize::deserialize(
        FlatMapDeserializer(&mut __collect, PhantomData),
        // ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        // This re-parses every Content value back into its real type,
        // doubling allocations for String -> B256, String -> Bloom, etc.
    )?;

    // And ExecutionPayloadV2 does the SAME thing internally for V1,
    // creating a 3rd Content buffer + FlatMapDeserializer pass
}

For a mainnet block with ~200 transactions, each transaction Bytes blob gets:

  1. Parsed from JSON string into Content::String(String) (alloc Port RPC types from reth #1)
  2. Re-deserialized from Content::String into Bytes via FlatMapDeserializer (alloc Add a transports crate & initial Network abstraction #2)
  3. And at the V2→V1 flatten level, the same happens again (alloc docs: add readmes #3)

The Bloom (512 hex chars) and all B256 fields suffer the same double-parse.

After this PR

The manual Deserialize impl uses a flat helper struct:

impl<'de> Deserialize<'de> for ExecutionPayloadV3 {
    fn deserialize<D: Deserializer<'de>>(deserializer: D) -> Result<Self, D::Error> {
        #[derive(Deserialize)]
        #[serde(rename_all = "camelCase")]
        struct Helper {
            parent_hash: B256,
            fee_recipient: Address,
            // ... all fields flat, no nesting ...
            transactions: Vec<Bytes>,
            withdrawals: Vec<Withdrawal>,
            #[serde(with = "alloy_serde::quantity")]
            blob_gas_used: u64,
            #[serde(with = "alloy_serde::quantity")]
            excess_blob_gas: u64,
        }

        let helper = Helper::deserialize(deserializer)?;
        Ok(Self {
            payload_inner: ExecutionPayloadV2 {
                payload_inner: ExecutionPayloadV1 { /* fields from helper */ },
                withdrawals: helper.withdrawals,
            },
            blob_gas_used: helper.blob_gas_used,
            excess_blob_gas: helper.excess_blob_gas,
        })
    }
}

serde generates an optimal field-order-aware deserializer for the flat struct — no intermediate Content buffering, no FlatMapDeserializer, single-pass deserialization directly into target types.

Changes

  • Replace derived Deserialize with manual impls using flat helper structs for:
    • ExecutionPayloadV2 (was 1 flatten level)
    • ExecutionPayloadV3 (was 2 flatten levels)
    • ExecutionPayloadV4 (was 3 flatten levels)
    • ExecutionPayloadInputV2 (was 1 flatten level + deny_unknown_fields)
    • ExecutionPayloadEnvelopeV4 (was 1 flatten level)
  • Keep derived Serialize with serde(flatten) — serialization does not have the same overhead
  • Add 7 new roundtrip/validation tests

Testing

cargo test -p alloy-rpc-types-engine --features serde — all 55 tests pass including existing hive test vectors.

Prompted by: mattsse

Replace derived Deserialize with manual impls using flat helper structs
for ExecutionPayloadV2, V3, V4, ExecutionPayloadInputV2, and
ExecutionPayloadEnvelopeV4.

serde(flatten) forces deserialization through an intermediate
serde_json::Map, causing double allocation of every field value
(especially expensive for the transactions Vec<Bytes> and Bloom).
The nested flatten chain (V3 -> V2 -> V1) made this 3 levels deep.

The manual impls use a flat helper struct that serde generates an
optimal field-order-aware deserializer for, avoiding the intermediate
map entirely.

Amp-Thread-ID: https://ampcode.com/threads/T-019c6ccb-3758-70a5-bfc4-c087afe44303
Co-authored-by: Amp <[email protected]>
@github-project-automation github-project-automation bot moved this to Reviewed in Alloy Feb 17, 2026
@mattsse mattsse merged commit 05b2452 into main Feb 17, 2026
30 checks passed
@github-project-automation github-project-automation bot moved this from Reviewed to Done in Alloy Feb 17, 2026
@mattsse mattsse deleted the mattsse/perf-remove-serde-flatten-payload branch February 17, 2026 23:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants