perf(proton): 3x faster bulk insert for CSV to Proton ingestion#64
Merged
perf(proton): 3x faster bulk insert for CSV to Proton ingestion#64
Conversation
Enable LZ4 compression for the proton-go-driver native protocol to reduce bytes on the wire. Minimal impact on localhost (~1s), but significant for remote connections through LB/NodePort where bandwidth and latency matter. Benchmark (localhost, 1.66M rows, batch_limit=50K): Before: 62s After: 61s (~0% on localhost, expected to help on WAN) Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
…t loop Remove 37 g.Debug() calls that fired for every nil value in every row. g.Debug() calls runtime.Caller() for stack walking + string formatting + zerolog event creation on each invocation. With -d flag enabled, 10 nullable columns, and 1.66M rows, this generated millions of expensive debug calls that provided zero diagnostic value (nil is normal for nullable columns). Benchmark (localhost, 1.66M rows, batch_limit=50K, cumulative): Baseline: 62s (41s user, 77% CPU) + LZ4: 61s + this: 28s (10s user, 40% CPU) ← 54% faster, biggest single win Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
The ds.Context.Lock()/Unlock() around batched.Append(row...) was unnecessary: - batched is a local object created in the operation closure - batchRows is a local snapshot fully drained from the channel - No concurrent goroutine accesses either during the append loop This saved 1.66M mutex lock/unlock cycles. Benchmark (localhost, 1.66M rows, batch_limit=50K, cumulative): Baseline: 62s (41.0s user) + LZ4: 61s + g.Debug removal: 28s (10.5s user) + this: 27s ( 9.8s user) ← ~1s improvement Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
…fake transactions
Three structural improvements:
1. Extract colClassification struct + classifyColumns() function:
Column type classification (the big switch mapping DbType → column index
lists) is now computed once in BulkImportStream and reused across all
batches, instead of being rebuilt inside every batch's retry closure.
2. Hoist ValidateColumnNames + GenerateInsertStatement to BulkImportStream:
These only change on schema change events, not per-batch. With
batch_limit=1000 and 1660 batches, this avoids 1659 redundant
recomputations per import.
3. Remove fake Begin/Commit per batch:
Proton does not support transactions (line 168 comment says so).
The driver's Begin() returns nil, Commit() returns nil. These were
pure ceremony adding method-chain overhead per batch.
The retry loop now only contains: PrepareBatch → Append rows → Send.
Benchmark (localhost, 1.66M rows, cumulative):
batch_limit=50K: 27s → 28s (no change, expected on localhost)
batch_limit=1000: 36s → 36s (no change on localhost; expected to help
on WAN where per-batch CPU overhead compounds with
network round-trip latency)
Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Security fixes since 1.24.11: - go1.24.12: security fixes in crypto/tls, archive/zip, net/url, go cmd - go1.24.13: security fixes in crypto/tls, crypto/x509, go cmd Relevant for fintech deployments connecting through TLS-terminated load balancers. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
go mod tidy with Go 1.24.13 removed the local replace directives for github.com/flarco/g and github.com/slingdata-io/sling. These are required for development against our local forks. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Closed
b94f5c9 to
edb2fcc
Compare
… inserts
Fix two pre-existing bugs found by Codex review:
P1 - Idempotent ID unstable across whole-import retries:
runProtonToProton retries by calling runFileToDB() from scratch,
which creates a new connection and re-enters BulkImportStream().
The old code generated idempotent_id from time.Now().UnixNano(),
so retried batches got new IDs and were inserted again.
Fix: use TaskExecution.ExecID (a KSUID generated once per task) as
the session component of the idempotent key. ExecID is set in
runFileToDB via conn.SetProp("exec_id", t.ExecID) and read in
processBatch. The resulting key is: exec_id + table + batch_number.
This is stable across retries (same ExecID) but different across
separate imports (new ExecID per task).
P2 - In-place row mutation breaks retry for array/map types:
batchRows is reused across retries. Type converters (e.g.,
convertToArrayString) mutate row values in-place from string to
[]string. On retry, the converter sees []string instead of string
and returns "expected string, got []string".
Fix: add idempotent type-check at the top of all 22 converters.
If the value is already the target type, return it as-is.
Both bugs are pre-existing (identical in v1.2.20-timeplus.4).
Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
edb2fcc to
9d8ca07
Compare
Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
9d8ca07 to
0ef27eb
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Optimizes the Proton bulk insert path. The main win is removing g.Debug() calls from the hot loop that caused a 3x slowdown when the -d flag is enabled (recommended for customer debugging).
Benchmark Results (fresh Proton server, 1.66M rows, controlled A/B)
The -d flag is recommended for customer debugging. With -d enabled, the baseline spends most of its CPU on g.Debug() → runtime.Caller() stack walks for every nil value in every row. This PR removes those calls, making -d mode as fast as non-debug mode.
Changes (7 commits)
Performance
Refactoring
Bug fixes (pre-existing, found by Codex review)
Infrastructure
What we tested and ruled out
Data Integrity Verification
Test plan