perf(sql): improve double-to-decimal cast performance#6792
perf(sql): improve double-to-decimal cast performance#6792bluestreak01 merged 19 commits intomasterfrom
Conversation
Remove the scientific-notation forcing from extractDoubleDigits that produced extra digits for display formatting purposes. The forcing (low=false, high=false when estDecExp < -3 or >= 8) was inherited from the JDK's FloatingDecimal and ensured Dragon4 produced at least 2 significant digits for values in scientific notation range. This is a string formatting concern that leaked into the Decimal conversion path, causing: - Inflated precision/scale (e.g., 0.0001 got precision 5 instead of 4) - Wider Decimal type selection than necessary - 1e-76 failing to fit in Decimal256 (scale 77 > max 76) when 1 digit gives scale 76 (exact fit) The string formatter (appendDouble00) already handles nDigits==1 by padding with '0', so Numbers.append output is unchanged. Add comprehensive test coverage for doubleToDecimal with 77 tests covering: auto-inference (type selection, precision/scale, all three Decimal widths, boundary values, Dragon4 fast/slow paths, powers of 2, bulk round-trips) and target cast (lossy truncation, scale/precision overflow, NaN/Infinity, all Decimal types). Co-Authored-By: Claude Opus 4.6 <[email protected]>
Change the Dragon4 digit extraction buffer from char[] to byte[],
storing raw digit values (0-9) instead of ASCII characters ('0'-'9').
This simplifies the DECIMAL consumption path, which previously had
to subtract '0' from each digit. The STRING path now adds '0' when
emitting digits via a small putAsciiDigits() helper — this is not a
regression since the previous bulk putAscii(char[], start, len) was
already a per-character loop in the CharSink default implementation,
with no specialized override in any concrete sink used by this path.
Co-Authored-By: Claude Opus 4.6 <[email protected]>
Add RyuDouble.java implementing the Ryu algorithm for O(1) double-to-decimal conversion via 128-bit multiply with precomputed lookup tables. This replaces the Dragon4 algorithm (extractDoubleDigits) which required the JDK-internal FDBigInteger class and ~250 lines of complex fallback code. The two doubleToDecimal() overloads now use RyuDouble.d2d() to get significant digits as a long plus a decimal exponent, then extract digits via division — no byte[] buffer needed. The string formatting path (Numbers.append) delegates to Double.toString() for exact JDK-compatible output, since Ryu can produce shorter-but-different representations that would break backward compatibility. Removed from Numbers.java: - extractDoubleDigits, estimateDecExpDouble, insignificantDigitsForPow2, roundupDouble - appendDouble0, appendDouble00, putAsciiDigits - Dragon4 constants (LONG_5_POW, SMALL_5_POW, N_5_BITS, etc.) - FDBigInteger import and Unsafe.addExports for jdk.internal.math Removed --add-exports java.base/jdk.internal.math from core/pom.xml (compiler and javadoc configurations). Updated CastDoubleToDecimalFunctionFactory and NumbersDoubleToDecimalTest to remove byte[] digits parameter. Co-Authored-By: Claude Opus 4.6 <[email protected]>
Replace the Double.toString()-based implementation of Numbers.append(CharSink, double, int) with a Ryu-based formatter that writes directly to CharSink. This eliminates a String allocation on every call, removing GC pressure on the hot data path. The new implementation decomposes the double via RyuDouble.d2d() and formats with three branches: fixed-point with integer part (decExp 1-7), leading-zero fixed-point (decExp 0 to -2), and scientific notation (otherwise). Scale truncation is applied in fixed-point branches only, matching prior behavior. Ryu produces the shortest decimal representation that round-trips to the original double, which may differ from JDK 11's Double.toString() by ±1 in the last digit on edge cases. Update random round-trip tests to verify parse correctness rather than exact string match with Double.toString(). Co-Authored-By: Claude Opus 4.6 <[email protected]>
Use QuestDB's ThreadLocal instead of java.lang.ThreadLocal for the tlE10 field in Numbers, and fix the modifier order to the standard private static final. Make RyuDouble internal helpers and constants private since only d2d() and decimalLength17() are called from outside the class. Update stale Dragon4 references in test comments to reflect the Ryu replacement. Co-Authored-By: Claude Opus 4.6 <[email protected]>
Add a PGWire client benchmark that measures the throughput of double-to-decimal and double-to-varchar conversions. The benchmark connects to an already-running QuestDB server and exercises all three Decimal widths (64, 128, 256) plus the VARCHAR formatting path. The test table uses exp(rnd_double(0) * 46 - 11.5) to generate a log-uniform distribution of 10M doubles spanning ~20 orders of magnitude, giving fair coverage to all formatting branches in Numbers.append() and all exponent ranges in RyuDouble.d2d(). Co-Authored-By: Claude Opus 4.6 <[email protected]>
In the trailing-zeros digit-removal loops, the `vrIsTrailingZeros` flag was being checked against `lastRemovedDigit` after it had already been overwritten with the current iteration's digit. This meant the "all removed digits were zero" invariant was tested against the wrong value. Move the `lastRemovedDigit` assignment after the trailing-zeros flag updates so that `vrIsTrailingZeros &= lastRemovedDigit == 0` correctly checks the previous iteration's digit. This fixes round-half-to-even: when the removed portion is exactly 0.5 and the retained significand is even, the algorithm now correctly avoids rounding up (e.g., 1.3607615367753142E15). Add a targeted regression test for this case. Co-Authored-By: Claude Opus 4.6 <[email protected]>
When casting a very small double (e.g. 1e-25) to a DECIMAL with a small target scale, the excess variable in the lossy truncation path could exceed the 20-element pow10 table, causing an ArrayIndexOutOfBoundsException. Since Ryu produces at most 17 significand digits, any excess >= olength means the entire significand is truncated away. Handle this case by setting output and olength to zero directly, skipping the table lookup. Add a regression test that triggers the crash with 1e-25 cast to DECIMAL(18,2). Co-Authored-By: Claude Opus 4.6 <[email protected]>
Document the shift-range precondition on mulShift64 ([65, 127]) and the bit-count precondition on multipleOfPowerOf2 ([0, 63]) in RyuDouble.java. Add an assertion in multipleOfPowerOf2 to catch violations during development. Fix incorrect comment in NumbersDoubleToDecimalTest that claimed naturalScale=42 for 1e-25 when the actual value is 25. Reorder the two test methods so testTargetLossyTruncateAllFractionalDigits comes before testTargetLossyTruncateExcessExceedsPow10Table. Add testFormatDoubleSubnormals to NumbersTest covering the Numbers.append formatting path for subnormal doubles: smallest, largest, negative, and mid-range subnormals, all verified via round-trip through Double.parseDouble. Co-Authored-By: Claude Opus 4.6 <[email protected]>
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Remove unnecessary (double) casts in NumbersDoubleToDecimalTest and NumbersTest where int-to-double and float-to-double widening conversions are implicit. Also inline a redundant local variable in testTargetBulkRoundTrip. Co-Authored-By: Claude Opus 4.6 <[email protected]>
Move the two large power-of-5 lookup table initializations (DOUBLE_POW5_INV_SPLIT and DOUBLE_POW5_SPLIT) out of the static initializer and into dedicated private static methods. This reduces the <clinit> bytecode from 9596 bytes to 30 bytes, keeping each init method well under the 8000-byte linter threshold (4542 and 5034 bytes respectively). Co-Authored-By: Claude Opus 4.6 <[email protected]>
[PR Coverage check]😍 pass : 286 / 304 (94.08%) file detail
|
|
@CodeRabbit pls review this PR critically |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 2717 🏁 Script executed: Length of output: 36 🏁 Script executed: Length of output: 36 🏁 Script executed: Length of output: 36 🏁 Script executed: Length of output: 36 🏁 Script executed: Length of output: 1379 🏁 Script executed: Length of output: 32 🏁 Script executed: Length of output: 43751 🏁 Script executed: Length of output: 31793 🏁 Script executed: Length of output: 27009 🏁 Script executed: Length of output: 3586 🏁 Script executed: Length of output: 5330 🏁 Script executed: Length of output: 14664 🏁 Script executed: Length of output: 3880 🏁 Script executed: Length of output: 885 🏁 Script executed: Length of output: 6335 🏁 Script executed: Length of output: 6391 🏁 Script executed: Length of output: 3353 🏁 Script executed: Length of output: 1383 🏁 Script executed: Length of output: 66
Critical Issues1. Unsafe assertion in production code (RyuDouble.java:391)private static boolean multipleOfPowerOf2(long value, int p) {
assert p >= 0 && p < 64;
return (value & ((1L << p) - 1)) == 0;
}Problem: This assertion can be disabled with Question: Why not use a runtime check or document the precondition more explicitly? The comment on line 388 says "Callers guarantee p < 63" but this is fragile - prove it or validate it. 2. ThreadLocal safety inconsistency (Numbers.java:73, CastDoubleToDecimalFunctionFactory.java:94,133)private static final ThreadLocal<int[]> tlE10 = new ThreadLocal<>(() -> new int[1]);And in the cast functions: `@Override`
public boolean isThreadSafe() {
return false; // Line 94, 133
}Problem: If Questions:
3. Lossy truncation semantics are wrong for SQL CAST (Numbers.java:772-783)// Truncate extra fractional digits if lossy
if (lossy && naturalScale > scale) {
int excess = naturalScale - scale;
if (excess >= olength) {
output = 0;
olength = 0;
} else {
output /= pow10[excess]; // Line 779: TRUNCATION, not rounding
olength -= excess;
}Problem: SQL CAST semantics typically require rounding, not truncation. For example:
Evidence: Check PostgreSQL, Oracle, SQL Server - they all round on overflow, not truncate. Required action: Either fix the truncation to use rounding, or document why QuestDB differs from standard SQL semantics. 4. mulShift64 unsigned arithmetic correction is subtle (RyuDouble.java:376,378)long high1 = Math.multiplyHigh(m, mul1) + ((mul1 >> 63) & m);
long high0 = Math.multiplyHigh(m, mul0) + ((mul0 >> 63) & m);Concern: This correction for treating signed Questions:
5. Benchmark doesn't measure what matters (DoubleToDecimalBenchmark.java:46-47)"SELECT sum(d::DECIMAL(18,2)) FROM bench_double",
"SELECT sum(d::DECIMAL(38,10)) FROM bench_double",Problem: Using
The actual per-row conversion cost is buried in noise. Required action: Create a proper JMH microbenchmark that isolates just the `@Benchmark`
public void doubleToDecimal64(Blackhole bh) {
for (double d : testData) {
Numbers.doubleToDecimal(d, sink64, sink128, sink256);
bh.consume(sink64);
}
}6. Test coverage gaps for error pathsCoverage: 94.08% sounds good, but what are the 12 uncovered lines in RyuDouble.java? Questions to answer:
Action required: Run coverage report and explicitly test the uncovered branches, or justify why they're unreachable. 7. Fast path integer detection has off-by-one risk (RyuDouble.java:438-440)if (e2 >= -52 && e2 <= 0) {
long mask = (1L << -e2) - 1;
if ((m2 & mask) == 0) {Question:
Test: Add explicit test for 8. Removal of FDBigInteger dependency (pom.xml:0, Numbers.java:73)Observation: You removed Question: Did you verify that the new Ryu-based implementation produces identical results for all edge cases compared to the old Dragon4 implementation? Action: Add a diff test that runs both implementations (if you keep old code temporarily) on millions of random doubles and compares outputs character-by-character. 9. Decimal construction in loop - error recovery? (Numbers.java:715-720)target.ofZero();
for (int i = 0; i < olength; i++) {
int d = (int) (output / pow10[olength - 1 - i] % 10);
if (d != 0) {
target.addPowerOfTenMultiple(decExp - 1 - i + naturalScale, d);
}
}Question: If 10. Scale validation order (Numbers.java:789-798)// Validate scale
if (naturalScale > scale) {
throw NumericException.instance()
.put("scale overflow: ").put(naturalScale).put(" > ").put(scale);
}
// Validate precision
int integerDigits = Math.max(0, decExp);
int requiredPrecision = integerDigits + scale;
if (requiredPrecision > precision) {Minor issue: After lossy truncation (line 779), Required Actions
Show me the coverage report and we can drill into the uncovered lines together. 🏁 Script executed: Length of output: 235 🏁 Script executed: Length of output: 3755 🏁 Script executed: Length of output: 403 🏁 Script executed: Length of output: 1028 🏁 Script executed: Length of output: 1236 🏁 Script executed: Length of output: 739 🏁 Script executed: Length of output: 0 🏁 Script executed: Length of output: 306 🏁 Script executed: Length of output: 390 🏁 Script executed: Length of output: 1369 🏁 Script executed: Length of output: 1963 🏁 Script executed: Length of output: 5142 🏁 Script executed: Length of output: 1515 🏁 Script executed: Length of output: 451 🏁 Script executed: Length of output: 415 🏁 Script executed: Length of output: 2871
Update on "Lossy Truncation" (
|
Summary
Numbers.doubleToDecimal) that bypasses string formatting entirely, removing the double→string→decimal roundtrip inCastDoubleToDecimalFunctionFactory.jdk.internal.math.FDBigInteger, eliminating the--add-exports java.base/jdk.internal.mathcompiler flag.Test plan
NumbersDoubleToDecimalTestwith 800+ lines covering auto-inference and target-precision paths, lossy truncation, edge cases (subnormals, NaN, Infinity, zero, negative zero), and bulk fraction round-tripstestFormatDoubleSubnormalsandtestFormatDoubleRoundEveninNumbersTestfor subnormal formatting and round-half-to-even correctnesstestFormatDoubleRandom(1M random doubles) andtestFormatDoubleAsRandomFloat(1M random floats) verify round-trip correctness of the new Ryu-basedNumbers.appendCastDoubleToDecimalFunctionFactoryTestcovering runtime casts, overflow, lossy truncation, constant folding, NaN/Infinity/null handling🤖 Generated with Claude Code