fix(core): correct greedy parsing for single N pattern in timestamp formats#6741
fix(core): correct greedy parsing for single N pattern in timestamp formats#6741bluestreak01 merged 7 commits intomasterfrom
Conversation
…ormats Greedy nanosecond parsing (single N followed by non-digit) is now consistent with milliseconds and microseconds. Previously, `.SSSUUUN` with input `.1234567` produced `.123456007Z` (7ns) instead of `.123456700Z` (700ns).
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the
✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai put your JVM bytecode hat on and review please |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 5397 🏁 Script executed: Length of output: 3468 🏁 Script executed: Length of output: 5912 🏁 Script executed: Length of output: 5160
Excellent fix from a bytecode/performance perspective. Here's what stands out: Core Fix AnalysisMethod Replacement: The critical difference is the padding loop (lines 1182-1185 in Numbers.java): while (i - p < 3) {
val *= 10;
i++;
}This ensures greedy parsing pads to 3 digits: JVM Performance Characteristics✅ Minimal overhead: The padding loop has maximum 2 iterations - constant time O(1) ✅ Inlining-friendly: Both methods are ~40 lines with simple control flow. C2 will inline these on hot paths. ✅ Efficient arithmetic: Both use bit-shift multiplication ✅ No allocation delta: Both methods have identical allocation profiles (NumericException on error path only). ✅ Branch prediction: Similar branch patterns - the additional Bytecode Generation FixStack management: The addition of
Verdict✅ Correct, ✅ Performant, ✅ Safe The padding overhead (0-2 multiplications) is negligible compared to the overall timestamp parsing work. The method will inline just as well as the previous one, and the correctness fix is essential. |
|
a test failure due to maven glitch: |
|
@coderabbitai come on, don't be such a bootlicker, I want you do to a hard-core review, question everything! Example, you wrote:
I call BS on this! JIT can detect multiplication by a constant, any real decent compiler can do that. |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain
Let me actually do a proper review: 🏁 Script executed: Length of output: 20975 🏁 Script executed: Length of output: 241250 🏁 Script executed: Length of output: 2617 Alright, you're right - let me actually tear into this: Actual Critical Issues1. The padding loop is confusing AF while (i - p < 3) {
val *= 10;
i++; // ← Why?
}Why are you incrementing int digitsNeeded = 3 - (i - p);
for (int pad = 0; pad < digitsNeeded; pad++) {
val *= 10;
}2. Code duplication - parseInt000Greedy vs parseLong000000Greedy You have two methods that do THE EXACT SAME THING except one pads to 3 digits and one to 6. Why? Make one parameterized method: public static long parseIntPaddedGreedy(CharSequence seq, int p, int lim, int targetDigits)3. Return value semantics are unclear After padding, Is this actually what the bytecode compiler expects? Where's the documentation? 4. Stack state - is it even necessary? Looking at the code, OP_MILLIS_GREEDY already had: stackState &= ~(1 << LOCAL_TEMP_LONG);But OP_NANOS_GREEDY didn't. So this fix is also fixing a pre-existing bug in the stack state tracking. But you didn't mention this in the PR description. Was this causing actual issues? Silent failures? Or did you just cargo-cult it from the other cases? 5. What about overflow with padding? if (len > 3 || val == Integer.MIN_VALUE && !negative) {
throw NumericException.instance().put("number overflow");
}This checks BEFORE padding. But what if final int len = i - p;
if (len > 3) {
throw NumericException.instance().put("number overflow");
}
// Check overflow DURING padding
while (i - p < 3) {
int next = val * 10;
if (next / 10 != val) { // overflow check
throw NumericException.instance().put("number overflow");
}
val = next;
i++;
}6. Tests don't cover edge cases Looking at the test that was added (line 738), it only tests
Questions you should answer:
This isn't a bad fix for the immediate bug, but it's sloppy implementation that suggests there might be similar issues elsewhere in the timestamp parsing code. |
|
@coderabbitai: much better, please remember to always use this thorough mode when reviewing my PR. thank you. |
|
Tip For best results, initiate chat on the files or code changes.
🐰💭 ✏️ Learnings added
|
[PR Coverage check]😍 pass : 5 / 5 (100.00%) file detail
|
Greedy nanosecond parsing (single
Nfollowed by non-digit) is now consistent with milliseconds and microseconds. Previously,.SSSUUUNwith input.1234567produced.123456007Z(7ns) instead of.123456700Z(700ns).I stumbled upon this bug while working on questdb/documentation#358