Skip to content

Fix wrap around issues on large numerics#543

Closed
jlusiardi wants to merge 4 commits intoultrajson:mainfrom
jlusiardi:issue_440
Closed

Fix wrap around issues on large numerics#543
jlusiardi wants to merge 4 commits intoultrajson:mainfrom
jlusiardi:issue_440

Conversation

@jlusiardi
Copy link
Copy Markdown

@jlusiardi jlusiardi commented May 27, 2022

There seemed to be a mistake on LLONG_MAX vs ULLONG_MAX (the latter is
the maximum value for an object of type unsigned long long int which
JSUINT64 should be). I fixed that.

I am unsure about how much performance is lost on the check for overflow
on digit check...

Fixes #440

Changes proposed in this pull request:

  • improve wrap around detection
  • add test for wrap arounds
  • make error messages more consistent (add ! at the end)

Joachim Lusiardi and others added 2 commits May 27, 2022 10:14
There seemed to be a mistake on LLONG_MAX vs ULLONG_MAX (the latter is
the maximum value for an object of type `unsigned long long int` which
JSUINT64 should be). I fixed that.

I am unsure about how much performance is lost on the check for overflow
on digit check...
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 27, 2022

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 91.77%. Comparing base (b300d64) to head (c98d5cb).
⚠️ Report is 256 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #543      +/-   ##
==========================================
+ Coverage   91.76%   91.77%   +0.01%     
==========================================
  Files           6        6              
  Lines        1821     1824       +3     
==========================================
+ Hits         1671     1674       +3     
  Misses        150      150              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment thread lib/ultrajsondec.c Outdated
Copy link
Copy Markdown
Collaborator

@bwoodsend bwoodsend left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bear with me here - this stuff makes my head hurt...

Comment thread lib/ultrajsondec.c
{
return SetError(ds, -1, overflowLimit == LLONG_MAX ? "Value is too big!" : "Value is too small");
return SetError(ds, -1, overflowLimit == LLONG_MAX ? "Value is too big" : "Value is too small");
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the overflowLimit == LLONG_MAX ? ... switch not redundant because of the if (intNeg == 1) and else if (intNeg == -1) conditions above it?

Copy link
Copy Markdown
Collaborator

@bwoodsend bwoodsend May 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(P.S. If we're ditching !s in error messages then there's that one still lurking on line 170.)

Comment thread lib/ultrajsondec.c
case '9':
{
// detect overflow on digit shift
if ((intValue * 10ULL) % 10 != 0)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would if (intValue > ULLONG_MAX / 10) not be equivalent? It's less headache inducing and probably fractionally more performant since the ULLONG_MAX / 10 will be evaluated at compile time leaving just the > operation at runtime.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think that's functionally identical. In both cases, an intValue of ULLONG_MAX / 10 = 1844674407370955161 would pass the check, as it should since it does not overflow on the multiplication. It might still overflow on the addition if the next digit is 6 to 9, but that would be caught in the check below. And intValue = ULLONG_MAX / 10 + 1 would fail this added check in both forms.

Copy link
Copy Markdown
Collaborator

@JustAnotherArchivist JustAnotherArchivist left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise, integer arithmetic is bad enough as it is, but decoding from a string definitely induces headaches.

Comment thread lib/ultrajsondec.c
{
hasError = 1;
}
else if (intNeg == -1 && intValue > overflowLimit)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't the removal of this check mean that there is no test against the smaller overflowLimit of LLONG_MAX for negative numbers anymore? Then we could get an overflow further down in the BREAK_INT_LOOP section.

Comment thread tests/test_ujson.py
Comment on lines +1060 to +1071
@pytest.mark.parametrize(
"test_input",
[
("33333333303333333333"),
("18446744073709551616"), # 64 bit
("-18446744073709551616"), # 64 bit
("-80888888888888888888"),
],
)
def test_decode_big_numeric(test_input):
with pytest.raises(ujson.JSONDecodeError):
ujson.loads(test_input)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be merged with test_decode_raises, which already has some tests for big ints. I have no strong opinion on whether it should all be in one test or not, though I tend towards yes (since the test code is basically identical, just with different input).

@jlusiardi
Copy link
Copy Markdown
Author

I'll close this PR and recommend looking at #544 instead.

@jlusiardi jlusiardi closed this May 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

UltraJSON has inconsistent behavior on parsing large integral values.

5 participants