Conversation
|
| // binary operators (x + x + … + x) both during the AST walk in semantic index building as | ||
| // well as during type checking. Using this stack size, we can handle handle expressions | ||
| // that are several times larger than the corresponding limits in existing type checkers. | ||
| .stack_size(16 * 1024 * 1024) |
There was a problem hiding this comment.
It's probably worth noting that this does not generally increase memory usage by 16 MiB × num_threads. It's just the max. available stack space that a single thread can use.
There was a problem hiding this comment.
And 16MiB/thread is not even detectable, relative to the memory we need for source texts and AST, for any project large enough to care about memory usage.
carljm
left a comment
There was a problem hiding this comment.
This looks great, thank you
| // binary operators (x + x + … + x) both during the AST walk in semantic index building as | ||
| // well as during type checking. Using this stack size, we can handle handle expressions | ||
| // that are several times larger than the corresponding limits in existing type checkers. | ||
| .stack_size(16 * 1024 * 1024) |
There was a problem hiding this comment.
And 16MiB/thread is not even detectable, relative to the memory we need for source texts and AST, for any project large enough to care about memory usage.
| total = 1{plus_one_repeated} | ||
| reveal_type(total) | ||
| ", | ||
| plus_one_repeated = " + 1".repeat(2000 - 1) |
There was a problem hiding this comment.
I chose a much smaller limit than what we can handle (but still bigger than what we can handle on main) for two reasons:
- Debug builds take up more stack space and crash at smaller input sizes
- We don't want minor increases of stack frame sizes to lead to a test failure here.
## Summary Following #17869, we can now run `ty` on `sympy`.
|
I think the long term solution here would be to implement binary expression traversal in a loop rather than with recursion. But this is fine for now |
## Summary closes astral-sh#17472 This is obviously just a band-aid solution to this problem (in that you can always make your [pathological inputs](https://github.com/sympy/sympy/blob/28994edd82a18c22a98a52245c173f3c836dcad3/sympy/polys/numberfields/resolvent_lookup.py) bigger and it will still crash), but I think this is not an unreasonable change — even if we add more sophisticated solutions later. I tried using `stacker` as suggested by @MichaReiser, and it works. But it's unclear where exactly would be the right place to put it, and even for the `sympy` problem, we would need to add it both in the semantic index builder AST traversal and in type inference. Increasing the default stack size for worker threads, as proposed here, doesn't solve the underlying problem (that there is a hard limit), but it is more universal in the sense that it is not specific to large binary-operator expression chains. To determine a reasonable stack size, I created files that look like *right associative*: ```py from typing import reveal_type total = (1 + (1 + (1 + (1 + (… + 1))))) reveal_type(total) ``` *left associative* ```py from typing import reveal_type total = 1 + 1 + 1 + 1 + … + 1 reveal_type(total) ``` with a variable amount of operands (`N`). I then chose the stack size large enough to still be able to handle cases that existing type checkers can not: ``` right N = 20: mypy takes ~ 1min N = 350: pyright crashes with a stack overflow (mypy fails with "too many nested parentheses") N = 800: ty(main) infers Literal[800] instantly N = 1000: ty(main) crashes with "thread '<unknown>' has overflowed its stack" N = 7000: ty(this branch) infers Literal[7000] instantly N = 8000+: ty(this branch) crashes left N = 300: pyright emits "Maximum parse depth exceeded; break expression into smaller sub-expressions" total is inferred as Unknown N = 5500: mypy crashes with "INTERNAL ERROR" N = 2500: ty(main) infers Literal[2500] instantly N = 3000: ty(main) crashes with "thread '<unknown>' has overflowed its stack" N = 22000: ty(this branch) infers Literal[22000] instantly N = 23000+: ty(this branch) crashes ``` ## Test Plan New regression test.
## Summary Following astral-sh#17869, we can now run `ty` on `sympy`.
Summary
closes #17472
This is obviously just a band-aid solution to this problem (in that you can always make your pathological inputs bigger and it will still crash), but I think this is not an unreasonable change — even if we add more sophisticated solutions later. I tried using
stackeras suggested by @MichaReiser, and it works. But it's unclear where exactly would be the right place to put it, and even for thesympyproblem, we would need to add it both in the semantic index builder AST traversal and in type inference. Increasing the default stack size for worker threads, as proposed here, doesn't solve the underlying problem (that there is a hard limit), but it is more universal in the sense that it is not specific to large binary-operator expression chains.To determine a reasonable stack size, I created files that look like
right associative:
left associative
with a variable amount of operands (
N). I then chose the stack size large enough to still be able to handle cases that existing type checkers can not:Test Plan
New regression test.