Conversation
Parser Change ChecklistThis PR modifies the parser. Please check that the following tests are updated:
This test should have examples of every new bit of syntax you are adding. Feel free to just check the box if your PR does not actually change the syntax (because it is refactoring the parser, say). |
nmatschke
requested changes
Sep 9, 2025
Contributor
nmatschke
left a comment
There was a problem hiding this comment.
Neat, thanks. Couple small comments.
nmatschke
requested changes
Sep 9, 2025
nmatschke
approved these changes
Sep 9, 2025
b9d4b1e to
bcfcb0b
Compare
bcfcb0b to
d63754a
Compare
mshinwell
pushed a commit
that referenced
this pull request
Sep 11, 2025
This feature adds literals for the remaining integer types: * `int8`: `42s`, * `int16`: `42S`, * `int8#`: `#42s`, * `int16#`: `#42S` * and `int#`: `#42m`. The literal syntax `#42` is not an option because it causes an ambiguity with line number directives. The literal syntax `42m` for regular tagged integers was also added to mirror the untagged immediate literals. The changes are mostly straightforward since we don't need to modify the parser. The one rough edge is converting strings to small ints. Since small ints are represented as regular `int`s in the compiler, we could just call `int_of_string` to convert, but this would not properly handle overflow[^1]. Ideally, we would call the C primitive `parse_intnat` to handle most of the overflow logic, but it takes some of its arguments unboxed. A less ideal but feasible option is to put a C stub in `utils/` that calls `parse_intnat`, but this would be awkward since it would be the only C stub used during typing. The option I chose to go with is to reimplement the overflow logic in OCaml (in `utils/misc.ml`). Extensive testing of this overflow logic is found in `testsuite/tests/typing/small-numbers/test_enabled.ml`. The choice of `m` for the untagged immediate suffix is a bit arbitrary. The main consideration is that the suffix isn't overloaded (e.g. `#42u` would be a bad choice since it could be confused for an unsigned literal). ## Reviewing Start by looking at the tests. In particular, make sure every case of overflowing literals is tested and has behavior similar to regular ints. Most of the changes in the source are copy-pastes of code that handles the other integer types, so reviewing should be easy. Two places require extra care: * In `utils/misc.ml`, make sure `cvt_small_int` handles overflow correctly (see long paragraph above) * In the printers, make sure that every time that some literals are printed with a suffix, untagged immediates are also printed with a suffix (in an earlier version of this PR, untagged immediates were not given a suffix). [^1]: Handling overflow is more complex than you'd expect: see <ocaml/ocaml#4210>. --------- Co-authored-by: James Rayman <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This feature adds literals for the remaining integer types:
int8:42s,int16:42S,int8#:#42s,int16#:#42Sint#:#42m.The literal syntax
#42is not an option because it causes an ambiguity with line number directives. The literal syntax42mfor regular tagged integers was also added to mirror the untagged immediate literals.The changes are mostly straightforward since we don't need to modify the parser. The one rough edge is converting strings to small ints. Since small ints are represented as regular
ints in the compiler, we could just callint_of_stringto convert, but this would not properly handle overflow1. Ideally, we would call the C primitiveparse_intnatto handle most of the overflow logic, but it takes some of its arguments unboxed. A less ideal but feasible option is to put a C stub inutils/that callsparse_intnat, but this would be awkward since it would be the only C stub used during typing. The option I chose to go with is to reimplement the overflow logic in OCaml (inutils/misc.ml). Extensive testing of this overflow logic is found intestsuite/tests/typing/small-numbers/test_enabled.ml.The choice of
mfor the untagged immediate suffix is a bit arbitrary. The main consideration is that the suffix isn't overloaded (e.g.#42uwould be a bad choice since it could be confused for an unsigned literal).Reviewing
Start by looking at the tests. In particular, make sure every case of overflowing literals is tested and has behavior similar to regular ints.
Most of the changes in the source are copy-pastes of code that handles the other integer types, so reviewing should be easy. Two places require extra care:
utils/misc.ml, make surecvt_small_inthandles overflow correctly (see long paragraph above)Footnotes
Handling overflow is more complex than you'd expect: see https://github.com/ocaml/ocaml/issues/4210. ↩