Skip to content

Conversation

@lsf37
Copy link
Member

@lsf37 lsf37 commented Dec 30, 2022

In %caseless mode, char classes can contain characters that are not in the input char set, leading to an exception when we try to look up the class code for such a character.

  • add a regression test case for this situation
  • make CharClasses robust against this situation and ignores characters outside the input char set in NFA construction.
  • minor warning reductions

Fixes #974

@lsf37 lsf37 self-assigned this Dec 30, 2022
@lsf37 lsf37 added the bug Not working as intended label Dec 30, 2022
@lsf37 lsf37 added this to the 1.9.0 milestone Dec 30, 2022
The lexer spec can mention characters that are not in the input set
(e.g. for %7bit or %8bit). In particular, in caseless matching, the
caseless class might contain such characters.

Make getClassCode() robust against this situation, and ignore such
characters when we add transitions.

Fixes #974
@lsf37 lsf37 merged commit c356aef into master Dec 30, 2022
@lsf37 lsf37 deleted the charclass branch December 30, 2022 10:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Not working as intended

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Unexpected exception encountered in JFlex

2 participants