Skip to content

Conversation

@will-hinson
Copy link
Contributor

This pull request fixes an issue with the regex for the Comment.Single token type in the TransactSqlLexer class.

In the current version of pygments (2.18.0), the existing regex causes single-line comments to be lexed incorrectly whenever they are immediately followed by a non-comment token on the next line. For example, consider the following code:

from pygments.lexers.sql import TransactSqlLexer
for token_type, token_contents in list(
    TransactSqlLexer().get_tokens(
        """
        -- this is a single line comment
        select
        """
    )
):
    print(token_type, repr(token_contents))

When run, this results in the following stream of tokens: (Note that the comment is lexed as various other tokens)

Token.Text.Whitespace '        '
Token.Operator '-'
Token.Operator '-'
Token.Text.Whitespace ' '
Token.Name 'this'
Token.Text.Whitespace ' '
Token.Keyword 'is'
Token.Text.Whitespace ' '
Token.Name 'a'
Token.Text.Whitespace ' '
Token.Name 'single'
Token.Text.Whitespace ' '
Token.Name 'line'
Token.Text.Whitespace ' '
Token.Name 'comment'
Token.Text.Whitespace '\n        '
Token.Keyword 'select'
Token.Text.Whitespace '\n        \n'

Lexing with the modified regex for the token Comment.Single in this commit results in the following stream of tokens: (Note that the comment is now lexed correctly.)

Token.Text.Whitespace '        '
Token.Comment.Single '-- this is a single line comment\n'
Token.Text.Whitespace '        '
Token.Keyword 'select'
Token.Text.Whitespace '\n        \n'

@Anteru
Copy link
Collaborator

Anteru commented Jun 9, 2024

Can you please add the snippet there as an example snippet?

@will-hinson
Copy link
Contributor Author

will-hinson commented Jun 10, 2024

Hi @Anteru,

Absolutely. I have added a test_single_line_comment.txt snippet that tests this specific case.

Additionally, I noticed that tests were failing due to single-line comments in tsql_example.sql that are now lexing differently. The token output looked good to me so I updated tsql_example.sql.output with tox.

Please let me know if any additional action is required of me. Thanks!

@Anteru
Copy link
Collaborator

Anteru commented Jun 10, 2024

No, this looks good. Thanks!

@Anteru Anteru self-assigned this Jun 10, 2024
@Anteru Anteru added the A-lexing area: changes to individual lexers label Jun 10, 2024
@Anteru Anteru merged commit 4d8257d into pygments:master Aug 9, 2024
@Anteru Anteru added this to the 2.19.0 milestone Aug 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-lexing area: changes to individual lexers

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants