Skip to content

fix: bump tree-sitter to 0.25.0#5977

Merged
willmcgugan merged 6 commits intoTextualize:mainfrom
TomJGooding:fix-bump-tree-sitter-to-0.25.0
Jul 25, 2025
Merged

fix: bump tree-sitter to 0.25.0#5977
willmcgugan merged 6 commits intoTextualize:mainfrom
TomJGooding:fix-bump-tree-sitter-to-0.25.0

Conversation

@TomJGooding
Copy link
Copy Markdown
Collaborator

@TomJGooding TomJGooding commented Jul 21, 2025

Bump tree-sitter to v0.25.0 and fix breaking change where Query.captures has been moved to QueryCursor.captures.

NB: tree-sitter dropped support for Python 3.9 back in v0.24.0, so syntax highlighting in Textual will now require Python >=3.10.

Fixes #5976

Please review the following checklist.

  • Docstrings on all new or modified functions / classes
  • Updated documentation
  • Updated CHANGELOG.md (where appropriate)

`lua-match` is specific to tree-sitter in Neovim. Currently these query
predicates are ignored so will highlight much more than they should.
Bump tree-sitter to v0.25.0 and fix breaking change where
`Query.captures` has been moved to `QueryCursor.captures`.

NB: tree-sitter dropped support for Python 3.9 back in v0.24.0, so
syntax highlighting in Textual will now require Python >=3.10.

Fixes Textualize#5976
Update Python versions for syntax dependencies and tests in CI, as
tree-sitter no longer supports Python 3.9.
@TomJGooding TomJGooding changed the title fix: bump tree sitter to 0.25.0 fix: bump tree-sitter to 0.25.0 Jul 21, 2025
@TomJGooding
Copy link
Copy Markdown
Collaborator Author

This causes a change in the snapshot test for Python syntax highlighting which I'm struggling to understand.

Before

snapshot before

After

snapshot after

It looks like this is highlighting 'range' as type.builtin rather than function.call.

But if you comment out some of the code above, now the highlighting is the same:

snapshot with commented out code

@TomJGooding
Copy link
Copy Markdown
Collaborator Author

TomJGooding commented Jul 22, 2025

I think this query_syntax_tree method hasn't been working as intended for a while. It looks like the 'range' arguments were removed from Query.captures back in tree-sitter v0.23.0.

[EDIT: Looking closer, it doesn't look like Textual ever actually used the range arguments in this method. I'm not sure whether to just remove them or try to fix this? I decided worth fixing anyway in case it might help future improvements]

captures_kwargs = {}
if start_point is not None:
captures_kwargs["start_point"] = start_point
if end_point is not None:
captures_kwargs["end_point"] = end_point
captures = query.captures(self._syntax_tree.root_node, **captures_kwargs)

Remove the `name` argument from `Language` initialization, as this was
removed back in tree-sitter v0.22.0.
Fix the handling of the optional point range in `query_syntax_tree`,
where the range arguments were removed from `Query.captures` back in
tree-sitter v0.23.0.

It doesn't look like Textual ever actually used these range arguments,
but perhaps they were included to allow future improvements or to be
used by other developers.
@TomJGooding
Copy link
Copy Markdown
Collaborator Author

For reference here's the latest snapshot report from CI: https://github.com/Textualize/textual/actions/runs/16452663173/artifacts/3590409485

The failed snapshots look fine for the most part (and actually fixed a few bugs), but unfortunately there seems to be some quirks as mentioned above.

@TomJGooding
Copy link
Copy Markdown
Collaborator Author

It looks like the syntax highlighting was possibly already broken...

PYTHON_EXAMPLE = """\
# Uncomment the line below
# foo = 69

print("Hello world")
x = range(10)
"""


from textual.app import App, ComposeResult
from textual.widgets import TextArea


class TextAreaSnapshot(App):
    def compose(self) -> ComposeResult:
        yield TextArea.code_editor(
            PYTHON_EXAMPLE,
            language="python",
        )


app = TextAreaSnapshot()
if __name__ == "__main__":
    app.run()
image image

@TomJGooding
Copy link
Copy Markdown
Collaborator Author

I think I'm finally starting to understand the issues with the highlighting.

Consider this Python code:

print("Hello world")
x = range(10)

The tree-sitter query captures (on this branch) look something like this.1 Notice that function.call is before type.builtin, since 'print' was matched first.

[('variable', ['print', 'range', 'x']),
 ('function.call', ['print', 'range']),
 ('function.builtin', ['print', 'range']),
 ('punctuation.bracket', ['(', '(', ')', ')']),
 ('string', ['"Hello world"']),
 ('operator', ['=']),
 ('type.builtin', ['range']),
 ('number', ['10'])]

After removing the print function:

x = range(10)

Notice how the order of the captures has changed, where now function.call is after type.builtin

[('variable', ['x', 'range']),
 ('operator', ['=']),
 ('type.builtin', ['range']),
 ('function.call', ['range']),
 ('function.builtin', ['range']),
 ('punctuation.bracket', ['(', ')']),
 ('number', ['10'])]

The TextArea loops over the captures to build the 'highlight map':

captures = self.document.query_syntax_tree(self._highlight_query)
for highlight_name, nodes in captures.items():
for node in nodes:
node_start_row, node_start_column = node.start_point
node_end_row, node_end_column = node.end_point
if node_start_row == node_end_row:
highlight = (node_start_column, node_end_column, highlight_name)
highlights[node_start_row].append(highlight)

When the TextArea lines are rendered, it loops over the line highlights to apply the syntax highlighting. This explains why 'range' will be highlighted differently depending on whether there's another function call before it!

line_highlights = highlights[line_index]
for highlight_start, highlight_end, highlight_name in line_highlights:
node_style = get_highlight_from_theme(highlight_name)
if node_style is not None:
line.stylize(
node_style,
byte_to_codepoint.get(highlight_start, 0),
byte_to_codepoint.get(highlight_end) if highlight_end else None,
)

Footnotes

  1. QueryCursor.captures actually returns "A dict where the keys are the names of the captures and the values are lists of the captured nodes". I've formatted the captures to make them simpler to understand.

Update the text area snapshots due to syntax highlighting changes after
bumping tree-sitter to 0.25.0.
@TomJGooding
Copy link
Copy Markdown
Collaborator Author

TomJGooding commented Jul 23, 2025

I've updated the snapshots and marked this as ready for review.

Since it turns out the syntax highlighting is already broken, I think better to prioritise fixing the tree-sitter crash and revisit these issues later. Bumping tree-sitter actually fixes some issues with incorrect highlighting.

Note I still need to update the CHANGELOG if this is approved. This might be considered a breaking change since the syntax extras would now require Python >=3.10,

@willmcgugan
Copy link
Copy Markdown
Member

Thanks for putting in the leg work on this one. Will get this in the new release (out soon).

@willmcgugan willmcgugan merged commit 1f70760 into Textualize:main Jul 25, 2025
23 checks passed
@willmcgugan
Copy link
Copy Markdown
Member

Thanks, Tom

@TomJGooding
Copy link
Copy Markdown
Collaborator Author

No problem. Sorry I didn't manage to get back to the CHANGELOG before this was merged - is it worth adding something now (and maybe also the release notes)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

TextArea with textual[syntax] causes crash as of tree-sitter 0.25.0

2 participants