Conversation
|
Current dependencies on/for this PR:
This stack of pull requests is managed by Graphite. |
485b941 to
63d0a83
Compare
ruff_lsp/server.py
Outdated
| settings = _get_settings_by_document(document.path) | ||
|
|
||
| if utils.is_stdlib_file(document.path): | ||
| if document.is_stdlib_file(): |
There was a problem hiding this comment.
You can also provide notebook specific code-actions. These are prefixed by namespace notebook. in code action kind.
63d0a83 to
c01f75c
Compare
c01f75c to
c12b459
Compare
0621462 to
095a808
Compare
c12b459 to
8505d3a
Compare
095a808 to
b295f14
Compare
8505d3a to
6ecdd97
Compare
| notebook_document = LSP_SERVER.workspace.get_notebook_document(cell_uri=uri) | ||
| if notebook_document is None: | ||
| notebook_document = LSP_SERVER.workspace.get_notebook_document( | ||
| notebook_uri=uri | ||
| ) |
There was a problem hiding this comment.
We can't use:
LSP_SERVER.workspace.get_notebook_document(notebook_uri=uri, cell_uri=uri)Because the method uses either notebook_uri or cell_uri where notebook_uri is preferred over cell_uri. While our implementation will first try the cell_uri and then the notebook_uri.
| This works for both Python files and Notebook Documents. For Notebook Documents, | ||
| the hover works at the cell level. | ||
| """ | ||
| document = LSP_SERVER.workspace.get_text_document(params.text_document.uri) |
There was a problem hiding this comment.
This is either a Python file or a Notebook cell.
ruff_lsp/server.py
Outdated
| document = Document.from_text_document( | ||
| LSP_SERVER.workspace.get_text_document(params.text_document.uri) | ||
| ) |
There was a problem hiding this comment.
Why are we creating the document again here when it's already constructed above at the start of the function?
This function is a generic implementation to create code actions at both source and line level. The document created above can only be used to create source level code actions as it represents either a Python file or an entire Notebook.
Here, we need a document which represents a Python file or an individual cell because we need to get the line level code action. Here, we already have the diagnostics information which is why this works as otherwise we would've been running Ruff on an individual cell.
There was a problem hiding this comment.
Do you mind adding something to this effect as a comment?
There was a problem hiding this comment.
I've simplified this part such that each branch uses the document through our custom abstraction or the one from pygls.
| def _result_to_edits( | ||
| document: workspace.TextDocument, | ||
| result: RunResult | None, | ||
| ) -> list[TextEdit]: | ||
| def _result_to_workspace_edit( | ||
| document: Document, result: RunResult | None | ||
| ) -> WorkspaceEdit | None: | ||
| """Converts a run result to a WorkspaceEdit.""" |
There was a problem hiding this comment.
Source level code actions should apply the fix for every cell in a Notebook. Earlier the API was to create a single text edit to replace the entire content of a text document with the fixed source, create a workspace edit corresponding the text edit and send it back. This behavior is preserved when the document kind is Python.
But, for a Notebook, we need to create text edits for each cell where the fix exists which is why this was changed.
ruff_lsp/server.py
Outdated
| if new_source.endswith("\r\n"): | ||
| new_source = new_source[:-2] | ||
| elif new_source.endswith("\n"): | ||
| new_source = new_source[:-1] |
There was a problem hiding this comment.
This is more of a TODO for me to check if this is actually required now. Ruff adds and removes the newline at the end of every cell but I need to verify that behavior. I'm not sure what kind of example can be used here. I might have to check with a Ruff version when Jupyter Notebook support was not present.
6ecdd97 to
8cbb83d
Compare
8cbb83d to
e062625
Compare
e062625 to
33a2738
Compare
MichaReiser
left a comment
There was a problem hiding this comment.
LGTM. Would love to get a second look at this because I'm unfamiliar with our Python LSP implementation
| cells: list[CodeCell | MarkdownCell] | ||
|
|
||
|
|
||
| def _create_notebook_json(notebook_document: NotebookDocument) -> str: |
There was a problem hiding this comment.
Oh wow, this feels kind of expensive. Do we run this on every keystroke or only on save? How does it perform for large notebooks?
There was a problem hiding this comment.
Good questions. What about this function feels expensive? I guess the json.dumps part?
Do we run this on every keystroke or only on save?
It depends on user settings but it could run on every keystroke when "run": "onType".
How does it perform for large notebooks?
I'll test it out.
We could potentially create our own layer of notebook synchronization but it feels not worth the effort if we're going to move the LSP to Rust.
Context: openlawlibrary/pygls#394
| # Publish diagnostics for every code cell, replacing the previous diagnostics. | ||
| for cell_idx, cell in enumerate(notebook_document.cells): | ||
| if cell.kind is not NotebookCellKind.Code: | ||
| continue |
There was a problem hiding this comment.
Might be worth adding this explanation to the inline comment
ruff_lsp/server.py
Outdated
|
|
||
| def is_notebook_file(self) -> bool: | ||
| """Return True if the document belongs to a Notebook or a cell in a Notebook.""" | ||
| return self.kind is DocumentKind.Notebook or self.path.endswith(".ipynb") |
There was a problem hiding this comment.
Interesting, so the latter case is a cell? We use DocumentKind.Text?
There was a problem hiding this comment.
Yes, although I'm thinking of adding it's own variant (#264 (comment))
ruff_lsp/server.py
Outdated
| document = Document.from_text_document( | ||
| LSP_SERVER.workspace.get_text_document(params.text_document.uri) | ||
| ) |
There was a problem hiding this comment.
Do you mind adding something to this effect as a comment?
| document = Document.from_uri(uri) | ||
| workspace_edit = await _fix_document_impl(document) | ||
| if workspace_edit is None: | ||
| return |
There was a problem hiding this comment.
Is this a behavior change? Like does this action now not show up in some cases?
There was a problem hiding this comment.
Is this a behavior change?
No, it's not.
Like does this action now not show up in some cases?
No, this action is registered and always shown on the client side using the VSCode extension. When invoked through VSCode, the code executes this command registered in the LSP.
|
|
||
|
|
||
| async def _run_format_on_document(document: workspace.TextDocument) -> RunResult | None: | ||
| async def _run_format_on_document(document: Document) -> RunResult | None: |
There was a problem hiding this comment.
Wow, so this works interchangeably with Python files and notebooks now?
There was a problem hiding this comment.
Yes, the Document object creates an abstraction to store all the information required in the format (specifically the source) it is required.
| workspace_edit = _result_to_workspace_edit(document, results) | ||
| if workspace_edit is None: | ||
| return | ||
| LSP_SERVER.apply_edit(workspace_edit, "Ruff: Format document") |
There was a problem hiding this comment.
Can you walk me through what happens if we receive a notebook cell here? How does that interact with _run_format_on_document?
There was a problem hiding this comment.
The applyFormat command works at the Notebook level.
For a Notebook the arguments is going to contain a single entry which will be a TextDocument for the current cell (active cell or the cell containing the cursor). We'll construct a Document which represents either a Python file or a Notebook.
After the formatting is done, the _result_to_workspace_edit will generate a single TextDocumentEdit for the Python file or a list of TextDocumentEdit for each cell in a Notebook. These are then a part of WorkspaceEdit which are then applied.
|
Thanks for the reviews! I just realized while playing around with the There are 2 ways to go about solving this:
(Sorry for the churn, I must've missed this in my earlier testing and I wasn't aware of the |
33a2738 to
8e0b58a
Compare
ruff_lsp/server.py
Outdated
| def is_notebook_cell(self) -> bool: | ||
| """Return True if the document belongs to a cell in a Notebook.""" | ||
| return self.kind is DocumentKind.Text and self.path.endswith(".ipynb") |
There was a problem hiding this comment.
This is similar to is_notebook_file. We represent a cell as Text because that's what pygls does as well using TextDocument.
| workspace_edit = await _fix_document_impl( | ||
| Document.from_uri(params.text_document.uri), only="I001" | ||
| ) | ||
| if workspace_edit: |
There was a problem hiding this comment.
The Document creation is done only when required as it's eager to read the source text or construct the JSON notebook format.
| workspace_edit = _result_to_workspace_edit(document, results) | ||
| if workspace_edit is None: | ||
| return | ||
| LSP_SERVER.apply_edit(workspace_edit, "Ruff: Format document") |
There was a problem hiding this comment.
The applyFormat command works at the Notebook level.
For a Notebook the arguments is going to contain a single entry which will be a TextDocument for the current cell (active cell or the cell containing the cursor). We'll construct a Document which represents either a Python file or a Notebook.
After the formatting is done, the _result_to_workspace_edit will generate a single TextDocumentEdit for the Python file or a list of TextDocumentEdit for each cell in a Notebook. These are then a part of WorkspaceEdit which are then applied.
ruff_lsp/server.py
Outdated
| class DocumentSource(NamedTuple): | ||
| """The source of a document.""" | ||
|
|
||
| path: str | ||
| """The path to the document.""" | ||
|
|
||
| text: str | ||
| """The text of the document.""" |
There was a problem hiding this comment.
This is so that we can pass a different source to Ruff than the one in the document. For example, in Format Cell document, we'll pass the notebook representation containing a single cell and then compare against the original source. Refer to format_document function.
DemoFormatting
notebook_formatting.movCode Actions
notebook_code_actions.movDiagnostics
diagnostics_on_type.mov
diagnostics_on_save.movHovernotebook_hover.mov |
8e0b58a to
56a8a5a
Compare
|
(@dhruvmanila - FYI looks like a CI failure here.) |
56a8a5a to
2b4d6ab
Compare
Sorry, it's fixed. |
charliermarsh
left a comment
There was a problem hiding this comment.
Thanks for the great work, and the thorough testing -- gives me a lot of confidence.
|
LGTM |
09587d4 to
244a246
Compare
| Cell = enum.auto() | ||
| """A cell in a Notebook Document.""" |
There was a problem hiding this comment.
I had to introduce the cell specific document kind now that they're being used in multiple places.
| @classmethod | ||
| def from_cell_or_text_uri(cls, uri: str) -> Self: | ||
| """Create a `Document` representing either a Python file or a Notebook cell from | ||
| the given URI. | ||
|
|
||
| The function will try to get the Notebook cell first, and if there's no cell | ||
| with the given URI, it will fallback to the text document. | ||
| """ |
There was a problem hiding this comment.
Not sure if there's any better name for this.
| workspace_edit = await _fix_document_impl( | ||
| Document.from_cell_or_text_uri(params.text_document.uri), | ||
| only="I001", | ||
| ) | ||
| if workspace_edit: |
There was a problem hiding this comment.
Code actions now work at cell level for a Notebook. This is consistent with everything else in the VS Code ecosystem and a user can use builtin commands (Ruff:) to run an action on the entire notebook.
| def _result_single_cell_notebook_to_edits( | ||
| document: Document, result: RunResult | ||
| ) -> list[TextEdit] | None: | ||
| """Converts a run result to a list of TextEdits. | ||
|
|
||
| if new_source == document.source: | ||
| The result is expected to be a single cell Notebook Document. | ||
| """ |
There was a problem hiding this comment.
This function exists because for formatting provider, we need to return a list of TextEdit while for the command we need to apply a WorkspaceEdit.
|
(@charliermarsh sorry for asking a review after approving, this is the final commit :)) I've pushed it as a separate commit to facilitate the review process: 244a246 Updates since the last reviewThe major update is that now the code actions work at cell level. Earlier, they were applied at the Notebook level. My thinking before was that source level code actions should be applied on the entire Notebook but it doesn't work well with VS Code settings, specifically the following: {
// Enable formatting the entire Notebook on save
"notebook.formatOnSave.enabled": true,
// Run the enabled code actions on the entire Notebook on save
"notebook.codeActionsOnSave": {
"source.fixAll": true,
"source.organizeImports.ruff": true
},
}This also means that the generic |
|
I'll review and approve this today, but before doing so I'll also check it out and poke around locally. |
charliermarsh
left a comment
There was a problem hiding this comment.
Excellent, tested this out myself too.

Summary
This PR adds support for Jupyter Notebook. It requires client support for LSP 3.17 which contains the Notebook support.
Implementation
Context
Document: LSP type representing a text file (Python file for Ruff).TextDocument:pyglsrepresentation of the LSPDocument. This is an abstraction created from aDocumentwhich provides some useful methods like getting the file path, source code, etc.NotebookDocumenttype was added to represent a Notebook which consists of a list of cells (NotebookCell). Note that these are all LSP types coming fromlsprotocol.pygls, a Notebook cell is represented as a text document (TextDocument).There are methods provided by
pyglsto get the object:get_text_document- Returns aTextDocumentwhich either represents a Python file or a Notebook cellget_notebook_document- Returns aNotebookDocumenteither using the Notebook URI or a cell URI. For cell URI, it returns theNotebookDocumentcontaining the cell.Document
A new
Documenttype was created to facilitate the implementation. This represents either a Python file, a Notebook or a Notebook cell. There are various constructor methods which should be used to create this type:from_uriorfrom_text_document.from_uriorfrom_notebook_document.from_cell_or_text_uriorfrom_notebook_cell.Notebook JSON
Ruff expects the source content of a Notebook file to be in JSON format following the [Notebook format specification] but the protocol uses it's own abstraction and doesn't store the JSON format. This means that we need to create a JSON string representing the Notebook from the available information. This doesn't need all the information as Ruff only uses the cell source and version information. So, we create a minimal JSON string representing the Notebook document and pass it to Ruff.
An example JSON string representing a Notebook Document:
{ "metadata": {}, "nbformat": 4, "nbformat_minor": 5, "cells": [ { "cell_type": "code", "metadata": null, "outputs": [], "source": "import random\nimport math" }, { "cell_type": "code", "metadata": null, "outputs": [], "source": "try:\n print()\nexcept ValueError:\n pass" }, { "cell_type": "code", "metadata": null, "outputs": [], "source": "import random\nimport pprint\n\nrandom.randint(10, 20)" }, { "cell_type": "code", "metadata": null, "outputs": [], "source": "foo = 1\nif foo is 1:\n msg = f\"Invalid foo: {foo}\"\n raise ValueError(msg)" } ] }We need to pass in every cell including the markdown cell to get an accurate information like the cell number.
For the cell document kind, the source value is a JSON string containing just a single code cell. This is required as code actions and formatting work at both cell and notebook level.
Configuration
For VSCode users, the
notebook.*configuration is used to run the formatter or code actions on save:{ // Enable formatting the entire Notebook on save "notebook.formatOnSave.enabled": true, // Run the enabled code actions on the entire Notebook on save "notebook.codeActionsOnSave": { "source.fixAll": true, "source.organizeImports.ruff": true }, }The way the above settings work in VSCode is that the editor runs the actions in parallel for every cell. This has the illusion that it was run on the entire Notebook. The commands defined by us (
Ruff: Organize importsandRuff: Fix all auto-fixable problems) are run on the entire Notebook at once. This is important because in the latter case theruffcommand is invokednnumber of times wherenis the number of cells while for the former it's run only once.Commands
Builtin
Ruff: Organize Imports: Works at Notebook levelRuff: Fix all auto-fixable problems: Works at Notebook levelVSCode specifc
Format Cell: Formats the current cellNotebook: Format Notebook: Formats the entire Notebook by running the formatter for every cellOrganize Imports: Runs thesource.organizeImportscode action on every cell in parallelFix All: Runs thesource.fixAllcode action on every cell in parallelFeature checklist
ruff.applyAutofixruff.applyOrganizeImportsruff.applyFormatTest Plan
Manually testing for all the features mentioned above.
How to run this locally?
git checkout dhruv/notebookin theruff-lsprepositoryruff-vscode, uninstallruff-lsp(pip uninstall --yes ruff-lsp) as we'd want to use the local version. To install the localruff-lspversion inruff-vscode, follow Modifying the LSP.ruff-vscodedirectory -> "Run and Debug" section from the sidebar -> "Debug Extension and Python" config.This will then open a VSCode development session which can be used to test out the notebook features.
Test notebooks:
Requires: astral-sh/ruff#7664 which was released in
v0.1.0fixes: #267
closes: astral-sh/ruff-vscode#256
closes: astral-sh/ruff-vscode#314
closes: astral-sh/ruff-vscode#51