Skip to content

Report the end/stop location / width of violations so that violations can be accurately highlighted in a text editor #4651

@james-johnston-thumbtack

Description

Search before asking

  • I searched the issues and found no similar issues.

Description

Currently, SQLFluff only supports showing the starting location of a violation. This means that anyone using the output of linting is unable to know the length / end location of the violation.

This is a limitation that is workable for a CLI tool, although even there it places limitations on what could potentially be rendered on the output. However, it becomes really bad looking for IDE plugins. I will use the plugin from Anbora labs for IntelliJ as an example: https://github.com/anboralabs/sqlfluff-intellij-community

Because the plugin doesn't know the end location of the violation, it has to make some assumption. In this case, it assumes it's just one character wide. This results in really goofy looking inspections like this in my IDE text editor (note the rule is a custom one I am making):

Screenshot 2023-03-31 at 9 41 56 PM

Notice that only the t is yellow highlighted, because the plugin doesn't know how long the violation was. Ideally, it would yellow highlight the entire test_column_three token in the IDE, since that is the portion of the statement my rule considers to be in error. That is how the builtin IDE inspections from JetBrains work.

Other linters do provide this information. For example, eslint provides a json output formatter that includes end positions: https://eslint.org/docs/latest/use/formatters/#json

Example eslint output from the documentation, which includes endLine and endColumn:

[
  {
    "filePath": "/var/lib/jenkins/workspace/Releases/eslint Release/eslint/fullOfProblems.js",
    "messages": [
      {
        "ruleId": "no-unused-vars",
        "severity": 2,
        "message": "'addOne' is defined but never used.",
        "line": 1,
        "column": 10,
        "nodeType": "Identifier",
        "messageId": "unusedVar",
        "endLine": 1,                    <------ note the end location included here!
        "endColumn": 16
      },

This is probably a major effort since all the rules would need to be updated to support this. Perhaps it could be introduced in a backwards-compatible way, where SQLFluff itself just assumes the violation is 1 character wide (or alternatively the entire starting anchor) unless the rule has been updated to provide an end location.

I am only just now learning how to write my own rule, and have never done something similar with other linters. So my perspective is limited and perhaps a different design would be better. But to try to give some initial thought to the issue: intuitively I think what would make sense is if the rule can provide an optional end_anchor in the LintResult, in addition to the anchor already given. When provided, it would be the inclusive location of where the end of the violation was. Possibly one sensible default value for rules that don't yet provide one might be to assume a default of end_anchor equal to the existing starting anchor. It might even be the right thing for some rules: for the initial rules I wrote it coincidentally would work...

An open question might be which child nodes, if any, are also included when the end_anchor is a parent node. I kind of think the answer might be "all of them" so that I don't need to traverse the tree to find the final leaf node if I just want to highlight the entire tree branch. For example, in my case, I just want to highlight the entire column_reference and have no interest otherwise in traversing it further. If someone wanted to only highlight up to a specific child node then they could just return that specific child node. Alternatively, you could be requiring users to provide the final leaf node in the tree and do that traversal themselves.

Another question might be what if the user wants to only highlight up to a specific location within the raw text of an individual tree node. However, already the starting anchor does not support returning a specific start location within the starting anchor, so I think it's orthogonal to this issue.

Use case

Ultimately I would like SQLFluff to provide a way for text editor plugins to accurately highlight the entire violation, from starting location to ending location.

I think the best way to do that will be if the JSON output of SQLFluff can be updated to include end location information, similar to ESLint.

Dialect

General improvements to SQLFluff rule API

Are you willing to work on and submit a PR to address the issue?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Labels

enhancementNew feature or request

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions