Search before asking
Description
Currently, SQLFluff only supports showing the starting location of a violation. This means that anyone using the output of linting is unable to know the length / end location of the violation.
This is a limitation that is workable for a CLI tool, although even there it places limitations on what could potentially be rendered on the output. However, it becomes really bad looking for IDE plugins. I will use the plugin from Anbora labs for IntelliJ as an example: https://github.com/anboralabs/sqlfluff-intellij-community
Because the plugin doesn't know the end location of the violation, it has to make some assumption. In this case, it assumes it's just one character wide. This results in really goofy looking inspections like this in my IDE text editor (note the rule is a custom one I am making):

Notice that only the t is yellow highlighted, because the plugin doesn't know how long the violation was. Ideally, it would yellow highlight the entire test_column_three token in the IDE, since that is the portion of the statement my rule considers to be in error. That is how the builtin IDE inspections from JetBrains work.
Other linters do provide this information. For example, eslint provides a json output formatter that includes end positions: https://eslint.org/docs/latest/use/formatters/#json
Example eslint output from the documentation, which includes endLine and endColumn:
[
{
"filePath": "/var/lib/jenkins/workspace/Releases/eslint Release/eslint/fullOfProblems.js",
"messages": [
{
"ruleId": "no-unused-vars",
"severity": 2,
"message": "'addOne' is defined but never used.",
"line": 1,
"column": 10,
"nodeType": "Identifier",
"messageId": "unusedVar",
"endLine": 1, <------ note the end location included here!
"endColumn": 16
},
This is probably a major effort since all the rules would need to be updated to support this. Perhaps it could be introduced in a backwards-compatible way, where SQLFluff itself just assumes the violation is 1 character wide (or alternatively the entire starting anchor) unless the rule has been updated to provide an end location.
I am only just now learning how to write my own rule, and have never done something similar with other linters. So my perspective is limited and perhaps a different design would be better. But to try to give some initial thought to the issue: intuitively I think what would make sense is if the rule can provide an optional end_anchor in the LintResult, in addition to the anchor already given. When provided, it would be the inclusive location of where the end of the violation was. Possibly one sensible default value for rules that don't yet provide one might be to assume a default of end_anchor equal to the existing starting anchor. It might even be the right thing for some rules: for the initial rules I wrote it coincidentally would work...
An open question might be which child nodes, if any, are also included when the end_anchor is a parent node. I kind of think the answer might be "all of them" so that I don't need to traverse the tree to find the final leaf node if I just want to highlight the entire tree branch. For example, in my case, I just want to highlight the entire column_reference and have no interest otherwise in traversing it further. If someone wanted to only highlight up to a specific child node then they could just return that specific child node. Alternatively, you could be requiring users to provide the final leaf node in the tree and do that traversal themselves.
Another question might be what if the user wants to only highlight up to a specific location within the raw text of an individual tree node. However, already the starting anchor does not support returning a specific start location within the starting anchor, so I think it's orthogonal to this issue.
Use case
Ultimately I would like SQLFluff to provide a way for text editor plugins to accurately highlight the entire violation, from starting location to ending location.
I think the best way to do that will be if the JSON output of SQLFluff can be updated to include end location information, similar to ESLint.
Dialect
General improvements to SQLFluff rule API
Are you willing to work on and submit a PR to address the issue?
Code of Conduct
Search before asking
Description
Currently, SQLFluff only supports showing the starting location of a violation. This means that anyone using the output of linting is unable to know the length / end location of the violation.
This is a limitation that is workable for a CLI tool, although even there it places limitations on what could potentially be rendered on the output. However, it becomes really bad looking for IDE plugins. I will use the plugin from Anbora labs for IntelliJ as an example: https://github.com/anboralabs/sqlfluff-intellij-community
Because the plugin doesn't know the end location of the violation, it has to make some assumption. In this case, it assumes it's just one character wide. This results in really goofy looking inspections like this in my IDE text editor (note the rule is a custom one I am making):
Notice that only the
tis yellow highlighted, because the plugin doesn't know how long the violation was. Ideally, it would yellow highlight the entiretest_column_threetoken in the IDE, since that is the portion of the statement my rule considers to be in error. That is how the builtin IDE inspections from JetBrains work.Other linters do provide this information. For example, eslint provides a json output formatter that includes end positions: https://eslint.org/docs/latest/use/formatters/#json
Example eslint output from the documentation, which includes
endLineandendColumn:[ { "filePath": "/var/lib/jenkins/workspace/Releases/eslint Release/eslint/fullOfProblems.js", "messages": [ { "ruleId": "no-unused-vars", "severity": 2, "message": "'addOne' is defined but never used.", "line": 1, "column": 10, "nodeType": "Identifier", "messageId": "unusedVar", "endLine": 1, <------ note the end location included here! "endColumn": 16 },This is probably a major effort since all the rules would need to be updated to support this. Perhaps it could be introduced in a backwards-compatible way, where SQLFluff itself just assumes the violation is 1 character wide (or alternatively the entire starting anchor) unless the rule has been updated to provide an end location.
I am only just now learning how to write my own rule, and have never done something similar with other linters. So my perspective is limited and perhaps a different design would be better. But to try to give some initial thought to the issue: intuitively I think what would make sense is if the rule can provide an optional
end_anchorin theLintResult, in addition to theanchoralready given. When provided, it would be the inclusive location of where the end of the violation was. Possibly one sensible default value for rules that don't yet provide one might be to assume a default ofend_anchorequal to the existing startinganchor. It might even be the right thing for some rules: for the initial rules I wrote it coincidentally would work...An open question might be which child nodes, if any, are also included when the
end_anchoris a parent node. I kind of think the answer might be "all of them" so that I don't need to traverse the tree to find the final leaf node if I just want to highlight the entire tree branch. For example, in my case, I just want to highlight the entirecolumn_referenceand have no interest otherwise in traversing it further. If someone wanted to only highlight up to a specific child node then they could just return that specific child node. Alternatively, you could be requiring users to provide the final leaf node in the tree and do that traversal themselves.Another question might be what if the user wants to only highlight up to a specific location within the raw text of an individual tree node. However, already the starting anchor does not support returning a specific start location within the starting anchor, so I think it's orthogonal to this issue.
Use case
Ultimately I would like SQLFluff to provide a way for text editor plugins to accurately highlight the entire violation, from starting location to ending location.
I think the best way to do that will be if the JSON output of SQLFluff can be updated to include end location information, similar to ESLint.
Dialect
General improvements to SQLFluff rule API
Are you willing to work on and submit a PR to address the issue?
Code of Conduct