Skip to content

feat(rules): add W012 group-by-ordinal#14

Merged
Pawansingh3889 merged 2 commits intoPawansingh3889:mainfrom
mvanhorn:feat/w012-group-by-ordinal
Apr 19, 2026
Merged

feat(rules): add W012 group-by-ordinal#14
Pawansingh3889 merged 2 commits intoPawansingh3889:mainfrom
mvanhorn:feat/w012-group-by-ordinal

Conversation

@mvanhorn
Copy link
Copy Markdown
Contributor

Summary

Implements W012 (`group-by-ordinal`) per #2. Warns on `GROUP BY` that groups by positional column ordinal -- the grouping silently breaks the moment someone reorders the SELECT list.

Should fail

```sql
SELECT region, status, COUNT(*)
FROM orders
GROUP BY 1, 2;
```

Should pass

```sql
SELECT region, status, COUNT(*)
FROM orders
GROUP BY region, status;
```

Implementation notes

  • Single rule in `sql_guard/rules/warnings.py` wired into the registry.
  • Severity: `warning` -- matches other "brittle but not broken" patterns (W001 `SELECT *`, W011 `UNION`).
  • Regex: `GROUP\s+BY\s+\d+\b(\s*,\s*\d+\b)*`.
  • Digit-prefixed identifiers like `1st_quarter` are safe: the trailing `\b` only matches if `\d+` is a pure integer token, so `GROUP BY 1st_quarter` does not fire. A dedicated test locks this in.
  • Multiline rule, so it works for the common format where `GROUP BY` lives on its own line.

Test plan

  • New fixture case in `tests/fixtures/warnings.sql` triggers W012
  • `tests/test_rules.py` adds 4 unit cases (single ordinal, multiple ordinals, explicit names pass, digit-prefixed identifier passes)
  • Registry counts updated (`ALL_RULES` goes 20 -> 21, warnings 14 -> 15)
  • Full suite: 64 passed, 1 skipped. The one pre-existing failure (`test_structural.py::test_deep_nesting_detected`) reproduces on `main` and is unrelated.

Fixes #2

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your first PR on sql-sop! Quick checks:

  • pytest -q and ruff check . pass locally
  • Every new rule has both a "fires on bad SQL" test AND a
    "does NOT fire on safe SQL" test
  • Rule IDs (E001, W001, …) are public API — don't renumber
    or rename existing ones without the deprecation process in
    GOVERNANCE.md
  • CHANGELOG.md [Unreleased] entry for any user-facing change

First-PR-wins on the linked issue applies for 7 days. The
maintainer will review within 14 days of your last response.

@Pawansingh3889
Copy link
Copy Markdown
Owner

@mvanhorn — really nice rule. The 1st_quarter test especially: that's the kind of edge case I'd have happily shipped a regression for, so thanks for thinking about it.

Heads up on something that's actually my fault: your branch carries the test_duration_tracked >= 0 fix and a CHANGELOG note about it, because PR #13 hasn't landed yet. I'll merge #13 this week, and once main moves, mind rebasing off it? This should drop those two commits cleanly:

git fetch upstream
git rebase --onto upstream/main afb9c01 feat/w012-group-by-ordinal
One small ask while you're rebasing: add an [Unreleased] line under ### Added for W012. The welcome bot flagged it in the checklist too.

Couple of nits, take or leave:

The description calls ordinals "non-portable" but Postgres/MySQL/BigQuery/Snowflake all accept them. The reorder-fragility framing in the message feels truer to the actual harm.
GROUP BY region, 1 slips through because \d+ anchors right after BY. Not a blocker — sql-sop is regex-based throughout — just flagging.
Ping me when the rebase is up.

@Pawansingh3889
Copy link
Copy Markdown
Owner

Heads up: main has moved on a bit since you opened this. PR #13 (the
duration test fix) and PR #15 (architecture cleanup) both landed today,
so your branch carries the duration commits as a side effect. If you
can rebase off the new main, those should drop out cleanly:

git fetch upstream
git rebase --onto upstream/main afb9c01 feat/w012-group-by-ordinal

One small ask while you're rebasing: add an [Unreleased] entry under

Added in CHANGELOG.md for W012. Heads up that #15 added a ### Changed

block right above ### Fixed in [Unreleased], so you may need to resolve
a tiny merge conflict there. Just put your ### Added W012 line at the
top of [Unreleased].

Two optional thoughts you can take or leave:

  • The description says "non-portable" but Postgres/MySQL/BigQuery/Snowflake
    all accept ordinals. The reorder-fragility framing in the message feels
    truer to the actual harm.
  • GROUP BY region, 1 slips through because \d+ anchors right after BY.
    Not a blocker, sql-sop is regex-based throughout, just flagging.

Ping me when the rebase is up.

New warning rule that flags GROUP BY by positional column ordinal:

  -- should trigger
  SELECT region, status, COUNT(*) FROM orders GROUP BY 1, 2;

  -- should pass
  SELECT region, status, COUNT(*) FROM orders GROUP BY region, status;

Why it is a warning, not an error: it parses in every dialect and the
result set is correct -- until someone reorders the SELECT list, at
which point the grouping silently changes. Flagging it as warning
matches how other 'brittle but not broken' patterns are handled.

Regex: GROUP\s+BY\s+\d+\b(\s*,\s*\d+\b)*
The trailing word boundary on \d+\b means identifiers like 1st_quarter
do not match -- '1' is followed by a word character, so the boundary
does not hold and no false positive fires. A dedicated test locks that
in.
Add [Unreleased] ### Added entry for W012 per maintainer request.

Reframe the rule description from "non-portable" to reorder-fragility.
Postgres/MySQL/BigQuery/Snowflake all accept ordinals, so portability
is not the real harm. The harm is that the grouping silently changes
when the SELECT list is reordered.
@mvanhorn mvanhorn force-pushed the feat/w012-group-by-ordinal branch from 9e20cd7 to 02235b1 Compare April 19, 2026 13:41
@mvanhorn
Copy link
Copy Markdown
Contributor Author

Rebased onto main in 02235b1. The PR #13 duration commits dropped out cleanly since my only new commit was the W012 rule.

Added the ### Added block at the top of [Unreleased] for W012. No conflict with the ### Changed block from #15, since I was adding above it.

Reframed the rule description: dropped "non-portable" and led with the reorder-fragility framing. You're right that Postgres/MySQL/BigQuery/Snowflake all accept ordinals, so portability isn't the real harm; the brittleness is.

On the GROUP BY region, 1 gap: yep, \d+\b(\s*,\s*\d+\b)* only fires when every element after BY is an ordinal. A mixed list like region, 1 slips through. Happy to extend the pattern (or add a second pass that scans each comma-separated element) if you want it covered in a follow-up.

Verified locally: pytest -q shows 64 passed, 1 skipped. The 5 W012 tests pass, ruff clean. The one failure (test_deep_nesting_detected) reproduces on main and is unrelated.

@Pawansingh3889 Pawansingh3889 mentioned this pull request Apr 19, 2026
11 tasks
Copy link
Copy Markdown
Owner

@Pawansingh3889 Pawansingh3889 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. The reframing away from portability onto reorder-fragility reads cleanly in the CHANGELOG and the commit message, and the 1st_quarter test locks down the word-boundary reasoning. The mixed-list gap you flagged (GROUP BY region, 1) is real but a smaller second-pass problem, fine as a follow-up. Merging as squash.

@Pawansingh3889 Pawansingh3889 merged commit 76638a5 into Pawansingh3889:main Apr 19, 2026
5 checks passed
@mvanhorn
Copy link
Copy Markdown
Contributor Author

Appreciate W012 landing in the rule set.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

rule: W012 group-by-ordinal - warn on GROUP BY 1, 2 (non-portable)

2 participants