feat(rules): warn on OVER ()#21
Conversation
Pawansingh3889
left a comment
There was a problem hiding this comment.
Thanks Prabhu. Template's filled and CI's clean. A few things to address before merge.
1. Rebase needed. Tonight's v0.5.0 release (#24) added four T-SQL rules (T001-T004) and a W002/W006 regex tweak, so ALL_RULES count and the test_rules.py registry tests have moved. Your branch is still at the pre-v0.5.0 state, so there will be conflicts on sql_guard/rules/__init__.py and tests/test_rules.py.
2. Rule ID: W013, not W014. Current highest W is W012 (group-by-ordinal), so W014 leaves a gap. Note that issue #9 suggests the S-prefix (S004) on the grounds that this is a structural pattern rather than a style warning. Either W013 or S004 works, pick one.
3. Semantic gap with issue #9. The issue body asks for "warn on OVER () without PARTITION BY" and the only "should fail" example is ROW_NUMBER() OVER (). Your implementation is looser: it passes anything with ORDER BY inside, so OVER (ORDER BY event_time) with no PARTITION BY slips through even though it still computes over the whole result set. Please either tighten the rule to match the issue (require PARTITION BY), or confirm in a comment that we are intentionally widening the scope, in which case the rule name should move away from window-missing-order-partition to something like empty-over-clause to match what it actually detects.
4. Regex and additional tests. The [^)]* inside _valid_pattern stops at the first ), so nested parens inside OVER could cause edge-case behaviour. Two tests worth adding to pin down the behaviour: verify OVER (PARTITION BY user_id) alone passes, and verify OVER with a nested function call but no ORDER / PARTITION fires. A third test on OVER (ORDER BY event_time) alone depends on how you resolve item 3.
5. Minor cleanups.
tests/test_python_scanner.pyhas an unrelated blank-line deletion betweentest_extract_captures_variable_assignmentandtest_invalid_python_returns_empty. Please revert.sql_guard/rules/warnings.pyis missing a trailing newline after the new class.
Happy to re-review once these are in.
fd9b4cf to
ea832ac
Compare
|
Thanks Prabhu, the W014 → W013 change came through cleanly, the rule now correctly fires on any OVER without PARTITION BY (matching the issue), and the unrelated test_python_scanner.py change is reverted. CI is green across all four Python versions. Good iteration. Two small things still pending and one mechanical one before I can merge:
Steps: git fetch origin Once green I'll merge same day. Thanks again for the careful work on the rule semantics — the rename and the tightening are exactly right. |
- E007 alter-add-not-null-no-default: forces table rewrite under Sch-M lock - E008 drop-column: irreversible, breaks replication subscribers and rollback - W017 leading-wildcard-like: LIKE '%foo' defeats B-tree indexes - W018 or-across-columns: WHERE a=1 OR b=2 defeats single-column indexes - W020 truncate-table: bypasses DELETE triggers and resets identity - T005 create-index-without-online: builds without WITH (ONLINE = ON) lock readers
Same-line and disable-next-line forms silence rules per occurrence so known false positives don't need a project-wide disable. Recognises '-- sql-guard: disable=W001,W003' (SQL) and '# sql-guard: disable=...' (Python). Bare 'disable' with no rule list silences every rule on the line, mirroring '# noqa' from flake8/ruff.
Loader walks up from cwd looking for .sql-guard.yml (or .sql-guard.yaml). Recognises disable, ignore, include_python, severity. Returns an empty Config when no file is present so callers don't have to None-check. CLI flags merge with and override config values. PyYAML promoted from optional to a runtime dependency.
Calls git diff --name-only against the working tree (or against a base ref via --changed-base) and intersects with the discovered file list. Falls back to a full scan with a warning when the cwd isn't a git repo or git is unavailable. Speeds up pre-commit hooks on large codebases.
Builds a SARIF document compatible with the github/codeql-action/upload-sarif action so findings render inline on PRs in the GitHub Files Changed view. The full rule catalogue ships in the tool descriptor; results carry rule id, level, file, line, and the suggestion appended to the message text. Lines below 1 are clamped to 1 to satisfy the schema.
- --config / -c PATH overrides automatic .sql-guard.yml discovery - --changed-only filters discovered files via git diff (with optional --changed-base REF). Empty diff exits cleanly without a 'no issues' scan banner. - --format sarif emits a SARIF 2.1.0 doc to stdout, or to a file via --output / -o. - --disable rule IDs are uppercased so 'w001' and 'W001' both work. - Config file disable / ignore / include_python merge with CLI flags.
Key Numbers updated to 37 rules / 143 tests / v0.6.0. Rule tables grow to include E007, E008, W017, W018, W020, T005. New Configuration sections cover .sql-guard.yml, inline disable comments, --changed-only, and SARIF output (with a copy-paste GitHub Actions snippet).
CHANGELOG covers the 6 new rules, inline disable directives, .sql-guard.yml config, --changed-only flag, SARIF reporter, and the PyYAML dependency promotion. P005 / W016 entries that were sitting in [Unreleased] are folded into the 0.6.0 release notes.
v0.6.0 declares pyyaml as a runtime dependency. micropip's resolver
sometimes fails to install it cleanly because the only PyYAML build
that works under Pyodide is the bundled one served from
pyodide-cdn.s3, not the PyPI sdist. Pre-loading via
loadPackage(['micropip', 'pyyaml']) takes the bundled emscripten
build directly and unblocks the subsequent micropip.install('sql-sop').
Also reordered: install sqlparse first (so the sql-sop resolve doesn't
fan out and re-download it as a transitive dep) and dropped the
trailing standalone sqlparse install since it now happens earlier in
the chain.
Status messages updated to surface the new two-stage init.
The catch-block error message used '\'t' (escaped backslash + apostrophe)
inside a single-quoted string literal. JS parsed the backslash as a
literal backslash, then took the apostrophe as the string terminator,
then choked on 't start the Python runtime' as an unexpected identifier.
Devtools showed it as: Uncaught SyntaxError: Unexpected identifier 't'
at sql-guard/:125
A SyntaxError on any line of an inline script kills the whole script
before any of it executes. So boot() never ran and the page sat at the
initial HTML strings ('Loading Python runtime...', 'Waiting for Pyodide
to load...') forever, regardless of CDN state, regardless of what
fixes we'd pushed elsewhere in the file.
Replace the contraction with 'Could not' to sidestep the apostrophe
entirely. Less fragile than re-escaping.
pypistats.org/packages/sql-sop confirms 530 downloads in the last 30 days as of 2026-04-26 (vs 222 in the last 7 days, 2.6k all-time per pepy.tech). The 195+ figure was several months stale; rounding down to 500+ stays conservative and verifiable.
Resolves Pawansingh3889#7. (Issue title says W017 but W017 is now leading-wildcard-like in v0.6.0; W019 is the next free ID before W020.) Warns on COUNT(DISTINCT col) without a WHERE, GROUP BY, or LIMIT in the same statement -- forces a full sort + distinct pass and is a frequent perf surprise on prod. - 'SELECT COUNT(DISTINCT user_id) FROM events;' -> fires - 'SELECT COUNT(DISTINCT user_id) FROM events WHERE event_date >= ...' -> passes - 'SELECT tenant_id, COUNT(DISTINCT user_id) FROM events GROUP BY tenant_id;' -> passes - 'SELECT COUNT(*) FROM events;' -> not flagged Updated: - sql_guard/rules/warnings.py: new CountDistinctUnbounded rule. - sql_guard/rules/__init__.py: registered in ALL_RULES. - tests/test_new_rules.py: 6 new test cases covering hit, WHERE, GROUP BY, LIMIT, whitespace handling, and plain COUNT non-firing. - tests/test_rules.py: bumped expected ALL_RULES count 32->33 and warning count 22->23. - CHANGELOG.md: Unreleased entry. Co-authored-by: Matt Van Horn <[email protected]>
Single-rule patch release. W019 lands count-distinct-unbounded from PR Pawansingh3889#29 (mvanhorn). Bypass list recognises WHERE / GROUP BY / LIMIT plus T-SQL TOP / FETCH FIRST / FETCH NEXT. Surface counts refreshed: 38 rules (33 SQL + 5 Python), 149 tests. Trusted Publishing will fire on the v0.6.1 tag push.
|
Requested changes are done. |
|
CI's flagging two things, only one of which is yours. The error: Suggested entry to drop under
The two rev-pin warnings ( Trailing newline check on warnings.py: I'll confirm on the diff once CI is green. Push the CHANGELOG line and ping me. |
|
Changes are done. |
|
Thanks Prabhu! 🙌 The CHANGELOG entry is perfect and all 6 checks are green. Nice clean rebase across the five files. One tiny thing I caught when I checked locally: If you've got 30 seconds, easiest is for you to add the newline and force-push. Otherwise I can push the one-byte fix as a maintainer, completely fine if you'd rather. Just let me know. Either way we're merging this today 🚀. Genuinely appreciate the patience across three review rounds, the rule is clean and the test coverage is exactly right. |
|
new trailing line added to warning.py. |
|
Confirmed locally, last byte of warnings.py is Brilliant work Prabhu. Nice rule, careful semantics, clean rebases through three rounds. The OVER () pattern was a real gap in the catalogue. 🎉 Looking forward to the next one. |
Single-rule patch release. W013 window-without-partition lands from PR #21 (@Prabhu-1409). Dialect-aware messaging across Postgres and Redshift, three review rounds, well-tested. Repository scaffolding shipped alongside (not user-facing in the package, but visible to contributors): - ROADMAP.md with v0.7 Performance Rules Pack milestone scope - README Contributors section (currently @tmchow, @mvanhorn, @Prabhu-1409) - scripts/scaffold_rule.py for new-rule boilerplate 39 rules total (34 SQL + 5 Python), 152 tests. Trusted Publishing fires on the v0.6.2 tag push.
What changed
Added a new warning rule W014 to detect window functions using OVER() without ORDER BY or PARTITION BY. This helps flag potentially non-deterministic queries and improves readability of SQL.
Related issue
Closes #9
If this adds or changes a rule
E00N/W0NN/S00N/P00N)sql_guard/rules/__init__.pytests/fixtures/errors.sqlorwarnings.sqltests/test_rules.pytmp_path)GOVERNANCE.md)Tests
pytest -qpassesruff check .passesChecklist
feat:,fix:,docs:,chore:,ci:,test:,style:)CHANGELOG.md[Unreleased]entry if user-facingAnything else reviewers should know