Skip to content

rule: W023 scalar-udf-in-where - warn on scalar UDF in WHERE clause #30

@Pawansingh3889

Description

@Pawansingh3889

What this rule should catch

T-SQL scalar UDFs in WHERE clauses force row-by-row evaluation, killing the optimiser's ability to use indexes or parallelise. A query that runs in milliseconds without the UDF can take minutes with it.

Bad SQL (should fire)

SELECT order_id, total
FROM orders
WHERE dbo.fn_IsHighValue(total) = 1;

The optimiser must call fn_IsHighValue(total) once per row. No index seek possible.

Good SQL (should not fire)

-- Inline the predicate
SELECT order_id, total
FROM orders
WHERE total > 1000;

-- Or use an inline TVF instead of a scalar UDF
SELECT o.order_id, o.total
FROM orders o
CROSS APPLY dbo.tvf_HighValueFlag(o.total) f
WHERE f.is_high = 1;

Suggested rule code

W023. W014 / W015 are claimed by issues #4 and #5. W016-W020 are merged. W021 / W022 are pending the rename of issues #3 and #8 (rule codes clashed with merged work).

Detection approach

Regex-first: match WHERE clause containing <schema>.<func_name>(...) where the function call is in predicate position. Scope to T-SQL initially via dialect detection or T-prefix rule code (could re-prefix as T006 if T-SQL-only). Postgres equivalent (VOLATILE functions in WHERE) is a v0.8 candidate.

Edge cases to handle

  1. Built-in T-SQL functions (LEN(), UPPER(), SUBSTRING(), CONVERT()) should NOT fire. These are built-ins, not scalar UDFs. Allowlist by name OR by lack of schema prefix (built-ins don't use dbo.).
  2. Function call in SELECT list should NOT fire. Only WHERE / HAVING / ON.
  3. EXISTS (... WHERE func(x) = 1) correlated subqueries should fire on the inner WHERE.

Reference implementation

See sql_guard/rules/warnings.py:CountDistinctUnbounded (W019) as the canonical multi-line warning rule. Mirror that shape: regex compiled at class init, multiline=True, check_statement returning a Finding.

Why this matters

This is one of the top-3 reasons production T-SQL queries inexplicably slow down at scale. Microsoft's Query Tuning Assistant and most T-SQL performance writing (Itzik Ben-Gan, Adam Machanic, Brent Ozar) flag scalar UDFs as a known anti-pattern. SQL Server 2019 added Scalar UDF Inlining but it's still off by default in many environments.

Acceptance criteria

  • New rule class registered in sql_guard/rules/__init__.py
  • Fixture line in tests/fixtures/warnings.sql
  • "Fires on bad SQL" test in tests/test_rules.py
  • "Does not fire on safe SQL" test (covering built-in functions in WHERE)
  • README rule table + Key Numbers count updated
  • CHANGELOG entry under [Unreleased] -> ### Added

Part of v0.7 — Performance Rules Pack. See ROADMAP.md for the full picture.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions