Skip to content
/ django Public

Comments

Fixed #36795 -- Enforced quoting of all database object names.#20587

Draft
charettes wants to merge 2 commits intodjango:mainfrom
charettes:ticket-36795
Draft

Fixed #36795 -- Enforced quoting of all database object names.#20587
charettes wants to merge 2 commits intodjango:mainfrom
charettes:ticket-36795

Conversation

@charettes
Copy link
Member

Trac ticket number

ticket-36795

Branch description

Currenly trying to learn more about why we started not quoting users probided aliases as 9c52d56 which introduced the change has very little details about its origin.

AI Assistance Disclosure (REQUIRED)

  • No AI tools were used in preparing this PR.
  • If AI tools were used, I have disclosed which ones, and fully reviewed and verified their output.

@charettes
Copy link
Member Author

This is promising.

So from my Git archeology session it seems that the behaviour was introduced to support backends that care about the case of aliases such as Postgres.

If we are to move forward with this patch it means that mixing RawSQL references with annotations will have to adjust their quoting adequately (e.g. annotate(bAr=...).filter(RawSQL("bar = %s", (...))) but I think that's a worthy compromise given the benefits its provides from an SQL injection protection perspective.

Running tests on Oracle now to see if this can help with ticket-20226

@charettes
Copy link
Member Author

charettes commented Jan 27, 2026

The Oracle failures were very telling and explain ticket-20226.

As Shai pointed out on the ticket we'd have to break backward compatiblity to change that so this ticket will continue to focus solely on the quoting aspect.

It's kind of weird that we decided to default to always upper casing on Oracle to deal with SQL92 delimited (quoted) names handling of case while we decided not to do the same thing on Postgres which also follows the same standard.

Copy link
Member Author

@charettes charettes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jacobtylerwalls let me know what you think of the approach so far. The commit history obvisouly needs some rework.

We'll want to keep disallowing some characters in users provided aliases until we adjust the quote_alias method of each backend to transform them (e.g. handle " as """, strip \x00) according to their respective docs) but the current patch should at least provide a fondation to ensure we don't have ever skip calling quote_name on aliases.

Copy link
Member

@jacobtylerwalls jacobtylerwalls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sparks a lot of joy. Makes total sense for a first step.

My understanding is that the rest of the plan is:

  • make quote_name_unless_alias() a no-op for connection.ops.quote_name() and deprecate it
  • document the need to possibly adjust casing of RawSQL expressions
  • [in follow ups:] begin undoing FORBIDDEN_ALIAS_PATTERN by auditing the quote_name() methods for each backend to see what needs special handling

Do you anticipate any issues with these .lower() comparisons?

def resolve_expression(
self, query=None, allow_joins=True, reuse=None, summarize=False, for_save=False
):
# Resolve parents fields used in raw SQL.
if query.model:
for parent in query.model._meta.all_parents:
for parent_field in parent._meta.local_fields:
if parent_field.column.lower() in self.sql.lower():
query.resolve_ref(
parent_field.name, allow_joins, reuse, summarize
)
break

return r
if (quoted := self.quote_cache.get(name)) is not None:
return quoted
quoted = self.connection.ops.quote_name(name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I bet we could just make this a @cached_property now, and still seed it with * up front.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could be missing someting but the since we cache is a value of the name I don't think a cached_property would work here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I guess it would have to be @functools.cache. Seemed like we're maintaining our own version of that. But it's not something we have to do now.

@charettes
Copy link
Member Author

charettes commented Feb 1, 2026

@jacobtylerwalls

My understanding is that the rest of the plan is:

Yep that would be the plan, I'll spend some time re-organizing the commits to do the following

  1. Introduce SQLCompiler.quote_name, switch everything to it, document the impact to raw interfaces (extra, RawSQL) in the release notes.
  2. Deprecate quote_name_unless_alias and document it.

Adjusting quote_name to handle more chars should effectively be done in a follow up. I was thinking we could hash the tail of aliases the moment they include disallowed characters (per backend rules) and that would allow pretty much any user specified aliases to be provided and hopefully halt this torrent of reports.

Hashing the moment a disallowed character is provided is safe even if it results in a different name as long as it's deterministic. We don't have to worry about whether it's a user provided alias or not as the moment it includes disallowed characters it means an object could not have been created with this name in the first place and thus it must be a user provided alias.

Do you anticipate any issues with these .lower() comparisons?

def resolve_expression(
self, query=None, allow_joins=True, reuse=None, summarize=False, for_save=False
):
# Resolve parents fields used in raw SQL.
if query.model:
for parent in query.model._meta.all_parents:
for parent_field in parent._meta.local_fields:
if parent_field.column.lower() in self.sql.lower():
query.resolve_ref(
parent_field.name, allow_joins, reuse, summarize
)
break

Not that I can think of as even if the name is quoted it will stil be idenifiable through a containment check. In other words

'quoted_alias' in 'UPPER(quoted_alias) will still pass with 'quoted_alias' in 'UPPER("quoted_alias").

This ensures all database identifiers are quoted independently of their orign
and most importantly that user provided aliases through annotate() and alias()
which paves the way for dropping the allow list of characters such aliases can
contain.

This will require adjustments to raw SQL interfaces such as RawSQL that might
make reference to ORM managed annotations as these will now be quoted.

The `SQLCompiler.quote_name_unless_alias` method is kept for now as an alias
for the newly introduced `.quote_name` method but will be duly deprecated in
a follow up commit.
…iases feature flag.

Now that user provided aliases are systematically quoted there is no need to
disallow the usage of the dollar sign on Postgres.
@charettes charettes changed the title Refs #36795 -- Always quote user-provided aliases Fixed #36795 -- Enforced quoting of all database object names. Feb 1, 2026
@github-actions
Copy link

github-actions bot commented Feb 1, 2026

📊 Coverage Report for Changed Files

-------------
Diff Coverage
Diff: origin/main...HEAD, staged and unstaged changes
-------------
django/db/backends/mysql/compiler.py (0.0%): Missing lines 31
django/db/models/expressions.py (100%)
django/db/models/sql/compiler.py (91.7%): Missing lines 1489
django/db/models/sql/datastructures.py (100%)
-------------
Total:   17 lines
Missing: 2 lines
Coverage: 88%
-------------


Note: Missing lines are warnings only. Some lines may not be covered by SQLite tests as they are database-specific.

For more information about code coverage on pull requests, see the contributing documentation.

@jacobtylerwalls
Copy link
Member

Not that I can think of as even if the name is quoted it will stil be idenifiable through a containment check

To flesh out my thought, based on your hint I was anticipating that a user would need to change:

.annotate(bAr=...).filter(foo=RawSQL("bar = %s", (...)))

to:

.annotate(bAr=...).filter(foo=RawSQL("bAr = %s", (...)))

And then we would hit this line, where self.sql is now

(Pdb) self.sql
'bAr = %s'

But still maps to the same self.sql.lower() as before, meaning the containment check could return True when we don't want it to return True, given that we're introducing case sensitivity by quoting aliases:

# sortorder is a field on parent model
# Raw query on the proxy model using a uniquely cased alias SORTORDER should not cause parent field resolving?
>>> Proxy.objects.annotate(SORTORDER=F("pk")).annotate(foo=RawSQL("SORTORDER = %s", (1,))).query
# str()
SELECT ..., "tiles"."tileid" AS "SORTORDER", (SORTORDER = 1) AS "foo" FROM "tiles"

This still resolves sortorder field on the parent model even though SORTORDER is now case-distinct. So just a little unnecessary resolving? Don't know if this is a real problem, just hunting around for edge cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants