Avoid omitting parentheses for trailing attributes on call expressions#6322
Avoid omitting parentheses for trailing attributes on call expressions#6322charliermarsh merged 1 commit intomainfrom
Conversation
| return false; | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
I'm guessing this is not the right solution, but it passes the new fixtures. Should this instead be in expr_compare.rs?
There was a problem hiding this comment.
The method here mirrors https://github.com/psf/black/blob/2fd9d8b339e1e2e1b93956c6d68b2b358b3fc29d/src/black/lines.py#L884-L966 I had a quick glance and don't seem to find this logic. But your change seems correct because the method returns false for the provided example. I recommend adding print statements to black's code base and then running the formatting on the code snipped to better understand what the underlying logic is.
There was a problem hiding this comment.
@konstin any chance this already gets fixed by your Call chain PR?
There was a problem hiding this comment.
i don't think so, they should be independent
PR Check ResultsBenchmarkLinuxWindows |
There was a problem hiding this comment.
This logic doesn't seem to be specific to compare operators. It also applies to boolean operators and, I assume for binary expressions too.
def f():
return (
unicodedata.normalize("NFKC", s1).casefold()
== unicodedata.normalize("NFKC", s2).casefold()
)
return (
unicodedata.normalize("NFKC", s1).casefold()
and unicodedata.normalize("NFKC", s2).casefold()
)
ct_match = (
aaaaaaaaaaact_id == self.get_content_type(obj=rel_obj, using=instance._state.db).id
)
ct_match = (
aaaaaaaaaaact_id and self.get_content_type(obj=rel_obj, using=instance._state.db).id
)
ct_match = aaaaaaaaaaact_id == xyzaaaaaa
ct_match = aaaaaaaaaaact_id and xyzaaaaaa
ct_match = {aaaaaaaaaaaaaaaa} == self.get_content_type(
obj=rel_obj, using=instance._state.db
).id
ct_match = {aaaaaaaaaaaaaaaa} and self.get_content_type(
obj=rel_obj, using=instance._state.db
).id
ct_match = (aaaaaaaaaaaaaaaa) == self.get_content_type(
obj=rel_obj, using=instance._state.db
).id
ct_match = (aaaaaaaaaaaaaaaa) and self.get_content_type(
obj=rel_obj, using=instance._state.db
).idWe need to spend more time understanding what black's doing to ensure we move closer. One way to assert this is to run the ecosystem checks on our selected projects and verify that either the similarity index increases or that diffing the Black/Ruff differences between the old and new Ruff versions introduces no regressions.
I recommend you to debug Black's can_omit_invisible_parens, delimiter_split, and rhs to better understand what happening. It can also be useful to debug the selected selected transformers.
rhs->in_parentheses_only_*helpers- delimiter` -> Normal groups
We can also hop on a call and do this together. It ca
| return false; | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
The method here mirrors https://github.com/psf/black/blob/2fd9d8b339e1e2e1b93956c6d68b2b358b3fc29d/src/black/lines.py#L884-L966 I had a quick glance and don't seem to find this logic. But your change seems correct because the method returns false for the provided example. I recommend adding print statements to black's code base and then running the formatting on the code snipped to better understand what the underlying logic is.
| return false; | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
@konstin any chance this already gets fixed by your Call chain PR?
|
Thank you! Will take a look at this again today and review the Black source. |
0bd76d0 to
4705850
Compare
|
I have deduced at least that our // Only use the layout if the first or last expression has parentheses of some sort.
let first_parenthesized = visitor
.first
.is_some_and(|first| has_parentheses(first, visitor.source));
let last_parenthesized = visitor
.last
.is_some_and(|last| has_parentheses(last, visitor.source));
first_parenthesized || last_parenthesized(Although in Black, this logic is split between a few branches, I think.) Continuing to dig in... |
I think ct_match = (
aaaaaaaaaaact_id == self.get_content_type(obj=rel_obj, using=instance._state.db).id
)rather than the attribute. This makes the above-mentioned path to return Doing something like this seems to help (do we need to do the same for |
|
Ohhh smart. I didn't realize that Black formatted these differently: ct_match = (
aaaaaaaaaaact_id == self.get_content_type(obj=rel_obj, using=instance._state.db).id
)
ct_match = aaaaaaaaaaact_id == self.get_content_type(
obj=rel_obj, using=instance._state.db
) |
|
It looks like on ct_match = (
{aaaaaaaaaaaaaaaa} == self.get_content_type[obj, rel_obj, using, instance._state.db].id
)Is that covered in the same place, or elsewhere? I feel like it's elsewhere, since it's not about removing the optional parentheses, but about which expression we break (left-vs-right). |
4705850 to
c6866c2
Compare
c6866c2 to
7d8262d
Compare
|
It seems like Black prefers splitting the expression on the RHS of a binary operator, but will split the left if it can't (playground), but hesitant to keep investigating this here because it seems separate and I know you've spent more time on binary operators. |
|
Is it specific to subscript operations? It would be good to understand in depth which expressions we break differently. We implement a similar "Prefer splitting parenthesized expressions before binary expression" logic internal discord. Or it is because our groups have the wrong hierarchy:
|
|
Not specific to subscripts, I'll look into the group hierarchy, thanks! |
|
(We do seem to have a "group sequence" as described above. Gonna keep looking into it, I need to understand this better -- but I do think it can be done separately from this PR.) |
|
Separate PR makes sense. Can you include the similarity index changes in your test plan |
|
Yes sir! |
|
Done, very small improvements in three projects. |
| attr: _, | ||
| ctx: _, | ||
| }) => { | ||
| self.visit_expr(value); |
There was a problem hiding this comment.
Can you test if we need the same for Subscript? Because it has the same pattern or is it not required because we never want to break subscripts?
There was a problem hiding this comment.
I did look into this but perhaps I didn't understand what you're referencing. Are you referring to a subscript with a trailing attribute, like foo[bar].baz? Or a subscript following a function call, like foo(bar)[baz]? I believe I added test coverage for both, and both are consistent with Black.
There was a problem hiding this comment.
This PR fixes
ct_match = aaaaaaaaaaact_id == self.get_content_type(
obj=rel_obj, using=instance._state.db
)Where the last item is a call chain. The fix was to correctly set last to the call because the callee is to the left of the call. This fixed has_parentheses and has_own_parentheses to return true.
Now, we have a similar situation with subscript where we visit the value to the left of the subscript. Meaning a[b] ends with the subscript and not the node a (the subscript should be last).
Now, whether it should apply that layout is a different question, but it currently seems inconsistent because has_parentheses returns true, but we don't set the subscript to the last node.
There was a problem hiding this comment.
Black output:
ct_match = (
self.get_content_type["obj=rel_obj, using=instance._state.db"] == aaaaaaaaaaact_id
)
ct_match = (
aaaaaaaaaaact_id == self.get_content_type["obj=rel_obj, using=instance._state.db"]
)So black prefers not to break subscripts. Can we remove the Subscript case from has_own_parentheses?
There was a problem hiding this comment.
Will investigate, thank you for clarifying.
There was a problem hiding this comment.
We can merge this in the meantime. Maybe update the title?
|
I'm realizing that this didn't actually fix the whole issue. Given def f():
return (
unicodedata.normalize("NFKC", s1).casefold()
== unicodedata.normalize("NFKC", s2).casefold()
)That's stable formatting for Black, but we do: def f():
return unicodedata.normalize("NFKC", s1).casefold() == unicodedata.normalize(
"NFKC", s2
).casefold() |
|
It looks like Black treats these differently depending on whether the function call contains any arguments. |
astral-sh#6322) ## Summary This PR modifies our `can_omit_optional_parentheses` rules to ensure that if we see a call followed by an attribute, we treat that as an attribute access rather than a splittable call expression. This in turn ensures that we wrap like: ```python ct_match = aaaaaaaaaaact_id == self.get_content_type( obj=rel_obj, using=instance._state.db ) ``` For calls, but: ```python ct_match = ( aaaaaaaaaaact_id == self.get_content_type(obj=rel_obj, using=instance._state.db).id ) ``` For calls with trailing attribute accesses. Closes astral-sh#6065. ## Test Plan Similarity index before: - `zulip`: 0.99436 - `django`: 0.99779 - `warehouse`: 0.99504 - `transformers`: 0.99403 - `cpython`: 0.75912 - `typeshed`: 0.72293 And after: - `zulip`: 0.99436 - `django`: 0.99780 - `warehouse`: 0.99504 - `transformers`: 0.99404 - `cpython`: 0.75913 - `typeshed`: 0.72293
Summary
This PR modifies our
can_omit_optional_parenthesesrules to ensure that if we see a call followed by an attribute, we treat that as an attribute access rather than a splittable call expression.This in turn ensures that we wrap like:
For calls, but:
For calls with trailing attribute accesses.
Closes #6065.
Test Plan
Similarity index before:
zulip: 0.99436django: 0.99779warehouse: 0.99504transformers: 0.99403cpython: 0.75912typeshed: 0.72293And after:
zulip: 0.99436django: 0.99780warehouse: 0.99504transformers: 0.99404cpython: 0.75913typeshed: 0.72293