Fix f-string formatting in assignment statement#14454
Conversation
4f24b97 to
9f968a5
Compare
|
|
This is looking good! |
This comment was marked as resolved.
This comment was marked as resolved.
b7aa489 to
c2d8b38
Compare
19e6a99 to
4c71b5d
Compare
c2d8b38 to
55574a9
Compare
There was a problem hiding this comment.
This overall looks good, but I think we're now using the BestFit layout in too many places, which leads to bad formatting in clause headers (see my inline comment).
I think we should only use the BestFit layout if the FString uses the flat layout. In all other cases, we should use the Multiline layout.
We should add some more tests that cover BestFit layout usages in non assignment positions to verify that we got the needs_parentheses changes correct.
See
and
| // This isn't decided yet, refer to the relevant discussion: | ||
| // https://github.com/astral-sh/ruff/discussions/9785 | ||
| else if StringLike::FString(self).is_multiline(context.source()) { | ||
| } else if StringLike::FString(self).is_multiline(context) { |
There was a problem hiding this comment.
I think using BestFit for all f-strings that contain multiline expressions isn't correct. For example. It results in
if f"aaaaaaaaaaa { ttttteeeeeeeeest} more {
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
}": passbeing formatted as
if f"aaaaaaaaaaa {ttttteeeeeeeeest} more {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}":
passWhich seems worse than before.
I think we should keep using Multiline if the f-string has any multiline expression to avoid collapsing them.
There was a problem hiding this comment.
I think the reason it was formatted like that previously is because the needs_parentheses would return OptionalParentheses::Never. If we would return Multiline then I think parentheses will be added making it:
if (
f"aaaaaaaaaaa {ttttteeeeeeeeest} more {
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
}"
):
passThere was a problem hiding this comment.
We can also decide to return Never. It's not entirely clear to me what the better and more consistent layout is. I would have to play with a few layouts but I don't think it should be what it is now.
Using Never has the advantage that it avoids unnecessary parentheses and is closer to what we had today (and no one complained?). Adding parentheses is similar to having "aaaaa" + tttttt + "more(aaaaaaaaaaaaaaaaaaaaaaaaaa). It might be good to play with the formatting in other positions where we use BestFit to decide what the best layout is
There was a problem hiding this comment.
Yeah, that's a good idea, I can do that.
There was a problem hiding this comment.
We should add a test for this
| if string.is_implicit_concatenated() || !string.is_multiline(context) { | ||
| return false; | ||
| } |
There was a problem hiding this comment.
Your change is an improvement to what we had before.
For example, this was correctly formatted code before
call(f"{
testeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee
}")But I don't think it's correct to use the "hug" layout if an inner expression is multiline
call(f"{
aaaaaa
+ '''test
more'''
}")There was a problem hiding this comment.
I'm not sure I follow here. Are you saying that both the code snippet should result in the formatting where the f-string is on the same line as the call expression (call(f"{)?
There was a problem hiding this comment.
This is not directly related to your PR, but I noticed it when reviewing it because you changed is_multiline. We have to make a decision on the idiomatic formatting for "hugging multiline strings" for f-strings.
I'm leaning towards that we should only "hug" if the f-string itself contains any multiline string literal, but not if any inner-expression is multiline. Similar to prettier:
call(
`${
aaaaaaaaaaaaaaaaaaaaaaaaaaaa +
`test more
aaa` +
morrrrrrrrrrrrrrrrrrrrrr
}`,
);So what I'm saying is that we should probably not hug for the both cases above.
There was a problem hiding this comment.
I suggest that we add an explicit is_triple_quoted check here. I don't think we ever want this to apply to any non-triple quoted strings. And let's add some test cases that cover f-strings too (cases where we don't want the layout to apply as well as cases where it should apply)
crates/ruff_python_formatter/resources/test/fixtures/ruff/expression/fstring.py
Show resolved
Hide resolved
03896f0 to
4ce0c42
Compare
There was a problem hiding this comment.
This looks good. I still think we should add more tests for f-strings in clause headers and assert statements. For example:
if (f"teaaaaaaaaaaaaaaaaaaaaaaaa{
expressioneeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee # comment
}aast".contains('a')
):
passWe now remove parentheses from:
aaaaaaaaaaaaaaaaaa = (
f"testeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee{[a, b,
# comment
]}moee" # comment
) but keep them for
aaaaaaaaaaaaaaaaaa = (
f"testeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee{[a, b,
]}moee"
# comment
) The second case is consistent with how we handle parentheses for all other non-fstring assignments.
But I think what you have now is consistent with how e.g. lists are formatted where the parentheses are removed as well
aaaaaaaaaaaaaaaaaa = (
[testeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee,
# comment
]
) | if is_f_string_formatting_enabled(context) { | ||
| // Expressions containing comments can't be joined. | ||
| // | ||
| // Format specifiers needs to be checked as well. For example, the | ||
| // following should be considered multiline because the literal | ||
| // part of the format specifier contains a newline at the end | ||
| // (`.3f\n`): | ||
| // | ||
| // ```py | ||
| // x = f"hello {a + b + c + d:.3f | ||
| // } world" | ||
| // ``` | ||
| context.comments().contains_comments(expression.into()) | ||
| || expression.format_spec.as_deref().is_some_and(|spec| { | ||
| contains_line_break_or_comments(&spec.elements, context) |
There was a problem hiding this comment.
Do we need to consider debug expressions too?
There was a problem hiding this comment.
I don't think so. If debug expressions are present, then there are no breakpoints in that f-string.
There was a problem hiding this comment.
But they could be multiline, no? The logic here isn't specific to the assignment formatting but is generally used to determine if a string is known to be multiline and it's unclear to me how a newline in a f-string literal is different from a newline in the debug expression
There was a problem hiding this comment.
Hmm, so something like the following which cannot be flattened:
aaaaaaaa = f"aaaaaaaaaa {
aaaaaaa + bbbbbbb + cccccccc = } dddddddddd"So, I think we should check if there's a debug expression and if so then check for line breaks in the expression itself.
There was a problem hiding this comment.
Yes, but I think we also have to test in the debug expression because the above has no line break in the expression, only in the debug expression but the whole f-string should be considered multiline (whether it can be flattened is irrelevant here because this is a general purpose method to determine if a string contains a hard line break)
| if string.is_implicit_concatenated() || !string.is_multiline(context) { | ||
| return false; | ||
| } |
There was a problem hiding this comment.
I suggest that we add an explicit is_triple_quoted check here. I don't think we ever want this to apply to any non-triple quoted strings. And let's add some test cases that cover f-strings too (cases where we don't want the layout to apply as well as cases where it should apply)
1e63fdc to
0587004
Compare
|
I've added a bunch of test cases for f-strings in various positions. The formatting is different between the one in assignment vs e.g., a So, the following: aaaaaa = f"testeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee{
expression}moreeeeeeeeeeeeeeeeeeeeeeeeeeeeee"
aaaaaa = (
f"testeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee{expression}moreeeeeeeeeeeeeeeeeeeeeeeeeeeee"
)
for a in f"testeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee{
expression}moreeeeeeeeeeeeeeeeeeeeeeeeeeeeee":
pass
for a in (
f"testeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee{expression}moreeeeeeeeeeeeeeeeeeeeeeeeeeeee"
):
passwill get formatted to: aaaaaa = (
f"testeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee{expression}moreeeeeeeeeeeeeeeeeeeeeeeeeeeeee"
)
aaaaaa = (
f"testeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee{expression}moreeeeeeeeeeeeeeeeeeeeeeeeeeeee"
)
for a in f"testeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee{
expression
}moreeeeeeeeeeeeeeeeeeeeeeeeeeeeee":
pass
for a in (
f"testeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee{expression}moreeeeeeeeeeeeeeeeeeeeeeeeeeeee"
):
passI wonder if this would need to be done instead at the f-string expression level. |
|
The for case seems fine to me. It's mostly consistent with the formatting in if-statements: for a in f"testeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee{
expression}moreeeeeeeeeeeeeeeeeeeeeeeeeeeeee":
pass
for a in f"testeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee{expression}moreeeeeeeeeeeeeeeeeeeeeeeeeeeee":
pass
if f"testeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee{
expression}moreeeeeeeeeeeeeeeeeeeeeeeeeeeeee":
pass
if (
f"testeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee{expression}moreeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee"
):
passfor a in f"testeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee{
expression
}moreeeeeeeeeeeeeeeeeeeeeeeeeeeeee":
pass
for a in (
f"testeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee{expression}moreeeeeeeeeeeeeeeeeeeeeeeeeeeee"
):
pass
if f"testeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee{
expression
}moreeeeeeeeeeeeeeeeeeeeeeeeeeeeee":
pass
if f"testeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee{expression}moreeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee":
passThis seems fine to me and we can iterate on the exact formatting in future style guides based on actual user feedback |
|
The ecosystem change looks correct to me and is similar to what we do for regular strings: --- /Users/dhruv/playground/ruff/formatter/preview.py
+++ /Users/dhruv/playground/ruff/formatter/preview.py
@@ -1,6 +1,10 @@
if 1:
if 1:
- print("Expected num_entities: {len(self.batchs)*num_entities}. \
- Acutal num_entites: {self.get_thread_local_collection().num_entities}")
- print(f"Expected num_entities: {len(self.batchs)*num_entities}. \
- Acutal num_entites: {self.get_thread_local_collection().num_entities}")
+ print(
+ "Expected num_entities: {len(self.batchs)*num_entities}. \
+ Acutal num_entites: {self.get_thread_local_collection().num_entities}"
+ )
+ print(
+ f"Expected num_entities: {len(self.batchs) * num_entities}. \
+ Acutal num_entites: {self.get_thread_local_collection().num_entities}"
+ )Going to merge this now. |
## Summary This PR fixes a bug in the f-string formatting to not consider the escaped newlines for `is_multiline`. This is done by checking if the f-string is triple-quoted or not similar to normal string literals. This is not required to be gated behind preview because the logic change for `is_multiline` was added in #14454. ## Test Plan Add a test case which formats differently on `main`: https://play.ruff.rs/ea3c55c2-f0fe-474e-b6b8-e3365e0ede5e
Summary
fixes: #13813
This PR fixes a bug in the formatting assignment statement when the value is an f-string.
This is resolved by using custom best fit layouts if the f-string is (a) not already a flat f-string (thus, cannot be multiline) and (b) is not a multiline string (thus, cannot be flattened). So, it is used in cases like the following:
Which is (a)
FStringLayout::Multilineand (b) not a multiline.There are various other examples in the PR diff along with additional explanation and context as code comments.
Test Plan
Add multiple test cases for various scenarios.