Split string formatting to individual nodes#9058
Conversation
e54f469 to
2cfc547
Compare
|
f1e269d to
6d7690d
Compare
|
Current dependencies on/for this PR:
This stack of pull requests is managed by Graphite. |
3afecc0 to
966af18
Compare
966af18 to
3e1635f
Compare
|
Would you mind including a comparision of the |
There was a problem hiding this comment.
Moving the formatting into the corresponding nodes helps to untangle some of the complexity.
It overall looks good, but there are some places where we can reduce the complexity by limiting the option types further.
My main feedback is not specifically related to this PR but to the string parts refactor.
I continue to find it extremelly difficult to keep the different string nodes apart. because we have ExprStringLiteral and StringLiteral. I made a naming recommendation on your most recent AST refactor and I believe that implementing it could greatly help with naming (including downstream logic like the formatting code here). Doing this renaming sooner rather than later will ensure that downstream crates use the terminology consistently.
My other feedback related to this refactor is that I find the StringLiteralValue concept difficult to understand because it seems to be mainly an implementation detail. Reducing that additional concept (from the already complicated string nodes) by making value and the StringLiteralValue struct private, and moving all StringLiteralValue methods to StringLiteralExpression might help further reduce the complexity of our string nodes (at least the perceived complexity)
I don't expect us to do this refactor as part of this PR but I think it will be worthwile to improve the naming of the string nodes, because I fail to understand the structure even after having looked up their definitions a couple of time.
crates/ruff_python_formatter/src/expression/expr_string_literal.rs
Outdated
Show resolved
Hide resolved
crates/ruff_python_formatter/src/expression/expr_string_literal.rs
Outdated
Show resolved
Hide resolved
|
Thanks for the detailed review. I've made most of the requested changes:
Open points:
|
Yeah, thanks for providing your perspective on this. With the AST rename PR, I'm able to understand why this might be so. I'll probably do a follow-up next week to merge |
MichaReiser
left a comment
There was a problem hiding this comment.
This is looking good to me but I think it would be helpful to avoid using StringOptions (or remove it entirely) and instead use more specific option types. I'm saying this because I struggled hard to understand which combinations of StringOptions are possible in the different formatting functions.
I commented on the PR where I think we can narrow the options but you can also use the following diff if you want all changes applied (requires some cleanup because I didn't always rename fields to match the type of the options).
Index: crates/ruff_python_formatter/src/expression/expr_f_string.rs
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/crates/ruff_python_formatter/src/expression/expr_f_string.rs b/crates/ruff_python_formatter/src/expression/expr_f_string.rs
--- a/crates/ruff_python_formatter/src/expression/expr_f_string.rs (revision Staged)
+++ b/crates/ruff_python_formatter/src/expression/expr_f_string.rs (date 1702518874243)
@@ -9,20 +9,19 @@
in_parentheses_only_group, NeedsParentheses, OptionalParentheses,
};
use crate::prelude::*;
-use crate::string::{AnyString, FormatStringContinuation, Quoting, StringOptions};
+use crate::string::{AnyString, FormatStringContinuation, Quoting};
#[derive(Default)]
pub struct FormatExprFString;
impl FormatNodeRule<ExprFString> for FormatExprFString {
fn fmt_fields(&self, item: &ExprFString, f: &mut PyFormatter) -> FormatResult<()> {
- let options =
- StringOptions::default().with_quoting(f_string_quoting(item, &f.context().locator()));
+ let quoting = f_string_quoting(item, &f.context().locator());
match item.value.as_slice() {
- [f_string_part] => f_string_part.format().with_options(options).fmt(f),
+ [f_string_part] => f_string_part.format().with_options(quoting).fmt(f),
_ => in_parentheses_only_group(
- &FormatStringContinuation::new(&AnyString::FString(item)).with_options(options),
+ &FormatStringContinuation::new(&AnyString::FString(item)).with_options(quoting),
)
.fmt(f),
}
Index: crates/ruff_python_formatter/src/expression/expr_string_literal.rs
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/crates/ruff_python_formatter/src/expression/expr_string_literal.rs b/crates/ruff_python_formatter/src/expression/expr_string_literal.rs
--- a/crates/ruff_python_formatter/src/expression/expr_string_literal.rs (revision Staged)
+++ b/crates/ruff_python_formatter/src/expression/expr_string_literal.rs (date 1702519544384)
@@ -6,18 +6,37 @@
use crate::expression::parentheses::{
in_parentheses_only_group, NeedsParentheses, OptionalParentheses,
};
+use crate::other::string_literal::StringLiteralKind;
use crate::prelude::*;
-use crate::string::{
- AnyString, FormatStringContinuation, StringOptions, StringPrefix, StringQuotes,
-};
+use crate::string::{AnyString, FormatStringContinuation, Quoting, StringPrefix, StringQuotes};
#[derive(Default)]
pub struct FormatExprStringLiteral {
- options: StringOptions,
+ options: ExprStringLiteralKind,
+}
+
+#[derive(Default, Copy, Clone, Debug)]
+pub enum ExprStringLiteralKind {
+ #[default]
+ String,
+ Docstring,
+}
+
+impl ExprStringLiteralKind {
+ const fn string_literal_kind(self) -> StringLiteralKind {
+ match self {
+ ExprStringLiteralKind::String => StringLiteralKind::String,
+ ExprStringLiteralKind::Docstring => StringLiteralKind::Docstring,
+ }
+ }
+
+ const fn is_docstring(self) -> bool {
+ matches!(self, Self::Docstring)
+ }
}
impl FormatRuleWithOptions<ExprStringLiteral, PyFormatContext<'_>> for FormatExprStringLiteral {
- type Options = StringOptions;
+ type Options = ExprStringLiteralKind;
fn with_options(mut self, options: Self::Options) -> Self {
self.options = options;
@@ -30,11 +49,15 @@
let ExprStringLiteral { value, .. } = item;
match value.as_slice() {
- [string_literal] => string_literal.format().with_options(self.options).fmt(f),
- _ => in_parentheses_only_group(
- &FormatStringContinuation::new(&AnyString::String(item)).with_options(self.options),
- )
- .fmt(f),
+ [string_literal] => string_literal
+ .format()
+ .with_options(self.options.string_literal_kind())
+ .fmt(f),
+ _ => {
+ assert!(!self.options.is_docstring());
+ in_parentheses_only_group(&FormatStringContinuation::new(&AnyString::String(item)))
+ .fmt(f)
+ }
}
}
Index: crates/ruff_python_formatter/src/other/f_string_part.rs
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/crates/ruff_python_formatter/src/other/f_string_part.rs b/crates/ruff_python_formatter/src/other/f_string_part.rs
--- a/crates/ruff_python_formatter/src/other/f_string_part.rs (revision Staged)
+++ b/crates/ruff_python_formatter/src/other/f_string_part.rs (date 1702519544305)
@@ -2,16 +2,17 @@
use ruff_python_ast::FStringPart;
use crate::other::f_string::FormatFString;
+use crate::other::string_literal::StringLiteralKind;
use crate::prelude::*;
-use crate::string::StringOptions;
+use crate::string::Quoting;
#[derive(Default)]
pub struct FormatFStringPart {
- options: StringOptions,
+ options: Quoting,
}
impl FormatRuleWithOptions<FStringPart, PyFormatContext<'_>> for FormatFStringPart {
- type Options = StringOptions;
+ type Options = Quoting;
fn with_options(mut self, options: Self::Options) -> Self {
self.options = options;
@@ -22,12 +23,11 @@
impl FormatRule<FStringPart, PyFormatContext<'_>> for FormatFStringPart {
fn fmt(&self, item: &FStringPart, f: &mut PyFormatter) -> FormatResult<()> {
match item {
- FStringPart::Literal(string_literal) => {
- string_literal.format().with_options(self.options).fmt(f)
- }
- FStringPart::FString(f_string) => {
- FormatFString::new(f_string, self.options.quoting()).fmt(f)
- }
+ FStringPart::Literal(string_literal) => string_literal
+ .format()
+ .with_options(StringLiteralKind::InFString(self.options))
+ .fmt(f),
+ FStringPart::FString(f_string) => FormatFString::new(f_string, self.options).fmt(f),
}
}
}
Index: crates/ruff_python_formatter/src/other/string_literal.rs
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/crates/ruff_python_formatter/src/other/string_literal.rs b/crates/ruff_python_formatter/src/other/string_literal.rs
--- a/crates/ruff_python_formatter/src/other/string_literal.rs (revision Staged)
+++ b/crates/ruff_python_formatter/src/other/string_literal.rs (date 1702519695435)
@@ -3,19 +3,40 @@
use ruff_text_size::Ranged;
use crate::prelude::*;
-use crate::string::{docstring, StringOptions, StringPart};
+use crate::string::{docstring, Quoting, StringPart};
use crate::QuoteStyle;
#[derive(Default)]
pub struct FormatStringLiteral {
- options: StringOptions,
+ layout: StringLiteralKind,
+}
+
+#[derive(Copy, Clone, Debug, Default)]
+pub enum StringLiteralKind {
+ #[default]
+ String,
+ Docstring,
+ InFString(Quoting),
+}
+
+impl StringLiteralKind {
+ pub(crate) const fn is_docstring(self) -> bool {
+ matches!(self, StringLiteralKind::Docstring)
+ }
+
+ fn quoting(self) -> Quoting {
+ match self {
+ StringLiteralKind::String | StringLiteralKind::Docstring => Quoting::CanChange,
+ StringLiteralKind::InFString(quoting) => quoting,
+ }
+ }
}
impl FormatRuleWithOptions<StringLiteral, PyFormatContext<'_>> for FormatStringLiteral {
- type Options = StringOptions;
+ type Options = StringLiteralKind;
fn with_options(mut self, options: Self::Options) -> Self {
- self.options = options;
+ self.layout = options;
self
}
}
@@ -25,24 +46,24 @@
let locator = f.context().locator();
let parent_docstring_quote_style = f.context().docstring();
- let quote_style = if self.options.is_docstring() {
+ let quote_style = match self.layout {
// Per PEP 8 and PEP 257, always prefer double quotes for docstrings
- QuoteStyle::Double
- } else {
- f.options().quote_style()
+ StringLiteralKind::Docstring => QuoteStyle::Double,
+ StringLiteralKind::String | StringLiteralKind::InFString(_) => {
+ f.options().quote_style()
+ }
};
let normalized = StringPart::from_source(item.range(), &locator).normalize(
- self.options.quoting(),
+ self.layout.quoting(),
&locator,
quote_style,
parent_docstring_quote_style,
);
- if self.options.is_docstring() {
- docstring::format(&normalized, f)
- } else {
- normalized.fmt(f)
+ match self.layout {
+ StringLiteralKind::Docstring => docstring::format(&normalized, f),
+ StringLiteralKind::String | StringLiteralKind::InFString(_) => normalized.fmt(f),
}
}
}
Index: crates/ruff_python_formatter/src/statement/suite.rs
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/crates/ruff_python_formatter/src/statement/suite.rs b/crates/ruff_python_formatter/src/statement/suite.rs
--- a/crates/ruff_python_formatter/src/statement/suite.rs (revision Staged)
+++ b/crates/ruff_python_formatter/src/statement/suite.rs (date 1702519544227)
@@ -9,9 +9,10 @@
leading_comments, trailing_comments, Comments, LeadingDanglingTrailingComments,
};
use crate::context::{NodeLevel, TopLevelStatementPosition, WithIndentLevel, WithNodeLevel};
+use crate::expression::expr_string_literal::ExprStringLiteralKind;
+use crate::other::string_literal::StringLiteralKind;
use crate::prelude::*;
use crate::statement::stmt_expr::FormatStmtExpr;
-use crate::string::StringOptions;
use crate::verbatim::{
suppressed_node, write_suppressed_statements_starting_with_leading_comment,
write_suppressed_statements_starting_with_trailing_comment,
@@ -609,7 +610,7 @@
leading_comments(node_comments.leading),
string_literal
.format()
- .with_options(StringOptions::docstring()),
+ .with_options(ExprStringLiteralKind::Docstring),
]
)?;
Index: crates/ruff_python_formatter/src/string/mod.rs
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/crates/ruff_python_formatter/src/string/mod.rs b/crates/ruff_python_formatter/src/string/mod.rs
--- a/crates/ruff_python_formatter/src/string/mod.rs (revision Staged)
+++ b/crates/ruff_python_formatter/src/string/mod.rs (date 1702519544492)
@@ -13,6 +13,7 @@
use crate::comments::{leading_comments, trailing_comments};
use crate::expression::parentheses::in_parentheses_only_soft_line_break_or_space;
use crate::other::f_string::FormatFString;
+use crate::other::string_literal::StringLiteralKind;
use crate::prelude::*;
use crate::QuoteStyle;
@@ -47,21 +48,29 @@
}
}
- fn parts(&self) -> Vec<AnyStringPart<'a>> {
+ fn parts(&self, quoting: Quoting) -> Vec<AnyStringPart<'a>> {
match self {
- Self::String(ExprStringLiteral { value, .. }) => {
- value.iter().map(AnyStringPart::String).collect()
- }
+ Self::String(ExprStringLiteral { value, .. }) => value
+ .iter()
+ .map(|part| AnyStringPart::String {
+ literal: part,
+ layout: StringLiteralKind::String,
+ })
+ .collect(),
Self::Bytes(ExprBytesLiteral { value, .. }) => {
value.iter().map(AnyStringPart::Bytes).collect()
}
Self::FString(ExprFString { value, .. }) => value
.iter()
.map(|f_string_part| match f_string_part {
- ast::FStringPart::Literal(string_literal) => {
- AnyStringPart::String(string_literal)
- }
- ast::FStringPart::FString(f_string) => AnyStringPart::FString(f_string),
+ ast::FStringPart::Literal(string_literal) => AnyStringPart::String {
+ literal: string_literal,
+ layout: StringLiteralKind::InFString(quoting),
+ },
+ ast::FStringPart::FString(f_string) => AnyStringPart::FString {
+ string: f_string,
+ quoting,
+ },
})
.collect(),
}
@@ -100,17 +109,23 @@
#[derive(Clone, Debug)]
enum AnyStringPart<'a> {
- String(&'a ast::StringLiteral),
+ String {
+ literal: &'a ast::StringLiteral,
+ layout: StringLiteralKind,
+ },
Bytes(&'a ast::BytesLiteral),
- FString(&'a ast::FString),
+ FString {
+ string: &'a ast::FString,
+ quoting: Quoting,
+ },
}
impl<'a> From<&AnyStringPart<'a>> for AnyNodeRef<'a> {
fn from(value: &AnyStringPart<'a>) -> Self {
match value {
- AnyStringPart::String(part) => AnyNodeRef::StringLiteral(part),
+ AnyStringPart::String { literal, .. } => AnyNodeRef::StringLiteral(literal),
AnyStringPart::Bytes(part) => AnyNodeRef::BytesLiteral(part),
- AnyStringPart::FString(part) => AnyNodeRef::FString(part),
+ AnyStringPart::FString { string, .. } => AnyNodeRef::FString(string),
}
}
}
@@ -118,101 +133,49 @@
impl Ranged for AnyStringPart<'_> {
fn range(&self) -> TextRange {
match self {
- Self::String(part) => part.range(),
+ Self::String { literal, .. } => literal.range(),
Self::Bytes(part) => part.range(),
- Self::FString(part) => part.range(),
+ Self::FString { string, .. } => string.range(),
+ }
+ }
+}
+
+impl Format<PyFormatContext<'_>> for AnyStringPart<'_> {
+ fn fmt(&self, f: &mut PyFormatter) -> FormatResult<()> {
+ match self {
+ AnyStringPart::String { literal, layout } => {
+ literal.format().with_options(*layout).fmt(f)
+ }
+ AnyStringPart::Bytes(bytes_literal) => bytes_literal.format().fmt(f),
+ AnyStringPart::FString { string, quoting } => {
+ FormatFString::new(string, *quoting).fmt(f)
+ }
}
}
}
#[derive(Copy, Clone, Debug, Default)]
-pub(crate) enum Quoting {
+pub enum Quoting {
#[default]
CanChange,
Preserve,
}
-#[derive(Default, Copy, Clone, Debug)]
-pub(crate) enum StringLayout {
- #[default]
- Default,
- DocString,
-}
-
-/// Resolved options for formatting any kind of string. This can be either a string,
-/// bytes or f-string.
-#[derive(Copy, Clone, Debug, Default)]
-pub struct StringOptions {
- quoting: Quoting,
- layout: StringLayout,
-}
-
-impl StringOptions {
- /// Creates a new context with the docstring layout.
- pub(crate) fn docstring() -> Self {
- Self {
- layout: StringLayout::DocString,
- ..Self::default()
- }
- }
-
- /// Returns a new context with the given [`Quoting`] style.
- pub(crate) const fn with_quoting(mut self, quoting: Quoting) -> Self {
- self.quoting = quoting;
- self
- }
-
- /// Returns the [`Quoting`] style to use for the string.
- pub(crate) const fn quoting(self) -> Quoting {
- self.quoting
- }
-
- /// Returns `true` if the string is a docstring.
- pub(crate) const fn is_docstring(self) -> bool {
- matches!(self.layout, StringLayout::DocString)
- }
-}
-
-struct FormatStringPart<'a> {
- part: &'a AnyStringPart<'a>,
- options: StringOptions,
-}
-
-impl<'a> FormatStringPart<'a> {
- fn new(part: &'a AnyStringPart<'a>, options: StringOptions) -> Self {
- Self { part, options }
- }
-}
-
-impl Format<PyFormatContext<'_>> for FormatStringPart<'_> {
- fn fmt(&self, f: &mut PyFormatter) -> FormatResult<()> {
- match self.part {
- AnyStringPart::String(string_literal) => {
- string_literal.format().with_options(self.options).fmt(f)
- }
- AnyStringPart::Bytes(bytes_literal) => bytes_literal.format().fmt(f),
- AnyStringPart::FString(f_string) => {
- FormatFString::new(f_string, self.options.quoting()).fmt(f)
- }
- }
- }
-}
-
pub(crate) struct FormatStringContinuation<'a> {
string: &'a AnyString<'a>,
- options: StringOptions,
+ quoting: Quoting,
}
impl<'a> FormatStringContinuation<'a> {
pub(crate) fn new(string: &'a AnyString<'a>) -> Self {
Self {
string,
- options: StringOptions::default(),
+ quoting: Quoting::default(),
}
}
- pub(crate) fn with_options(mut self, options: StringOptions) -> Self {
- self.options = options;
+ pub(crate) fn with_options(mut self, options: Quoting) -> Self {
+ self.quoting = options;
self
}
}
@@ -223,11 +186,11 @@
let mut joiner = f.join_with(in_parentheses_only_soft_line_break_or_space());
- for part in self.string.parts() {
+ for part in self.string.parts(self.quoting) {
joiner.entry(&format_args![
line_suffix_boundary(),
leading_comments(comments.leading(&part)),
- FormatStringPart::new(&part, self.options),
+ part,
trailing_comments(comments.trailing(&part))
]);
}I recommend taking a second look where omitting a call to with_options (and assuming the default) leads to invalid results and consider removing the AsFormat and IntoFormat implementations for these types (to force callers to specify all fields when cosntructing the Format*.
Note: It seems GitHub (reviewed with VS code) lost a few of my comments or even misplaced them. I'll try to identify the missing ones but the patch above should include all places where we can replace
StringOptionswith more specific types (to the point where we can deleteStringOptionsall together)
crates/ruff_python_formatter/src/expression/expr_string_literal.rs
Outdated
Show resolved
Hide resolved
|
Alright, this is looking good. I've made the suggested changes and some more:
I'll merge this but feel free to add any further comments and I can take it up in a follow-up PR. |
806abae to
79872e0
Compare

This PR splits the string formatting code in the formatter to be handled by the respective nodes.
Previously, the string formatting was done through a single
FormatStringinterface. Now, the nodes themselves are responsible for formatting.The following changes were made:
StringLayout::ImplicitStringConcatenationInBinaryLikeand inline the call toFormatStringContinuation. After the refactor, the binary like formatting would delegate toFormatStringwhich would then delegate toFormatStringContinuation. This removes the intermediary steps.FStringPartwhich delegates it to the respective string literal or f-string node.ExprStringLiteralKindwhich is eitherStringorDocstring. If it's a docstring variant, then the string expression would not be implicitly concatenated. This is guaranteed by theDocstringStmt::try_from_expressionconstructor.StringLiteralKindwhich is either aString,DocstringorInImplicitlyConcatenatedFString. The last variant is for when the string literal is implicitly concatenated with an f-string ("foo" f"bar {x}").FormatString.ExprFStringorFormatStringContinuation).Formatter ecosystem result
This PR
main