Skip to content

feat(core): embedded formatting in html files#7467

Merged
ematipico merged 20 commits intonextfrom
feat/embedded-formatter
Sep 13, 2025
Merged

feat(core): embedded formatting in html files#7467
ematipico merged 20 commits intonextfrom
feat/embedded-formatter

Conversation

@ematipico
Copy link
Copy Markdown
Member

Summary

Part of #6657

This PR establishes the infrastructure for embedded formatting and implements it for .html files.

The infrastructure brings my ideas of how we should do it, so feel free to discuss it.

Formatter infrastructure

The infrastructure was designed to avoid any sort of cyclic dependencies. This is an important part of how I designed it. Because of this, the formatting is done inside biome_service, which is the crate that imports all biome_*_formatter crates.

The formatting of embedded code is in two passes. The first pass is the usual formatting that we always implement in all languages; the second pass is done via a new function called format_embedded. This function accepts a closure that provides the TextRange of where the embedded node is found (we will get to that TextRange in a bit). The caller will use that TextRange to retrieve the CST root, and then use its own formatter (JS and CSS in case of the implementation). Eventually, the closure must return a formatter Document.

The function format_embedded is called inside biome_formatter, which now tracks the embedded nodes during the first formatting pass. The node is tracked using the new Tag:StartEmbedded(TextRange). The TextRange is precisely the range that I mentioned in the previous paragraph. This range is collected using a new function called embedded_node_range which is implemented in each FormatNode of the language. In this PR, we implement the function inside the element HtmlEmbeddedContent. The range we save must match the range of CST we save in the Workspace. That's how we track the embedded nodes.

During the second pass, we replace Tag:StartEmbedded(TextRange) with the format elements returned by the closure mentioned in the first paragraph. The element Tag::EndEmbedded is currently replaced with a hard line. I left a note there, so in the future, we can use a better heuristic and choose a node that fits the situation.

The replacement of the FormatElements is done using a new visitor infrastructure. The code of the visitor was suggested using an AI agent. I suggested a solution that allows replacing a vector element with nested infrastructure.

Parsing of embedded nodes

The parsing of embedded nodes is now done inside the Workspace because we must retrieve the parsing options of the current project.

ServiceLanguage and Workspace

The ServiceLanguage has been updated with a new capability called format_embedded. I chose to add a new capability because it needs to accept a list of embedded nodes saved inside the Workspace. This allows us to switch from one formatting to another depending on the presence of some embedded nodes.

The ServiceLanguage now expose a new function called parse_options. Up until now the resolution of these options was done only inside the parse function. Now the scope become larger, so I needed a common infrastructure to retrieve those options. When we parse embedded nodes, we use the parsing options defined in the configuration.

Test Plan

I added some tests. For some reason, the CSS formatting is a bit funky. I will tackle that in the next PR.

Docs

I believe it doesn't require docs for now

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-CLI Area: CLI A-Formatter Area: formatter A-Parser Area: parser A-Project Area: project A-Tooling Area: internal tools L-CSS Language: CSS and super languages L-Grit Language: GritQL L-HTML Language: HTML and super languages L-JavaScript Language: JavaScript and super languages L-JSON Language: JSON and super languages

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants