feat(core): embedded formatting in html files by ematipico · Pull Request #7467 · biomejs/biome

ematipico · 2025-09-10T13:46:02Z

Summary

Part of #6657

This PR establishes the infrastructure for embedded formatting and implements it for .html files.

The infrastructure brings my ideas of how we should do it, so feel free to discuss it.

Formatter infrastructure

The infrastructure was designed to avoid any sort of cyclic dependencies. This is an important part of how I designed it. Because of this, the formatting is done inside biome_service, which is the crate that imports all biome_*_formatter crates.

The formatting of embedded code is in two passes. The first pass is the usual formatting that we always implement in all languages; the second pass is done via a new function called format_embedded. This function accepts a closure that provides the TextRange of where the embedded node is found (we will get to that TextRange in a bit). The caller will use that TextRange to retrieve the CST root, and then use its own formatter (JS and CSS in case of the implementation). Eventually, the closure must return a formatter Document.

The function format_embedded is called inside biome_formatter, which now tracks the embedded nodes during the first formatting pass. The node is tracked using the new Tag:StartEmbedded(TextRange). The TextRange is precisely the range that I mentioned in the previous paragraph. This range is collected using a new function called embedded_node_range which is implemented in each FormatNode of the language. In this PR, we implement the function inside the element HtmlEmbeddedContent. The range we save must match the range of CST we save in the Workspace. That's how we track the embedded nodes.

During the second pass, we replace Tag:StartEmbedded(TextRange) with the format elements returned by the closure mentioned in the first paragraph. The element Tag::EndEmbedded is currently replaced with a hard line. I left a note there, so in the future, we can use a better heuristic and choose a node that fits the situation.

The replacement of the FormatElements is done using a new visitor infrastructure. The code of the visitor was suggested using an AI agent. I suggested a solution that allows replacing a vector element with nested infrastructure.

Parsing of embedded nodes

The parsing of embedded nodes is now done inside the Workspace because we must retrieve the parsing options of the current project.

`ServiceLanguage` and `Workspace`

The ServiceLanguage has been updated with a new capability called format_embedded. I chose to add a new capability because it needs to accept a list of embedded nodes saved inside the Workspace. This allows us to switch from one formatting to another depending on the presence of some embedded nodes.

The ServiceLanguage now expose a new function called parse_options. Up until now the resolution of these options was done only inside the parse function. Now the scope become larger, so I needed a common infrastructure to retrieve those options. When we parse embedded nodes, we use the parsing options defined in the configuration.

Test Plan

I added some tests. For some reason, the CSS formatting is a bit funky. I will tackle that in the next PR.

Docs

I believe it doesn't require docs for now

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(core): embedded formatting in html files#7467

feat(core): embedded formatting in html files#7467
ematipico merged 20 commits intonextfrom
feat/embedded-formatter

ematipico commented Sep 10, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

ematipico commented Sep 10, 2025

Summary

Formatter infrastructure

Parsing of embedded nodes

ServiceLanguage and Workspace

Test Plan

Docs

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

`ServiceLanguage` and `Workspace`