-
Notifications
You must be signed in to change notification settings - Fork 4.6k
Introduce block validation levels #73116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: trunk
Are you sure you want to change the base?
Conversation
|
The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the If you're merging code through a pull request on GitHub, copy and paste the following into the bottom of the merge commit message. To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook. |
|
Size Change: +674 B (+0.03%) Total Size: 2.58 MB
ℹ️ View Unchanged
|
This comment was marked as outdated.
This comment was marked as outdated.
|
From the HTML test example, all blocks should be silently reconstructed, except for the level 5 case, which should error out in the UI. |
This comment was marked as outdated.
This comment was marked as outdated.
| // - Both have content, OR | ||
| // - Both are empty (valid empty block), OR | ||
| // - Original is empty but generated has content (adding generated classes/structure) | ||
| const contentIsReasonable = hasGeneratedContent || ! hasOriginalContent; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which case we're excluding here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was wondering the same.
|
|
||
| // Shortcut to avoid costly validation. | ||
| // Shortcut: Fallback blocks (freeform/unregistered) are marked as raw transformed. | ||
| if ( isFallbackBlock ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this should apply to both unregistered and freeform? Feels like it should apply only to "freeform". Also I feel like it should apply only when there's no comment delimiters for the block, in other words, if the block comment explicitly states that it should use core/classic or core/html for instance, it shouldn't auto-transform even if that block is marked as "freeform" block content handler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I revised this logic a bit. It comes down to the fact we wrap unregistered blocks in core/missing which handles the UI. A block with core/missing is "valid" from the perspective of the validation flow.
- RECONSTRUCTED_BLOCK - REGENERATED_BLOCK With the second level handling more trivial attributes and level three doing entire block reconstruction as long as the block flag is set
de17480 to
3a520de
Compare
3a520de to
160e352
Compare
mcsf
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the test cases:
Level 2: Reconstructed
Test cases:
- Dynamic block rendering server-side content differently
- Template blocks with variable output
- Blocks with computed/derived innerHTML from attributes
and
Level 4: Invalid
- Block with incompatible save output
What do these mean?
From the HTML test example, all blocks should be silently reconstructed, except for the level 5 case, which should error out in the UI.
The goal is to 1) silently reconstruct without saving, then 2) saving the reconstruction if the user modified anything. Correct?
Level 3: RawTransformedBlock
Definition: Raw handling functions applied; block type name preserved
When: Freeform or unregistered type handlers process the content
Action: Raw HTML transformation applied
Test Cases:
Unregistered block type converted to freeform
Classic block with raw HTML content
HTML block with freeform content
Not very clear to me how much this is supposed to change from what already happens. Reading "raw handling functions applied" makes me think that we can be more confident in transforming non-block content into blocks, instead of always carefully isolating them as Freeform, but the rest of the text suggests maintaining the isolation approach.
I think this is a good start, but it would help me if we could discuss how we see this making a difference for users, what kind of new interactions they would see, and what — if any — contingencies we should plan. For example, and without endorsing of these ideas:
- If we want to perform silent transformations more often, do we need to change the granularity of post revisions, in case users want to better understand what the system did?
- Similarly, can we improve the undo/redo system to have some awareness of these transformations? Or, on the contrary, do we solidify the idea that these are meant to be invisible (under the condition that they be robust, of course)?
- Scenarios like a
<!-- wp:headingsurrounding both a H2 and a P tag and discarding the P (suggested somewhere above) give me pause, hence the questions above.
There was also an example of a block that should be regenerated whose tag didn't match the save output. For example, a Heading block with a single P tag. The premise is that, if we can parse the attributes, we can regenerate the whole thing. Well, but how do we parse the content attribute, since it's meant to be sourced using a selector that won't match?
| const isDirectAttributeIssue = | ||
| message.includes( 'Expected attribute' ) || | ||
| message.includes( 'Expected attributes' ) || | ||
| message.includes( 'Unexpected attribute' ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This won't match. The logger is currently passed:
Encountered unexpected attribute `%s`.
This is irrelevant for evaluating the overall PR, but it will easily be missed later if I don't point it out.
| * | ||
| * @return {boolean} Whether issues are only attribute-related. | ||
| */ | ||
| export function areOnlyAttributeDifferences( validationIssues ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably fine for a first effort, but adding a piece in the system that interprets the validation issues logged by our own validation function is roundabout, and the whole validation module could probably use an overhaul with these levels in mind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed here maybe we can update validationIssues to be more structured to avoid relying on strings.
| * | ||
| * @return {Object} Validation result with isValid computed property. | ||
| */ | ||
| export function createValidationResult( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Has tests but is unused.
| // - Both have content, OR | ||
| // - Both are empty (valid empty block), OR | ||
| // - Original is empty but generated has content (adding generated classes/structure) | ||
| const contentIsReasonable = hasGeneratedContent || ! hasOriginalContent; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was wondering the same.
| * false otherwise. Invalid HTML is not considered equivalent, even if the | ||
| * strings directly match. | ||
| * false otherwise. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed that the first branch in this function contradicted the last sentence of this comment, so I removed it.
|
I want to underscore how useful this is for what we are trying to do with AI
This would solve this problem - we would take manipulated tree and depend on UI to reconstitute the syntax. |
Deprecations were incorrectly matching when their save() output qualified for automated reconstruction rather than an exact match.
These cases jump to level 4 (invalid block)
|
@youknowriad @mcsf thanks for the initial look. I think this is ready for more in depth reviews. I included some html to test in the original description. It should cover multiple levels and give a sense for how much reconstruction helps. (Compare what trunk produces versus this branch.) The amount of UI errors is drastically reduced. Added a condition to be a bit more conservative in reconstruction and take into account volume of content discrepancies. If it's above a threshold skip to level 4. I think this is a good safety net to balance, but the specific threshold I think is up for debate. Without it, level 4 is very hard to reach for core blocks (which is not a bad thing, in a sense). All tests and fixtures should be passing. I would like to follow this up with introducing a UI signal when a block was reconstructed at level 4. The signal should be unobtrusive but present for user review if necessary. But this PR should introduce no UI changes and just focus on the validation logic. A final thing is that with this in place perhaps there are a lot of manually written block deprecations that can be removed from the codebase entirely. |
|
Also @ellatrix after this, I'd like to continue the idea of automatically converting freeform/classic content to blocks (maybe as a flag that can be set in wp_config). |
| // block comment (e.g., className, ariaLabel). Unlike Level 2/3 reconstruction | ||
| // which would lose these values on save, fixes preserve them in the block's | ||
| // attributes for regeneration. | ||
| const fixedBlock = applyBuiltInValidationFixes( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be considered a level of validation on its own? I see that right now, we're just considering it "valid_block"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure, these built in validation and normalization routines are applied in a few places, including some raw handling work.
| // If the block was migrated via deprecation, update validation level to Level 1 | ||
| if ( updatedBlock.__wasMigrated ) { | ||
| updatedBlock.validationLevel = VALIDATION_LEVEL.MIGRATED_BLOCK; | ||
| delete updatedBlock.__wasMigrated; // Clean up internal flag |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: Makes me wonder if we should update the return value of applyBlockDeprecatedVersions to use the same format used here (add the validation level instead of a flag)
| // This ensures deprecations (explicit author instructions) take priority over | ||
| // automated block reconstruction. | ||
| if ( ! updatedBlock.isValid ) { | ||
| const [ canReconstruct, , reconstructMeta ] = validateBlock( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In terms of "flow". These new validation levels can be considered new kind of "built-in" fixes. But in your implementation, they are being done in the "validateBlock" function, kind of changing what this function is about and also now the built-in fixes are kind of done in two places during the parsing flow.
I wonder if we should move the new logic and the code of applyBlockValidation also inline in this function. To schematize, the code of this function could become more clear I think something like:
if ( isblockValid ) {
return { level 1 }
}
if ( applyBuiltinFixes() ) {
return { level 2 }
}
if ( applyDeprecations() ) {
return { level 3 }
}
if ( applyReconstruction() ) {
return { level 4 }
}
if ( applyRegeneration() ) {
return { level 5 }
}
return { level6 (invalid) }
Each step could be tested separately and the flow becomes a lot more clear. WDYT? Would that be too big of a change for now?
| const isFallbackBlock = | ||
| block.name === getFreeformContentHandlerName() || | ||
| block.name === getUnregisteredTypeHandlerName(); | ||
| export function validateBlock( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just noting that this is a public API, so we should be careful about the changes not impacting existing usage. It seems fine at first sight, but you know better :P
|
The main case that I found a bit weird in my tests is the following: Content loss becomes I also noticed that it's almost impossible to get the paragraph block to be invalid in both this PR and trunk, it absorbs all content within the "p", it's a bit weird but it's a separate issue. I think it's a bit hard to know the full extent of the impact of this PR, but I'm willing to give it a try, especially given it impacts invalid content only. |
youknowriad
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can give this a try.
I left some potential flow/code enhancements but nothing really major.
|
We may want to document the new flag of the block API in the handbook. |
| * | ||
| * @type {Object} | ||
| */ | ||
| export const VALIDATION_LEVEL_NAME = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this used anywhere?
dmsnell
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May WordPress never corrupt or lose user content! Thanks for working on this.
As I read through this patch and reflect on the original discussion, I hear a mixture of issues that now seems to branch out beyond where it was before:
- Improve our ability to auto-detect benign changes or corruption and fix them reliability when the content in a post appears as a block but is different than the editor expects.
- Gracefully auto-update blocks lacking a proper migration/deprecation from older versions of the markup.
- Smush LLM-generated block content into valid markup when it’s not.
And I think it’s hard to see this proposal as solving all of them at once because I think the needs and characteristics are different. For content which has been manually adjusted or is out of date, or which a plugin might have altered, we expect small and intentional changes within the existing structure. For LLM-generated content we expect random and self-contradictory content which looks like blocks, and importantly, without the same sense of intention behind the content. Or, perhaps more tersely: intentional edits vs. new pluasible block content.
For automation purposes, I don’t fully understand the need for the validation levels, and to some extent have always struggled understanding their role. If we attempt to rewrite blocks automatically, as is already done when loading a post, then it seems like we could add just a few more steps and wouldn’t need the classification system.
In fact, I believe that Gutenberg already handles levels 0–2, meaning only level 4 is “new” and since it requires an opt-in, requires no apparent classification scheme to introduce as a new thing. We could quietly add the flag on block registration and handle these through the existing loader.
One of the changes I’ve found most valuable from my own work is the combination of #38794, #38923, and #39523 where we started preserving so-called “invalid” content (so-called, because usually it’s valid but the editor is unaware). Before then, it was common and easy for the editor to erase content in a way similar to how this change can wipe out content from something like a Heading block which uses the DIV or SPAN element as a wrapper instead of H1–H6.
Being able to preserve content that the editor doesn’t understand seems fundamental to providing the reliability people need for trusting the editor with their content. A common flow where this is normal is when copying and pasting a post from one WordPress to another, where one lacks the appropriate plugin or version to run every block. “Invalid blocks” should stay invalid and verbatim so that they work properly when copied back to the original site.
To that end I wonder if we have seen the project evolve more towards needing a new idiom: “fix slop” with a better name. In this case, that could be the addition of a new function such as fuzzyParseBlock( blockNode, fullBlockHTML ) which can be called in flows where structure is a guess (e.g. from LLM contexts), but also which could be called as a response to a user clicking on “Attempt to fix this broken block.”
Given the liberty taken in the resolutions here, and the risk of corrupting or losing content, and the combinatorial explosion of ways to try and detect invalid markup, I think it would be wise to hesitate to add this additional transformation to the loading stage.
If an LLM workflow wants to dump something into the editor it could call fuzzyParseBlocks() separately and then dump the output of that function into the editor, meaning this can all be separate, external, tested in isolation, and exchangeable.
In some of the earlier discussions (though apparently not in #21703) I advocated for the idea of allowing blocks to register their own attribute parser to determine validity in custom ways. That still offers merit that an automated system can’t, and I think it would fit in nice with a fuzzy-parse stage. For example, a Heading block could register a fuzzy parser and do any number of things, such as grabbing the first element, or taking the plaintext content of the entire provided HTML as the heading text, or that but including formatting elements, etc… We originally had in mind benign changes like added styles, classes, and attributes, something that has been incorporated in various ways over time in the core system. Having a fuzzy system as a separate action for cases where resolution isn’t obvious seems like an appropriate match to the qualitatively distinct operation of trying to make sense out of content that is self-contradictory.
@artpi we have tools to read and write block HTML structure on the server. let’s talk if you are having issues with this.
@mtias I see a few type changes that are only visible inline where they are read. this means that unless someone knows where to look for them they won’t likely be able to find them. it would be awesome if we could add documentation in the places people will look.
__wasMigratedandvalidationLevelcould be added to theWPBlocktype with an inline description of what it represents, even as an optional property.allowsReconstructioncould be added to the docblock onregisterBlockType()and also theWPBlockType. we have exampleblock.jsonfiles we could extend showing its use.
In summary, if our goal is more along the lines of cleaning up LLM slop then it seems like this could be handled on its own outside of the primary loading and validation framework and cause less risk to inadvertently erasing intentional content. Maybe a lot of the mechanisms here could be moved over into a new function for that purpose.
That would give users more freedom: to click on a block and have the editor attempt lossy reconstruction, to click “Attempt cleanup” or “Paste exactly” on a paste handler if non-valid block content is detected, and to be called manually in AI flows before inserting via createBlock() or whatever.
| const issues = [ | ||
| { | ||
| log: jest.fn(), | ||
| args: [ 'Expected attributes Array(2), saw Array(1).' ], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are these text-only error messages the only way we have available here to mention the issues? would it not help if each issue were indicated as the level it corresponds to? then the areOnlyAttributeDifferences() function would amount to !issues.some( issue => issue.level > LEVELS.ATTRIBUTE_ONLY )
| if ( message.includes( 'Expected token of type' ) ) { | ||
| // Extract the token objects from the logger arguments. | ||
| const expectedToken = issue.args?.[ 2 ]; | ||
| const actualToken = issue.args?.[ 4 ]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a simple example of what we expect on the issue array would greatly clarify what we’re trying to extract.
| expectedTagName = match?.[ 1 ]; | ||
| } else if ( expectedToken?.tagName ) { | ||
| expectedTagName = expectedToken.tagName; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this whole section is a bit unclear: is there a way to show in a comment some inputs where we would expect it to run?
| const originalLength = block.originalContent?.trim().length || 0; | ||
| const generatedLength = generatedBlockContent?.trim().length || 0; | ||
| const wouldLoseContent = | ||
| originalLength > 0 && generatedLength < originalLength * 0.5; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems a bit arbitrary to measure 50%, especially when even in the example we have more tag markup than we do plaintext.
I wonder if this could be adjustable and go off of more robust heuristics. it seems like a good verification step in any case, and it’s good to see us thinking about how to detect when we mess up what someone produced.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'm curious if there's better ideas for more robust heuristic to have as a default. I think this is better than nothing, but can be tweaked.
|
The goal of this is not at all influenced by cleaning up LLM HTML yet, though it happens to also assist with it since precision and deterministic structure is not its strength. The fundamental reason is to reduce or eliminate the confusing and paralyzing "this block is malformed" message to an end user, which erodes confidence in the system and is puzzling if you don't know how things work or what to do next. @dmsnell probably the most illustrative is to copy the HTML shared at the top and paste it in Gutenberg. Many of the examples "break" in trunk and we've been shifting the responsibility to resolve them to the user. The other motivation for this implementation is to make explicit a lot of the flow already in place but difficult to act on after the fact because it often left no trail in the system.
Levels 0-1 existed and were implicit. Level 2 partially, but fairly ad hoc. Level 3 is new, but is not set as opt-in but opt-out. All core blocks are thus opt-in by default. I'd like to follow this up with specific UI for handling this, which I'm working on That's the other reason for these explicit levels. As we introduce more ways of manipulating content, including collaboration flows, suggestion flows, etc, being able to keep track of what level of integrity a block is at is going to be useful to develop specific UIs to help users disambiguate when needed. Which also can apply to a better revision system, aware of the state of blocks, where we could easily classify changes that were done automatically (migrations, deprecations, reconstruction levels) from user changes, and collapse them, stack them, etc, to make the interface more useful.
I initially had this as a separate level (a RAW_TRANSFORM one) but came to the conclusion (as you point out at the beginning) that it's a separate problem about dropping arbitrary HTML and converting it to the best block approximations. That's something we wanted to do with the classic -> blocks as an "always on" setting. I think it's worth exploring separately. That gets grouped with the overall "paste arbitrary html" and have it mapped to the most adequate places. That does require more knowledge of blocks and may require block author instructions to be done properly. |
|
@mtias I was tracking this work in the past in #38922
to me this is one of the most significant issues, which also led to data corruption when loading blocks which don’t invalidate. specifically, when we upgrade the list block we moved from a selector for inner content to inner blocks, and so the block loaded with zero list items, which was a valid parse, and erased any existing items from the previous version. the deprecations didn’t catch this either, though a version number would have.
Are you talking about #7604? At some point @jasmussen had some lovely figures of colored dots to the side of each block with issues. I felt like there was a huge win to be had with those, because we could differentiate warnings and errors, get distraction out of the block for things like grammar mistakes even, or blocks whose attributes require attention.
this is the part I would love to better understand, because I don’t quite get what the levels give us. the diff and resolution dialog is not that useful, especially in the way it hides and ignores inner blocks. I feel like we could improve that, but also I think there’s more to attribute differences than just differences. added and removed attributes seems more useful, whereas additional classes or attributes largely seem benign and should be preserved. |
I can dig up those mockups again, and there was potential. But I think there might be a better option still: a full fledged block editor linting tool. Accessible perhaps from within the document outline (2nd tab under list view), and emphasized with an unread dot on the list view icon if an error is there. And also surfaced as a panel in the pre-publish flow. Linting options would be in 3 levels like we're used to:
What do you think? |
|
thanks @jasmussen. well I still like the unobtrusive dots in situ, and I think you said with some better wording what I was trying to convey, that these can represent statuses for action or review. I love having a view like the list view and of showing these in the publish flow. but.
you had me until this 😆. I guarantee it will appear endlessly after everything has been seen and read, and it will distract people from writing, and people will try and find ways to hide it, but it will tirelessly return. for me this crosses from a helpful notice into demanding feigned urgency, probably because without switching tasks and opening the view, the editor doesn’t know what the dot is for — is it there because something is new? is something broken? is the dot misbehaving? and by the time we’ve opened up the view and realized the dot isn’t relevant we’ve forgotten what we were doing. |
|
@dmsnell no, it's a branch I pushed building upon this work, doesn't have a PR yet. :) I'll share a mockup of how it looks. |
Note
Implements #21703
This introduces a 5-tier block validation classification system to replace the current binary valid/invalid approach, allowing for more intelligent handling of block validation edge cases.
The aim of this work is to reduce the cases where a user is presented with a "this block is invalid" message.
Validation Levels
Level 0: ValidBlock
save(attributes) => originalContentLevel 1: MigratedBlock
Level 2: ReconstructedBlock
Level 3: RegeneratedBlock
allowsReconstruction !== false)level:3, HTML has<h2>)Level 4: InvalidBlock
allowsReconstruction: false)Implementation
1. Core API - Validation Level Constants
2. validateBlock() Return Format
The function returns a tuple with 3 elements:
Example usage:
3. Validation Level Detection Logic
Level 0: ValidBlock - Exact HTML match
Level 1: MigratedBlock - Handled in parser via deprecation system
Level 2: ReconstructedBlock - Attribute-only differences
Level 3: RegeneratedBlock - Content regenerated from attributes
Level 4: InvalidBlock - All validation failed
4. Block Type Registration
Blocks can opt out of regeneration (Level 3):
Blocks Can Opt Out
Blocks requiring stricter validation can disable regeneration:
Example HTML to test that has multiple failures in trunk