Skip to content

Conversation

@mtias
Copy link
Member

@mtias mtias commented Nov 8, 2025

Note

Implements #21703

This introduces a 5-tier block validation classification system to replace the current binary valid/invalid approach, allowing for more intelligent handling of block validation edge cases.

The aim of this work is to reduce the cases where a user is presented with a "this block is invalid" message.

Validation Levels

Level 0: ValidBlock

  • Definition: Idempotent operation where save(attributes) => originalContent
  • When: Perfect match between saved content and regenerated content
  • Action: No intervention needed

Level 1: MigratedBlock

  • Definition: Source matched against defined deprecations sequentially
  • When: Block validated through a deprecation path
  • Action: Deprecation migrate() function was applied
  • Test Cases:
    • Block with old attribute schema migrated to new schema
    • Block with changed save structure using deprecation
    • Block with multiple deprecations matching the most recent applicable one

Level 2: ReconstructedBlock

  • Definition: Conservative reconstruction with only attribute-level differences
  • When: Content structure is intact but attributes differ (e.g., missing generated classes)
  • Action: This level is always allowed and considered safe
  • Test Cases:
    • Block with missing generated class names from attributes
    • Block with different attribute values but same structure
    • Block with attribute order variations

Level 3: RegeneratedBlock

  • Definition: Content regenerated from attributes; may have structural differences
  • When: Content has structural differences (wrong tags, different text) but can be regenerated from attributes. Also applies to freeform/unregistered blocks.
  • Action: Regenerate content from attributes (requires allowsReconstruction !== false)
  • Test Cases:
    • Block with wrong HTML tag (comment says level:3, HTML has <h2>)
    • Dynamic block rendering server-side content differently
    • Template blocks with variable output
    • Blocks with computed/derived innerHTML from attributes
    • Unregistered block type converted to freeform
    • Classic block with raw HTML content
    • HTML block with freeform content

Level 4: InvalidBlock

  • Definition: Cannot be safely restored; requires user intervention
  • When: All other validation levels fail
  • Action: Show invalid block warning UI
  • Test Cases:
    • Block with corrupted HTML structure (when allowsReconstruction: false)
    • Block with missing required attributes
    • Block with save() function that throws errors
    • Block with incompatible save output

Implementation

1. Core API - Validation Level Constants

/**
 * Validation level constants ordered by decreasing certainty over content integrity.
 */
export const VALIDATION_LEVEL = {
    VALID_BLOCK: 0,           // Idempotent save operation
    MIGRATED_BLOCK: 1,        // Matched deprecation
    RECONSTRUCTED_BLOCK: 2,   // Attribute-level differences only
    REGENERATED_BLOCK: 3,     // Content regenerated from attributes
    INVALID_BLOCK: 4,         // Cannot be safely restored
};

/**
 * Human-readable names for validation levels.
 */
export const VALIDATION_LEVEL_NAME = {
    [ VALIDATION_LEVEL.VALID_BLOCK ]: 'ValidBlock',
    [ VALIDATION_LEVEL.MIGRATED_BLOCK ]: 'MigratedBlock',
    [ VALIDATION_LEVEL.RECONSTRUCTED_BLOCK ]: 'ReconstructedBlock',
    [ VALIDATION_LEVEL.REGENERATED_BLOCK ]: 'RegeneratedBlock',
    [ VALIDATION_LEVEL.INVALID_BLOCK ]: 'InvalidBlock',
};

2. validateBlock() Return Format

The function returns a tuple with 3 elements:

return [
    isValid,              // boolean - true for levels 0-3, false for level 4
    validationIssues,     // array of validation issues
    metadata              // object with validationLevel, originalContent, generatedContent
];

Example usage:

const [ isValid, issues, metadata ] = validateBlock( block, blockType );

console.log( metadata.validationLevel );  // 0-4
console.log( metadata.originalContent );  // HTML from database
console.log( metadata.generatedContent ); // HTML from save()

3. Validation Level Detection Logic

Level 0: ValidBlock - Exact HTML match

if ( isEquivalentHTML( block.originalContent, generatedBlockContent ) ) {
    return [
        true,
        [],
        {
            validationLevel: VALIDATION_LEVEL.VALID_BLOCK,
            originalContent: block.originalContent,
            generatedContent: generatedBlockContent,
        },
    ];
}

Level 1: MigratedBlock - Handled in parser via deprecation system

// In apply-block-deprecated-versions.js
block = {
    ...block,
    attributes: migratedAttributes,
    innerBlocks: migratedInnerBlocks,
    isValid: true,
    validationIssues: [],
    __wasMigrated: true, // Flag detected in parser
};

// In parser/index.js
if ( updatedBlock.__wasMigrated ) {
    updatedBlock.validationLevel = VALIDATION_LEVEL.MIGRATED_BLOCK;
    delete updatedBlock.__wasMigrated;
}

Level 2: ReconstructedBlock - Attribute-only differences

// If only attribute-level differences are found (structure intact)
// This is always allowed and considered safe
if ( areOnlyAttributeDifferences( logger.getItems() ) ) {
    return [
        true,
        [],
        {
            validationLevel: VALIDATION_LEVEL.RECONSTRUCTED_BLOCK,
            originalContent: block.originalContent,
            generatedContent: generatedBlockContent,
        },
    ];
}

Level 3: RegeneratedBlock - Content regenerated from attributes

// If block allows reconstruction and has attributes, it's valid at Level 3
// We trust the attributes and will regenerate HTML from them
const allowsReconstruction = blockType.allowsReconstruction !== false;

if ( allowsReconstruction && block.attributes ) {
    // Check if generated content is reasonable
    const hasOriginalContent = block.originalContent?.trim().length > 0;
    const hasGeneratedContent = generatedBlockContent?.trim().length > 0;
    const contentIsReasonable = hasGeneratedContent || ! hasOriginalContent;

    if ( contentIsReasonable ) {
        return [
            true,
            [],
            {
                validationLevel: VALIDATION_LEVEL.REGENERATED_BLOCK,
                originalContent: block.originalContent,
                generatedContent: generatedBlockContent,
            },
        ];
    }
}

// Freeform/unregistered blocks also get Level 3
const isFallbackBlock =
    block.name === getFreeformContentHandlerName() ||
    block.name === getUnregisteredTypeHandlerName();

if ( isFallbackBlock ) {
    return [
        true,
        [],
        {
            validationLevel: VALIDATION_LEVEL.REGENERATED_BLOCK,
            originalContent: block.originalContent,
            generatedContent: block.originalContent,
        },
    ];
}

Level 4: InvalidBlock - All validation failed

// All validation levels failed
return [
    false,
    logger.getItems(),
    {
        validationLevel: VALIDATION_LEVEL.INVALID_BLOCK,
        originalContent: block.originalContent,
        generatedContent: generatedBlockContent,
    },
];

4. Block Type Registration

Blocks can opt out of regeneration (Level 3):

registerBlockType( 'my-plugin/strict-block', {
    // ... other properties

    /**
     * Whether the block allows regeneration from attributes.
     * When false, only exact HTML match (Level 0), deprecation (Level 1),
     * or attribute-only differences (Level 2) are acceptable.
     * Default: true
     */
    allowsReconstruction: false,
} );

Blocks Can Opt Out

Blocks requiring stricter validation can disable regeneration:

registerBlockType( 'my-plugin/strict-block', {
    allowsReconstruction: false,
    // Only Level 0 (exact match), Level 1 (deprecation),
    // or Level 2 (attribute-only differences) will pass
} );

Example HTML to test that has multiple failures in trunk

<!-- This file contains block HTML examples that test different validation levels -->

<!-- ========================================== -->
<!-- Level 0: ValidBlock - Perfect match -->
<!-- ========================================== -->
<!-- The saved HTML exactly matches what save() would generate -->

<!-- wp:paragraph -->
<p>This is a perfect match paragraph.</p>
<!-- /wp:paragraph -->

<!-- wp:heading {"level":2} -->
<h2 class="wp-block-heading">Perfect heading with correct class</h2>
<!-- /wp:heading -->


<!-- ========================================== -->
<!-- Level 1: MigratedBlock - Deprecation applied -->
<!-- ========================================== -->
<!-- These match an old save() format and get migrated via deprecation -->

<!-- Button with old borderRadius attribute (migrates to style.border.radius) -->
<!-- wp:button {"borderRadius":25} -->
<div class="wp-block-button"><a class="wp-block-button__link" style="border-radius:25px">Deprecated
button</a></div>
<!-- /wp:button -->

<!-- Old button with customBackgroundColor (migrates to style.color.background) -->
<!-- wp:button {"customBackgroundColor":"#ff0000"} -->
<div class="wp-block-button"><a class="wp-block-button__link" style="background-color:#ff0000">Old color
button</a></div>
<!-- /wp:button -->


<!-- ========================================== -->
<!-- Level 2: ReconstructedBlock - Attribute-only differences -->
<!-- ========================================== -->
<!-- HTML structure matches but attributes differ (missing classes, different values) -->
<!-- These are always safe - no allowsReconstruction check needed -->

<!-- Paragraph with color attribute but missing generated classes -->
<!-- wp:paragraph {"textColor":"vivid-red"} -->
<p>Paragraph missing color classes</p>
<!-- /wp:paragraph -->

<!-- Heading with color but missing generated classes -->
<!-- wp:heading {"level":2,"textColor":"vivid-cyan-blue"} -->
<h2>Heading missing color classes</h2>
<!-- /wp:heading -->

<!-- Button with background color but missing generated classes -->
<!-- wp:button {"backgroundColor":"vivid-green-cyan"} -->
<div class="wp-block-button"><a class="wp-block-button__link wp-element-button">Button missing bg
classes</a></div>
<!-- /wp:button -->

<!-- Paragraph with extra unexpected class attribute -->
<!-- wp:paragraph -->
<p class="custom-class">Paragraph with extra class not in attributes</p>
<!-- /wp:paragraph -->


<!-- ========================================== -->
<!-- Level 3: RegeneratedBlock - Structural differences -->
<!-- ========================================== -->
<!-- HTML structure differs but can be regenerated from attributes -->
<!-- Requires allowsReconstruction !== false (default true for core blocks) -->

<!-- Wrong heading level tag (comment says 3, HTML has h2) -->
<!-- wp:heading {"level":3} -->
<h2>Should be h3 tag, will be regenerated</h2>
<!-- /wp:heading -->

<!-- Wrong HTML tag entirely -->
<!-- wp:heading -->
<span>This should be a heading tag, not span</span>
<!-- /wp:heading -->

<!-- Different text content than what attributes would generate -->
<!-- wp:heading {"level":2} -->
<h2>Text here doesn't affect regeneration - attributes are trusted</h2>
<!-- /wp:heading -->

<!-- Malformed/mismatched tags -->
<!-- wp:paragraph -->
<p>Opening tag is p, closing is wrong</div>
<!-- /wp:paragraph -->

<!-- Extra nested content that shouldn't exist -->
<!-- wp:heading {"level":2} -->
<h2>Heading text</h2>
<p>Extra paragraph that will be discarded on regeneration</p>
<!-- /wp:heading -->

<!-- Unregistered/missing blocks become Level 3 via fallback handler -->
<!-- wp:fake/nonexistent-block -->
<div>This block type doesn't exist - handled by missing block handler</div>
<!-- /wp:fake/nonexistent-block -->

<!-- wp:custom/missing-plugin -->
<p>Plugin was deactivated - handled by missing block handler</p>
<!-- /wp:custom/missing-plugin -->


<!-- ========================================== -->
<!-- Level 4: InvalidBlock - Cannot be reconstructed -->
<!-- ========================================== -->
<!-- These show the "This block contains unexpected or invalid content" warning -->

<!-- Example 1: Content length threshold exceeded (generated < 50% of original) -->
<!-- The heading only captures text from h2, but extra content would be lost -->
<!-- wp:heading -->
<h2>Title</h2>
<p>This content will be completely lost during regeneration.</p>
<!-- /wp:heading -->

<!-- Example 2: Empty generated content -->
<!-- The heading selector (h1-h6) finds nothing in a div, so content is empty -->
<!-- save() generates an empty heading, but original had content = Level 4 -->
<!-- wp:heading -->
<div>This content is in a div, not a heading tag. The heading block's content
selector only matches h1-h6, so it won't capture this text. The generated
output will be an empty heading tag while the original has substantial content.</div>
<!-- /wp:heading -->

<!-- Example 3: Too many validation issues -->
<!-- Multiple wrong tags and structural mismatches accumulate issues -->
<!-- wp:heading {"level":2} -->
<div class="wrong"><span>Wrong</span><em>tags</em><strong>everywhere</strong><a
href="#">links</a><code>code</code><mark>mark</mark></div>
<!-- /wp:heading -->


<!-- ========================================== -->
<!-- MIXED CONTENT TEST - Realistic Post Content -->
<!-- ========================================== -->
<!-- Simulates a realistic post with various validation levels -->

<!-- Level 0: ValidBlock -->
<!-- wp:paragraph -->
<p>Welcome to my post! This paragraph is perfect.</p>
<!-- /wp:paragraph -->

<!-- Level 2: ReconstructedBlock - Missing color classes -->
<!-- wp:heading {"level":2,"textColor":"vivid-cyan-blue"} -->
<h2>Section Title Without Classes</h2>
<!-- /wp:heading -->

<!-- Level 0: ValidBlock -->
<!-- wp:paragraph -->
<p>This is normal content that matches perfectly.</p>
<!-- /wp:paragraph -->

<!-- Level 3: RegeneratedBlock - Wrong tag structure -->
<!-- wp:heading {"level":3} -->
<h4>Wrong heading level tag</h4>
<!-- /wp:heading -->

<!-- Level 1: MigratedBlock - Old button format -->
<!-- wp:button {"borderRadius":10} -->
<div class="wp-block-button"><a class="wp-block-button__link" style="border-radius:10px">Click me</a></div>
<!-- /wp:button -->

<!-- Level 0: ValidBlock -->
<!-- wp:paragraph -->
<p>More regular content.</p>
<!-- /wp:paragraph -->

@mtias mtias added the [Status] In Progress Tracking issues with work in progress label Nov 8, 2025
@github-actions
Copy link

github-actions bot commented Nov 8, 2025

The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the props-bot label.

If you're merging code through a pull request on GitHub, copy and paste the following into the bottom of the merge commit message.

Co-authored-by: mtias <[email protected]>
Co-authored-by: mcsf <[email protected]>
Co-authored-by: youknowriad <[email protected]>
Co-authored-by: lezama <[email protected]>
Co-authored-by: dmsnell <[email protected]>
Co-authored-by: artpi <[email protected]>
Co-authored-by: jasmussen <[email protected]>

To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook.

@github-actions
Copy link

github-actions bot commented Nov 8, 2025

Size Change: +674 B (+0.03%)

Total Size: 2.58 MB

Filename Size Change
build/scripts/blocks/index.min.js 57.1 kB +674 B (+1.19%)
ℹ️ View Unchanged
Filename Size
build/modules/a11y/index.min.js 355 B
build/modules/abilities/index.min.js 42.3 kB
build/modules/block-editor/utils/fit-text-frontend.min.js 549 B
build/modules/block-library/accordion/view.min.js 779 B
build/modules/block-library/file/view.min.js 346 B
build/modules/block-library/form/view.min.js 528 B
build/modules/block-library/image/view.min.js 1.95 kB
build/modules/block-library/navigation/view.min.js 1.03 kB
build/modules/block-library/query/view.min.js 518 B
build/modules/block-library/search/view.min.js 498 B
build/modules/block-library/tabs/view.min.js 859 B
build/modules/boot/index.min.js 102 kB
build/modules/core-abilities/index.min.js 890 B
build/modules/edit-site-init/index.min.js 1.41 kB
build/modules/interactivity-router/full-page.min.js 451 B
build/modules/interactivity-router/index.min.js 11.5 kB
build/modules/interactivity/index.min.js 14.9 kB
build/modules/latex-to-mathml/index.min.js 56.5 kB
build/modules/latex-to-mathml/loader.min.js 131 B
build/modules/lazy-editor/index.min.js 18.8 kB
build/modules/route/index.min.js 24.6 kB
build/modules/workflow/index.min.js 36.8 kB
build/scripts/a11y/index.min.js 1.06 kB
build/scripts/annotations/index.min.js 2.38 kB
build/scripts/api-fetch/index.min.js 2.83 kB
build/scripts/autop/index.min.js 2.18 kB
build/scripts/blob/index.min.js 631 B
build/scripts/block-directory/index.min.js 8.03 kB
build/scripts/block-editor/index.min.js 325 kB
build/scripts/block-library/index.min.js 277 kB
build/scripts/block-serialization-default-parser/index.min.js 1.16 kB
build/scripts/block-serialization-spec-parser/index.min.js 3.08 kB
build/scripts/commands/index.min.js 19.9 kB
build/scripts/components/index.min.js 272 kB
build/scripts/compose/index.min.js 13.9 kB
build/scripts/core-commands/index.min.js 4.13 kB
build/scripts/core-data/index.min.js 86.7 kB
build/scripts/customize-widgets/index.min.js 12.3 kB
build/scripts/data-controls/index.min.js 793 B
build/scripts/data/index.min.js 9.62 kB
build/scripts/date/index.min.js 23.6 kB
build/scripts/deprecated/index.min.js 752 B
build/scripts/dom-ready/index.min.js 476 B
build/scripts/dom/index.min.js 4.91 kB
build/scripts/edit-post/index.min.js 16.3 kB
build/scripts/edit-site/index.min.js 234 kB
build/scripts/edit-widgets/index.min.js 20 kB
build/scripts/editor/index.min.js 284 kB
build/scripts/element/index.min.js 5.19 kB
build/scripts/escape-html/index.min.js 586 B
build/scripts/format-library/index.min.js 10.8 kB
build/scripts/hooks/index.min.js 1.83 kB
build/scripts/html-entities/index.min.js 494 B
build/scripts/i18n/index.min.js 2.46 kB
build/scripts/is-shallow-equal/index.min.js 568 B
build/scripts/keyboard-shortcuts/index.min.js 1.57 kB
build/scripts/keycodes/index.min.js 1.53 kB
build/scripts/list-reusable-blocks/index.min.js 2.44 kB
build/scripts/media-utils/index.min.js 69.5 kB
build/scripts/notices/index.min.js 1.11 kB
build/scripts/nux/index.min.js 1.88 kB
build/scripts/patterns/index.min.js 7.88 kB
build/scripts/plugins/index.min.js 2.14 kB
build/scripts/preferences-persistence/index.min.js 2.15 kB
build/scripts/preferences/index.min.js 3.31 kB
build/scripts/primitives/index.min.js 1.01 kB
build/scripts/priority-queue/index.min.js 1.61 kB
build/scripts/private-apis/index.min.js 1.06 kB
build/scripts/react-i18n/index.min.js 832 B
build/scripts/react-refresh-entry/index.min.js 9.44 kB
build/scripts/react-refresh-runtime/index.min.js 3.59 kB
build/scripts/redux-routine/index.min.js 3.36 kB
build/scripts/reusable-blocks/index.min.js 2.93 kB
build/scripts/rich-text/index.min.js 12.9 kB
build/scripts/router/index.min.js 5.96 kB
build/scripts/server-side-render/index.min.js 1.91 kB
build/scripts/shortcode/index.min.js 1.58 kB
build/scripts/style-engine/index.min.js 2.33 kB
build/scripts/theme/index.min.js 20.8 kB
build/scripts/token-list/index.min.js 739 B
build/scripts/undo-manager/index.min.js 917 B
build/scripts/url/index.min.js 3.98 kB
build/scripts/vendors/react-dom.min.js 43 kB
build/scripts/vendors/react-jsx-runtime.min.js 691 B
build/scripts/vendors/react.min.js 4.27 kB
build/scripts/viewport/index.min.js 1.22 kB
build/scripts/warning/index.min.js 454 B
build/scripts/widgets/index.min.js 7.81 kB
build/scripts/wordcount/index.min.js 1.04 kB
build/styles/block-directory/style-rtl.css 1.05 kB
build/styles/block-directory/style.css 1.05 kB
build/styles/block-editor/content-rtl.css 4.8 kB
build/styles/block-editor/content.css 4.79 kB
build/styles/block-editor/default-editor-styles-rtl.css 224 B
build/styles/block-editor/default-editor-styles.css 224 B
build/styles/block-editor/style-rtl.css 16.4 kB
build/styles/block-editor/style.css 16.4 kB
build/styles/block-library/accordion-heading/style-rtl.css 325 B
build/styles/block-library/accordion-heading/style.css 325 B
build/styles/block-library/accordion-item/style-rtl.css 180 B
build/styles/block-library/accordion-item/style.css 180 B
build/styles/block-library/accordion-panel/style-rtl.css 99 B
build/styles/block-library/accordion-panel/style.css 99 B
build/styles/block-library/accordion/style-rtl.css 62 B
build/styles/block-library/accordion/style.css 62 B
build/styles/block-library/archives/editor-rtl.css 61 B
build/styles/block-library/archives/editor.css 61 B
build/styles/block-library/archives/style-rtl.css 90 B
build/styles/block-library/archives/style.css 90 B
build/styles/block-library/audio/editor-rtl.css 149 B
build/styles/block-library/audio/editor.css 151 B
build/styles/block-library/audio/style-rtl.css 132 B
build/styles/block-library/audio/style.css 132 B
build/styles/block-library/audio/theme-rtl.css 134 B
build/styles/block-library/audio/theme.css 134 B
build/styles/block-library/avatar/editor-rtl.css 115 B
build/styles/block-library/avatar/editor.css 115 B
build/styles/block-library/avatar/style-rtl.css 104 B
build/styles/block-library/avatar/style.css 104 B
build/styles/block-library/breadcrumbs/style-rtl.css 203 B
build/styles/block-library/breadcrumbs/style.css 203 B
build/styles/block-library/button/editor-rtl.css 265 B
build/styles/block-library/button/editor.css 265 B
build/styles/block-library/button/style-rtl.css 554 B
build/styles/block-library/button/style.css 554 B
build/styles/block-library/buttons/editor-rtl.css 291 B
build/styles/block-library/buttons/editor.css 291 B
build/styles/block-library/buttons/style-rtl.css 349 B
build/styles/block-library/buttons/style.css 349 B
build/styles/block-library/calendar/style-rtl.css 239 B
build/styles/block-library/calendar/style.css 239 B
build/styles/block-library/categories/editor-rtl.css 132 B
build/styles/block-library/categories/editor.css 131 B
build/styles/block-library/categories/style-rtl.css 152 B
build/styles/block-library/categories/style.css 152 B
build/styles/block-library/classic-rtl.css 321 B
build/styles/block-library/classic.css 321 B
build/styles/block-library/code/editor-rtl.css 53 B
build/styles/block-library/code/editor.css 53 B
build/styles/block-library/code/style-rtl.css 139 B
build/styles/block-library/code/style.css 139 B
build/styles/block-library/code/theme-rtl.css 122 B
build/styles/block-library/code/theme.css 122 B
build/styles/block-library/columns/editor-rtl.css 108 B
build/styles/block-library/columns/editor.css 108 B
build/styles/block-library/columns/style-rtl.css 421 B
build/styles/block-library/columns/style.css 421 B
build/styles/block-library/comment-author-avatar/editor-rtl.css 124 B
build/styles/block-library/comment-author-avatar/editor.css 124 B
build/styles/block-library/comment-author-name/style-rtl.css 72 B
build/styles/block-library/comment-author-name/style.css 72 B
build/styles/block-library/comment-content/style-rtl.css 120 B
build/styles/block-library/comment-content/style.css 120 B
build/styles/block-library/comment-date/style-rtl.css 65 B
build/styles/block-library/comment-date/style.css 65 B
build/styles/block-library/comment-edit-link/style-rtl.css 70 B
build/styles/block-library/comment-edit-link/style.css 70 B
build/styles/block-library/comment-reply-link/style-rtl.css 71 B
build/styles/block-library/comment-reply-link/style.css 71 B
build/styles/block-library/comment-template/style-rtl.css 191 B
build/styles/block-library/comment-template/style.css 191 B
build/styles/block-library/comments-pagination-numbers/editor-rtl.css 122 B
build/styles/block-library/comments-pagination-numbers/editor.css 121 B
build/styles/block-library/comments-pagination/editor-rtl.css 168 B
build/styles/block-library/comments-pagination/editor.css 168 B
build/styles/block-library/comments-pagination/style-rtl.css 201 B
build/styles/block-library/comments-pagination/style.css 201 B
build/styles/block-library/comments-title/editor-rtl.css 75 B
build/styles/block-library/comments-title/editor.css 75 B
build/styles/block-library/comments/editor-rtl.css 842 B
build/styles/block-library/comments/editor.css 842 B
build/styles/block-library/comments/style-rtl.css 637 B
build/styles/block-library/comments/style.css 637 B
build/styles/block-library/common-rtl.css 1.11 kB
build/styles/block-library/common.css 1.11 kB
build/styles/block-library/cover/editor-rtl.css 631 B
build/styles/block-library/cover/editor.css 631 B
build/styles/block-library/cover/style-rtl.css 1.82 kB
build/styles/block-library/cover/style.css 1.81 kB
build/styles/block-library/details/editor-rtl.css 65 B
build/styles/block-library/details/editor.css 65 B
build/styles/block-library/details/style-rtl.css 86 B
build/styles/block-library/details/style.css 86 B
build/styles/block-library/editor-elements-rtl.css 75 B
build/styles/block-library/editor-elements.css 75 B
build/styles/block-library/editor-rtl.css 11.8 kB
build/styles/block-library/editor.css 11.8 kB
build/styles/block-library/elements-rtl.css 54 B
build/styles/block-library/elements.css 54 B
build/styles/block-library/embed/editor-rtl.css 331 B
build/styles/block-library/embed/editor.css 331 B
build/styles/block-library/embed/style-rtl.css 448 B
build/styles/block-library/embed/style.css 448 B
build/styles/block-library/embed/theme-rtl.css 133 B
build/styles/block-library/embed/theme.css 133 B
build/styles/block-library/file/editor-rtl.css 324 B
build/styles/block-library/file/editor.css 324 B
build/styles/block-library/file/style-rtl.css 278 B
build/styles/block-library/file/style.css 278 B
build/styles/block-library/footnotes/style-rtl.css 198 B
build/styles/block-library/footnotes/style.css 197 B
build/styles/block-library/form-input/editor-rtl.css 229 B
build/styles/block-library/form-input/editor.css 229 B
build/styles/block-library/form-input/style-rtl.css 366 B
build/styles/block-library/form-input/style.css 366 B
build/styles/block-library/form-submission-notification/editor-rtl.css 344 B
build/styles/block-library/form-submission-notification/editor.css 341 B
build/styles/block-library/form-submit-button/style-rtl.css 69 B
build/styles/block-library/form-submit-button/style.css 69 B
build/styles/block-library/freeform/editor-rtl.css 2.59 kB
build/styles/block-library/freeform/editor.css 2.59 kB
build/styles/block-library/gallery/editor-rtl.css 615 B
build/styles/block-library/gallery/editor.css 616 B
build/styles/block-library/gallery/style-rtl.css 1.84 kB
build/styles/block-library/gallery/style.css 1.84 kB
build/styles/block-library/gallery/theme-rtl.css 108 B
build/styles/block-library/gallery/theme.css 108 B
build/styles/block-library/group/editor-rtl.css 335 B
build/styles/block-library/group/editor.css 335 B
build/styles/block-library/group/style-rtl.css 103 B
build/styles/block-library/group/style.css 103 B
build/styles/block-library/group/theme-rtl.css 79 B
build/styles/block-library/group/theme.css 79 B
build/styles/block-library/heading/style-rtl.css 205 B
build/styles/block-library/heading/style.css 205 B
build/styles/block-library/html/editor-rtl.css 419 B
build/styles/block-library/html/editor.css 419 B
build/styles/block-library/image/editor-rtl.css 763 B
build/styles/block-library/image/editor.css 763 B
build/styles/block-library/image/style-rtl.css 1.6 kB
build/styles/block-library/image/style.css 1.59 kB
build/styles/block-library/image/theme-rtl.css 137 B
build/styles/block-library/image/theme.css 137 B
build/styles/block-library/latest-comments/style-rtl.css 355 B
build/styles/block-library/latest-comments/style.css 354 B
build/styles/block-library/latest-posts/editor-rtl.css 139 B
build/styles/block-library/latest-posts/editor.css 138 B
build/styles/block-library/latest-posts/style-rtl.css 520 B
build/styles/block-library/latest-posts/style.css 520 B
build/styles/block-library/list/style-rtl.css 107 B
build/styles/block-library/list/style.css 107 B
build/styles/block-library/loginout/style-rtl.css 61 B
build/styles/block-library/loginout/style.css 61 B
build/styles/block-library/math/editor-rtl.css 105 B
build/styles/block-library/math/editor.css 105 B
build/styles/block-library/math/style-rtl.css 61 B
build/styles/block-library/math/style.css 61 B
build/styles/block-library/media-text/editor-rtl.css 321 B
build/styles/block-library/media-text/editor.css 320 B
build/styles/block-library/media-text/style-rtl.css 543 B
build/styles/block-library/media-text/style.css 542 B
build/styles/block-library/more/editor-rtl.css 393 B
build/styles/block-library/more/editor.css 393 B
build/styles/block-library/navigation-link/editor-rtl.css 645 B
build/styles/block-library/navigation-link/editor.css 647 B
build/styles/block-library/navigation-link/style-rtl.css 190 B
build/styles/block-library/navigation-link/style.css 188 B
build/styles/block-library/navigation-submenu/editor-rtl.css 295 B
build/styles/block-library/navigation-submenu/editor.css 294 B
build/styles/block-library/navigation/editor-rtl.css 2.24 kB
build/styles/block-library/navigation/editor.css 2.24 kB
build/styles/block-library/navigation/style-rtl.css 2.27 kB
build/styles/block-library/navigation/style.css 2.25 kB
build/styles/block-library/nextpage/editor-rtl.css 392 B
build/styles/block-library/nextpage/editor.css 392 B
build/styles/block-library/page-list/editor-rtl.css 356 B
build/styles/block-library/page-list/editor.css 356 B
build/styles/block-library/page-list/style-rtl.css 192 B
build/styles/block-library/page-list/style.css 192 B
build/styles/block-library/paragraph/editor-rtl.css 251 B
build/styles/block-library/paragraph/editor.css 251 B
build/styles/block-library/paragraph/style-rtl.css 341 B
build/styles/block-library/paragraph/style.css 340 B
build/styles/block-library/post-author-biography/style-rtl.css 74 B
build/styles/block-library/post-author-biography/style.css 74 B
build/styles/block-library/post-author-name/style-rtl.css 69 B
build/styles/block-library/post-author-name/style.css 69 B
build/styles/block-library/post-author/style-rtl.css 188 B
build/styles/block-library/post-author/style.css 189 B
build/styles/block-library/post-comments-count/style-rtl.css 72 B
build/styles/block-library/post-comments-count/style.css 72 B
build/styles/block-library/post-comments-form/editor-rtl.css 96 B
build/styles/block-library/post-comments-form/editor.css 96 B
build/styles/block-library/post-comments-form/style-rtl.css 525 B
build/styles/block-library/post-comments-form/style.css 525 B
build/styles/block-library/post-comments-link/style-rtl.css 71 B
build/styles/block-library/post-comments-link/style.css 71 B
build/styles/block-library/post-content/style-rtl.css 61 B
build/styles/block-library/post-content/style.css 61 B
build/styles/block-library/post-date/style-rtl.css 62 B
build/styles/block-library/post-date/style.css 62 B
build/styles/block-library/post-excerpt/editor-rtl.css 71 B
build/styles/block-library/post-excerpt/editor.css 71 B
build/styles/block-library/post-excerpt/style-rtl.css 155 B
build/styles/block-library/post-excerpt/style.css 155 B
build/styles/block-library/post-featured-image/editor-rtl.css 719 B
build/styles/block-library/post-featured-image/editor.css 717 B
build/styles/block-library/post-featured-image/style-rtl.css 347 B
build/styles/block-library/post-featured-image/style.css 347 B
build/styles/block-library/post-navigation-link/style-rtl.css 215 B
build/styles/block-library/post-navigation-link/style.css 214 B
build/styles/block-library/post-template/style-rtl.css 414 B
build/styles/block-library/post-template/style.css 414 B
build/styles/block-library/post-terms/style-rtl.css 96 B
build/styles/block-library/post-terms/style.css 96 B
build/styles/block-library/post-time-to-read/style-rtl.css 70 B
build/styles/block-library/post-time-to-read/style.css 70 B
build/styles/block-library/post-title/style-rtl.css 162 B
build/styles/block-library/post-title/style.css 162 B
build/styles/block-library/preformatted/style-rtl.css 125 B
build/styles/block-library/preformatted/style.css 125 B
build/styles/block-library/pullquote/editor-rtl.css 133 B
build/styles/block-library/pullquote/editor.css 133 B
build/styles/block-library/pullquote/style-rtl.css 365 B
build/styles/block-library/pullquote/style.css 365 B
build/styles/block-library/pullquote/theme-rtl.css 176 B
build/styles/block-library/pullquote/theme.css 176 B
build/styles/block-library/query-pagination-numbers/editor-rtl.css 121 B
build/styles/block-library/query-pagination-numbers/editor.css 118 B
build/styles/block-library/query-pagination/editor-rtl.css 154 B
build/styles/block-library/query-pagination/editor.css 154 B
build/styles/block-library/query-pagination/style-rtl.css 237 B
build/styles/block-library/query-pagination/style.css 237 B
build/styles/block-library/query-title/style-rtl.css 64 B
build/styles/block-library/query-title/style.css 64 B
build/styles/block-library/query-total/style-rtl.css 64 B
build/styles/block-library/query-total/style.css 64 B
build/styles/block-library/query/editor-rtl.css 438 B
build/styles/block-library/query/editor.css 438 B
build/styles/block-library/quote/style-rtl.css 238 B
build/styles/block-library/quote/style.css 238 B
build/styles/block-library/quote/theme-rtl.css 233 B
build/styles/block-library/quote/theme.css 236 B
build/styles/block-library/read-more/style-rtl.css 131 B
build/styles/block-library/read-more/style.css 131 B
build/styles/block-library/reset-rtl.css 472 B
build/styles/block-library/reset.css 472 B
build/styles/block-library/rss/editor-rtl.css 126 B
build/styles/block-library/rss/editor.css 126 B
build/styles/block-library/rss/style-rtl.css 284 B
build/styles/block-library/rss/style.css 283 B
build/styles/block-library/search/editor-rtl.css 199 B
build/styles/block-library/search/editor.css 199 B
build/styles/block-library/search/style-rtl.css 665 B
build/styles/block-library/search/style.css 666 B
build/styles/block-library/search/theme-rtl.css 113 B
build/styles/block-library/search/theme.css 113 B
build/styles/block-library/separator/editor-rtl.css 100 B
build/styles/block-library/separator/editor.css 100 B
build/styles/block-library/separator/style-rtl.css 248 B
build/styles/block-library/separator/style.css 248 B
build/styles/block-library/separator/theme-rtl.css 195 B
build/styles/block-library/separator/theme.css 195 B
build/styles/block-library/shortcode/editor-rtl.css 286 B
build/styles/block-library/shortcode/editor.css 286 B
build/styles/block-library/site-logo/editor-rtl.css 773 B
build/styles/block-library/site-logo/editor.css 770 B
build/styles/block-library/site-logo/style-rtl.css 218 B
build/styles/block-library/site-logo/style.css 218 B
build/styles/block-library/site-tagline/editor-rtl.css 87 B
build/styles/block-library/site-tagline/editor.css 87 B
build/styles/block-library/site-tagline/style-rtl.css 65 B
build/styles/block-library/site-tagline/style.css 65 B
build/styles/block-library/site-title/editor-rtl.css 85 B
build/styles/block-library/site-title/editor.css 85 B
build/styles/block-library/site-title/style-rtl.css 143 B
build/styles/block-library/site-title/style.css 143 B
build/styles/block-library/social-link/editor-rtl.css 314 B
build/styles/block-library/social-link/editor.css 314 B
build/styles/block-library/social-links/editor-rtl.css 339 B
build/styles/block-library/social-links/editor.css 338 B
build/styles/block-library/social-links/style-rtl.css 1.51 kB
build/styles/block-library/social-links/style.css 1.51 kB
build/styles/block-library/spacer/editor-rtl.css 346 B
build/styles/block-library/spacer/editor.css 346 B
build/styles/block-library/spacer/style-rtl.css 48 B
build/styles/block-library/spacer/style.css 48 B
build/styles/block-library/style-rtl.css 16.5 kB
build/styles/block-library/style.css 16.5 kB
build/styles/block-library/tab/style-rtl.css 202 B
build/styles/block-library/tab/style.css 202 B
build/styles/block-library/table-of-contents/style-rtl.css 83 B
build/styles/block-library/table-of-contents/style.css 83 B
build/styles/block-library/table/editor-rtl.css 394 B
build/styles/block-library/table/editor.css 394 B
build/styles/block-library/table/style-rtl.css 641 B
build/styles/block-library/table/style.css 640 B
build/styles/block-library/table/theme-rtl.css 152 B
build/styles/block-library/table/theme.css 152 B
build/styles/block-library/tabs/editor-rtl.css 236 B
build/styles/block-library/tabs/editor.css 236 B
build/styles/block-library/tabs/style-rtl.css 983 B
build/styles/block-library/tabs/style.css 983 B
build/styles/block-library/tag-cloud/editor-rtl.css 92 B
build/styles/block-library/tag-cloud/editor.css 92 B
build/styles/block-library/tag-cloud/style-rtl.css 248 B
build/styles/block-library/tag-cloud/style.css 248 B
build/styles/block-library/template-part/editor-rtl.css 368 B
build/styles/block-library/template-part/editor.css 368 B
build/styles/block-library/template-part/theme-rtl.css 113 B
build/styles/block-library/template-part/theme.css 113 B
build/styles/block-library/term-count/style-rtl.css 63 B
build/styles/block-library/term-count/style.css 63 B
build/styles/block-library/term-description/style-rtl.css 126 B
build/styles/block-library/term-description/style.css 126 B
build/styles/block-library/term-name/style-rtl.css 62 B
build/styles/block-library/term-name/style.css 62 B
build/styles/block-library/term-template/editor-rtl.css 225 B
build/styles/block-library/term-template/editor.css 225 B
build/styles/block-library/term-template/style-rtl.css 114 B
build/styles/block-library/term-template/style.css 114 B
build/styles/block-library/text-columns/editor-rtl.css 95 B
build/styles/block-library/text-columns/editor.css 95 B
build/styles/block-library/text-columns/style-rtl.css 165 B
build/styles/block-library/text-columns/style.css 165 B
build/styles/block-library/theme-rtl.css 715 B
build/styles/block-library/theme.css 719 B
build/styles/block-library/verse/style-rtl.css 123 B
build/styles/block-library/verse/style.css 123 B
build/styles/block-library/video/editor-rtl.css 415 B
build/styles/block-library/video/editor.css 416 B
build/styles/block-library/video/style-rtl.css 202 B
build/styles/block-library/video/style.css 202 B
build/styles/block-library/video/theme-rtl.css 134 B
build/styles/block-library/video/theme.css 134 B
build/styles/commands/style-rtl.css 1.72 kB
build/styles/commands/style.css 1.72 kB
build/styles/components/style-rtl.css 14 kB
build/styles/components/style.css 14 kB
build/styles/customize-widgets/style-rtl.css 1.44 kB
build/styles/customize-widgets/style.css 1.44 kB
build/styles/edit-post/classic-rtl.css 426 B
build/styles/edit-post/classic.css 427 B
build/styles/edit-post/style-rtl.css 3.42 kB
build/styles/edit-post/style.css 3.42 kB
build/styles/edit-site/style-rtl.css 16.1 kB
build/styles/edit-site/style.css 16.2 kB
build/styles/edit-widgets/style-rtl.css 4.67 kB
build/styles/edit-widgets/style.css 4.67 kB
build/styles/editor/style-rtl.css 18.8 kB
build/styles/editor/style.css 18.8 kB
build/styles/format-library/style-rtl.css 326 B
build/styles/format-library/style.css 326 B
build/styles/list-reusable-blocks/style-rtl.css 1.02 kB
build/styles/list-reusable-blocks/style.css 1.02 kB
build/styles/media-utils/style-rtl.css 400 B
build/styles/media-utils/style.css 400 B
build/styles/nux/style-rtl.css 622 B
build/styles/nux/style.css 618 B
build/styles/patterns/style-rtl.css 611 B
build/styles/patterns/style.css 611 B
build/styles/preferences/style-rtl.css 415 B
build/styles/preferences/style.css 415 B
build/styles/reusable-blocks/style-rtl.css 275 B
build/styles/reusable-blocks/style.css 275 B
build/styles/widgets/style-rtl.css 1.17 kB
build/styles/widgets/style.css 1.18 kB

compressed-size-action

@mtias

This comment was marked as outdated.

@mtias
Copy link
Member Author

mtias commented Nov 9, 2025

From the HTML test example, all blocks should be silently reconstructed, except for the level 5 case, which should error out in the UI.

@mtias

This comment was marked as outdated.

// - Both have content, OR
// - Both are empty (valid empty block), OR
// - Original is empty but generated has content (adding generated classes/structure)
const contentIsReasonable = hasGeneratedContent || ! hasOriginalContent;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which case we're excluding here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was wondering the same.


// Shortcut to avoid costly validation.
// Shortcut: Fallback blocks (freeform/unregistered) are marked as raw transformed.
if ( isFallbackBlock ) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this should apply to both unregistered and freeform? Feels like it should apply only to "freeform". Also I feel like it should apply only when there's no comment delimiters for the block, in other words, if the block comment explicitly states that it should use core/classic or core/html for instance, it shouldn't auto-transform even if that block is marked as "freeform" block content handler.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I revised this logic a bit. It comes down to the fact we wrap unregistered blocks in core/missing which handles the UI. A block with core/missing is "valid" from the perspective of the validation flow.

mtias added 3 commits November 9, 2025 13:32
- RECONSTRUCTED_BLOCK
- REGENERATED_BLOCK

With the second level handling more trivial attributes
and level three doing entire block reconstruction as
long as the block flag is set
@mtias mtias force-pushed the add/validation-levels branch from de17480 to 3a520de Compare November 11, 2025 03:23
@mtias mtias force-pushed the add/validation-levels branch from 3a520de to 160e352 Compare November 11, 2025 03:25
Copy link
Contributor

@mcsf mcsf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the test cases:

Level 2: Reconstructed
Test cases:

  • Dynamic block rendering server-side content differently
  • Template blocks with variable output
  • Blocks with computed/derived innerHTML from attributes

and

Level 4: Invalid

  • Block with incompatible save output

What do these mean?


From the HTML test example, all blocks should be silently reconstructed, except for the level 5 case, which should error out in the UI.

The goal is to 1) silently reconstruct without saving, then 2) saving the reconstruction if the user modified anything. Correct?


Level 3: RawTransformedBlock
Definition: Raw handling functions applied; block type name preserved
When: Freeform or unregistered type handlers process the content
Action: Raw HTML transformation applied
Test Cases:
Unregistered block type converted to freeform
Classic block with raw HTML content
HTML block with freeform content

Not very clear to me how much this is supposed to change from what already happens. Reading "raw handling functions applied" makes me think that we can be more confident in transforming non-block content into blocks, instead of always carefully isolating them as Freeform, but the rest of the text suggests maintaining the isolation approach.


I think this is a good start, but it would help me if we could discuss how we see this making a difference for users, what kind of new interactions they would see, and what — if any — contingencies we should plan. For example, and without endorsing of these ideas:

  • If we want to perform silent transformations more often, do we need to change the granularity of post revisions, in case users want to better understand what the system did?
  • Similarly, can we improve the undo/redo system to have some awareness of these transformations? Or, on the contrary, do we solidify the idea that these are meant to be invisible (under the condition that they be robust, of course)?
  • Scenarios like a <!-- wp:heading surrounding both a H2 and a P tag and discarding the P (suggested somewhere above) give me pause, hence the questions above.

There was also an example of a block that should be regenerated whose tag didn't match the save output. For example, a Heading block with a single P tag. The premise is that, if we can parse the attributes, we can regenerate the whole thing. Well, but how do we parse the content attribute, since it's meant to be sourced using a selector that won't match?

const isDirectAttributeIssue =
message.includes( 'Expected attribute' ) ||
message.includes( 'Expected attributes' ) ||
message.includes( 'Unexpected attribute' );
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't match. The logger is currently passed:

Encountered unexpected attribute `%s`.

This is irrelevant for evaluating the overall PR, but it will easily be missed later if I don't point it out.

*
* @return {boolean} Whether issues are only attribute-related.
*/
export function areOnlyAttributeDifferences( validationIssues ) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably fine for a first effort, but adding a piece in the system that interprets the validation issues logged by our own validation function is roundabout, and the whole validation module could probably use an overhaul with these levels in mind.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed here maybe we can update validationIssues to be more structured to avoid relying on strings.

*
* @return {Object} Validation result with isValid computed property.
*/
export function createValidationResult(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has tests but is unused.

// - Both have content, OR
// - Both are empty (valid empty block), OR
// - Original is empty but generated has content (adding generated classes/structure)
const contentIsReasonable = hasGeneratedContent || ! hasOriginalContent;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was wondering the same.

Comment on lines -613 to +667
* false otherwise. Invalid HTML is not considered equivalent, even if the
* strings directly match.
* false otherwise.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that the first branch in this function contradicted the last sentence of this comment, so I removed it.

@artpi
Copy link
Contributor

artpi commented Nov 27, 2025

I want to underscore how useful this is for what we are trying to do with AI
We are currently manipulating block tree in PHP using AI. AI does a good job of manipulating the JSON tree, but syncing the HTML and attributes proves to be a lot of fain and hassle.

Level 2: ReconstructedBlock

This would solve this problem - we would take manipulated tree and depend on UI to reconstitute the syntax.

Deprecations were incorrectly matching when their save() output
qualified for automated reconstruction rather than an exact match.
@mtias mtias changed the title Add/validation levels Introduce block validation levels Dec 5, 2025
@mtias mtias added [Feature] Block API API that allows to express the block paradigm. [Feature] Parsing Related to efforts to improving the parsing of a string of data and converting it into a different f [Type] Enhancement A suggestion for improvement. and removed [Feature] Block API API that allows to express the block paradigm. [Feature] Parsing Related to efforts to improving the parsing of a string of data and converting it into a different f labels Dec 5, 2025
@mtias mtias requested a review from ockham December 5, 2025 16:12
These cases jump to level 4 (invalid block)
@mtias
Copy link
Member Author

mtias commented Dec 6, 2025

@youknowriad @mcsf thanks for the initial look. I think this is ready for more in depth reviews.

I included some html to test in the original description. It should cover multiple levels and give a sense for how much reconstruction helps. (Compare what trunk produces versus this branch.) The amount of UI errors is drastically reduced.

Added a condition to be a bit more conservative in reconstruction and take into account volume of content discrepancies. If it's above a threshold skip to level 4. I think this is a good safety net to balance, but the specific threshold I think is up for debate. Without it, level 4 is very hard to reach for core blocks (which is not a bad thing, in a sense).

All tests and fixtures should be passing.

I would like to follow this up with introducing a UI signal when a block was reconstructed at level 4. The signal should be unobtrusive but present for user review if necessary. But this PR should introduce no UI changes and just focus on the validation logic.

A final thing is that with this in place perhaps there are a lot of manually written block deprecations that can be removed from the codebase entirely.

@mtias
Copy link
Member Author

mtias commented Dec 6, 2025

Also @ellatrix after this, I'd like to continue the idea of automatically converting freeform/classic content to blocks (maybe as a flag that can be set in wp_config).

@mtias mtias removed the [Status] In Progress Tracking issues with work in progress label Dec 6, 2025
// block comment (e.g., className, ariaLabel). Unlike Level 2/3 reconstruction
// which would lose these values on save, fixes preserve them in the block's
// attributes for regeneration.
const fixedBlock = applyBuiltInValidationFixes(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be considered a level of validation on its own? I see that right now, we're just considering it "valid_block"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure, these built in validation and normalization routines are applied in a few places, including some raw handling work.

// If the block was migrated via deprecation, update validation level to Level 1
if ( updatedBlock.__wasMigrated ) {
updatedBlock.validationLevel = VALIDATION_LEVEL.MIGRATED_BLOCK;
delete updatedBlock.__wasMigrated; // Clean up internal flag
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: Makes me wonder if we should update the return value of applyBlockDeprecatedVersions to use the same format used here (add the validation level instead of a flag)

// This ensures deprecations (explicit author instructions) take priority over
// automated block reconstruction.
if ( ! updatedBlock.isValid ) {
const [ canReconstruct, , reconstructMeta ] = validateBlock(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In terms of "flow". These new validation levels can be considered new kind of "built-in" fixes. But in your implementation, they are being done in the "validateBlock" function, kind of changing what this function is about and also now the built-in fixes are kind of done in two places during the parsing flow.

I wonder if we should move the new logic and the code of applyBlockValidation also inline in this function. To schematize, the code of this function could become more clear I think something like:

if ( isblockValid ) {
    return { level 1 }
}

if ( applyBuiltinFixes() ) {
   return { level 2 }
}

if ( applyDeprecations() ) {
   return { level 3 }
}

if ( applyReconstruction() ) {
   return { level 4 }
}

if ( applyRegeneration() ) {
   return { level 5 }
}

return { level6 (invalid) }

Each step could be tested separately and the flow becomes a lot more clear. WDYT? Would that be too big of a change for now?

const isFallbackBlock =
block.name === getFreeformContentHandlerName() ||
block.name === getUnregisteredTypeHandlerName();
export function validateBlock(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just noting that this is a public API, so we should be careful about the changes not impacting existing usage. It seems fine at first sight, but you know better :P

@youknowriad
Copy link
Contributor

The main case that I found a bit weird in my tests is the following:

Content loss

<!-- wp:heading -->
<p class="wp-block-heading">Some content</p>
<!-- /wp:heading -->

becomes

<!-- wp:heading -->
<h2 class="wp-block-heading"></h2>
<!-- /wp:heading -->

I also noticed that it's almost impossible to get the paragraph block to be invalid in both this PR and trunk, it absorbs all content within the "p", it's a bit weird but it's a separate issue.

I think it's a bit hard to know the full extent of the impact of this PR, but I'm willing to give it a try, especially given it impacts invalid content only.

Copy link
Contributor

@youknowriad youknowriad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can give this a try.

I left some potential flow/code enhancements but nothing really major.

@youknowriad
Copy link
Contributor

We may want to document the new flag of the block API in the handbook.

*
* @type {Object}
*/
export const VALIDATION_LEVEL_NAME = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this used anywhere?

Copy link
Member

@dmsnell dmsnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May WordPress never corrupt or lose user content! Thanks for working on this.

As I read through this patch and reflect on the original discussion, I hear a mixture of issues that now seems to branch out beyond where it was before:

  • Improve our ability to auto-detect benign changes or corruption and fix them reliability when the content in a post appears as a block but is different than the editor expects.
  • Gracefully auto-update blocks lacking a proper migration/deprecation from older versions of the markup.
  • Smush LLM-generated block content into valid markup when it’s not.

And I think it’s hard to see this proposal as solving all of them at once because I think the needs and characteristics are different. For content which has been manually adjusted or is out of date, or which a plugin might have altered, we expect small and intentional changes within the existing structure. For LLM-generated content we expect random and self-contradictory content which looks like blocks, and importantly, without the same sense of intention behind the content. Or, perhaps more tersely: intentional edits vs. new pluasible block content.


For automation purposes, I don’t fully understand the need for the validation levels, and to some extent have always struggled understanding their role. If we attempt to rewrite blocks automatically, as is already done when loading a post, then it seems like we could add just a few more steps and wouldn’t need the classification system.

In fact, I believe that Gutenberg already handles levels 0–2, meaning only level 4 is “new” and since it requires an opt-in, requires no apparent classification scheme to introduce as a new thing. We could quietly add the flag on block registration and handle these through the existing loader.


One of the changes I’ve found most valuable from my own work is the combination of #38794, #38923, and #39523 where we started preserving so-called “invalid” content (so-called, because usually it’s valid but the editor is unaware). Before then, it was common and easy for the editor to erase content in a way similar to how this change can wipe out content from something like a Heading block which uses the DIV or SPAN element as a wrapper instead of H1H6.

Being able to preserve content that the editor doesn’t understand seems fundamental to providing the reliability people need for trusting the editor with their content. A common flow where this is normal is when copying and pasting a post from one WordPress to another, where one lacks the appropriate plugin or version to run every block. “Invalid blocks” should stay invalid and verbatim so that they work properly when copied back to the original site.


To that end I wonder if we have seen the project evolve more towards needing a new idiom: “fix slop” with a better name. In this case, that could be the addition of a new function such as fuzzyParseBlock( blockNode, fullBlockHTML ) which can be called in flows where structure is a guess (e.g. from LLM contexts), but also which could be called as a response to a user clicking on “Attempt to fix this broken block.”

Given the liberty taken in the resolutions here, and the risk of corrupting or losing content, and the combinatorial explosion of ways to try and detect invalid markup, I think it would be wise to hesitate to add this additional transformation to the loading stage.

If an LLM workflow wants to dump something into the editor it could call fuzzyParseBlocks() separately and then dump the output of that function into the editor, meaning this can all be separate, external, tested in isolation, and exchangeable.

In some of the earlier discussions (though apparently not in #21703) I advocated for the idea of allowing blocks to register their own attribute parser to determine validity in custom ways. That still offers merit that an automated system can’t, and I think it would fit in nice with a fuzzy-parse stage. For example, a Heading block could register a fuzzy parser and do any number of things, such as grabbing the first element, or taking the plaintext content of the entire provided HTML as the heading text, or that but including formatting elements, etc… We originally had in mind benign changes like added styles, classes, and attributes, something that has been incorporated in various ways over time in the core system. Having a fuzzy system as a separate action for cases where resolution isn’t obvious seems like an appropriate match to the qualitatively distinct operation of trying to make sense out of content that is self-contradictory.


@artpi we have tools to read and write block HTML structure on the server. let’s talk if you are having issues with this.


@mtias I see a few type changes that are only visible inline where they are read. this means that unless someone knows where to look for them they won’t likely be able to find them. it would be awesome if we could add documentation in the places people will look.

  • __wasMigrated and validationLevel could be added to the WPBlock type with an inline description of what it represents, even as an optional property.
  • allowsReconstruction could be added to the docblock on registerBlockType() and also the WPBlockType. we have example block.json files we could extend showing its use.

In summary, if our goal is more along the lines of cleaning up LLM slop then it seems like this could be handled on its own outside of the primary loading and validation framework and cause less risk to inadvertently erasing intentional content. Maybe a lot of the mechanisms here could be moved over into a new function for that purpose.

That would give users more freedom: to click on a block and have the editor attempt lossy reconstruction, to click “Attempt cleanup” or “Paste exactly” on a paste handler if non-valid block content is detected, and to be called manually in AI flows before inserting via createBlock() or whatever.

const issues = [
{
log: jest.fn(),
args: [ 'Expected attributes Array(2), saw Array(1).' ],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are these text-only error messages the only way we have available here to mention the issues? would it not help if each issue were indicated as the level it corresponds to? then the areOnlyAttributeDifferences() function would amount to !issues.some( issue => issue.level > LEVELS.ATTRIBUTE_ONLY )

if ( message.includes( 'Expected token of type' ) ) {
// Extract the token objects from the logger arguments.
const expectedToken = issue.args?.[ 2 ];
const actualToken = issue.args?.[ 4 ];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a simple example of what we expect on the issue array would greatly clarify what we’re trying to extract.

expectedTagName = match?.[ 1 ];
} else if ( expectedToken?.tagName ) {
expectedTagName = expectedToken.tagName;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this whole section is a bit unclear: is there a way to show in a comment some inputs where we would expect it to run?

const originalLength = block.originalContent?.trim().length || 0;
const generatedLength = generatedBlockContent?.trim().length || 0;
const wouldLoseContent =
originalLength > 0 && generatedLength < originalLength * 0.5;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems a bit arbitrary to measure 50%, especially when even in the example we have more tag markup than we do plaintext.

I wonder if this could be adjustable and go off of more robust heuristics. it seems like a good verification step in any case, and it’s good to see us thinking about how to detect when we mess up what someone produced.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'm curious if there's better ideas for more robust heuristic to have as a default. I think this is better than nothing, but can be tweaked.

@mtias
Copy link
Member Author

mtias commented Dec 9, 2025

The goal of this is not at all influenced by cleaning up LLM HTML yet, though it happens to also assist with it since precision and deterministic structure is not its strength. The fundamental reason is to reduce or eliminate the confusing and paralyzing "this block is malformed" message to an end user, which erodes confidence in the system and is puzzling if you don't know how things work or what to do next. @dmsnell probably the most illustrative is to copy the HTML shared at the top and paste it in Gutenberg. Many of the examples "break" in trunk and we've been shifting the responsibility to resolve them to the user.

The other motivation for this implementation is to make explicit a lot of the flow already in place but difficult to act on after the fact because it often left no trail in the system.

In fact, I believe that Gutenberg already handles levels 0–2, meaning only level 4 is “new” and since it requires an opt-in, requires no apparent classification scheme to introduce as a new thing. We could quietly add the flag on block registration and handle these through the existing loader.

Levels 0-1 existed and were implicit. Level 2 partially, but fairly ad hoc. Level 3 is new, but is not set as opt-in but opt-out. All core blocks are thus opt-in by default. I'd like to follow this up with specific UI for handling this, which I'm working on add/block-validation-indicator separately.

That's the other reason for these explicit levels. As we introduce more ways of manipulating content, including collaboration flows, suggestion flows, etc, being able to keep track of what level of integrity a block is at is going to be useful to develop specific UIs to help users disambiguate when needed.

Which also can apply to a better revision system, aware of the state of blocks, where we could easily classify changes that were done automatically (migrations, deprecations, reconstruction levels) from user changes, and collapse them, stack them, etc, to make the interface more useful.

If an LLM workflow wants to dump something into the editor it could call fuzzyParseBlocks() separately and then dump the output of that function into the editor, meaning this can all be separate, external, tested in isolation, and exchangeable.

I initially had this as a separate level (a RAW_TRANSFORM one) but came to the conclusion (as you point out at the beginning) that it's a separate problem about dropping arbitrary HTML and converting it to the best block approximations. That's something we wanted to do with the classic -> blocks as an "always on" setting. I think it's worth exploring separately. That gets grouped with the overall "paste arbitrary html" and have it mapped to the most adequate places. That does require more knowledge of blocks and may require block author instructions to be done properly.

@dmsnell
Copy link
Member

dmsnell commented Dec 9, 2025

@mtias I was tracking this work in the past in #38922

it often left no trail in the system

to me this is one of the most significant issues, which also led to data corruption when loading blocks which don’t invalidate. specifically, when we upgrade the list block we moved from a selector for inner content to inner blocks, and so the block loaded with zero list items, which was a valid parse, and erased any existing items from the previous version.

the deprecations didn’t catch this either, though a version number would have.

add/block-invalidation-indicator

Are you talking about #7604?

At some point @jasmussen had some lovely figures of colored dots to the side of each block with issues. I felt like there was a huge win to be had with those, because we could differentiate warnings and errors, get distraction out of the block for things like grammar mistakes even, or blocks whose attributes require attention.

That's the other reason for these explicit levels. As we introduce more ways of manipulating content, including collaboration flows, suggestion flows, etc, being able to keep track of what level of integrity a block is at is going to be useful to develop specific UIs to help users disambiguate when needed.

this is the part I would love to better understand, because I don’t quite get what the levels give us.

the diff and resolution dialog is not that useful, especially in the way it hides and ignores inner blocks. I feel like we could improve that, but also I think there’s more to attribute differences than just differences. added and removed attributes seems more useful, whereas additional classes or attributes largely seem benign and should be preserved.

@jasmussen
Copy link
Contributor

At some point @jasmussen had some lovely figures of colored dots to the side of each block with issues. I felt like there was a huge win to be had with those, because we could differentiate warnings and errors, get distraction out of the block for things like grammar mistakes even, or blocks whose attributes require attention.

I can dig up those mockups again, and there was potential. But I think there might be a better option still: a full fledged block editor linting tool. Accessible perhaps from within the document outline (2nd tab under list view), and emphasized with an unread dot on the list view icon if an error is there. And also surfaced as a panel in the pre-publish flow. Linting options would be in 3 levels like we're used to:

  • Notice: You've an empty block. (e.g. an image placeholder)
  • Warning: You've a failing contrast issue.
  • Error: You have an invalid block.

What do you think?

@dmsnell
Copy link
Member

dmsnell commented Dec 11, 2025

thanks @jasmussen. well I still like the unobtrusive dots in situ, and I think you said with some better wording what I was trying to convey, that these can represent statuses for action or review. I love having a view like the list view and of showing these in the publish flow.

but.

with an unread dot

you had me until this 😆. I guarantee it will appear endlessly after everything has been seen and read, and it will distract people from writing, and people will try and find ways to hide it, but it will tirelessly return.

for me this crosses from a helpful notice into demanding feigned urgency, probably because without switching tasks and opening the view, the editor doesn’t know what the dot is for — is it there because something is new? is something broken? is the dot misbehaving? and by the time we’ve opened up the view and realized the dot isn’t relevant we’ve forgotten what we were doing.

@mtias
Copy link
Member Author

mtias commented Dec 12, 2025

@dmsnell no, it's a branch I pushed building upon this work, doesn't have a PR yet. :) I'll share a mockup of how it looks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

[Feature] Block API API that allows to express the block paradigm. [Type] Enhancement A suggestion for improvement.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants