Block Custom CSS: encode values to prevent wp_kses mangling#76480
Block Custom CSS: encode values to prevent wp_kses mangling#76480
Conversation
…tion This update introduces functions to encode and decode CSS strings as base64 data URIs, ensuring that CSS is safely stored and retrieved without being altered by wp_kses. The changes include a new decoding function in PHP and corresponding encoding/decoding logic in JavaScript, allowing for seamless handling of custom CSS in the block editor.
|
Size Change: +229 B (0%) Total Size: 8.75 MB
ℹ️ View Unchanged
|
|
Flaky tests detected in 8b5997f. 🔍 Workflow run URL: https://github.com/WordPress/gutenberg/actions/runs/23074150668
|
| for ( const byte of bytes ) { | ||
| binary += String.fromCharCode( byte ); | ||
| } | ||
| return CSS_BASE64_PREFIX + btoa( binary ); |
There was a problem hiding this comment.
I think we can have the browser do this for us simply by creating a Blob. I may be wrong, but it could be worth exploring.
function createTextDataURL( rawText: string, mimeType: string ): Promise<string> {
const blob = new Blob( [ ( new TextEncoder() ).encode( rawText ) ], { type: mimeType } );
const file = new FileReader();
file.readAsDataURL( blob );
return file.result;
}
const attributeValue = await createTextDataURL( css, 'text/css' );there’s a reverse transformation from this as well. it’s not like it’s considerably less code than what you have proposed, but it removes some of the internal details about converting text and producing the data URL, so might be simpler to maintain.
lib/block-supports/custom-css.php
Outdated
| // phpcs:ignore WordPress.PHP.DiscouragedPHPFunctions.obfuscation_base64_decode | ||
| $decoded = base64_decode( substr( $value, strlen( $prefix ) ), /* strict */ true ); | ||
| // If the decoded string contains characters from outside the base64 alphabet or a null byte, set the CSS to an empty string. | ||
| return ( false === $decoded || str_contains( $decoded, "\0" ) ) ? '' : $decoded; |
There was a problem hiding this comment.
what’s the purpose of checking for the null byte? I am imagining a couple of ideas but it’s a guess.
by the way, I think the comment might be a little confused, because the decoded string absolutely should include characters from outside the base64 alphabet.
perhaps there is a valid case to be made to check for wp_is_valid_utf8(), but HTML and CSS are both capable of handling a null byte, and removing one would increase the risk of accidentally introducing an exploit.
what you’ve done by rejecting the CSS entirely is the far-preferable alternative to stripping out null bytes, but I would be curious to learn more the reasoning behind the check in the first place.
There was a problem hiding this comment.
would there be a reason to return $value here instead of '' on a failure to decode? what do you think the tradeoffs would be?
There was a problem hiding this comment.
Good catch, and thanks for questioning. The comment is wrong.
This was definitely hacked together on a Friday eve. 😇
The strict = true flag on base64_decode is what handles characters outside the base64 alphabet.
The way I understand things, rejecting characters from outside the base64 alphabet is okay when decoding. Because encoding only produces A-Za-z0-9+/= anyway.
Can you help me understand the case for not rejecting? Happy to update this in any way you think is correct.
what’s the purpose of checking for the null byte?
The null byte check is probably an echo now that I pause to think about it. I have to research it again, but some brain cell kept a memory that it might be treated as the end of the string.
color: red;\0</style><script>alert(1)</script>
Shows how old I am: https://pentesterlab.com/glossary/null-byte-injection
"Most modern languages and frameworks now properly handle or reject null bytes. PHP 5.3.4+ throws an error on null bytes in paths. However, legacy systems may still be vulnerable."
I think I'm fine to relax and touch grass 😄
would there be a reason to return $value here instead of '' on a failure to decode? what do you think the tradeoffs would be?
I like that question.
I suppose neither is super duper for UX, but I went with '' to be conservative.
My reasoning was thus - this is inside a render operation function ultimately, so returning nothing is preferable to returning wrong output. Maybe the function name needs to be updated, e..g, to something like gutenberg_decode_custom_css_attribute_for_display. 🤷🏻
Now that we're talking about it, I guess a storage decoder on failure should return $value (preserve the encoded string to avoid data loss). I can't think of a valid use case right now, maybe import/export tools (??) or if we ever wanted the block editor to always receive plain CSS (and not have to call decodeCSSAttribute).
Anyway, I appreciate the help and discussion again. This was a POC to push @glendaviesnz's case forward, I hope we get there!
lib/block-supports/custom-css.php
Outdated
| // that wp_kses cannot corrupt characters like `&`, `>`, or nested selectors. | ||
| // Plain CSS (saved by users with `unfiltered_html`, or legacy content) is | ||
| // returned unchanged by this function. | ||
| $custom_css = gutenberg_decode_css_attribute( $custom_css ); |
There was a problem hiding this comment.
I like this approach here. while it does leave open some confusion by blurring raw content from data-URI-encoded content, I think the likelihood of finding CSS starting with a data URI prefix is low.
this approach runs the CSS through the existing flow, which is a nice touch, because ultimately, regardless of how we store or source the CSS, we want it running through WordPress’ broader policies.
| if ( ! str_starts_with( $value, $prefix ) ) { | ||
| return $value; | ||
| } | ||
| // phpcs:ignore WordPress.PHP.DiscouragedPHPFunctions.obfuscation_base64_decode |
There was a problem hiding this comment.
I’m saddened on your behalf that you have to pollute the code with noise in order to prevent the linter from rejecting your valid and legitimate work.
on the other hand, maybe we should add rules rejecting every builtin PHP function 🤔, or reject all contributions without a sworn disclaimer…
…decode_custom_css_attribute_for_display`. - Remove misleading comment. - Excise the null check
…ves outside the custom css control workflow, e.g., REST, direct editing in the editor.
…ks for verifying base64-encoded CSS values.
There was a problem hiding this comment.
Pull request overview
This PR introduces base64 encoding for per-block Custom CSS values to prevent wp_kses from corrupting CSS content (notably nested selectors and characters like & / >), while decoding the stored value for display and render-time processing.
Changes:
- Encode the
style.csscustom CSS attribute asdata:text/css;base64,...before KSES runs (JS onChange + PHPpre_kses). - Decode the stored base64 value for editor display and server-side render processing.
- Add PHPUnit coverage for encode/decode behavior and render-path integration.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
phpunit/block-supports/custom-css-test.php |
Adds unit + integration tests for decode/display and pre-KSES encoding. |
packages/block-editor/src/hooks/custom-css.js |
Encodes custom CSS before saving; decodes for UI display and editor-time processing/validation. |
lib/compat/wordpress-7.0/kses.php |
Adds a pre_kses filter to base64-encode custom CSS attributes across all save paths (recurses into inner blocks). |
lib/block-supports/custom-css.php |
Adds PHP decode shim and decodes prior to server-side validation + CSS processing. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
| function gutenberg_decode_custom_css_attribute_for_display( $value ) { | ||
| $prefix = 'data:text/css;base64,'; | ||
| if ( ! str_starts_with( $value, $prefix ) ) { | ||
| return $value; | ||
| } | ||
| // phpcs:ignore WordPress.PHP.DiscouragedPHPFunctions.obfuscation_base64_decode | ||
| $decoded = base64_decode( substr( $value, strlen( $prefix ) ), /* strict */ true ); | ||
| return false !== $decoded ? $decoded : ''; |
| $custom_css = gutenberg_decode_custom_css_attribute_for_display( $custom_css ); | ||
|
|
| function gutenberg_encode_custom_css_for_kses( $content ) { | ||
| if ( ! has_blocks( $content ) ) { | ||
| return $content; | ||
| } |
|
Looks like the approach will be to strip until a general approach to CSS is worked out. See #76650 |
What
When a user without
unfiltered_html(e.g. an Editor role) saves a post, WordPress runswp_kseson the post content.wp_kses
treats block attribute string values as HTML, so it mangles CSS containing characters like&,>, or nested selectors —& p { color: red }becomes& p { color: red }, and the rendered selector becomes.wp-custom-css-xxxamp; p {}`.How
Before saving, the JS
onChangehandler base64-encodes the CSS value as adata:text/css;base64,…URI.wp_ksespasses base64 characters through untouched. On read, the value is decoded back to plain CSS in three places:CustomCSSControl— decodes intodisplayStylebefore passing toAdvancedPanel, so the user always sees readable CSS in the paneluseBlockProps— decodes beforevalidateCSSandprocessCSSNesting, so the editor preview works correctlygutenberg_render_custom_css_support_styles— decodes beforeprocess_blocks_custom_css, so the frontend renders the correct CSSValues that don't start with the prefix (legacy content, or content saved by admins who bypass KSES entirely) are returned unchanged.
Migration path
This encoding is intentionally temporary. When
WP_CSS_Token_Processoror a proper CSS sanitizer API lands in WordPress core:encodeCSSAttributecall inonChange— plain CSS can then be stored directlydecodeCSSAttributeandgutenberg_decode_css_attributeas read-time shims — they decode any previously encoded values, and the next save will write plain CSS, so migration is lazy with no batch script requiredNotes
&already in the DB) is not auto-repaired — the user must re-edit and re-save the affected blockonChange. It only fires when someone edits CSS. It doesn't fire for passive saves where the CSS attribute is just carried along.Testing
unfiltered_html), add a block and open Advanced → Additional CSSbackground: green; & p { color: yellow; padding: 20px; }