Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix URL encoding in Link HTTP response header #1907

Merged

Conversation

ShyamGadde
Copy link
Contributor

Summary

Fixes #1906

Relevant technical choices

  • Removed urldecode() to prevent double encoding.
  • Adjusted regex in encode_url_for_response_header() to preserve existing percent-encoded sequences while properly encoding invalid characters.

Signed-off-by: Shyamsundar Gadde <[email protected]>
@ShyamGadde ShyamGadde added [Type] Bug An existing feature is broken [Plugin] Optimization Detective Issues for the Optimization Detective plugin [Plugin] Image Prioritizer Issues for the Image Prioritizer plugin (dependent on Optimization Detective) labels Mar 5, 2025
Copy link

codecov bot commented Mar 5, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 69.59%. Comparing base (db322a3) to head (20a03ca).
Report is 10 commits behind head on trunk.

Additional details and impacted files
@@            Coverage Diff             @@
##            trunk    #1907      +/-   ##
==========================================
- Coverage   69.59%   69.59%   -0.01%     
==========================================
  Files          86       86              
  Lines        6983     6982       -1     
==========================================
- Hits         4860     4859       -1     
  Misses       2123     2123              
Flag Coverage Δ
multisite 69.59% <100.00%> (-0.01%) ⬇️
single 39.63% <0.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@@ -489,7 +489,7 @@ public function data_provider_to_test_add_link(): array {
'expected_html' => '
<link data-od-added-tag rel="preload" href="https://example.com/חנות/wp-content/uploads/2025/01/example.jpg?ver=1+2" as="image">
',
'expected_header' => 'Link: <https://example.com/%D7%97%D7%A0%D7%95%D7%AA/wp-content/uploads/2025/01/example.jpg?ver=1%202>; rel="preload"; as="image"',
'expected_header' => 'Link: <https://example.com/%D7%97%D7%A0%D7%95%D7%AA/wp-content/uploads/2025/01/example.jpg?ver=1+2>; rel="preload"; as="image"',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add test cases for the example URLs I have in the issue description as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 20a03ca.

Signed-off-by: Shyamsundar Gadde <[email protected]>
@ShyamGadde ShyamGadde marked this pull request as ready for review March 5, 2025 13:21
@ShyamGadde ShyamGadde requested a review from felixarntz as a code owner March 5, 2025 13:21
@ShyamGadde ShyamGadde requested a review from westonruter March 5, 2025 13:21
Copy link

github-actions bot commented Mar 5, 2025

The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the props-bot label.

If you're merging code through a pull request on GitHub, copy and paste the following into the bottom of the merge commit message.

Co-authored-by: ShyamGadde <[email protected]>
Co-authored-by: westonruter <[email protected]>

To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook.

// Encode characters not allowed in a URL per RFC 3986 (anything that is not among the reserved and unreserved characters).
$encoded_url = (string) preg_replace_callback(
'/[^A-Za-z0-9\-._~:\/?#\[\]@!$&\'()*+,;=]/',
'/[^A-Za-z0-9\-._~:\/?#\[\]@!$&\'()*+,;=%]/',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing that comes to mind. What if someone has a URL like: https://example.com/price.jpg?amount=100%-free. What happens here? Will browsers force that % to get encoded as if they encountered non-ASCII characters?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After looking into how WordPress handles standalone % characters in URLs via esc_url_raw(), I found that:

  1. WordPress allows standalone % characters in URLs since they are part of the allowed character set.
  2. It only strips truly invalid characters, not malformed percent encodings.

Additionally, modern browsers (tested on Edge, Chrome, and Firefox) accept URLs with standalone % characters. For example, given a URL like https://example.com/price.jpg?amount=100%-free:

  • Both browsers and WordPress preserve the standalone %.
  • The image loads correctly without errors.
  • The URL is not rejected as invalid.

I considered encoding standalone % with something like:

preg_replace_callback(
    '/%(?![0-9A-Fa-f]{2})/',
    static function () {
        return '%25';
    },
    $url
);

However, this would create inconsistency between how the URL appears in HTML vs. HTTP headers, potentially causing the same double-request issue we're trying to fix.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can confirm that in trunk right now, a bare % is resulting in the incorrect behavior where the % is getting percent-encoded:

image

Warning

The resource https://upload.wikimedia.org/wikipedia/commons/8/8d/American_bison_k5680-1.jpg?amount=100%25-free was preloaded using link preload but not used within a few seconds from the window's load event. Please make sure it has an appropriate as value and it is preloaded intentionally.

Whereas on this branch it is working correctly with only one image is getting preloaded:

image

And no warning appears in the console.

Copy link
Member

@westonruter westonruter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!!

@westonruter westonruter merged commit a5c48c4 into WordPress:trunk Mar 5, 2025
21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Plugin] Image Prioritizer Issues for the Image Prioritizer plugin (dependent on Optimization Detective) [Plugin] Optimization Detective Issues for the Optimization Detective plugin [Type] Bug An existing feature is broken
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Preload links for image URLs containing commas can be erroneously percent-encoded
2 participants