-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Compression Streams API for URL Metrics JSON Compression #1905
Implement Compression Streams API for URL Metrics JSON Compression #1905
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## trunk #1905 +/- ##
==========================================
+ Coverage 71.01% 71.18% +0.16%
==========================================
Files 85 85
Lines 6959 7000 +41
==========================================
+ Hits 4942 4983 +41
Misses 2017 2017
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
A couple things following up from #1886: performance/plugins/optimization-detective/detect.js Lines 668 to 670 in 29fff3c
For this let's remove the warn( 'Unable to look up XPath for element' ); Note that For this: performance/plugins/optimization-detective/detect.js Lines 806 to 813 in 29fff3c
Let's just remove the |
Co-authored-by: Weston Ruter <[email protected]>
Co-authored-by: Weston Ruter <[email protected]>
The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the If you're merging code through a pull request on GitHub, copy and paste the following into the bottom of the merge commit message.
To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook. |
Amazing! With 200 images being tracked, the result is now: Note Sending URL Metric (5,167 bytes, 8% of 64 KiB limit) Whereas before it was an error: Caution Unable to send URL Metric because it is 101,527 bytes, 159% of 64 KiB limit I can even go up to 1,000 images and it's still well under the limit: Note Sending URL Metric (20,343 bytes, 32% of 64 KiB limit) |
@@ -1,3 +1,5 @@ | |||
// noinspection JSUnusedGlobalSymbols |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this because in PhpStorm because otherwise there is a warning:
The reason the IDE doesn't know the export is used is because it is used in PHP and via a dynamic export. See also https://stackoverflow.com/questions/54687009/export-default-unused-in-webstorm
This noinspection
annotation is used in Gutenberg once as well: https://github.com/WordPress/gutenberg/blob/8b88ada2fd675509bf0c39a55c23a75cc67987cb/packages/components/src/mobile/link-settings/test/edit.native.js#L1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is so good, but there's one more thing that needs to be done: we need to account for whether gzdecode()
is available.
} | ||
|
||
try { | ||
$decompressed_body = gzdecode( $compressed_body ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked at this in PhpStorm and I noticed an error:
So then I added it to require
in c59c3ed, but then I wondered what core does with this function. I found this:
Note that core checks if the function exists. I think what we'll have to do is do the same, and that od_get_detection_script()
we'll need to add a new param which is exported to the client like gzdecodeAvailable
which contains the value of function_exists( 'gzdecode' )
. This can then be used to conditionally compress the URL Metric data before sending it to the REST API.
We should use the function in the same way it is being used in core, perhaps even with the error suppression.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added logic for handling gzdecode()
availability in af1d1dc
function od_decompress_rest_request_body( $result, WP_REST_Server $server, WP_REST_Request $request ) { | ||
if ( | ||
$request->get_route() === '/' . OD_REST_API_NAMESPACE . OD_URL_METRICS_ROUTE && | ||
'application/gzip' === $request->get_header( 'Content-Type' ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'application/gzip' === $request->get_header( 'Content-Type' ) | |
'application/gzip' === $request->get_header( 'Content-Type' ) && | |
function_exists( 'gzdecode' ) |
const jsonBody = JSON.stringify( urlMetric ); | ||
const compressedJsonBody = await compress( jsonBody ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per above, this should be conditional based on gzdecodeAvailable
.
@@ -346,3 +325,65 @@ function od_trigger_page_cache_invalidation( int $cache_purge_post_id ): void { | |||
/** This action is documented in wp-includes/post.php. */ | |||
do_action( 'save_post', $post->ID, $post, /* $update */ true ); | |||
} | |||
|
|||
/** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just had a thought that perhaps an Accept
response header should be added for this endpoint. This apparently isn't needed because I don't see Accept: application/json
being served anywhere now in REST API responses. Just something to think about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor suggestion: Let's have gzdecodeAvailable
always passed to the client.
Co-authored-by: Weston Ruter <[email protected]>
/* | ||
* The limit for data sent via navigator.sendBeacon() is 64 KiB. This limit is checked in detect.js so that the | ||
* request will not even be attempted if the payload is too large. This server-side restriction is added as a | ||
* safeguard against clients sending possibly malicious payloads much larger than 64 KiB which should never be | ||
* getting sent. | ||
*/ | ||
$max_size = 64 * 1024; | ||
$content_length = strlen( (string) wp_json_encode( $url_metric ) ); | ||
if ( $content_length > $max_size ) { | ||
return new WP_Error( | ||
'rest_content_too_large', | ||
sprintf( | ||
/* translators: 1: the size of the payload, 2: the maximum allowed payload size */ | ||
__( 'JSON payload size is %1$s bytes which is larger than the maximum allowed size of %2$s bytes.', 'optimization-detective' ), | ||
number_format_i18n( $content_length ), | ||
number_format_i18n( $max_size ) | ||
), | ||
array( 'status' => 413 ) | ||
); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed this check from od_handle_rest_request
as we can compress the URL metrics. However, in cases where gzdecode
is unavailable, should we reintroduce this check in od_handle_rest_request
, conditioned upon the availability of gzdecode
?
Additionally, if a malicious user sends data with application/json
content type and gzdecode
is available, there is currently no check enforcing an upper limit on the size of the URL metrics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yes, good points!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add a limit of 1MB for URL metrics? I am not sure about the what should be the limit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder as well, when gzdecode()
is available, what happens if I try (maliciously) submitting a URL Metric which has 1,000,000 instances of the same element? Given that compression will likely be very effective in this case with the duplication, will it result in something extremely large being submitted when decompressed? If so, should we have some fallback size constraint for a decompressed URL Metric?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I think we were thinking along the same lines here and commented at the same time.)
Yes, let's maybe introduce a new function that returns the maximum byte size for a URL Metric serialized to JSON. The function's return value can be filtered, but defaulting to MB_IN_BYTES
seems good to me. This should perhaps get exported to the client so that the pre-compressed JSON size can be checked to see if it is too big so the client will skip sending it in that case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a thought—should we compress the URL metrics before storing them in the database? Would using gzdecode
cause performance issues, since we'll need to decompress the data on every request to the optimized page? Compressing the metrics could allow us to store larger URL metric data, but I'm not sure if any real-world scenario would generate more than 1 MB of URL metrics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to compress in the DB, no. That would be an over-optimization and could negatively impact performance at runtime with having to decompress. Storage is cheap. What is expensive is network bandwidth, especially with the limitations of the beacon size. So that's what we want to optimize for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried submitting a URL Metric with various numbers of copies of the first element completely populating the elements
. I was able to get up to 18,000 identical elements being copied before going over the 64 KiB limit with compression. Before this URL Metric was compressed, it was a total of 10,584,362 bytes (10.6 MB). So 1,800 elements is roughly 1 MB. This seems like a reasonable limit to impose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried doing something silly and visit every single tag:
add_action( 'od_register_tag_visitors', function ( OD_Tag_Visitor_Registry $registry ) {
$registry->register( 'all', '__return_true' );
} );
On a post with 1,000 Image blocks, this resulted in 2,198 elements in the URL Metric. The uncompressed size of the URL Metric is 1,068,353 bytes (about 1 MB), with the compressed size being 33,047 bytes (52% of 64 KiB limit). So indeed, I think 1 MB is probably far larger than any URL Metric will naturally be, so it's a safe upper bound for what should be allowed.
…th fallback Co-authored-by: Weston Ruter <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
Unfortunately, I found a serious bug here which prevents URL Metrics from being submitted: #1928. I think this needs to be reverted for the next release. |
Summary
Fixes #1893
Relevant technical choices
This PR implements the Compression Streams API to efficiently compress URL Metrics JSON before sending it via
navigator.sendBeacon()
. This significantly reduces the payload size, allowing more data to be sent without exceeding the 64 KiB limit.