Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Capture userAgent in URL metric when debugging #1490

Closed
wants to merge 1 commit into from

Conversation

westonruter
Copy link
Member

Important

This is a follow-up PR to #1489. Merge that PR before this one so that this PR will merge into trunk.

I discovered on my site when looking at URL metrics that there was a crazy tall viewport:

"viewport": {
    "width": 1024,
    "height": 12140
}

Apparently this is because Googlebot is the crawler for the URL, as I found from https://www.seroundtable.com/googlebot-9000px-high-viewport-24727.html (although I need to confirm this).

In any case, it isn't currently possible to determine the user agent responsible for a given URL metric. So I think it will be very helpful when debugging URL metrics data to have this information included.

Separately, we'll need to compute the aspect ratio of the viewport and abort capturing the URL metric if it is not conceivably a regular device form factor!

@westonruter westonruter self-assigned this Aug 21, 2024
@westonruter
Copy link
Member Author

Actually, I don't like having to special case this optional property. This is probably better to make the schema extensible, and then a mini plugin can be installed which logs the user agent in the URL metric. This will pair very nicely with #1373 which introduced client-side extensions to amend the URL metrics prior to sending.

@adamsilverstein
Copy link
Member

Apparently this is because Googlebot is the crawler for the URL

Ah, I wonder if crawler bots could be a problem generally as some may not render as well as others. Having the metrics would be helpful to see for sure; could we exclude bots entirely if we find they cause issues with collected metrics?

@westonruter
Copy link
Member Author

@adamsilverstein I think we could exclude known bots by user agent, like Googlebot. But simply excluding by extreme aspect ratios should solve the problem too, right?

@adamsilverstein
Copy link
Member

@adamsilverstein I think we could exclude known bots by user agent, like Googlebot. But simply excluding by extreme aspect ratios should solve the problem too, right?

Probably for most sites; just thinking about sites with JS front ends for example that may not get rendered properly by a crawler, potentially leading to saving of invalid metrics. I think its fine to leave for now, unless we discover this issue is more widespread than we expect.

I guess the good part about allowing the crawlers is we are more likely to get and update metrics even for low traffic sites.

Base automatically changed from add/od-debugging to trunk August 21, 2024 18:17
@westonruter westonruter added [Type] Enhancement A suggestion for improvement of an existing feature [Plugin] Optimization Detective Issues for the Optimization Detective plugin labels Aug 21, 2024
@westonruter westonruter force-pushed the add/od-opt-in-user-agent-capture branch from 97eab3b to 341ec69 Compare August 21, 2024 20:51
@westonruter
Copy link
Member Author

Closing in favor of another PR to make the schema extensible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Plugin] Optimization Detective Issues for the Optimization Detective plugin [Type] Enhancement A suggestion for improvement of an existing feature
Projects
Status: Done 😃
Development

Successfully merging this pull request may close these issues.

2 participants