Skip to content

Conversation

@markkelnar
Copy link
Contributor

What does this implement/fix? Explain your changes.

The graphql url cache key header returned includes http/https as well as path. But this differs from integration with WPE Varnish cache, where this currently runs. Bring this to consistent/expected behavior where the varnish rule uses and invalidates only the hostname.

Does this close any currently open issues?

part of wp-graphql/wp-graphql-smart-cache/issues/246

Any relevant logs, error output, GraphiQL screenshots, etc?

(If it’s long, please paste to https://ghostbin.com/ and insert the link here.)

Any other comments?

Where has this been tested?

Operating System:

WordPress Version:

@markkelnar markkelnar changed the title Use hostname for graphql cache header url for varnish fix: Use hostname for graphql cache header url for varnish Aug 9, 2023
@markkelnar markkelnar requested a review from jasonbahl August 9, 2023 22:28
@coveralls
Copy link

coveralls commented Aug 9, 2023

Coverage Status

coverage: 84.969% (-0.01%) from 84.981% when pulling 5649e04 on markkelnar:fix/cache-key-url into 027323f on wp-graphql:develop.

@markkelnar markkelnar requested a review from justlevine August 10, 2023 13:51
justlevine
justlevine previously approved these changes Aug 10, 2023
Copy link
Collaborator

@justlevine justlevine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🚀

The lockfile version changes included here are explicit in #2891, so i'm not concerned that they slip in here too.

fix per phpcs
@qlty-cloud-legacy
Copy link

Code Climate has analyzed commit 5649e04 and detected 0 issues on this pull request.

View more on Code Climate.

@jasonbahl jasonbahl merged commit ad5a4e9 into wp-graphql:develop Aug 10, 2023
@jasonbahl
Copy link
Collaborator

jasonbahl commented Aug 10, 2023

@markkelnar we might need to revert this.

The full endpoint is returned to allow for multi-tenant caches to not accidentally purge the incorrect thing.

see: varnish/varnish-modules#41 (comment)

For WordPress instances that are set up as sub-directory multisites (site.com/site-1/graphql, site.com/site-2/graphql), caching clients (Varnish, etc) can prefix the cache keys with the graphql endpoint to ensure they're cached uniquely.

If we use the X-GraphQL-Url to prefix our keys, and the value is only the host, not the full endpoint, then we could have collisions.

For example a query from site-1 and a query from site-2 could both return the same node (post:1, for example).

Currently they would be tagged as site.com/site-1/graphql:post:1 and site.com/site-2/graphql:post:1 respectively.

When a post in site 1 changes, it would then purge site.com/site-1/graphql:post:1 but not purge the site-2 post, and vis-versa.

If we go back to just the host, instead of the full endpoint, then we end up tagging both as: site.com:graphql:post:1 and an update to post:1 on either site will purge the cache for both sites.

If we expand to a multisite network of hundreds of sites that each have post:1 the problem gets worse.

I believe, unless I'm missing something about how Varnish + XKey, etc works, I think we probably need to keep the full endpoint here.


Some more context:

@markkelnar
Copy link
Contributor Author

I see site_url() used in graphql_get_endpoint_url(). We should confirm that is indeed returning the value for public/queryable hostname/path, ex. domain.com/pathy/graphql/, that you are expecting and not the path to wp-admin path. We might want to use home_url() instead for the url of where requests are sent.

It might also make sense not to include the protocol (http/https) in the url cache header.

Without knowing those, we might not have a change in behavior for anyone using caching. If you are more comfortable reverting out this change and figuring that out before we try again, we can do that.

@jasonbahl
Copy link
Collaborator

@markkelnar I set up a multisite using localwp and set it to use sub-directories.

My site is cleverly named multisite.local.

I created a 2nd site, cleverly named site-2. This site's GraphQL endpoint can be accessed at multisite.local/site-2/graphql

When I query data from site-2, the expected fully qualified endpoint is output in the headers:

CleanShot 2023-08-10 at 14 33 33

I believe we will cause a regression for subdomain multisites if we make this change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants