caching.mdx

title	Query Plan Caching
subtitle	Configure in-memory and distributed caching for query plans
description	Configure query plan caching to improve router performance by storing generated query plans in memory or Redis.

import RedisTls from '../../../shared/redis-tls.mdx';

Whenever your router receives an incoming GraphQL operation, it generates a query plan to determine which subgraphs it needs to query to resolve that operation.

By caching previously generated query plans, your router can skip generating them again if a client later sends the exact same operation—improving your router's responsiveness.

Performance improvements vs. stability

The router is a highly scalable and low-latency runtime. Even with all caching disabled, the time to process operations and query plans is minimal (nanoseconds to milliseconds) when compared to the overall supergraph request, except in edge cases of extremely large operations and supergraphs.

Caching offers stability to those running a large graph so your overhead for given operations stays consistent, not that it dramatically improves. To validate the performance wins of operation caching, check out the traces and metrics in the router to take measurements before and after.

In extremely large edge cases though, the cache can save 2-10x time to create the query plan, which is still a small part of the overall request.

In-memory caching

GraphOS Router enables query plan caching by default using an in-memory LRU cache. In your router's YAML config file, you can configure the maximum number of query plan entries in the cache:

supergraph:
  query_planning:
    cache:
      in_memory:
        limit: 512 # This is the default value.

Cache warm-up

When loading a new schema, a query plan might change for some queries, so cached query plans cannot be reused.

To prevent increased latency upon query plan cache invalidation, the router precomputes query plans for the most used queries from the cache when a new schema is loaded.

Precomputed plans are cached before the router switches traffic over to the new schema.

You can also send the header Apollo-Expose-Query-Plan: dry-run for generating query plans at runtime, which you can use to warm up your cache instances with a custom-defined operation list.

By default, the router warms up the cache with 30% of the queries already in cache, but you can configure it as follows:

supergraph:
  query_planning:
    # Pre-plan the 100 most used operations when the supergraph changes
    warmed_up_queries: 100

Additionally, the router can use the persisted query list to prewarm the cache. By default, the router prewarms the cache when loading a new schema but not on startup. You can configure the router to change these defaults.

Cache warm-up with headers

With router v1.61.0+ and v2.x+, if you have enabled exposing query plans via --dev mode or plugins.experimental.expose_query_plan: true, you can pass the Apollo-Expose-Query-Plan header to return query plans in the GraphQL response extensions. You must set the header to one of the following values:

true: Returns a human-readable string and JSON blob of the query plan while still executing the query to fetch data.
dry-run: Generates the query plan and aborts without executing the query.

After using dry-run, query plans are saved to your configured cache locations. Using real, mirrored, or similar-to-production operations is a great way to warm up the caches before transitioning traffic to new router instances.

Monitoring cache performance

To get more information on the planning and warm-up process, use the following metrics (where <storage> can be redis for distributed cache or memory):

Counters

apollo.router.cache.hit.time.count{kind="query planner", storage="<storage>"}
apollo.router.cache.miss.time.count{kind="query planner", storage="<storage>"}

Histograms

apollo.router.query_planning.plan.duration: time spent planning queries
- planner: The query planner implementation used (rust or js)
- outcome: The outcome of the query planning process (success, timeout, cancelled, error)
apollo.router.schema.load.duration: time spent loading a schema
apollo.router.cache.hit.time{kind="query planner", storage="<storage>"}: time to get a value from the cache
apollo.router.cache.miss.time{kind="query planner", storage="<storage>"}

Gauges

apollo.router.cache.size{kind="query planner", storage="memory"}: current size of the cache (only for in-memory cache)
apollo.router.cache.storage.estimated_size{kind="query planner", storage="memory"}: estimated storage size of the cache (only for in-memory query planner cache)

To define the right size of the in-memory cache, monitor apollo.router.cache.size and the cache hit rate. Then examine apollo.router.schema.load.duration and apollo.router.query_planning.plan.duration to decide how much time to spend warming up queries.

Distributed caching with Redis

Rate limits apply on the Free plan. Performance pricing applies on Developer and Standard plans. Developer and Standard plans require Router v2.6.0 or later.

If you have multiple GraphOS Router instances, those instances can share a Redis-backed cache for their query plans. This means that if any of your router instances caches a particular value, all of your instances can look up that value to significantly improve responsiveness.

Prerequisites

To use distributed caching:

You must have a Redis cluster (or single instance) that your router instances can communicate with.
You must have a GraphOS Enterprise plan and connect your router to GraphOS.

How it works

Whenever a router instance requires a query plan to resolve a client operation:

The router instance checks its own in-memory cache for the required value and uses it if found.
If not found, the router instance then checks the distributed Redis cache for the required value and uses it if found. It also then replicates the found value in its own in-memory cache.
If not found, the router instance generates the required query plan.
The router instance stores the obtained value in both the distributed cache and its in-memory cache.

Redis URL configuration

The distributed caching configuration must contain one or more URLs using different schemes depending on the expected deployment:

redis — TCP connected to a centralized server.
rediss — TLS connected to a centralized server.
redis-cluster — TCP connected to a cluster.
rediss-cluster — TLS connected to a cluster.
redis-sentinel — TCP connected to a centralized server behind a sentinel layer.
rediss-sentinel — TLS connected to a centralized server behind a sentinel layer.

The URLs must have the following format:

One node

redis|rediss :// [[username:]password@] host [:port][/database]

Example: redis://localhost:6379

Clustered

redis|rediss[-cluster] :// [[username:]password@] host [:port][?[node=host1:port1][&node=host2:port2][&node=hostN:portN]]

or, if configured with multiple URLs:

[
  "redis|rediss[-cluster] :// [[username:]password@] host [:port]",
  "redis|rediss[-cluster] :// [[username:]password@] host1 [:port1]",
  "redis|rediss[-cluster] :// [[username:]password@] host2 [:port2]"
]

Sentinel

redis|rediss[-sentinel] :// [[username1:]password1@] host [:port][/database][?[node=host1:port1][&node=host2:port2][&node=hostN:portN]
                            [&sentinelServiceName=myservice][&sentinelUsername=username2][&sentinelPassword=password2]]

or, if configured with multiple URLs:

[
  "redis|rediss[-sentinel] :// [[username:]password@] host [:port][/database][?[&sentinelServiceName=myservice][&sentinelUsername=username2][&sentinelPassword=password2]]",
  "redis|rediss[-sentinel] :// [[username1:]password1@] host [:port][/database][?[&sentinelServiceName=myservice][&sentinelUsername=username2][&sentinelPassword=password2]]"
]

Router configuration

In your router's YAML config file, you should specify your Redis URLs via environment variables and variable expansion. This prevents your Redis URLs from being committed to version control, which is especially dangerous if they include authentication information like a username and/or password.

Cached query plans are not evicted on schema refresh, which can quickly lead to distributed cache overflow when combined with cache warm-up and frequent schema publishes.

Test your cache configuration with expected queries and consider decreasing the TTL to prevent cache overflow.

To enable distributed caching of query plans, add the following to your router's YAML config file:

supergraph:
  query_planning:
    cache:
      redis:
        urls: ["redis://..."]

The value of urls is a list of URLs for all Redis instances in your cluster.

All query plan cache entries will be prefixed with plan. within the distributed cache.

Redis configuration options

supergraph:
  query_planning:
    cache:
      redis:
        urls: ["redis://..."]
        username: admin/123 # Optional, can be part of the urls directly, mainly useful if you have special character like '/' in your password that doesn't work in url. This field takes precedence over the username in the URL
        password: admin # Optional, can be part of the urls directly, mainly useful if you have special character like '/' in your password that doesn't work in url. This field takes precedence over the password in the URL
        timeout: 2s # Optional, by default: 500ms
        ttl: 24h # Optional
        namespace: "prefix" # Optional
        #tls:
        required_to_start: false # Optional, defaults to false
        reset_ttl: true # Optional, defaults to true
        pool_size: 4 # Optional, defaults to 1

Timeout

Connecting and sending commands to Redis have a timeout of 500ms by default, which you can override.

TTL

The ttl option defines the default global expiration for Redis entries. For query plan caching, the default expiration is set to 30 days.

When enabling distributed caching, consider how frequently you publish new schemas and configure the TTL accordingly. When new schemas are published, the router pre-warms the in-memory and distributed caches but doesn't invalidate existing cached query plans in the distributed cache, creating an additive effect on cache utilization.

To prevent cache overflow, consider decreasing the TTL to 24 hours or twice the median publish interval (whichever's lesser), and monitor cache utilization in your environment, especially during schema publish events.

Also note that when cache warm-up is enabled, each router instance will warm the distributed cache with query plans from its own in-memory cache. In the worst case, a schema publish will increase the number of query plans in the distributed cache by the number of router instances multiplied by the number of warmed-up queries per instance, which may noticeably increase the total cache utilization.

Be sure to test your configuration with expected queries and during schema publish events to understand the impact of distributed caching on cache utilization.

Namespace

When using the same Redis instance for multiple purposes, the namespace option defines a prefix for all the keys defined by the router.

TLS

Nest this tls configuration under supergraph.query_planning.cache.

Required to start

When active, the required_to_start option will prevent the router from starting if it cannot connect to Redis. By default, the router will still start without a connection to Redis, which would result in only using the in-memory cache for query planning.

Reset TTL

When this option is active, accessing a cache entry in Redis will reset its expiration.

Pool size

The pool_size option defines the number of connections to Redis that the router will open. By default, the router will open a single connection to Redis. If there is a lot of traffic between router and Redis and/or there is some latency in those requests, it is recommended to increase the pool size to reduce that latency.

Cache warm-up with distributed caching

If the router uses distributed caching for query plans, the warm-up phase also stores the new query plans in Redis. Since all router instances might have the same distributions of queries in their in-memory cache, the list of queries is shuffled before warm-up, so each router instance can plan queries in a different order and share their results through the cache.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance improvements vs. stability

In-memory caching

Cache warm-up

Cache warm-up with headers

Monitoring cache performance

Counters

Histograms

Gauges

Distributed caching with Redis

Prerequisites

How it works

Redis URL configuration

One node

Clustered

Sentinel

Router configuration

Redis configuration options

Timeout

TTL

Namespace

TLS

Required to start

Reset TTL

Pool size

Cache warm-up with distributed caching

FilesExpand file tree

caching.mdx

Latest commit

History

caching.mdx

File metadata and controls

Performance improvements vs. stability

In-memory caching

Cache warm-up

Cache warm-up with headers

Monitoring cache performance

Counters

Histograms

Gauges

Distributed caching with Redis

Prerequisites

How it works

Redis URL configuration

One node

Clustered

Sentinel

Router configuration

Redis configuration options

Timeout

TTL

Namespace

TLS

Required to start

Reset TTL

Pool size

Cache warm-up with distributed caching