Image

Caching tricks with Cloudflare workers

tl;dr: Cloudflare allows you to vary their cache responses by arbitrary headers, such as Cookies or User-Agent. This requires you to have an enterprise account (~$5000). You could either pay $5000 or solve the problem with $5 and some Javascript code.

Intro

Cloudflare offers caching on their global CDN. For sites that rarely update Cloudflare handles most of the traffic for you without ever reaching your origin server. Once a user visits a specific page, Cloudflare keeps the page response in the cache and serves it for the next visitor. This reduces the load on your servers while also improving the page performance since it's served closer to the user via one of Cloudflare's POPs.

While this is great in theory, it's challenging to implement for any site that requires page customization per visitor (read: most sites). For example: we may want to serve different cache responses based on the cookie (unique caching per visitor) or on the User-Agent (unique caching per device type - mobile / tablet / desktop).

Luckily, Cloudflare allows you to vary the cache by HTTP headers:

  • Accept-Encoding - caches each resource by the encoding of the payload.
  • Cookie - allows the cache to be unique per cookie. This is useful if the cache is unique per user or session.
  • User-Agent - caching per User-Agent ensures that the page is cached differently per device. E.g mobile clients may receive one version of the page which desktop clients may receive another.

The catch? Only Accept-Encoding is freely available, the other two headers require you to upgrade to an enterprise plan. It's rumored that this costs around $5000. Even if you were to upgrade to an enterprise plan you still wouldn't be able to cache by other HTTP headers.

Why do I need cache variation?

It's best to explain why caching by various headers is useful by a practical example.

At findwork.dev we deliver different versions of a page based on the User-Agent. For mobile versions we omit certain parts of the page which don't fit on small screen sizes and only include them for desktop clients. We do this by checking for the User-Agent header and rendering the page differently with django-user-agent.

Here is a fictional example which renders buttons with different sizes depending on if the user is on a mobile device or a desktop device.

{% if request.user_agent.is_mobile %}
<a class="btn btn-sm">Company info</a>
{% else %}
<a class="btn btn-lg">Company info</a>
{% endif %}

Recently we enabled a Cloudflare page rule to cache everything (including the html). Unfortunately we noticed some problems right away: if one user visited a page on a mobile device Cloudflare would cache the mobile version of the page. When another user would access the same page on a desktop Cloudflare would serve the mobile optimized version because that's the page in the cache. This obviously resulted in very ugly looking pages.

Cloudflare workers to the rescue

We briefly discussed moving our CDN layer to AWS and Cloudfront (which allows for arbitrary cache variation headers). However moving our entire infrastructure to AWS just to circumvent caching is impractical.

Cloudflare recently launched Cloudflare workers. Cloudflare workers are Javascript snippets that run on the Cloudflare infrastructure. The workers can interface with various parts of the Cloudflare infrastructure, including the caching API. This meant that we could write arbitrary code to customize how Cloudflare should cache and deliver our content.

The Cloudflare docs state:

Unlike the browser Cache API, Cloudflare Workers do not support the ignoreSearch or ignoreVary options on match(). You can accomplish this behavior by removing query strings or HTTP headers at put() time.

So by default the HTTP Vary header is ignored in the cache key. A workaround is instead to add query parameter which distinguishes the response by device.

  • a request for a mobile version of findwork.dev could be cached under https://findwork.dev?version=mobile
  • a desktop version could be cached under https://findwork.dev?version=desktop.

Here's a code snippet we came up with which solved our problem:

async function run(event) {
  const { request } = event;

  const cache = caches.default;

  // Read the user agent of the request
  const ua = request.headers.get('user-agent');
  let uaValue;

  if (ua.match(/mobile/i)) {
    uaValue = 'mobile';
  } else {
    uaValue = 'desktop';
  }

  // Construct a new response object which distinguishes the cache key by device
  // type.
  const url = new URL(request.url);
  url.searchParams.set('ua', uaValue);
  const newRequest = new Request(url, request);

  let response = await cache.match(newRequest);
  if (!response) {
    // Use the original request object when fetching the response from the
    // server to avoid passing on the query parameters to our backend.
    response = await fetch(request);

    // Store the cached response with our extended query parameters.
    event.waitUntil(cache.put(newRequest, response.clone()));
  }

  return response;
}

window.addEventListener('fetch', (event) => {
  event.respondWith(run(event));
});

It's worth noting that this not only allows you to vary the cache by HTTP headers. You could get creative and vary the cache by the contents of the body if you wanted.

Don't enable caching in the Cloudflare UI if you're using workers. It may mess with the workers and cause inconsistencies. Either use the workers for caching or use UI / page rules.