Full page caching, Design and Reality

Full page caching is a common practice to deal with high traffic. Although it has clear advantages, many companies still choose not to do caching. In this article, Maksym Moskvychev explains best practices for configuring full page caching and shows pros and cons of using full page cache. 

Design: How to set up full page caching on AWS

Three rules to start with full page caching:

  1. Route whole website traffic via CDN (for example, AWS CloudFront).
  2. The application (website) should return headers to cache content for 15 minutes.
  3. Do not use cache invalidation by design.

Route whole website traffic via CDN. While Varnish was dominating on caching software list a few years ago, nowadays it makes no sense to set-up dedicated Varnish servers. It is much easier (and also more efficient) to use CDN for full page caching. Modern CDNs have caching nodes around the world, are easy to configure, and most of them are even free! (you pay only for traffic, but no fixed fees per month). For example, AWS CloudFront is not the cheapest CDN, but it has 200+ locations worldwide, easy to set-up and supports configuration via CloudFormation and Terraform.

The application (website) should return headers to cache content for 15 minutes. 15 minutes is a good starting point if you are not sure which cache expiration time to choose. How to achieve it? With HTTP headers. Some CDNs allow you to configure expiration time. But it is much more flexible if your application returns headers. It gives you precise control, which pages to cache, which - not.

Cache-control: public, max-age=900

Cache-control: public, smaxage=900

With the first response header, a cached page might be stored in a user's browser and in CDN. With second option cached page will be cached only in CDN.

Full information about cache-control derectives is described in RFC7234

Cache invalidation should not be needed by design. Many CDNs support cache invalidation, but none of them can do it quickly. The biggest complexity is that CDNs have hundreds of servers around the globe, and when you want to invalidate cache - it has to be invalidated on all of them. The best thing you can do is to build your development processes in a way you never need to invalidate the cache. For example, agree within an organization that any changes to cached pages can be applied only in 15 minutes.

Sometimes for debugging purposes, you need to see a page without being cached. For that - just configure CDN to include special query parameter into cache key. For example, if you enabled query parameter "debug_param", then page 
https://example.com/page?debug_param=1 will have another cache from https://example.com/page?debug_param=2 and from https://example.com/page .

Reality: What to expect from full page caching, what does it really brings?

You would need to move domain or name servers. Many CDNs are using special "Alias" DNS records, which allows connectivity with the closest DNS location. But for you, it would mean that you have to move your domain or nameservers (NS records).

Multiple cache locations and random invalidations are affecting cache efficiency. For example, CloudFront has 200+ edge locations, which work independently. If a page was cached in one edge - it is not cached in another. There should be enough users from the same geographical location for full page caching to start working.

Developers will hate it. Cache makes a life of developers more complex. One more thing to keep in mind. Setting up DTAP environments is also more complex if you want to test caching behavior.

Average page load time will be improved. This is the most expected outcome of caching - better website performance.

You will win a lot in server resources. This is another win, often forgotten when talking about caching. Because some pages will be delivered from the cache, you do not need to spend server CPU and Memory. So adding caching will free up server resources and decrease your monthly bill.

Is a full page caching the right tool for me?

It mostly depends on your website specifics. If you are online casino - then probably not, as you want that your visitors see unique pages with personalized content. If you are classic e-commerce - definitely yes! Because product pages are the same for everyone, and the personalization part can be achieved with CDP (Customer Data Platform) solutions, which can update data on a page on the client side.