Loïc Faugeron Technical Blog

HTTP Cache 26/11/2015

TL;DR: Enable HTTP cache by setting one of the following header in your Responses: Cache-Control, Expires, Last-Modified or Etag.

The HTTP protocol specifies how to cache Responses:

sequence diagram

The following actors are involved:

Its advantages:

Its drawbacks:

Cacheable Response

A Response can be cached if the incoming Request complies to the following rules:

For example, the following Request can produce cacheable Responses:

GET /v1/items?page=1
Accept: application/json
Vary: Accept

Sending the following Request would produce a different cacheable Response:

GET /v1/items?page=1
Accept: text/html
Vary: Accept

To be cacheable, a Response should also have one of the headers described in the next section.

Note: Headers specified in Vary will have their value stored in the Reverse Proxy. Sensitive data (e.g. API keys, password, etc) shouldn't be used as a cache key.

Strategies

HTTP cache provides the possibility to choose different strategies:

Expiration

If a Response can be safely cached for a fixed period of time (e.g. 10 minutes), use Cache-Control HTTP Header:

HTTP/1.1 200 OK
Cache-Control: max-age=600
Content-Type: application/json

{"id":42,"name":"Arthur Dent"}

Note: the Reverse Proxy will add a header to the Response indicating its age:

HTTP/1.1 200 OK
Age: 23
Cache-Control: max-age=600
Content-Type: application/json

{"id":42,"name":"Arthur Dent"}

If a Response can be safely cached until a known date (e.g. the 30th of October 1998, at 2:19pm), use Expires HTTP Header:

HTTP/1.1 200 OK
Expires: Fri, 30 Oct 1998 14:19:41 GMT
Content-Type: application/json

{"id":42,"name":"Arthur Dent"}

Note: the HTTP date format is required.

Validation

The Reverse Proxy can serve stale cached copy and then asynchronously check with the Application if it needs to be refreshed, using Last-Modified (a date) or/and ETag (a hash) HTTP Headers:

HTTP/1.1 200 OK
Last-Modified: Tue, 8 Sep 2015 13:35:23 GMT
ETag: a3e455afd
Content-Type: application/json

{"id":42,"name":"Arthur Dent"}

Note: the Reverse Proxy will add a header to requests with If-Modified-Since (a date) or/and If-None-Match (a hash):

GET /v1/users/42 HTTP/1.1
If-Modified-Since: Tue, 8 Sep 2015 13:35:23 GMT
If-None-Match: a3e455afd
Accept: application/json

If the Response shouldn't be updated, the Server should return a 304 Not Modified Response.

Cheatsheet

Cache-Control values

Keep in mind that Cache-Control header can be used by both the Reverse Proxy and the Client (e.g. a browser) to cache the Response.

HTTP date format

The date MUST be represented in Greenwich Mean Time (GMT), with the following format: D, d M Y H:i:s GMT

Conclusion

The Expires header is usually used to cache responses for application which are updated on a regular date (synchronization with a source of the data).

The Last-Modified header is used when the data has an updated_at field we can rely on.

The Etag header is used when we can safely create a hash of the response and compare it with with the cached copy.

Note: With Last-Modified and Etag, the Reverse Proxy serves first the cached Reponse and then check asynchronously with the application if it is stale.

Finally the Cache-Control header is usually used in any other situation.

More readings about HTTP Cache can be found here: