After recently implementing an Azure-based solution to mitigate SharePoint Online’s poor image rendition performance by utilising Azure CDN (see Chris O’Brien’s post on this issue, see Fran R’s post on other Image Rendition issues) I’ve reached a few conclusions regarding setting appropriate cache control headers. It is important to reach a practical balance between performance and receiving updates to files.
Before continuing it is important to understand the fundamental building blocks when using a CDN. At any time a file can be present in three location types: the blob or source file, the CDN endpoint(s), and users’ browser caches. In the case of Azure CDN, the source file must be a blob in Azure Blob Storage. Depending on the CDN/configuration it is likely that the file may be cached at many (dozens) of CDN endpoints dispersed around the globe. Without a CDN the only consideration is the cache timeout for files stored at the user’s browser cache. When considering a CDN we must also consider the cache timeout between the CDN endpoint and the source file.
Another important point to call out is that CDNs generally only push content to an endpoint when is it first requested: on-demand. This will incur a delay for the first user to request that asset from a given endpoint, while source blob is transferred to the endpoint. The impact of this will differ depending on the distance between the source blob and the CDN endpoint and the file size. It is this process that increasing the s-maxage header prevents (discussed below).
Relevant cache control headers
Definitions
- max-age : Defines the period which, until reached, the client will used the cached file without contacting the server. ‘Client’ refers to a user’s browser cache as well as a CDN.
- s-maxage : If provided, overrides max-age for CDNs only
- public : Explicitly marks the file as not user specific
- no-transform : Proxy servers may compress or encode images to improve performance or reduce bandwidth traffic. This header prevents this for occurring. It is preferable to avoid this header assuming that you can spare the effort to ensure the files being served are not affected adversely.
A good summary of the many remaining cache control headers that I didn’t feel were relevant to this post can be found here:
A beginners guide to HTTP cache headers
In practice
- For an image that has been previously requested:
- When s-maxage has not expired and max-age has not expired, server responds with 200 (OK), the file is not downloaded again [0ms]
- When s-maxage has not expired but max-age has expired, server responds with 304 (not modified), the file is not downloaded again [<100ms]
- When s-maxage has expired but max-age has not expired, server responds with 200 (OK), the file is not downloaded again [0ms]
- When s-maxage has expired and max-age has expired and the blob has not changed, server responds with 304 (not modified), the file is not downloaded again [<100ms]
- When s-maxage has expired and max-age has expired and the blob has changed, server responds with 200 (OK), the file is downloaded again [download image]
- A request for an image will return 200 (OK) until max-age has expired and then 304 (not modified) for every subsequent request until the blob is updated. Once updated, this process repeats
- If an existing image is updated, the longest a user can wait to see the updated image is
- Without clearing browser cache: max-age + s-maxage
- With clearing browser cache: s-maxage
- If an user views an image from the CDN for the first time, it is only guaranteed to be the latest version of that image if the blob hasn’t been updated in the last s-maxage
- SharePoint library images are served with a max-age of 24 hours
- As SharePoint library images are not served via a CDN they have an effective s-maxage of 0
My recommendations
Keeping all of the above in mind, I feel that the most important factor is to replicate the experience that users expect from images being served from the SharePoint environment. This can presented as a couple of simple rules:
- max-age + s-maxage = 24 hours = 86400 seconds
- s-maxage is as low as possible whilst satisfying bandwidth and performance targets (especially for locations most distant to the source blob)
For a recent SharePoint/CDN, I used the following cache control headers:
- max-age: 23 hours
- s-maxage: 1 hour
- public
- no-transform
Which looks like this:
no-transform,public,max-age=82800,s-maxage=3600
Setting the cache headers served by Azure CDN and Azure Blob Storage
When working with cache control headers in Azure, they are set on the blob itself. It is not a CDN configuration setting.
Paul.