Mitigating Performance Impact from Personalisation

There are only two hard things in Computer Science: cache invalidation and naming things.
Phil Karlton

This is the 4th post in a series I started while consulting at rtCamp back in 2022. The series of posts explored personalisation of page content in order to optimise the conversion rate of lead-generation on a business website

A multi-brand, multi-site enterprise website may have hundreds of landing pages and possibly thousands of posts and other single content pages. In order to ensure page response time performance is maintained when scaling-up with growing number of visitors, modern CMS frameworks like WordPress leverage page caching technology along with CDNs in order to reduce server-loads and ensure good response performance. However, cached content is pre-rendered static HTML, so how can performance and personalisation coexist?

Performance impact of personalisation

As a general rule of thumb, the following hierarchy of content requests is followed in order to assess the performance impact. At the top, and least impactful on the performance, are requests that are served using cached content. This can be either caching on the client-side in the browser itself, or on the server-side. Cookies play an important role in order to store session states and access credentials and therefore prevent the need to hit the server for such information. At the next level is server rendered requests which requires accessing resources on the server such as the database. Such requests are less performant than those served from cached content. Finally the least performant are requests that require API calls to third party services on external servers.

Rule of thumb for request performance as a function of resource access.

Personalised content based on cookie preferences, such as the preferred theme for reading content on a site (some website offer a dark-mode theme), then the personalisation is already stored in cache and does not require accessing any server resources.

Personalisation based on first-party data stored on the server will, in most cases, require the content to be rendered on the server and this will impact the performance of the page response time. In addition, if the conditional data is to be fetched from an external third-party service such as a CDP using an API call, this further impacts the performance and in such cases require additional expensive server resources in order to scale up with increased traffic.

Caching personalised page variants

One way to mitigate the impact on performance is to restrict personalisation variations on landing pages to less than a dozen or so and cache all the variants. This would be feasible if the personalisation is based on segments. A user’s personas/segments could then be stored/updated as a cookie and the required request served from the cached page variant for that segment.

How do cache page variants work?

A caching server or sometimes also known as a caching engine, is a set of data resources that are stored in RAM memory in order to speed up access to this data and lower the cost of accessing these resources at scale. Supporting the equivalent scaling with traditional server resources such as databases, hard disks and CPU would require considerably more resources at a higher cost without matching the latency periods of RAM access.

These cached resources are generally transient, meaning that they may be deleted and replaced by more up-to-date versions of the data. The control of cached data is determined by Cache-Control header directives.

The HTTP Caching specification introduced the Vary header as a mechanism for content negotiation when making page requests to a server. The Vary header can take as a value one or more header parameter names, for example,

Vary: Accept-Encoding

and the Accept-Encoding header request made from a browser that can handle compressed responses might be,

Accept-Encoding: compress, gzip

and the server may respond with gzip’ed compressed content as opposed to plain uncompressed HTML text. This is an example of content negotiation for which the Vary header was initially specified for.

This is now being leveraged to distinguish requests with content variation such as personalised pages. Let’s look at how the Vary header is handled by various hosting providers in order to handle multiple variations of a personalised page.

WordPress VIP cache

The WordPress VIP hosting network provides an interface API for developers to flag page variations which are then stored on the VIP CDN caching network with a Vary header. The interface provides a Vary_Cache class,

use Automattic\VIP\Cache\Vary_Cache;

which provides functionality to register different segments of users. For example, if a business has two types of customers, say dealers (B2B) and retail (B2C), and want to personalise a landing page depending on the segment to which a user belongs to. The first step would be to register a customer group,

Vary_Cache::register_group( 'customer' );

The API handles these variations with a custom header and a cookie on the client browser which will store the value ‘b2b’ for dealer clients and ‘b2c’ for retail clients. In addition, instruction to the caching engine is handled automatically with a Vary header.

Once a visitor is identified as a client, the API can be used to set that user’s appropriate segment,

//user identified as a dealer.
Vary_Cache::set_group_for_user( 'customer', 'b2b' );

Finally, when a previously identified client makes a request to a page, the API exposes the client’s segment for conditional content personalization, which can send a response with the appropriate content variation for that segment,

if(Vary_Cache::is_user_in_group_segment( 'customer', 'b2b' )){
  //display content for dealer clients
}else if(Vary_Cache::is_user_in_group_segment( 'customer', 'b2c' )){
  //display content for retail clients
}else{ 
  //display content for other visitors
}

As the API handles the cache Vary header automatically, the VIP CDN will store the response such that subsequent requests made by users will be served the correct page version from the cache.The segment personalisation is handled with a cookie stored on the client browser, however, for sensitive conditional data that should not be shown to other segment groups, the API has functionality to encrypt the identification string.

WP Engine cache

WPEngine makes use of the Vary header with its own custom X-WPENGINE-SEGMENT header which gets populated with the wpe-us segment cookie on the client browser. Therefore, when a user is identified as belonging to a specific segment, the cookie is set,

setcookie('wpe-us', 'customer-b2b'); //customer is a dealer

The cookie value is passed in a request to the server as the X-WPENGINE-SEGMENT header, therefore pages that are personalised can set conditional personalised content

//flag this page as having multiple versions to the caching engine.
header('Vary: X-WPENGINE-SEGMENT'); 
$cookie_value = $_SERVER["HTTP_X_WPENGINE_SEGMENT"];
if ($cookie_value == 'customer-b2b') {
    //dealer content
} else if ($cookie_value == 'customer-b2c') {
    //retail content
} else {
    //default content
}

Setting the Vary: X-WPENGINE-SEGMENT header at the start of the page will ensure the caching engine stores this page as a variation based on the value of the cookie in the browser, and subsequent requests will be served from the cache. A plugin is available to leverage WPEngine segment caching with the help of shortcodes.

Pantheon cache

Pantheon hosting service uses a special cookie template to distinguish between different versions of a page. Any pages that setup a cookie named with the prefix STYXKEY is flagged by Pantheon’s global CDN caching engine with a Vary header,

Vary: Cookie

Following our previous example setting a cookie,

set_cookie('STYXKEY_customer', 'b2b');

for a personalised page will cache the page as content for dealer customers.

A note of caution

Content caching is an important tool in optimising the performance of a website in order to ensure it can scale up with higher traffic and page requests. However, a page cache is still a limited resource, it is therefore important to limit the amount of cache memory used by reducing the number of pages cached, and hence the number of variations of pages by limiting the number of user segments as well as reducing the number of pages that are personalised.

Conclusion

Personalising page content on WordPress is easy, but requires some precautions in order to make sure these pages scale up. Caching this personalised content as different versions of the same page is readily feasible with hosting services such as WordPress VIP, WPEngine and Patheon as well as others but requires planning in order to ensure the cache server is not overloaded with too many page versions.