Cache tags are a game changer for your caching strategy in Drupal 8.
Up until Drupal 8, Drupal has had one caching strategy called cache expiration. It cached computed output for a fixed period of time (e.g. 1 hour). There are two downsides to this approach:
Drupal 8 introduced another option called cache invalidation. This is where you set the cache lifetime to be permanent and invalidate (purge) that cached item when its no longer relevant. Drupal 8 does this by storing metadata about the cached item. Then, when an event occurs, such as an update on a node, the metadata can be searched to find all cache items that contain computed data about the updated node, and can be invalidated. This solves the two problems of cache expiration:
Pretty neat huh? But this means that any successful cache invalidation strategy must have accurate cache tagging occurring everywhereâ¦. including your custom code!
First up, let's look at how you might use Drupal's native application caching with cache tags:
use Drupal\node\Entity\Node;use Drupal\Core\Cache\Cache;$nid = 123;$cid = 'markdown:' . $nid;// Look for the item in cache so we don't have to do the work if we don't need to.if ($item = \Drupal::cache()->get($cid)) { return $item->data;}// Build up the markdown array we're going to use later.$node = Node::load($nid);$markdown = [ 'title' => sprintf('## %s', $node->get('title')->getValue()), //...];// Set the cache so we don't need to do this work again until $node changes.\Drupal::cache()->set($cid, $markdown, Cache::PERMANENT, $node->getCacheTags());
See the last line of code where the cached item is written to the cache provider. Here, there are two arguments passed at the end of the set call that show this is code that will obey a cache invalidation strategy:
Now whenever Entity::save() is called (or extended versions of it), the entities cache tags will be invalidated which will include our cached item above (if its the node that's being saved). Drupal does this by using Drupal\Core\Cache\Cache:invalidateTags() like this:
use Drupal\Core\Cache\Cache;Cache::invalidateTags($node->getCacheTagsToInvalidate());
You can use this function as well if certain events occur that require cache tags to be invalidated that Drupal is not aware of (example at the end of this blog post).
While cache tags help identify what needs to be invalidated, they don't help validate if a cached item is fit for the purpose. To do that, we need to ensure the cache ID is one that variates based on all the environmental variables the cached item is subject to. For example, if your site is using entity translations, you'll want to ensure that the language code is apart of any cache ID you create.
$cid = markdown:' . $node->id() . â:' . $node->get('langcode')->value;
Inside the Render API, the renderer does the cache lookup for you and it may not know all the environmental variables required to produce an accurate cache ID. This is where cache contexts come in. Let's look at using cache contexts inside the Render API:
/** * Implements hook_preprocess_block(). */function mymodule_preprocess_block(&$variables) { // Unique cache per search string. if ($variables['elements']['#id'] == 'search_hero_type_1') { $variables['search_string'] = \Drupal::request()->get('search'); $variables['#cache']['contexts'] = ['url.query_args:search']; }}
In this example, the 'search_string' variable changes subject to the value of the 'search' query parameter in the URL. Because of this, the block cache needs to vary based on the value of that query string. By adding the cache context 'url.query_args:search', Drupal ensures the cached item ID will contain the value of the 'search' query parameter. This ensures that we don't accidentally pull from cache an item with a different 'search_string' value.
The more entities and contexts you have inside your cached items, the higher the probability a cache lookup will either miss (no such variation exists) or be invalidated (affected by many entity change events).
The render API is a deep tree of keyed arrays that maps out a rendering structure. Caching the array at the top will generally result in frequent invalidations. So sublevels of the array are cached also so that they are not rebuild when other areas of the render array are invalidated.
This is where cache keys come into play. They provide keys that are eventually combined with contexts to generate a cache ID used to store that level of the render array in cache.
$build['#cache'] = [ 'keys' => ['entity_view', 'node', $node->id()], 'contexts' => ['languages'], 'tags' => $node->getCacheTags(), 'max-age' => Cache::PERMANENT,];
The above code would result in storing $build in cache with a cid such as âentity_view:node:123:en'. If the âkeys' key is not provided at this level, then no caching at this level occurs. However the cache tags bubble up to the top level of the render array and become apart of the larger cache item cache tags.
With an invalidation cache strategy, remember that any data you use needs to be registered with cache tags (and contexts or keys if using a render array). This is especially the case when using entity queries:
condition('type', 'product') ->condition('field_product_brand', $brand->id()) ->condition('status', 1) ->execute());// Build a list of cache tags from the retrieved nodes.$tags = array_reduce($nodes, function (array $tags, Node $node) { return Cache::mergeTags($tags, $node->getCacheTags());});\Drupal::cache()->set('my_products:' . $brand->id(), $nodes, Cache::PERMANENT, $tags);
In the code above, the cache tags from the queried nodes are aggregated together to store inside the cache item. Now take another look, because something is wrongâ¦What if a new product node is added?
If a new product is added, how do we invalidate the right my_products:
/** * Implements hook_node_presave(). */function mymodule_node_presave(NodeInterface $node) { mymodule_invalidate_node($node);}/** * Implements hook_node_presave(). */function mymodule_node_delete(NodeInterface $node) { mymodule_invalidate_node($node);}/** * Invalidate custom cache associated to brand node. */function mymodule_invalidate_node(NodeInterface $node) { $tags = []; if ($node->getType() == 'product' && !$node->get('field_product_brand')->isEmpty()) { $tags[] = 'my_products:' . $node->get('field_product_brand')->entity->id(); } elseif ($node->nodeType() == 'brand') { $tags[] = 'my_products:' . $node->id(); } if (!empty($tags)) { Cache::invalidateTags($tags); }}
This is an existing problem for the Views module too.
Core mitigates this by adding
Yup, it kind of kills the caching strategy a little. There is an issue on Drupal.org that suggests including the bundle name also (see #2145751) which might improve the performance a little. But otherwise this is a bit of a gotcha!
We can however mitigate this at the page caching level if you use a caching proxy such as Acquia Cloud or Acquia Cloud Edge.
A cache invalidation strategy will help you achieve killer performance and maximize the mileage you can get from your underlying hardware platform. But it doesn't come for free - you need to review it, validate it works (test it), and ensure your custom code uses it correctly.
In my experience, Drupal core handles cache tags very well and with the use of Views Custom Cache Tags, it can be quite easy to produce a strong invalidation strategy. However, if you have a lot of custom code (e.g. controllers, rest plugins) it is quite likely it won't support an invalidation strategy and you'll have to settle for expiration instead.
If this content did not answer your questions, try searching or contacting our support team for further assistance.
Wed Apr 03 2019 23:36:15 GMT+0000 (Coordinated Universal Time)