Loading...


Related Products


Date Published: June 26, 2025

How do I manage an application that receives lots of requests for faceted searches?

Issue

There is a known issue with some sites which use faceted search filters. This is due to the fact that web crawlers and bots do not always respect the “nofollow or noindex” tags on links in HTML in those faceted search features. This contributes to, or sometimes causes, performance issues for the faceted search pages, and in some circumstances may even trigger downtime events for an application. Crawlers can become stuck continually crawling a site or section of a site when there are many links to follow due to the number of facets presented to the user.

This is more of a problem than it might seem because these requests are hard to cache, or may not be cached at all, resulting in many thousands of requests to Drupal placing the origin under additional load. 

Cause

Every faceted search filter that the user (or crawler) adds to their search generates a new unique link variation that will match some of the items based on the category. Some searches may not return any results but the facets will still exist and be clickable.

If we multiply those filters: f[0] f[1] f[2] f[3] etc, we will end up with hundreds of thousands of variations of the same page, trapping crawlers in a loop.

These are NOT cacheable requests, all of them will reach the Drupal application, loading Apache, MySQL and also Acquia Search (Solr) which can in certain cases saturate the infrastructure and may have further implications.

Resolution

There are three possible options to solve the issue:

  1. Enable and configure nofollow_noindex module
rel=’nofollow, noindex’  

However, note there are crawlers that will not respect the above guidelines on links.

  1. Update your robots.txt to include a Disallow rule for the path that indicates a faceted search is being used, this will prevent crawling of these links by user agents which respect the robots.txt file.
# For all user agents

User-agent: *
Disallow: /*?f[*
Disallow: /*&f[*
  1.  Block or challenge the traffic on your WAF for the pattern; the precise logic required will vary based on your provider. See below for an example using a Cloudflare WAF Firewall Rule:
(http.request.uri.query contains "f%5B" AND (lower(http.user_agent) contains "spider" OR lower(http.user_agent) contains "bot" OR lower(http.user_agent) contains "crawler" OR lower(http.user_agent) contains "netEstate"))

The logic here is to look for queries that contain parameters that indicate a faceted search is present and then establish if that request has been made by a crawler, bot, or automated service and send it a challenge or block it entirely from being processed by the application.

You should also review your analytics for queries that contain the faceted search parameters to also check if there are any volumetric queries that indicate other user agents may also be making large numbers of queries. These should be added to your WAF rule logic accordingly. 

An example custom rule in the Akamai WAF might look as follows:

You should also consult with your WAF management team on this for more detail.

Depending on the features available to you in your WAF, you may also be able to use Rate Limiting to slow and reduce the volume of requests to faceted search paths.

Did not find what you were looking for?

If this content did not answer your questions, try searching or contacting our support team for further assistance.

Back to Section navigation