Sometimes, robot webcrawlers (like Bing, Huwaei Cloud, Yandex, Semrush, etc) can attempt to crawl a Drupal View's search results pages, and could also be following links to each of the view's filtering options. This places extra load on your site. Additionally, the crawling (even if done by legitimate search engines) may not be increasing your site's visibility to users of search engines.
Therefore, we suggest blocking or re-routing this traffic to reduce resource consumption at the Acquia platform, avoid overages to your Acquia entitlements (for Acquia Search, Views & Visits, etc.), and to generally help your site perform better.
You may need multiple strategies to resolve this problem:
# Do not index nor follow links that have a query string
# (e.g. /search?page=123 or /search?size=small&color=red)
User-agent: *
Disallow: /*?
# If your views or search pages use a module to convert facets/filters
# to clean URLs (e.g. /search/page/123 or /search/size/small)
# you can try disallowing the search page's URL
User-agent: *
Disallow: /search*
While not a perfect solution, blocking at the Apache layer with the following .htaccess rule can help.
If you do decide to try out this code in your .htaccess, please follow the included instructions.
# EXAMPLE ROBOT BLOCKING CODE for Search pages or views.
# From: https://support-acquia.force.com/s/article/4408794498199-Block-excessive-crawling-of-Drupal-Views-or-search-results
# Robot list updated: 2024-01-22
# NOTE: May need editing depending on your use case(s).
#
# INSTRUCTIONS:
# PLACE THIS BLOCK directly after the "RewriteEngine on" line
# on your docroot/.htaccess file.
#
# This will block some known robots/crawlers on URLs when query arguments are present.
# DOES allow basic URLs like /news/feed, /node/1 or /rss, etc.
# BLOCKS only when search arguments are present like
# /news/feed?search=XXX or /rss?page=21.
# Note: You can add more conditions if needed.
# For example, to only block on URLs that begin with '/search', add this
# line before the RewriteRule:
# RewriteCond %{REQUEST_URI} ^/search
#
RewriteCond %{QUERY_STRING} .
RewriteCond %{HTTP_USER_AGENT} "11A465|AddThis.com|AdsBot-Google|Ahrefs|aiohttp|alexa site audit|AlipesNewsBot|Amazonbot|Amazon-Route53-Health-Check-Service|ApacheBench|AppDynamics|Applebot|ArchiveBot|Archive-It|AspiegelBot|Assetnote|axios|azure-logic-apps|Baiduspider|Barkrowler|bingbot|BLEXBot|BLP_bbot|BluechipBacklinks|Buck|Bytespider|CatchBot|CCBot|check_http|ClaudeBot|CloudFlare-Prefetch|cludo.com bot|colly|contentkingapp|Cookiebot|CopperEgg|crawler4j|Csnibot|Curebot|curl|CyotekWebCopy|Daum|Datadog Agent|DataForSeoBot|Detectify|DotBot|Dow Jones Searchbot|DuckDuckBot|facebookexternalhit|Faraday|FeedBurner|FeedFetcher-Google|feedonomics|feroxbuster|Fess|Funnelback|Fuzz Faster U Fool|GAChecker|Ghost Inspector|GPTBot|Grapeshot|gobuster|gocolly|Googlebot|GoogleStackdriverMonitoring|Go-http-client|go-resty|GuzzleHttp|HeadlessChrome|heritrix|hokifyBot|Honolulu-bot|HTTrack|HubSpot Crawler|ICC-Crawler|Imperva|IonCrawl|jooble|KauaiBot|Kinza|LieBaoFast|linabot|Linespider|Linguee|LinkChecker|LinkedInBot|LinkUpBot|LinuxGetUrl|LMY47V|MacOutlook|Magnet.me|Magus Bot|Mail.RU_Bot|MauiBot|Mb2345Browser|MegaIndex|Microsoft Office|Microsoft Outlook|Microsoft Word|MicroMessenger|mindbreeze-crawler|mirrorweb.com|MJ12bot|monitoring-plugins|Monsidobot|MQQBrowser|msnbot|MSOffice|MTRobot|nagios-plugins|nettle|Neevabot|NewsCred|newspaper|node-fetch|Nuclei|NukeScan|okhttp|OnCrawl|Orbbot|PageFreezer|panscient.com|PetalBot|Pingdom.com|Pinterestbot|PiplBot|python-requests|Qwantify|Re-re Studio|Riddler|RocketValidator|rogerbot|RustBot|Safeassign|Scrapy|Screaming Frog|SeobilityBot|Search365bot|SearchBlox|SearchmetricsBot|searchunify|Seekport|SemanticScholarBot|SemrushBot|SEOkicks|seoscanners|serpstatbot|SessionCam|SeznamBot|Site24x7|SiteAuditBot|siteimprove|Siteimprove|SiteLockSpider|SiteSucker|SkypeRoom|Slackbot|Slurp|Sogou web spider|special_archiver|SpiderLing|StatusCake|Swiftbot|Synack|Turnitin|trendictionbot|trendkite-akashic-crawler|UCBrowser|Uptime|UptimeRobot|usasearch|UT-Dorkbot|weborama-fetcher|WhiteHat Security|WidenWebhookClient|Wget|WTWBot|www.loc.gov|Xenu Link Sleuth|Vagabondo|VelenPublicWebCrawler|Yeti|Veracode Security Scan|YandexBot|YandexImages|YisouSpider|Y!J|Zabbix|ZoominfoBot|ZoomSpider" [NC]
RewriteRule ^.* - [F,L]
If this content did not answer your questions, try searching or contacting our support team for further assistance.
Wed Oct 22 2025 08:59:29 GMT+0000 (Coordinated Universal Time)