As a website owner or developer, to ensure the security and performance of your website you must control access and manage unwanted traffic.
Acquia recommends that you implement controls through Acquia Edge WAF/CDN or Varnish (VCL) when possible, as these methods stop traffic before it reaches your application servers to conserve resources. If these methods are unavailable, you can control access at the origin through the .htaccess file.
The following are the common methods to manage unwanted traffic on your website:
Each of these methods blocks different entity types from accessing your sites. In the recommended code snippets in these methods, ensure that the regular expressions and Apache directives are properly formatted and adapted for your specific use case.
Before you can block traffic, you must identify the source. If your site experiences slowdowns or a spike in requests (visible in Stack Metrics), connect to the environment through SSH and use the following commands on the access.log to identify the offenders.
Top 20 IP addresses
To identify specific hosts that generates the most requests:
awk '{print $1}' /var/log/nginx/access.log* | sort | uniq -c | sort -rn | head -20Top 20 user agents (bots/browsers)
To identify the bots or scrapers involved:
awk -F'"' '{print $6}' /var/log/nginx/access.log* | sort | uniq -c | sort -rn | head -20Top 20 requested paths (targeted URLs)
To identify the targeted endpoints:
awk -F'"' '{print $2}' /var/log/nginx/access.log* | awk '{print $2}' | sort | uniq -c | sort -rn | head -20Once identified, use these IPs or User Agents in the following implementation methods.
Use this method when you have identified one or a few specific IP addresses that abuses your site. For example, brute-force attacks.
If you use Acquia Edge, implement IP blocks through the WAF. This is the most performant method as it does not consume origin resources.
Create a WAF rule that matches the abusive IPs. For example, Expression in Cloudflare: (ip.src in {198.51.100.23 203.0.113.0/24})
Set Action to Block.
Deploy the rule.
Select the appropriate action to ensure that you stop abuse with no impact to legitimate users.
Block: Use when malicious traffic comes from a specific source. For example, block specific IPs.
Rate-Limit: Use when the unwanted traffic is distributed across many IPs, such as scrapers that hit /search or API endpoints, or when a legitimate service makes multiple requests. Rate-limiting caps the number of requests per IP in a specific time frame. When the threshold is exceeded, the visitor might be challenged through CAPTCHA or temporarily blocked.
This typically requires Acquia Edge powered by Cloudflare.
To block specific user agents, add the following code snippet in the .htaccess file:
RewriteCond %{HTTP_USER_AGENT} UserAgent
RewriteRule .* - [F,L]
You can also block more than one User Agent at a time with the [OR] ('or next condition') flag, and the [NC] ('no case') flag renders the string case insensitive. Here are some examples of some user-agents with properly escaped regexes:
RewriteCond %{HTTP_USER_AGENT} Baiduspider [NC,OR]
RewriteCond %{HTTP_USER_AGENT} HTTrack [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Yandex [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Scrapy [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Mozilla/5\.0\ \(compatible;\ Yahoo [NC,OR]
RewriteCond %{HTTP_USER_AGENT} AppleNewsBot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Googlebot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Mozilla/5\.0\ \(compatible;\ YandexBot [NC]
RewriteRule .* - [F,L]
To avoid website errors, you must properly use escape characters in your regular expressions (regex). HTTP_USER_AGENT can use regex as an argument. In the preceding code, many user agents require regex because of the complexity of their name. Instead of creating a rule manually, you can use websites such as https://beautifycode.net/regex-escape to quickly construct a properly-escaped regex.
Run the following command to test that the site responds:
curl -H "host:www.url_you_are_testing.url http://localhost/
Run the following command to test that the user agent is blocked:
curl -H "host:www.url_you_are_testing.url" -H "user-agent:Pcore-HTTP/v0.25.0" http://localhost/
This is an example command to test a user agent named Pcore. You can run a similar command for your intended user agent.
A robot crawler can cause problems to a site. It makes a large number of requests. You can use the following code in any of these situations:
RewriteCond %\{HTTP_REFERER\} ^$
RewriteCond %\{HTTP_USER_AGENT\} "<exact_name_for_the_bot>"
RewriteRule ^(.*)$ - [F,L]
Website owners want to protect their website from other websites that intend to steal content, or hotlink to images and steal bandwidth.
To prevent hotlinking, use the following code and replace domain.com with your domain name.
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://(www\.)?domain.com/ .*$ NC
RewriteRule \.(gif|jpg|swf|flv|png)$ /feed/ R=302,L
In this code, R=302 indicates a temporary redirect. If you want a permanent redirect, adjust it to R=301.
Staging environment slowdowns often result from configuration issues or internal testing rather than external attacks.
If your non-production site is unresponsive:
Check recent activity: Verify if there were recent code deployments, configuration changes, or feature flag toggles.
Verify access controls: Confirm IP ACLs or Basic Auth settings have not accidentally blocked legitimate users or testing tools.
Review logs for internal tools: Check the access logs for repetitive requests from internal testing or monitoring tools. Use the commands mentioned in the Identify unwanted traffic sources section.
Check Application Errors: Review the error.log and mysql-slow.log for application failures rather than traffic volume.
Check Cron Status: Ensure that scheduled jobs are not stuck or do not cause resource exhaustion.
If the issue persists or you suspect a large-scale DDoS attack, collect the following data before you create a Support ticket. Providing this information upfront expedites resolution.
Environment: Produstion, Staging, Development and Environment ID.
Time Window (UTC): Exact start and end time of the incident. For example, 2025-11-17 18:00–18:30 UTC.
Triage Data: Output from the Triage CLI commands. For example, Top 5 IPs, User Agents, and Paths.
Actions Taken: Specific WAF rules, rate limits, or .htaccess changes that you applied.
Sample Request: A recent X-Request-ID header value from a failed or blocked request, if available.
If this content did not answer your questions, try searching or contacting our support team for further assistance.
A robot crawler can cause problems to a site. It makes a large number of requests. You can use the following code in any of these situations:
RewriteCond %\{HTTP_REFERER\} ^$
RewriteCond %\{HTTP_USER_AGENT\} "<exact_name_for_the_bot>"
RewriteRule ^(.*)$ - [F,L]
Website owners want to protect their website from other websites that intend to steal content, or hotlink to images and steal bandwidth.
To prevent hotlinking, use the following code and replace domain.com with your domain name.
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://(www\.)?domain.com/ .*$ NC
RewriteRule \.(gif|jpg|swf|flv|png)$ /feed/ R=302,L
In this code, R=302 indicates a temporary redirect. If you want a permanent redirect, adjust it to R=301.
Staging environment slowdowns often result from configuration issues or internal testing rather than external attacks.
If your non-production site is unresponsive:
Check recent activity: Verify if there were recent code deployments, configuration changes, or feature flag toggles.
Verify access controls: Confirm IP ACLs or Basic Auth settings have not accidentally blocked legitimate users or testing tools.
Review logs for internal tools: Check the access logs for repetitive requests from internal testing or monitoring tools. Use the commands mentioned in the Identify unwanted traffic sources section.
Check Application Errors: Review the error.log and mysql-slow.log for application failures rather than traffic volume.
Check Cron Status: Ensure that scheduled jobs are not stuck or do not cause resource exhaustion.
If the issue persists or you suspect a large-scale DDoS attack, collect the following data before you create a Support ticket. Providing this information upfront expedites resolution.
Environment: Produstion, Staging, Development and Environment ID.
Time Window (UTC): Exact start and end time of the incident. For example, 2025-11-17 18:00–18:30 UTC.
Triage Data: Output from the Triage CLI commands. For example, Top 5 IPs, User Agents, and Paths.
Actions Taken: Specific WAF rules, rate limits, or .htaccess changes that you applied.
Sample Request: A recent X-Request-ID header value from a failed or blocked request, if available.
If this content did not answer your questions, try searching or contacting our support team for further assistance.