---
title: "Blocking unwanted traffic  on your website"
date: "2024-02-14T06:18:38+00:00"
summary: "Learn effective methods to block unwanted traffic on your website using .htaccess, including user agents, robot crawlers, and hotlinks."
image:
type: "page"
url: "/acquia-cloud-platform/blocking-unwanted-traffic-your-website"
id: "9a14d377-e27a-4916-a068-e155ef5d8d37"
---

As a website owner or developer, to ensure the security and performance of your website you must control access and manage unwanted traffic.

Acquia recommends that when possible you implement traffic controls through Acquia Edge WAF or your own managed WAF, as these methods stop traffic before it reaches your application on the Acquia Platform and conserves resources. If these methods are unavailable, you can control access at the origin through the `.htaccess` file.

The following are the common methods to manage unwanted traffic on your website:

*   [Identify unwanted traffic sources](#unwanted-traffic)
    
*   [Block specific IP addresses](#blocking-ip-addresses)
    
*   [Decide between blocking and rate-limiting](#deciding-block-rate-limit)
    
*   [Block specific user agents](#blocking-user-agents)
*   [Block traffic from robot crawlers](#blocking-traffic-crawlers)
*   [Block hotlinks](#blocking-hotlinks)

Each of these methods blocks different entity types from accessing your sites. In the recommended code snippets in these methods, ensure that the regular expressions and Apache directives are properly formatted and adapted for your specific use case.

Identify unwanted traffic sources
---------------------------------

Before you can block traffic, you must identify the source. If your site experiences slowdowns or a spike in requests (visible in [Stack Metrics](/acquia-cloud-platform/using-stack-metrics-monitor-activity-your-environments "https://www.google.com/search?q=https://docs.acquia.com/acquia-cloud-platform/monitor/stack-metrics/")), connect to the environment through SSH and use the following commands on the `access.log` to identify the offenders.

**Top 20 IP addresses**

To identify specific hosts that generates the most requests:

    awk '{print $1}' /var/log/nginx/access.log* | sort | uniq -c | sort -rn | head -20

**Top 20 user agents (bots/browsers)**

To identify the bots or scrapers involved:

    awk -F'"' '{print $6}' /var/log/nginx/access.log* | sort | uniq -c | sort -rn | head -20

**Top 20 requested paths (targeted URLs)**

To identify the targeted endpoints:

    awk -F'"' '{print $2}' /var/log/nginx/access.log* | awk '{print $2}' | sort | uniq -c | sort -rn | head -20

Once identified, use these IPs or User Agents in the following implementation methods.

Block specific IP addresses
---------------------------

Use this method when you have identified one or a few specific IP addresses that abuses your site. For example, brute-force attacks.

### Acquia Edge WAF (Recommended)

If you use Acquia Edge, implement IP blocks through the WAF. This is the most performant method as it does not consume origin resources.

1.  Create a WAF rule that matches the abusive IPs. For example, Expression in Cloudflare: `(ip.src in {198.51.100.23 203.0.113.0/24})`
    
2.  Set **Action** to **Block**.
    
3.  Deploy the rule.
    

### .htaccess (Fallback)

If Edge WAF is unavailable, you can use the `.htaccess` file. The rules defined in this file consume server resources like CPU, Memory. Use this method only for small lists of IPs.

Use the following syntax (Apache 2.4) in your `.htaccess` file:

    Apache
    <RequireAll> 
      Require all granted 
      # Block a single IP 
      Require not ip 198.51.100.23 
      # Block an IP range 
      Require not ip 203.0.113.0/24 
    </RequireAll>

Decide between blocking and rate-limiting
-----------------------------------------

Select the appropriate action to ensure that you stop abuse with no impact to legitimate users.

*   **Block:** Use when malicious traffic comes from a specific source. For example, block specific IPs.
    
*   **Rate-Limit:** Use when the unwanted traffic is distributed across many IPs, such as scrapers that hit `/search` or API endpoints, or when a legitimate service makes multiple requests. Rate-limiting caps the number of requests per IP in a specific time frame. When the threshold is exceeded, the visitor might be challenged through CAPTCHA or temporarily blocked.
    

This typically requires [Acquia Edge powered by Cloudflare](https://www.google.com/search?q=https://docs.acquia.com/acquia-edge/config/rate-limiting/ "https://www.google.com/search?q=https://docs.acquia.com/acquia-edge/config/rate-limiting/").

Block specific user agents
--------------------------

Caution

Do not block reputable crawlers, such as Googlebot or Bingbot as this negatively impacts your Search Engine Optimization (SEO). Verify the source of the traffic before you block a common User Agent.

To block specific user agents, add the following code snippet in the .htaccess file:

    RewriteCond %{HTTP_USER_AGENT} UserAgent
    RewriteRule .* - [F,L]
    You can also block more than one User Agent at a time with the [OR] ('or next condition') flag, and the [NC] ('no case') flag renders the string case insensitive. Here are some examples of some user-agents with properly escaped regexes:
    RewriteCond %{HTTP_USER_AGENT} Baiduspider [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} HTTrack [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} Yandex [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} Scrapy [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} Mozilla/5\.0\ \(compatible;\ Yahoo [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} AppleNewsBot [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} Googlebot [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} Mozilla/5\.0\ \(compatible;\ YandexBot [NC]
    RewriteRule .* - [F,L]
    

To avoid website errors, you must properly use escape characters in your regular expressions (regex). HTTP\_USER\_AGENT can use regex as an argument. In the preceding code, many user agents require regex because of the complexity of their name. Instead of creating a rule manually, you can use websites such as [https://beautifycode.net/regex-escape](https://beautifycode.net/regex-escape) to quickly construct a properly-escaped regex.

Test the blocking of user agents
--------------------------------

1.  Run the following command to test that the site responds:
    
    `curl -H "host:www.url_you_are_testing.url http://localhost/`
    
2.  Run the following command to test that the user agent is blocked:
    
    `curl -H "host:www.url_you_are_testing.url" -H "user-agent:Pcore-HTTP/v0.25.0" http://localhost/`
    
    This is an example command to test a user agent named Pcore. You can run a similar command for your intended user agent.
    

Block traffic from robot crawlers
---------------------------------

A robot crawler can cause problems to a site. It makes a large number of requests. You can use the following code in any of these situations:

*   When robots do not adhere to the robots.txt file
*   When you want to immediately block the traffic and you do not wait for robot crawlers to fetch the robots.txt file

    RewriteCond %\{HTTP_REFERER\} ^$
    RewriteCond %\{HTTP_USER_AGENT\} "<exact_name_for_the_bot>"
    RewriteRule ^(.*)$ - [F,L]
    

Block hotlinks
--------------

Website owners want to protect their website from other websites that intend to steal content, or hotlink to images and steal bandwidth.

To prevent hotlinking, use the following code and replace `domain.com` with your domain name.

    RewriteCond %{HTTP_REFERER} !^$
    RewriteCond %{HTTP_REFERER} !^http://(www\.)?domain.com/ .*$ NC
    RewriteRule \.(gif|jpg|swf|flv|png)$ /feed/ R=302,L
    

In this code, `R=302` indicates a temporary redirect. If you want a permanent redirect, adjust it to `R=301`.

Troubleshoot staging and non-production outages
-----------------------------------------------

Staging environment slowdowns often result from configuration issues or internal testing rather than external attacks.

If your non-production site is unresponsive:

*   **Check recent activity:** Verify if there were recent code deployments, configuration changes, or feature flag toggles.
    
*   **Verify access controls:** Confirm [IP ACLs or Basic Auth settings](/acquia-cloud-platform/controlling-access-cloud-platform "https://www.google.com/search?q=https://docs.acquia.com/acquia-cloud-platform/manage/security/access-control/") have not accidentally blocked legitimate users or testing tools.
    
*   **Review logs for internal tools:** Check the access logs for repetitive requests from internal testing or monitoring tools. Use the commands mentioned in the [Identify unwanted traffic sources](#unwanted-traffic) section.
    
*   **Check Application Errors:** Review the `error.log` and `mysql-slow.log` for application failures rather than traffic volume.
    
*   **Check Cron Status:** Ensure that scheduled jobs are not stuck or do not cause resource exhaustion.
    

Before you create a Support ticket
----------------------------------

If the issue persists or you suspect a large-scale DDoS attack, collect the following data before you create a Support ticket. Providing this information upfront expedites resolution.

*   **Environment:** Produstion, Staging, Development and Environment ID.
    
*   **Time Window (UTC):** Exact start and end time of the incident. For example, `2025-11-17 18:00–18:30 UTC`.
    
*   **Triage Data:** Output from the Triage CLI commands. For example, Top 5 IPs, User Agents, and Paths.
    
*   **Actions Taken:** Specific WAF rules, rate limits, or `.htaccess` changes that you applied.
    
*   **Sample Request:** A recent `X-Request-ID` header value from a failed or blocked request, if available.