Loading...


Related Products


Date Published: February 10, 2022

Blocking access using rewrites

Issue

How can I restrict access to part or all of my website?

Resolution

Using RewriteCond and RewriteRule, you can block or restrict access to one section of a website, or an entire website. This restriction can be to one or more IP addresses, domains, and so on. We have covered the setup of some methods in various Help Center articles such as:

For help in generating rules automatically, there are some series of tools at IncrediBILL's htaccess tools. These tools cover a number of the use cases we discuss in the following sections.

Block access to an entire domain

You can prevent users from accessing a domain, or you can prevent users that are coming from a particular domain from accessing your website.

Preventing access to a domain

When using an edit domain, it's highly probable that you will want to limit the users of that domain by allowlisting only a particular group of domains or IP addresses. Here's an example:

# Protect the origin
<IfModule mod_setenvif.c>
  SetEnvIf Host ^origin\.domain\.com$ origin=1
  SetEnvIf AH_CLIENT_IP ^127\.0\.0\.1$ whitelist=1
  SetEnvIf AH_CLIENT_IP ^10\.171\. whitelist=1
  SetEnvIf AH_CLIENT_IP ^10\.210\.122\.14$ whitelist=1
  SetEnvIf AH_CLIENT_IP ^192\.168\.204\.109$ whitelist=1
  SetEnvIf AH_CLIENT_IP ^192\.168\.236\.194$ whitelist=1
  Order deny,allow
  Deny from env=origin
  Allow from env=whitelist
</IfModule>

The Deny from line ensures that this snippet only runs for users access the site using the origin host.

In a similar fashion, you can block access to the DNS name of your elastic load balancer (ELB) like this:

<Limit GET>
order allow,deny
allow from all
deny from [ELB DNS name]
</Limit>

If you have dedicated balancers, you may also be able to do this in your VCL (Varnish).

Disallowing access to users coming from a domain

Requests from specific domains can be blocked by adding the following to .htaccess:

RewriteCond %{HTTP_REFERER} domain-name\.com [NC] 
RewriteRule .* - [F]

For multiple domains, use something similar to the following:

RewriteCond %{HTTP_REFERER} domain-one\.com [NC,OR] 
RewriteCond %{HTTP_REFERER} domain-two\.com RewriteRule .* - [F]

Blocking by IP address

You can use rewrites to block access to one or more IP addresses. For a list of examples, see Blocking by IP address.

Blocking access to specific areas

You may want to block access to parts of your site, only giving users from particular IP ranges or domains the ability to access them. The previously mentioned articles cover this example in different ways. There are some reasonably easy examples for address blocking.

Note

If you want to restrict access to your Acquia subscription, see Securing your Acquia subscription with IP address allowlisting.

Blocking access to specific pages

Some customers would only like specific IPs to be able to log into the Drupal website. If this is the case, you have to do the following:

Note

If you are on Acquia hosting, or you're using Varnish, you'll want to ensure your page caches are not serving restricted content. See Set max-age headers for specific pages, files, or paths for help in setting max-age=0 in appropriate circumstances.

In many cases, RewriteCond may be chained with other conditions.

  • Ensure only accepted IPs can access the page
  • Ensure that the user page is not cached after the accepted IP has accessed the page.
  • On Acquia Cloud, we must use the AH_Client_IP variable here as if the website is behind an elastic load balancer (ELB), because the X-FORWARDED-FOR variable will have both the original customer IP as well as the ELB IP. This will mean that any exact match will not succeed and the user will be blocked. Substitute IP addresses in this example as necessary:
Header always set Cache-Control max-age=0 env=userpage 
RewriteCond %{REQUEST_URI} ^/user(/login)*$ [OR] 
RewriteCond %{QUERY_STRING} ^q=user(/login)*$ 
RewriteCond %{ENV:AH_Client_IP} !^2.121.88.122$ 
RewriteCond %{ENV:AH_Client_IP} !^12.211.9.49$ 
RewriteCond %{ENV:AH_Client_IP} !^123.45. 
RewriteCond %{ENV:AH_SITE_ENVIRONMENT} ^(dev|test|prod)$ 
RewriteRule ^ http://%{HTTP_HOST} [R=302,L,E=userpage:1]

In conjunction, ensure Drupal does not cache the user page by entering the following in settings.php:

if (isset($_GET['q']) && strpos($_GET['q'], 'user') === 0) {
  $conf['cache'] = FALSE;
}

You can use a similar method to restrict access to a single page to one IP address. The following code restricts access to /path/to/page to the IP 192.168.2.3:

// Add this to the end of settings.php
// Handle allowlisting for a /path/to/page URL
if ($_SERVER['REQUEST_URI'] == '/path/to/page') {
  // Turn off page caching (internal and external) for this page
  // to avoid a valid IP getting this page into cache.
  drupal_page_is_cacheable(FALSE);
  // Only allow access to an IP.
  if ($_ENV['AH_Client_IP'] != '192.168.2.3') {
    // Deny access
    drupal_add_http_header('Status', '403 Forbidden');
    echo "<h1>403 Forbidden</h1><p>Access is forbidden from " . $_ENV['AH_Client_IP'] .". Shoo!</p>\n\n";
    exit;
  }
}

Once this is done, you should clear the Varnish cache for that page.

Blocking core Drupal pages

Files such as CHANGELOG.txt can be used to quickly identify security vulnerabilities in your Drupal installation to a malicious script or user. While there are a number of ways to identify the version of Drupal that you are running, one quick addition to your .htaccess file can make it slightly le obvious.

# Various alias rules 
# Current Drupal version
Redirect 404 /core/CHANGELOG.txt
Redirect 404 /core/COPYRIGHT.txt
Redirect 404 /core/INSTALL.mysql.txt
Redirect 404 /core/INSTALL.pgsql.txt
Redirect 404 /core/INSTALL.sqlite.txt
Redirect 404 /core/INSTALL.txt
Redirect 404 /core/LICENSE.txt
Redirect 404 /core/MAINTAINERS.txt
Redirect 404 /core/PATCHES.txt
Redirect 404 /core/UPDATE.txt
Redirect 404 /core/USAGE.txt
Redirect 404 /INSTALL.txt
Redirect 404 /modules/README.txt
Redirect 404 /profiles/README.txt
Redirect 404 /README.md
Redirect 404 /sites/README.txt
Redirect 404 /themes/README.txt

# Drupal 7
Redirect 404 /CHANGELOG.txt 
Redirect 404 /COPYRIGHT.txt 
Redirect 404 /cron.php 
Redirect 404 /INSTALL.mysql.txt 
Redirect 404 /INSTALL.pgsql.txt 
Redirect 404 /INSTALL.sqlite.txt 
Redirect 404 /INSTALL.txt 
Redirect 404 /install.php 
Redirect 404 /LICENSE.txt 
Redirect 404 /MAINTAINERS.txt 
Redirect 404 /PATCHES.txt 
Redirect 404 /README.txt 
Redirect 404 /update.php 
Redirect 404 /UPGRADE.txt 
Redirect 404 /web.config

Blocking file resources from all but a handful of sites

You may want to keep a specific directory from being accessed by the general public, unless it's being pulled by a particular website. This example shows blocks requests to the /sites/default/files directory unless the request comes from www?prod-kb, or the kb subdomains of example.com.

RewriteCond %{REQUEST_URI} ^/sites/default/files 
RewriteCond %{HTTP_REFERER} !^http://prod-kb.example.com [NC] 
RewriteCond %{HTTP_REFERER} !^http://kb.example.com [NC] 
RewriteCond %{HTTP_REFERER} !^http://(www.)?example.com [NC] RewriteRule .* - [F]

Blocking access to a subfolder

You can block access to a folder using code like this:

RewriteCond %{THE_REQUEST} ^A-Z{3,9}\ /(/+)/.*\ HTTP NC
RewriteRule .* - [F,L]

Time-based blocks

If you are only allowed to expose your website for a specific time period, you can do that. This condition and rule blocks access until 4 PM.

RewriteCond %{TIME_HOUR} ^16$
RewriteRule ^.*$ - [F,L]

Blocking HTTP commands

You may not want to allow certain types of commands to be proceed by your site.

This blocks any HTTP request that is not a GET or a POST request.

RewriteCond %{REQUEST_METHOD} !^(GET|POST) 
RewriteRule .* - [F]

Blocking attacks on your website

There are multiple methods that can be used to deflect attacks.

Block specific user agents

If your website is the victim of a DDoS attack, and you want to block a group of IP addresses using the same User Agent, the following code may be helpful. Replace the UserAgent with the name of the agent you want to block:

RewriteCond %{HTTP_USER_AGENT} UserAgent 
RewriteRule .* - [F,L]

You can also block more than one User Agent at a time with the [OR] ('or next condition') flag, and the [NC] ('no case') flag renders the string case insensitive. Here are some examples of some user-agents with properly escaped regexes:

RewriteCond %{HTTP_USER_AGENT} Baiduspider [NC,OR] 
RewriteCond %{HTTP_USER_AGENT} HTTrack [NC,OR] 
RewriteCond %{HTTP_USER_AGENT} Yandex [NC,OR] 
RewriteCond %{HTTP_USER_AGENT} Scrapy [NC,OR] 
RewriteCond %{HTTP_USER_AGENT} Mozilla/5\.0\ \(compatible;\ Yahoo [NC,OR] 
RewriteCond %{HTTP_USER_AGENT} AppleNewsBot [NC,OR] 
RewriteCond %{HTTP_USER_AGENT} Googlebot [NC,OR] 
RewriteCond %{HTTP_USER_AGENT} Mozilla/5\.0\ \(compatible;\ YandexBot [NC] 
RewriteRule .* - [F,L]

Important

Properly escape characters inside your regex (regular expressions) to avoid website errors.

HTTP_USER_AGENT can use regex as an argument. As seen in the example above, many User Agents will require regex due to the complexity of their name. Rather than creating the rule manually, websites such as https://www.regex-escape.com/regex-escaping-online.php can help construct a properly-escaped regex quickly.

How to test that the block is working

Test that the site is responding:

curl -H "host:www.url_you_are_testing.url http://localhost/

Test that the user-agent (Pcore as an example) is indeed blocked:

curl -H "host:www.url_you_are_testing.url" -H "user-agent:Pcore-HTTP/v0.25.0" http://localhost/

Block traffic from robot crawlers

While a robot crawler may not technically be an attack, some crawlers can cause real problems. You can use this when the robots do not obey the robots.txt file, or if you need an immediate block, because robots.txt is generally not fetched immediately by crawlers.

RewriteCond %\{HTTP_REFERER\} ^$
RewriteCond %\{HTTP_USER_AGENT\} "<exact_name_for_the_bot>"
RewriteRule ^(.*)$ - [F,L]

The last thing most website owners want is other websites stealing their content, or worse - hotlinking to their images and stealing their bandwidth. Here s a simple bit of code that prevents it-modify domain.com to your domain name:

RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://(www\.)?domain.com/ .*$ NC
RewriteRule \.(gif|jpg|swf|flv|png)$ /feed/ R=302,L

Did not find what you were looking for?

If this content did not answer your questions, try searching or contacting our support team for further assistance.

Back to Section navigation
Back to Site navigation