Monitoring uptime for your site
The Acquia site health service monitors your site's uptime and responsiveness. This service checks your site every minute to see if it is online and serving pages. The service does this by loading a special, secure URL (the "canary URL") served by the Acquia Connector module. The response to this canary URL informs us that your site can bootstrap Drupal, and that all of its database and other basic dependencies are functioning. The site health monitoring service also checks other pages on your site (such as your home page) less frequently and tracks their response time. Only if the page checks return errors repeatedly does the site health monitoring service send an alert that the site is down. This approach helps minimize false alarms.
Site health monitoring is a self-service feature you can use to monitor trends in response times and availability. It does not automatically notify Acquia Support of application outages or issues. Acquia uses different systems to monitor the health of a site's servers. If Acquia detects server issues that may impact site availability for a prolonged period of time, a proactive ticket will be filed by Acquia Support. However, not all application issues are caused by server problems, so site health monitoring will sometimes catch issues that Acquia's server monitoring does not.
The site health service should be used to monitor only production sites; it should not be used to monitor non-production sites.
Before you begin
To use the site health monitoring service, you must also install and enable Acquia Connector module (version 7.x-2.13, 6.x-2.14 or later) on your production site. Acquia recommends that you always use the latest version of the Acquia Connector module. After you update or install the Acquia Connector module, go to the Home > Admin > Reports page of your Drupal site and click Status reports. In the Acquia SPI section, click the manually send SPI data link to register the presence of the new module with Acquia Insight. You may need to wait a few minutes before you can then enable the site health monitoring service.
Enabling site health monitoring
To enable site health monitoring:
- On the Sites > Insight page, under Site Health, click Enable.
- In the Edit environment settings dialog, select Monitor site health.
- Optionally, select the domain that you want to monitor.
Viewing the uptime graph
To view information about your website's uptime, open the Sites > Insight page. Under Site Health, the uptime graph displays your site's responsiveness over time, measured in milliseconds.
The graph displays your site's response with bars colored according to the response:
|orange||Site health monitoring configuration error|
|red||Site down, or site DNS error|
|black||Site is unreachable|
Move your mouse over a bar on the graph to see details of the response, including date and time, exact response time in milliseconds, and response (ok or error). Note that a single error does not necessarily mean that the site was down for a whole minute. It may have been a momentary issue.
You can use the date pickers to change the start and end dates of the data presented in the graph.
Getting more information about errors
When the uptime service receives an error, a notification of the error is sent and displayed in the Insight notification list. The notification describes the type of error and possible causes. The following categories of errors are reported:
- Site down. Here are some possible causes:
- New code was deployed which may contain an error in the PHP that requires a code rollback. Check your PHP error logs for any fatal errors.
- A new module was enabled and it has created an error condition on the site. Disable the module.
- Caches were recently cleared and the site is hitting a memory limit when attempting to rebuild the cache.
- Site impaired. The site is returning errors for some, but not all, of the pages Acquia monitors.
- Site online. The site has come back online.
- Site in Maintenance mode. The site reports that it is in Drupal maintenance mode.
- Site health monitoring configuration error. The site is not responding to testing. Ensure that the Acquia Connector is enabled, the module version is 7.x-2.13, 6.x-2.14 or later, and that the site is connected to Acquia Insight.
- Site not responding. This could be due to general internet connectivity issues, or the servers hosting the site may be offline. If your site is hosted with Acquia and the servers powering it are down, we will be opening a support ticket with you shortly.
- Site DNS error. The DNS record for the site is failing to resolve, meaning we did not receive a timely response during a DNS lookup. This usually means either that your DNS provider or registrar is having an issue, or that there are networking issues outside of Acquia's control. DNS resolution problems are often resolved very quickly. If you can load the domain, the issue may be localized or already resolved. Acquia cannot investigate DNS errors.
Receiving email notifications
You can enable email notifications of site health issues. If you do so, the site health monitoring service will send an e-mail notification after three consecutive minutes of errors. This threshold helps reduce false alarms. You cannot customize this three-minute threshold.
The site health monitoring service needs to be able to receive an HTTP 200 response from your homepage and the canary URL in order to function. If, when you first enable site health monitoring, the Site Health graph shows errors, the most likely causes are:
- The Acquia Connector module on your site is not up-to-date.
- Acquia Insight has not yet received confirmation from your site that the Acquia Connector module has been updated. Check the last connection date and time shown on the left sidebar. Run cron, if it has not run recently.
- The site health monitoring service can't reach the canary URL or your homepage because it is being redirected or blocked.
If you have questions about an alert and your Acquia subscription entitles you to open support tickets, you can contact Acquia Support. The site health monitoring service may detect momentary, transient issues on a site, or in the network between our service and a site's servers. For this reason, Acquia Support will not investigate momentary outages. We will, however, investigate recurring instances of partial or complete downtime detected by this feature, or issues which are currently in progress. Furthermore, Support will not investigate DNS errors, which usually mean either that your DNS provider or registrar (often the same) is having an issue, or that there are networking issues outside of Acquia's control.