Information for: DEVELOPERS   PARTNERS   SUPPORT

FAQs and troubleshooting in Cloud Next

Cloud Platform versions: Cloud Platform documentation refers to both the Cloud Classic and Cloud Next versions of this product, unless otherwise noted.

This page details the FAQs that are specific to Cloud Next.

Acquia began upgrading customer applications to Cloud Next on 24 March 2021. Cloud Next technologies provide better performance, scalability, and resiliency during high load. Powered by advanced serverless architecture, Cloud Next ensures applications can scale faster to satisfy the demands of high traffic events and resource intensive processes.

For more information on Cloud Next and to learn how it differs from Cloud Classic, see Cloud Next and Cloud Classic.

Important

Cloud Next technologies have been extensively tested against a broad variety of Drupal applications, including various custom configurations, with assistance from Cloud Platform subscribers all over the world. In the event that you encounter any technical issues with applications running on Cloud Next infrastructure, contact Acquia support and Acquia will work with you to resolve the issue as quickly as possible.

How does infrastructure upsizing and downsizing work in Cloud Next?

Cloud Next is powered by Kubernetes-managed clusters and advanced file system and database management technologies that leverage pooled resources across hundreds of containerized and virtually isolated environments. This architecture allows non-production environments to run in a high-availability configuration by default. Also, it allows production environments to leverage the full capacity of an entire regional cluster, if necessary, to respond to spikes in traffic or other forms of high resource utilization.

On Cloud Next, non-production environments are permitted to scale up to a predefined point before additional resources stop getting allocated. Acquia Support can raise this limit on a case-by-case basis for customers who have not exceeded their Resource Limits or contractual Cloud Capacity Unit thresholds.

Production environments are also permitted to scale up to a predefined point above the expected amount of cloud capacity, required to satisfy the level of traffic, associated with a given subscription. However, in the unlikely event that any Production environment nears that threshold, Acquia Support will step in to audit the activity on your account and determine if additional auto-scaling can be permitted or if some form of intervention is required to reduce utilization back to normal levels. For example, if your site is under attack or inadvertently consumes more resources than expected.

Once a spike in resource utilization subsides, all environments on Acquia Cloud Next automatically reduce the amount of dedicated capacity to make it available to the regional pool once more.

Could the dynamic auto-scaling on Cloud Next result in overage fees?

The dynamic auto-scaling functionality in Cloud Next provides flexibility and almost limitless growth when extra cloud capacity is required.

For customers on Acquia’s Views/Visits pricing model, Acquia reaches out only if:

  • the aggregate resource utilization across all applications and environments exceeds the Resource Limits guidelines published in Acquia’s Products & Services Guide for a period lasting longer than eight hours.
  • recurring spikes above thresholds are observed on a routine basis over the course of two or more weeks.

In either case, overage fees will be assessed starting from the date of initial notification by Acquia.

For customers on Acquia’s legacy infrastructure-based pricing model, Acquia doesn’t notify you about transient Cloud Capacity Unit spikes that exceed your contractual capacity limits. Such spikes must be rare and shouldn’t exceed three hours in duration. However, recurring or excessive spikes in resource utilization above your contractual capacity limits may result in:

  • overage fees
  • requirement that you upgrade your subscription to an appropriate package using Acquia’s Views/Visits pricing model.

In either case, overage fees will be assessed starting from the date of initial notification by Acquia. However, fees may be waived in the event of a subscription upgrade.

Can I limit or manually trigger scaling on my Cloud Next environments?

Environments running on Cloud Next dynamically scale to meet active demand. Hence, they can’t be scaled manually, even upon request. Acquia will provide the ability to limit scaling per-environment in a later release.

Can I run my applications on a dedicated (non-pooled) version of Cloud Next?

Cloud Next leverages pooled capacity to ensure every production environment can utilize as much capacity as it needs and when it needs it. It was also designed with security and compliance in mind, and leverages an architecture that virtually eliminates the need for permanently dedicated capacity to properly protect an environment.

Hence, Cloud Next is currently only available in a pooled configuration. More information about dedicated capacity configurations will be available later this year.

Should I still notify Acquia in advance of high traffic events on my site? Should my sites still get upsized proactively?

As an Cloud Platform customer, you are always advised to contact Acquia Support before any high traffic event occurs. This information ensures that Acquia’s on-call teams know about the possibility of a spike in legitimate traffic on an environment. This also gives Acquia the context needed to provide assistance, if required.

Environments running on the Cloud Next technologies don’t require proactive capacity increases as such environments dynamically scale as needed. However, Acquia teams may adjust some settings in anticipation of increased traffic to ensure optimal performance throughout the event.

If Cloud Next leverages Kubernetes and containers, can I download and run a local image or provide my own custom builds for Acquia to run?

At this time, Acquia can neither provide build scripts and images for the Cloud Next containers nor run container images and build scripts provided by customers. To get this functionality added to the Cloud Platform roadmap, contact your Acquia account team.

How are Atomic code updates different from traditional code deployments?

On the Cloud Classic infrastructure, code deployments involves a multi-step process:

  1. Once you trigger code deployment, the Cloud Classic web infrastructure performs a [git pull] operation to pull the new code to a hidden repository folder.
  2. Once that process is complete, the Cloud Classic web infrastructure syncs the changes to the active directory for your docroot.

This process, while typically fast, can result in inconsistent code deployments across multiple web nodes.

In contrast, modern deployment methodologies advocate for atomic updates, a concept Acquia has implemented Cloud Next with the following sequence:

  1. Once you trigger code deployment, the Cloud Next application layer automatically provisions new pods and containers, and deploys your code and environment settings.
  2. Once the pod is healthy, traffic is diverted from your old Drupal pods to your new Drupal pods and post-code-deploy Cloud Hooks are triggered.
  3. Once traffic stops routing to your old pods, they will shut down.

This process not only allows for clean and consistent updates to your environment but also ensures that your environments constantly run on fresh and healthy pods. Code deployments on Cloud Next environments take approximately one minute longer, compared to deployments on Cloud Classic environments. This extra time is required to deploy new infrastructure and ensure a clean build.

How is Memcached architected differently on Cloud Next?

On the Cloud Classic infrastructure, Memcached was typically configured to run across all available web nodes with 64 MB of memory allocated to each infrastructure. This isolated a relatively small amount of system memory, pulling it away from PHP, cron, and other application-layer services for the exclusive use of Memcached.

Some customers on the Cloud Classic infrastructure also invested in dedicated Memcache capacity, where some infrastructure was set aside for the exclusive use of Memcached so that two or more nodes could allocate nearly all of their available memory to Memcache. This configuration was typically only available to top-tier accounts and for production environments.

On Cloud Next, Memcached is rearchitected to run in a high-availability configuration on both production and non-production environments. This not only results in improved performance on non-production environments but also expands access to a more performant Memcached configuration to all customers by default.

For environments running on Cloud Next, Drupal pods connect to Memcached pods through a relay point running mcrouter for improved performance and resiliency. Owing to this optimization, the [mcstat] command won’t function during SSH sessions and the Memcache module. After using SSH to access an environment, use acquia-memcache stats to access Memcached data.

The Cloud Platform team is actively researching potential mid-term solutions to both limitations to improve customer visibility into the health and performance of Memcached on your environments.

How does logging behave differently on Cloud Next?

On Acquia’s Cloud Classic infrastructure, you have access to the /var/log/sites directory to view environment log data and store custom logs. On average, these logs remain available for weeks or months at a time, but would be lost whenever infrastructure was relaunched during maintenance events, sometimes limiting the historical data available.

Since Cloud Next leverages ephemeral containers and pods, the same methodology is not available for upgraded environments. Hence, you must configure your custom logs to be stored in an alternative location: /shared/logs. Hence, you must access historical logs for your environments using an alternative method such as Acquia CLI.

For this reason, Acquia has updated the Download Logs functionality in the Cloud Platform user interface and Cloud API v2. Prior to this update, you could only download logs for the current day. However, with this update, you can select a day or time period over the past 30 days. You can download logs up to 24 hours. This enhancement not only makes log access easier but also ensures that your logs persist even after maintenance events.

How do Cron, SSH, and Cloud Hooks behave differently on Cloud Next?

On the Cloud Classic infrastructure, cron jobs, SSH, and Cloud Hooks run on the same infrastructure. Hence, they might compete with other processes, such as PHP or MySQL for system resources. You can also ssh directly to a specific web node and purchase dedicated cron capacity to ensure that a single node was always available and isolated for the sole use of frequent or resource-intensive cron processes.

On Cloud Next, the existing Cloud Classic functionality is enhanced. By default, all production and non-production environments have access to isolated cron, SSH, and Cloud Hooks pods that run only when needed and shut down whenever they are not. Thus, your environments are more efficient by default and utilize fewer Cloud Capacity Units when idle because excess pods aren’t left running. Additionally, cron, SSH, and Cloud Hook activity can’t interfere with Drupal, Memcached, or MySQL activity by competing for resources.

On Cloud Next, long running or stuck processes, such as processes running cron or Cloud Hooks, can’t be terminated manually. Contact Acquia Support for assistance if this happens in one of your environments.

Environments upgraded to Cloud Next may require adjustments to scripts, cron jobs, and other automations due to the updated location of several file directories. For more information, see Working with files.