Information for: DEVELOPERS   PARTNERS   SUPPORT

Unloading Query Results to Amazon S3

When you need to access more than 10k rows of data or set up some post-processing of query results, you can unload query results to Amazon S3 buckets.

This workflow involves the following steps:

  1. Setting up an Amazon S3 Bucket
  2. Unloading query results to Amazon S3
  3. Accessing query results on Amazon S3

Setting up an Amazon S3 Bucket

This is a one-time activity with the following options:

Unloading query results to a self-managed Amazon S3 bucket

Acquia recommends that you use your own Amazon S3 bucket to have full control over how long the query results are retained in the bucket.

To set up your S3 bucket:

  1. Create a new IAM user in your AWS account for CDP. Give the user a descriptive name to easily identify the user in the future.
  2. Create a S3 bucket with the following properties:
    • Bucket Name: Specify a descriptive name to differentiate the bucket from other buckets in your AWS account.
    • Region: Specify the region for your buckets. US clients must create buckets in the US East (North Virginia) aka us-east-1 AWS region. European clients must create their buckets in the EU (Ireland) aka eu-west-1 AWS region.
  3. Define additional properties for the S3 bucket. For example, versioning, logging. Such parameters are not required and can be disabled.
  4. Ensure that the previously-created IAM user has full read and write permissions in the newly created S3 bucket.
  5. Ensure that the S3 bucket is not publicly accessible.
  6. Send the following details to Acquia Support:
    • AWS access key ID for the IAM user
    • AWS secret access key for the IAM user
    • AWS bucket name
    • AWS region where your S3 bucket is located
    • (Optional) The name of the folder where you want to drop your files

After you set up your S3 bucket, Acquia Support sends you an S3 stage name that you can leverage to unload your query results. For example, A1SF_EXTSTAGE_101_123 where 123 is the tenant ID.

If you have any issues while setting up your Amazon S3 bucket, see AWS documentation.

Unloading query results to a CDP-managed Amazon S3 bucket

To unload your query results to a CDP-managed Amazon S3 bucket, you can contact Acquia Support and request for a bucket. If Acquia sets up your bucket, you get read-only access to the files. The unloaded query results are retained in the S3 bucket for seven days.

After Acquia sets up the bucket, Acquia Support sends you an S3 stage name that you can leverage to unload your query results. For example, A1SF_EXTSTAGE_101_123 where 123 is the tenant ID.

Unloading query results to Amazon S3

Run the following query in the Interactive Queries Web user interface:

COPY INTO @S3_stage_name[/folder_path][/file_name][.csv.gz]
-- [.csv.gz] should only be used when SINGLE = TRUE option is used
FROM
(
<your sql query goes here>
)
SINGLE = TRUE -- Specifies whether to generate a single file or multiple files
MAX_FILE_SIZE = 100000000 -- Number that specifies the maximum size (in bytes) of each file to be generated
HEADER = TRUE -- Specifies whether to include the table column headings in the output files
OVERWRITE = TRUE -- Specifies whether the COPY command overwrites existing files, if any, in the location where files are stored

Accessing query results on Amazon S3

Acquia recommends that you use Cyberduck.

To access query results on Amazon S3:

  1. Access your tool. For example, Cyberduck.
  2. Open a connection to Amazon S3 by specifying the following details:
    • Server: [bucket-name].s3.amazonaws.com
    • Port: 443
    • Access Key ID: Obtained from the earlier step
    • Secret Access Key: Obtained from the earlier step