---
title: "Indexing attachments with Apache Solr Search module"
date: "2024-02-14T06:18:38+00:00"
summary: "Enhance your Drupal site's search functionality by indexing file attachments with Apache Solr Search. Learn how to configure and optimize attachment indexing for improved content discoverability and user experience."
image:
type: "page"
url: "/acquia-cloud-platform/indexing-attachments-apache-solr-search-module"
id: "bfe3876f-5ec8-42d9-aa80-d41151571fdc"
---

In addition to Drupal nodes, Acquia Search results presented to your users can include files that contain matches to the search query. File formats that can be indexed include HTML, XML, Microsoft Office documents, OpenDocument, PDF, RTF, `.zip`, and other compression formats, text formats, audio formats, video formats, and more. For a complete list of supported document formats, see the [Apache Tika documentation](http://tika.apache.org).

Search results for attached files display a direct link to the attached file and the node to which it is attached:

![Search results with attachments](https://acquia.widen.net/content/xazcfrvuat/jpeg/search_search-results.jpeg?position=c&color=ffffffff&quality=80&u=soxprt)

Searching file attachments requires the Apache Solr Attachments module. The Apache Solr Attachments module uses the [Apache Tika Content Analysis Toolkit](http://tika.apache.org) to detect and extract meta data and structured text content from a wide variety of file formats. After extraction, this information is indexed and available to your users with Acquia Search.

Note

Ensure that the most recent versions of your [modules](/acquia-cloud-platform/features/acquia-search/getting-started/modules) are in use. If you use older versions, you may experience problems with extracting attachments.

Installing the Apache Solr Attachments module
---------------------------------------------

To index attachments:

1.  Install and enable the [Apache Solr Attachments 7.x-1.x-dev](https://www.drupal.org/project/apachesolr_attachments) module on your website. You must use this version since this is currently compatible with the Acquia Search Platform and other versions are incompatible.
2.  Go to the **Apache Solr search** > **Configuration** > **Default Index** page.
3.  In the **Configuration** section, select the **File** checkbox as an entity to be indexed.
4.  Click **Save**.

![Search index configuration](https://acquia.widen.net/content/nsqq0fpvla/jpeg/search_index-configuration.jpeg?position=c&color=ffffffff&quality=80&u=soxprt)

The Apache Solr Search configuration page now displays the Attachments tab. Use the settings on this tab to configure the file attachment indexing settings.

Note

The Apache Solr Attachments module is compatible only with Apache Solr Search. If you are using the Search API module, you can use the [Search API attachments module](https://www.drupal.org/project/search_api_attachments) instead. [Configure it](#search-config-attachment-settings) to work with Acquia Search. Be sure to use version 7.x-1.4 or later; earlier versions of the Search API attachments module are not compatible with Acquia Search.

Configuring file attachment index settings
------------------------------------------

The Attachments tab of the Apache Solr Search configuration page contains the following configuration options for indexing attachments in Apache Solr Search:

![Apache Solr Search attachments configuration](https://acquia.widen.net/content/vnbwngruov/jpeg/search_apache-solr-search-attachment.jpeg?position=c&color=ffffffff&quality=80&u=soxprt)

**S. No.**

**Item**

**Description**

1

**Excluded file extensions**

A space-separated list of file extensions that are excluded from indexing. Modify this list to suit the needs of your site. Extensions are internally mapped to a MIME type, so it is not necessary to include variations that map to the same type. For example, `tif` is sufficient to exclude both the `tif` and `tiff` file extensions.

2

**Extract using**

Acquia Search includes Apache Tika for indexing documents. For best performance, select **Solr (remote server)**.

3

**File Size Limit**

If a file is larger than this limit, do not index it. Default is 41943040 bytes (40MB).

4

**Tika directory**

Leave this blank.

5

**Tika jar file**

Leave this set to the default value.

6

**Tika extracting servlet path**

update/extract

7

**Tika extracting servlet parameters**

extractOnly=true

### Index and cache controls

File attachments get indexed at the same time as their parent entities. Under **Actions**, you can:

Action

Description

**Clear the attachment text extraction cache**

Clears all extracted data.

**Delete the attachments from index**

Deletes all the attached files on your site from the Acquia Search index. You need to do this if you change what types of files should be indexed, if your search index becomes corrupted, or if you install a new `schema.xml`.

**Test your tika extraction**

Tests if your Tika configuration settings work.