Information for: DEVELOPERS   PARTNERS

Selecting a language

Acquia Search can index websites in most languages, but it passes content through a language-specific stemmer during indexing, depending on the search schema that you configured for your website.

Using the Search API Solr Drupal module

To use a language-specific Solr configset, you can use the built-in functionality in the Search API Solr Drupal module to generate a new configset optimized for a specific language or languages. You can then deploy it by following Custom Solr configuration with Solr 3/4 or Custom Solr configuration with Solr 7.

The Search API Solr module runs a check to compare the configuration in Drupal and Solr, and shows an error message if it detects a mismatch.

If you’re using an outdated version of Drupal and Solr, you get an error message such as You are using outdated Solr configuration set. For more information, see Troubleshooting the Search API Solr message.

Acquia Search with non-Latin languages

Although the default schema of Acquia Search optimizes for English searches, Acquia also provides schemas in other languages. Most of the Latin-based languages have grammatical and linguistic similarities. Therefore, you can use them in a straightforward manner.

Non-Latin languages, such as Chinese, Japanese, and Korean (CJK), use different stemming and spacing rules compared to Latin-based languages. The Solr / Lucene search engine must handle non-latin languages differently than Latin-based languages.

For instructions on how to add support for the CJK languages, see Custom Solr configuration with Solr 3/4 or Custom Solr configuration with Solr 7.

Important

EOL notice! Acquia Search with Solr 3/4 will reach end-of-life on Tuesday, March 1, 2022. For more information, see Frequently Asked Questions (FAQ) regarding End of Support for Acquia Search with SOLR 3/4.

Acquia Search with Solr 7 uses CJKBigramFilterFactory.

For more information, see the following section.

The SmartChineseSentenceTokenizerFactory and SmartChineseWordTokenFilterFactory classes are available if your Acquia Search index uses Solr 3.5.

If your Acquia Search index uses Solr 4.5.1, use CJKTokenizerFactory instead.

Acquia recommends that you first test your configuration using the instructions on Testing your custom Solr configset with Solr 7, and then apply them to your Solr instance.

More information

For more background information about indexing and searching CJK languages, see the following resources: