Information for: DEVELOPERS   PARTNERS

Acquia Search with non-Latin languages

Although the Acquia Search default schema optimizes for English searches, Acquia also provides schemas in other languages. For more information, see Selecting a language. Since most of the Latin-based languages have grammatical and vocabular similarities, they can be used in a straightforward fashion.

Non-Latin languages, such as Chinese, Japanese, and Korean (CJK), use different stemming and spacing rules compared to Latin-based languages, and must be handled differently than Latin-based languages by the Solr / Lucene search engine.

For instructions on how to add support for the CJK languages, see Custom Solr configuration.


The SmartChineseSentenceTokenizerFactory and SmartChineseWordTokenFilterFactory classes are available if your Acquia Search core uses Solr 3.5. If your Acquia Search core uses Solr 4.5.1, use CJKTokenizerFactory instead.

After you have developed and tested your configuration changes, contact Acquia support, and we will review and deploy the changed files to your Acquia Search index.

More information

For more background information about indexing and searching CJK languages, see the following resources: