apachesolr

CJK (Chinese, Japanese and Korean) Language support and Acquia Search

By default the Acquia Solr Search schema is optimized for English searches. Other Latin-based languages (e.g. Italian) will work due to grammatical and vocabular similarities, but Chinese, Japanese and Korean (CJK) languages use different stemming and spacing rules than Latin-based languages, and are handled differently by Solr / Lucene search engine.

These references that provide some helpful background information about Searching CJK languages, and they are intended to be a reference as you consider adding CJK support to Acquia Solr Search index for your site(s). 

Release notes - Acquia Search

Looking for the latest and greatest new features and changes to Acquia Search? Read on and check back regularly to see what else we’ve done.

Acquia Search - 1 June 2015

Mon, 1 Jun 2015

Acquia Search was updated to fix the following issue:

Fixed issue

  • Acquia Search did not display the search configuration gear to some users, even though they had the appropriate permissions to view the schema collection tool. (NN-11964)

Acquia Search - 4 August 2014

Mon, 4 Aug 2014

  • nginx upgraded to version 1.4

    The version of nginx used by Acquia Search has been upgraded to nginx 1.4. This version of nginx provides better performance, as well as many significant bug fixes.

Acquia Search - 10 April 2014

Thu, 10 Apr 2014

Acquia Search was updated on April 10 with the following new features and enhancements:

  • Updated version of the Acquia Search module for Drupal 7

    The latest version of the Acquia Search module for Drupal 7, which is included in version 7.x-2.14 of the Acquia Network Connector module, includes a fix to an issue in the Acquia Search module that could result in failed search indexing. For more information, see the release notes for Acquia Network Connector 7.x-2.14 on drupal.org.

Acquia Search - 18 March 2014

Tue, 18 Mar 2014

Acquia Search will be upgraded to include new libraries to support search schema customizations for Chinese, Japanese, and Korean. These new libraries are available only to Acquia Search customers who are eligible to deploy custom schema.xml files, that is, Acquia Network Enterprise and Acquia Cloud Enterprise customers. For more information, see Custom Solr configuration.

No downtime is expected for customer sites or search functionality. However, during the upgrade, search indexes will be in read-only mode and therefore will not be updated with new or changed content for a period of 1 to 10 minutes.

The upgrade adds a set of libraries that are valuable in customizing the Apache Solr schema.xml file for Chinese, Japanese, and Korean. By default, Solr was built with Western languages in mind. However, in some other languages, such as Chinese or Japanese, words are not necessarily split by spaces. Another important difference is that some characters in these languages look the same, but mean something very different. Chinese, Japanese, and Korean have multiple meanings for some characters, so searches should return appropriate results for any of those meanings. To learn more, read a series of blog posts at Discovery Grindstone.

The classes being added to Acquia Search are:

Learn more about ICU (International Components for Unicode).

Acquia Search - 11 March 2014

Tue, 11 Mar 2014

If you have more than one search index linked to your subscription, the Acquia Search page at https://insight.acquia.com/acquia-search now shows combined statistics for all of the search indexes in your subscription. Before, it showed only your first index size (and ignored the others). Therefore, if you have multiple indexes, your usage graphs will show an increase, because they now include the sums of the searches, documents, and storage usage of all your indexes. For more information, see Viewing search usage and statistics.

Contact supportStill need assistance? Contact Acquia Support