Cloud Platform

Sharing Solr indexes in read/write mode

With all of the benefits of allowing multiple websites to write to a single Solr instance, there are some inherent areas of concern in doing so. Potential problems can include accidental Solr index pollution (such as a development website writing to a production index) and accidental deletion when trying to remove a non-production index.

However, the following methods can help you share Solr indexes safely, which can ensure that your website is able to service users:

  • Consider using read-only mode
    Ensure that your application actually needs to write to a Solr index; in some cases, reading is sufficient. For example, if your development and production websites have the same content, you can use the production index in read-only mode for search.

  • Ensure correct implementation of Solr hashes
    Solr-related Drupal modules (including Apache Solr Multisite Search and recent versions of Search API Solr Search) should tag all indexed items with a unique string ID called a hash. This hash value should be unique to each website that writes to the same Solr index, allowing you to logically share a single Solr index across websites. All queries, indexing, and delete operations are limited to items that have this hash value. As a result, problems can occur when moving a Drupal database. Hash values are created when the module is installed, and are stored as configuration data in Drupal’s database. Moving a database to another environment can include configuration data (and by extension, the hash values). To resolve this, we recommend that you override the hash value in code. To review your hash values, examine the following items:

  • Use tools to understand your index
    Tools such as Solr Devel (for Drupal 7) enable you to both examine outgoing queries and understand the data in your index. For information about using Solr Devel, see Using Solr Devel module to debug indexing and queries.

  • Be prepared to delete data and reindex
    If a single Solr index is polluted with data from unintended sources, it can be difficult to fix. This is because Drupal keeps tracking information about what it thinks it has already indexed in the database, separate from the actual Solr data. To recover from this issue, and to help prevent future pollution, complete the following steps:

    1. Ensure that you close any connections from unwanted websites writing to the target Solr index.

    2. Completely wipe data from the target Solr index.

    3. Reindex all of the data that needs to be written to this Solr index.