Posted On: Apr 21, 2020

Amazon Elasticsearch Service now offers support for adding custom dictionary files to your domains. Now you can specify synonyms, stop words, and segmentation files to improve your indexing, matching, and search relevancy. Previously, you could only include these types of customizations directly in your mapping which could make them unwieldy and difficult to manage.  

Synonyms provide a means of expanding matches along similar concepts. For example, you can specify the synonym “one-> 1” to match any query that contains these related concepts. Stop words are common, low-value terms like “a, an, and the” that do not contribute positively to matching or relevance. These words are removed from indexes and queries. Using a custom segmentation dictionary is of particular importance to ensure the best indexing of free text in Asian languages and German. These languages have compound terms or characters that can mean different things depending on context and how they are split. You use a segmentation dictionary for tight control on this decomposition. 

With support for custom dictionaries, Amazon Elasticsearch Service can now import your dictionary files from Amazon S3 and make them available to be associated with your Amazon Elasticsearch Service domain(s) as needed. Custom dictionary support is available for all versions of Elasticsearch on Amazon Elasticsearch Service. To learn more, see the documentation

Custom dictionary support is now available for Amazon Elasticsearch Service domains across 21 regions globally: US East (N. Virginia, Ohio), US West (Oregon, N. California), Amazon Web Services GovCloud (US-Gov-East, US-Gov-West), Canada (Central), South America (Sao Paulo), EU (Ireland, London, Frankfurt, Paris, Stockholm), Asia Pacific (Singapore, Sydney, Tokyo, Seoul, Mumbai, Hong Kong), and China (Beijing) region, operated by Sinnet, and China (Ningxia) region, operated by NWCD. Please refer to the Amazon Web Services Region Table for more information about Amazon Elasticsearch Service availability.