Configuring FTS for localization
The ARSystemInstallDir\ftsconfiguration directory contains the files described in the following table that you can use to configure FTS for localization:
Configuration files for FTS
Configuration file | Description |
|---|---|
stopwords_locale.txt | Specifies stop words used for eliminating common words. Each stop word is a separate line item in the text file. You create a file for each locale or language. You can update this file through the Ignore Words List field on the AR System Administration: Server Information form. |
rootwords_locale.txt | Lists words and their corresponding root word. You create a file for each locale or language. By default, FTS uses stemmers particular to the installed locale. A stemmer takes words, such as fast, faster, and fastest, and converts them to stem words at indexing time so that using a word, such as fast, finds all references to it, such as faster and fastest. The rootwords_locale.txt file overrides how the FTS or third-party stemmers define root words. If a word is found in the root words list, then the root word is used, and the stemmer is not run on the original word. Each line in the rootwords_locale.txt file contains space-separated words with the first word being the root word and the others being words that map to the root word, for example: run running runs |
rootcharacters_locale.txt | The rootcharacters.txt file uses encoding for characters. This file lists characters and their corresponding root character. All characters are separated by a space. The first character is the replacement for the characters that follow, for example: A = À Á Â Ã Ä Å Ā Ă Ą Ə Ǎ Ǟ Ǡ Ǻ Ȁ Ȃ Ȧ Ⱥ ᴀ Ḁ Ạ Ả Ấ Ầ Ẩ Ẫ Ậ Ắ Ằ Ẳ Ẵ Ặ Ⓐ A a = à á â ã ä å ā ă ą ǎ ǟ ǡ ǻ ȁ ȃ ȧ ɐ ə ɚ ᶏ ᶕ ḁ ẚ ạ ả ấ ầ ẩ ẫ ậ ắ ằ ẳ ẵ ặ ₐ ₔ ⓐ ⱥ Ɐ a You can modify the FTS locale configuration file, FTSLocaleConfig.xml to contain a subset of characters. You can also add characters as per your requirement. In an on-premises environment, you can modify the FTS locale configuration file (FTSLocaleConfig.xml) by using configmaps. For more information about configmaps, seeUsing ConfigMaps to access the configuration files. Perform the following steps to modify the rootcharacter.txt file location in installDirectory\ftsconfiguration\conf\: Important: Make sure that you have the appropriate version of FTS plug-in that works with accented characters.
|
thesaurus_locale.txt | Contains synonyms used to perform thesaurus expansion during indexing. You create a file for each locale or language. If the thesaurus.txt file is present, any terms it finds in the thesaurus are expanded within the index to contain their synonym values at the same word location. Each line in the text file contains space-separated words that are synonyms. For example: quick fast speedy |
If you modify any of the FTS configuration files, you must restart the server for the changes to take effect.
The FTSLocaleConfig.xml file, in ARSystemInstallDir\ftsconfiguration, contains pointers to the configuration files. For example:
<locale locale="en">
<stopwordfile>enStopword.txt</stopwordfile>
<rootwordfile>enRootword.txt</rootwordfile>
<thesaurusfile>enThesaurus.txt</thesaurusfile>
<indexAnalyzer> </indexAnalyzer>
<searchAnalyzer> </searchAnalyzer>
<stemmer>English</stemmer>
</locale>
<locale locale="de">
.
.
.
You can include as many locale elements as you need in the FTSLocaleConfig.xml file. By default, AR System server is installed with two-letter locales defined, but you can also include country codes, for example, <locale locale="en_US">.
If any element is blank or missing in any locale section of the FTSLocaleConfig.xml file, Full Text Search does not use that item in the analysis process.
For advanced configuration, you can enter the name of an index analyzer, search analyzer, or stemmer file in the FTSLocaleConfig.xml file. For more information, see Advanced FTS configuration files.
AR System serverdoes not support Full Text Search if you have a read-only database. For more information on using a read-only database, see Using read-only database.
Related topics