The Text Manager module settings are loaded through the TextModule.default.xml configuration file. Table 3-7 shows the parameters in this file.
Parameter |
Default |
Description |
---|---|---|
min.term.length |
2 |
The minimum term length considered for indexing. This is not taken into account in the list of preserved terms and does not apply to single-digit terms. |
max.term.length |
20 |
The maximum term length considered valid for indexing. This value must match the Term Lexicon Manager parameter term.length.max. |
custom.term.weight.tag.start |
ctw{ |
Indicates the start of the custom term weighting parameter among search terms. For example, the format for a complete parameter, for the search term “Sybase” with the weighting increased 5 times, is ctw{Sybase,5}. |
custom.term.weight.tag.delim |
, |
The delimiter used to separate search terms from the custom weight. |
custom.term.weight.tag.end |
} |
Indicates the end of a custom term weighting parameter. |
stopwords.filename |
locale/Stopwords_en.xml |
Contains a list of stopwords to remove during the indexing and querying processes to improve system performance. See “Defining the list of stopwords”. |
preserved.terms.filename |
locale/PreservedTerms_en.xml |
Contains a list of preserved terms that are not stemmed during indexing. The list can also include terms less than the minimum term length defined in the min.term.length parameter. See “Defining the list of preserved terms”. |
term.splitter.class |
com.isdduk.text.Break IteratorSplitter |
Specifies the Java class used to break text into separate words. The default BreakIteratorSplitter handles all double-byte character sets. |
term.stemmer.class |
com.isdduk.text. Porter2Stemmer |
Specifies the Java class used for term stemming. The default Porter2Stemmer is for English text. |
query.augmentor.filename |
locale/QueryAugmenter_en.xml |
Contains a list of synonyms and acronyms. See “Augmenting queries”. |
query.augmentor.verboseLoad |
false |
Indicates logging the details of cases where configurations cannot be strictly adhered to. For example, if the synonyms "transport" and "transportation" are provided, the QueryAugmentor creates a log stating that "transportation" will collapse to "transport," so the synonyms will not be loaded. |
parsers.filename |
Parsers.xml |
The name of the file in the config directory that contains the list of text parsers. |
You can set the term splitter and stemmer classes to language-independent classes or to language-specific classes. Language-specific stemmers allow an increase in system performance when Sybase Search is going to index documents in one language only.