Setting Text Manager parameters

The Text Manager settings are loaded through the TextModule.default.xml configuration file.

Table 3-8: TextModule.default.xml parameters

Parameter

Default

Description

acronym.suf.filename

acronym.ser.suf

Indicates the location where your system stores updates to your acronyms.

custom.term.weight.tag.start

ctw{

Indicates the start of the custom term weighting parameter among search terms. For example, the format for a complete parameter, for the search term “Sybase” with the weighting increased 5 times, is ctw{Sybase,5}.

custom.term.weight.tag.delim

,

The delimiter used to separate search terms from the custom weight.

custom.term.weight.tag.end

}

Indicates the end of a custom term weighting parameter.

custom.synonym.tag.start

syn{

Indicates the start of a query synonym among search terms. For example, the format for a query to find a manager named Joseph or Joe, is “Manager syn{Joseph,Joe} Williams”.

custom.synonym.tag. delim

,

The delimiter used to separate search terms.

custom.synonym.tag.end

}

Indicates the end of the query synonym.

min.term.length

2

The minimum term length considered for indexing. The parameter is not used for preserved terms and does not apply to single-digit terms.

max.term.length

20

The maximum term length considered valid for indexing. This value must match the Term Lexicon Manager parameter term.length.max.

para.minVTC

50

Indicates the minimum valid term count for breaking the paragraph text.

para.maxVTC

100

Indicates the maximum valid term count for breaking the paragraph text.

para.maxChars

1500

The maximum characters is used to force-break a paragraph before the para.minVTC has been reached. This ensures that bad text data does not result in big paragraphs.

parsers.filename

Parsers.xml

The name of the file in the config directory that contains the list of text parsers.

preserved.terms.filename

locale/PreservedTerms_en.xml

Contains a list of preserved terms that are not stemmed during indexing. The list can also include terms less than the minimum term length defined in the min.term.length parameter. See “Preserved terms”.

preserved.terms.suf.filename

preserved.terms.ser.suf

Indicates the location where your system stores updates to your preserved terms.

query.augmentor.filename

locale/QueryAugmenter_en.xml

Contains a list of synonyms and acronyms.

query.augmentor.verboseLoad

false

Set to true to log details of cases where configurations cannot be strictly adhered to. For example, if the synonyms “transport” and “transportation” are provided, the QueryAugmentor creates a log stating that “transportation” collapses to “transport,” so the synonyms are not loaded.

slice.idealVTC

2000

The ideal valid term count for slicing documents.

stopwords.filename

locale/Stopwords_en.xml

Contains a list of stopwords to ignore during the indexing and querying processes to improve system performance. See “Stopwords”.

stopwords.suf.filename

stopwords.ser.suf

Indicates the location where your system stores updates to your stopwords.

synonym.suf. filename

synonym.ser.suf

Indicates the location where your system stores updates to your synonyms.

You can set the text tokenizer and stemmer classes to language-independent classes or to language-specific classes. Language-specific stemmers improve system performance when Sybase Search indexes documents only in one language.