Query parsers  Preserved terms

Chapter 2: Configuring OmniQ Enterprise

Stopwords

Stopwords are common words such as “I,” “a,” “an,” “the,” and so on, that are ignored during the indexing or querying process. Removing the most common words during the indexing process keeps index sizes smaller, which enhances performance.

You can change the list of stopwords in one of two ways:

NoteThe stopword list must be UTF-8 encoded. Because the words on the stoplist are ignored when you index documents, (in other words, the document is indexed as if the words on the stoplist did not exist), you must make any changes to the stoplist before you index. If you have already indexed your documents, and add new stopwords, the words are not included in your query but the disk space consumed by that word’s associated data is not reclaimed until you reindex your documents.

Removing stopwords after you have already indexed your documents has no affect until you reindex your documents.





Copyright © 2005. Sybase Inc. All rights reserved. Preserved terms

View this book as PDF