Configuring the term splitter

The term splitter interface defines numerous methods, many of which must be the same, regardless of the splitting algorithm used. To simplify implementing new term splitters, Sybase Search includes an abstract base class that you can extend to inherit much of the required functionality:

com.isdduk.text.AbstractTermSplitter

The convenience base class does not implement any splitting algorithms. The various split methods defined by the term splitter interface are as follows (see the Javadocs for the full interface method listing):

com.isdduk.text.TermSplitter
     split(java.lang.String source) : com.isdduk.util.set.StringSet
     split(java.lang.String source, boolean validate):
          com.isdduk.text.StringList
     splitWords(java.lang.CharSequence source) :
          java.util.SortedSet<com.isdduk.text.TermSplitter.WordIndex>
     splitWords(java.lang.CharSequence source, int offset, int length) :
          java.util.SortedSet<com.isdduk.text.TermSplitter.WordIndex>
     splitFrequencies(java.lang.CharSequence source,
           com.isdduk.util.map.FastTermMap insertInto) : void