The Document Filter parameters are loaded through the DocumentFilters.xml configuration file.
The default filters in the configuration file are:
HTML filter
Plain Text filter
PDF filter
POI filter
RFC822 filter
RTF filter
XMLInputMatching filter
ImportXML filter
Streaming API for XML (StAX) filter
ZIP filter
Each filter specifies which class is loaded for the filter. In addition, during the installation of Sybase Search Content Adapter, DocumentFilters.xml adds these two filters:
SearchML
SearchMLExport
See “Setting Document Filter parameters for Content Adapter”.
Parameter |
Default |
Description |
---|---|---|
DocumentFilter class |
None |
The Java class that defines the filter. |
Timeout millis |
45,000 |
Indicates the time, in milliseconds, the filter waits while filtering a document. If the filter exceeds the given time, the filter aborts. |
TempFiles keep |
false |
If set to true, the filter keeps any temporary files produced during the filtering process. |
FallbackCharset |
None |
Indicates the character set decoding scheme to use for decoding the text bytes when the encoding is not supplied and cannot be determined.
|
DocumentFilters.xml specifies the MIME type association with each filter within the MimeMapping tag.
Parameter |
Default |
Value |
---|---|---|
MimeType |
None |
Specifies the MIME type associated with the filter. |
DocumentFilterName |
None |
Specifies the name of the filter that maps to the MIME type. |