A document store is a collection of documents in Sybase Search related by physical location. You can organize documents into these types of document stores:
File system
Database
Passive (Web)
The file system and database document stores acquire and maintain documents through internal processes. The Web document stores are passive, and are managed by a Web robot.
A file system document store represents one or more collections of documents imported into Sybase Search from a local file system, including mapped network drives or mounted remote file systems. The file system document store accepts one or more directory roots (for example, D:\documents\office), the contents of which Sybase Search indexes.
Although documents from different file systems (for example, C:\docs\ and \\network-share\docs) can coexist in the same document store, internally, all documents found in all root directories of a file system document store are indexed together. This means they share the same data structures, and they are updated and removed together. Sybase Search analyzes directories and subdirectories. Files with valid MIME types are then indexed. You can customize the list of valid MIME types.
Creating
file system document stores
Click Document Management.
Click File System.
Click Import from file system.
Complete these fields:
Field |
Description |
---|---|
Name |
Name of the document store. |
Manager |
Document store manager for which the document store should exist. Typically, there is one document store manager for each server where document indexing occurs. The document store manager for each document store that you create lets you set up document indexing on the different servers in the system. See “Managing document stores”. |
Member of |
Document groups in which the document store is a member. See “Grouping document stores”. |
Not Member of |
Document groups of which the document store is not a member. |
Store Indexed Text |
Indicates the raw text from each document is stored within the document store. By default, the option is selected. If you unselect the option, the search results page does not include the View Text link option for each results, as there is no cached text to display.
|
Index now |
Proceeds with indexing immediately or save the configuration without indexing. See “Indexing document stores”. |
Directories |
One or more root directories whose contents are indexed and available for searching. |
Include subdirectories |
Index all subdirectories under the root directory. |
File Type Filter |
Include or exclude documents by file extension or MIME type, for example:
|
Click Create.
The Document Stores Summary page shows the details of the document store. An indexing summary is also listed, and, if the store is being indexed, the current indexing session information appears. See “Indexing document stores”.
A database document store represents a collection of documents imported into Sybase Search from one or more database tables. Use a SQL query to import documents from database tables into Sybase Search. See “Constructing an import SQL statement”.
All data conversions are handled internally, including files stored in binary format and links to files elsewhere on a system. Sybase Search can import data from any database for which you can obtain JDBC drivers.
The database document stores use Java Database Connectivity
(JDBC) drivers to import data. Before creating a database document
store, make sure that the appropriate JDBC driver is available in install_location/OmniQ/lib.
If it is not available, copy an appropriate JDBC driver to install_location/OmniQ/lib and
restart the container that manages the database import function.
For more information about the JDBC driver’s location,
see your database vendor’s documentation.
Creating
database document stores
Click Document Management.
Click Database.
Click Import from database.
Complete these fields:
Field |
Description |
---|---|
Name |
Name of the document store. |
Manager |
Document store manager for which the document store should exist. Typically, there is one document store manager for each server where document indexing occurs. The document store manager for each document store lets you set up document indexing on the different servers in the system. See “Managing document stores”. |
Member of |
Document groups in which the document store is a member. See “Grouping document stores”. |
Not Member of |
Document groups of which the document store is not a member. |
JDBC connection details |
|
Host |
Indicates the network name or IP address of the database server. |
DB Name |
Indicates the name of the database. |
Username |
Indicates the name of the user who has authenticated access to the database. |
Password |
Indicates the password used to authenticate access to the database. |
Preset |
Indicates the type of database and the configuration of the JDBC options. When you select a database from the Preset list, Sybase Search automatically displays the port, driver, and URL with common values for the type of database selected. To use a preset database:
If you do not select a preset database from the list, enter driver and URL appropriate values.
|
Port |
Indicates the database server listener port. If you select a database from the Preset list, this field is populated automatically. |
Driver |
Indicates the full class name of the JDBC driver. If you select a database from the Preset list, this field is populated automatically. |
URL |
Indicates the JDBC URL to use to contact the database. If you select a database from the Preset list, this field is populated automatically. |
SQL Query |
Indicates the SQL statement designed to import documents from a database. See “Constructing an import SQL statement”. |
Document reference |
|
Class |
Signifies the Java class type that should be used by Sybase Search internally to store the DOC_REF SQL datatype. The document reference class is automatically determined the first time data is extracted from the database, and it cannot be changed. |
Length |
Identifies the document reference length, which is used only for java.lang.String document reference types (the lengths of other types are implicit). In most cases, the value in this field should be the same as the VARCHAR column width from which the document references are being extracted. If the document reference is not a string, this value is ignored. |
Store Indexed Text |
Indicates the raw text from each document is stored within the document store. By default, the option is selected. If you unselect the option, the search results page does not include the View Text link option for each results, as there is no cached text to display. |
Index now |
Indicates whether to proceed with indexing immediately or to save the configuration without indexing. See “Indexing document stores”. |
Click Create.
The Document Stores Summary page shows the details of the document store. An indexing summary is also listed, and, if the store is being indexed, the current indexing session information appears. See “Indexing document stores”.
A passive document store represents a collection of documents imported into Sybase Search by an external process such as Web robot. The Web robot manages the download of Web content from the Internet and intranets. The Web content is sent to a passive document store, where it is indexed and made available for searching. See “Web robots”.
Sybase Search also allows you to create custom processes for externally managed document stores. See “Customizing externally managed document stores”.