Document stores

A document store is a collection of documents in Sybase Search related by physical location. You can organize documents into these types of document stores:

The file system and database document stores acquire and maintain documents through internal processes. The Web document stores are passive, and are managed by a Web robot.

File system document stores

A file system document store represents one or more collections of documents imported into Sybase Search from a local file system, including mapped network drives or mounted remote file systems. The file system document store accepts one or more directory roots (for example, D:\documents\office), the contents of which Sybase Search indexes.

Although documents from different file systems (for example, C:\docs\ and \\network-share\docs) can coexist in the same document store, internally, all documents found in all root directories of a file system document store are indexed together. This means they share the same data structures, and they are updated and removed together. Sybase Search analyzes directories and subdirectories. Files with valid MIME types are then indexed. You can customize the list of valid MIME types.

StepsCreating file system document stores

  1. Click Document Management.

  2. Click File System.

  3. Click Import from file system.

  4. Complete these fields:

    Field

    Description

    Name

    Name of the document store.

    Manager

    Document store manager for which the document store should exist. Typically, there is one document store manager for each server where document indexing occurs. The document store manager for each document store that you create lets you set up document indexing on the different servers in the system. See “Managing document stores”.

    Member of

    Document groups in which the document store is a member. See “Grouping document stores”.

    Not Member of

    Document groups of which the document store is not a member.

    Store Indexed Text

    Indicates the raw text from each document is stored within the document store. By default, the option is selected. If you unselect the option, the search results page does not include the View Text link option for each results, as there is no cached text to display.

    NoteIf you do not want to view the raw text from your documents and have disk space constraints, you can unselect this option.

    Index now

    Proceeds with indexing immediately or save the configuration without indexing. See “Indexing document stores”.

    Directories

    One or more root directories whose contents are indexed and available for searching.

    Include subdirectories

    Index all subdirectories under the root directory.

    File Type Filter

    Include or exclude documents by file extension or MIME type, for example:

    • Include – indexes documents of the specified file type.

    • Exclude – indexes all documents except those of the specified file type.

  5. Click Create.

    The Document Stores Summary page shows the details of the document store. An indexing summary is also listed, and, if the store is being indexed, the current indexing session information appears. See “Indexing document stores”.

Database document stores

A database document store represents a collection of documents imported into Sybase Search from one or more database tables. Use a SQL query to import documents from database tables into Sybase Search. See “Constructing an import SQL statement”.

All data conversions are handled internally, including files stored in binary format and links to files elsewhere on a system. Sybase Search can import data from any database for which you can obtain JDBC drivers.

NoteThe database document stores use Java Database Connectivity (JDBC) drivers to import data. Before creating a database document store, make sure that the appropriate JDBC driver is available in install_location/OmniQ/lib. If it is not available, copy an appropriate JDBC driver to install_location/OmniQ/lib and restart the container that manages the database import function. For more information about the JDBC driver’s location, see your database vendor’s documentation.

StepsCreating database document stores

  1. Click Document Management.

  2. Click Database.

  3. Click Import from database.

  4. Complete these fields:

    Field

    Description

    Name

    Name of the document store.

    Manager

    Document store manager for which the document store should exist. Typically, there is one document store manager for each server where document indexing occurs. The document store manager for each document store lets you set up document indexing on the different servers in the system. See “Managing document stores”.

    Member of

    Document groups in which the document store is a member. See “Grouping document stores”.

    Not Member of

    Document groups of which the document store is not a member.

    JDBC connection details

    Host

    Indicates the network name or IP address of the database server.

    DB Name

    Indicates the name of the database.

    Username

    Indicates the name of the user who has authenticated access to the database.

    Password

    Indicates the password used to authenticate access to the database.

    Preset

    Indicates the type of database and the configuration of the JDBC options. When you select a database from the Preset list, Sybase Search automatically displays the port, driver, and URL with common values for the type of database selected.

    To use a preset database:

    1. Complete the Name, Manager, and Member of fields for the database document store.

    2. Complete the Host, DB Name, Username, Password, and Port fields for the JDBC connection details.

    3. Select a preset. The port, driver, and URL fields display the corresponding default values.

    4. Click the Translate URL placeholders link to replace the URL template placeholders with the correct values.

    If you do not select a preset database from the list, enter driver and URL appropriate values.

    NoteInclusion of a database driver in the Presets list does not mean the driver is available to the system. Make sure that the correct driver is available to the selected document store manager.

    Port

    Indicates the database server listener port. If you select a database from the Preset list, this field is populated automatically.

    Driver

    Indicates the full class name of the JDBC driver. If you select a database from the Preset list, this field is populated automatically.

    URL

    Indicates the JDBC URL to use to contact the database. If you select a database from the Preset list, this field is populated automatically.

    SQL Query

    Indicates the SQL statement designed to import documents from a database. See “Constructing an import SQL statement”.

    Document reference

    Class

    Signifies the Java class type that should be used by Sybase Search internally to store the DOC_REF SQL datatype. The document reference class is automatically determined the first time data is extracted from the database, and it cannot be changed.

    Length

    Identifies the document reference length, which is used only for java.lang.String document reference types (the lengths of other types are implicit). In most cases, the value in this field should be the same as the VARCHAR column width from which the document references are being extracted. If the document reference is not a string, this value is ignored.

    Store Indexed Text

    Indicates the raw text from each document is stored within the document store. By default, the option is selected. If you unselect the option, the search results page does not include the View Text link option for each results, as there is no cached text to display.

    Index now

    Indicates whether to proceed with indexing immediately or to save the configuration without indexing. See “Indexing document stores”.

  5. Click Create.

    The Document Stores Summary page shows the details of the document store. An indexing summary is also listed, and, if the store is being indexed, the current indexing session information appears. See “Indexing document stores”.

Passive (Web) document stores

A passive document store represents a collection of documents imported into Sybase Search by an external process such as Web robot. The Web robot manages the download of Web content from the Internet and intranets. The Web content is sent to a passive document store, where it is indexed and made available for searching. See “Web robots”.

Sybase Search also allows you to create custom processes for externally managed document stores. See “Customizing externally managed document stores”.