Categories

You can set up categories by defining a query and assigning a relevance threshold to it. You can also include metadata filtering to your category query. A category must have at least one search term or one metadata expression.

Sybase Search assigns a document to a category if document relevance percent is equal to or greater than the threshold.

For example, a query that consists of search terms and a minimum document relevance creates a category of documents that are grouped by their relevance to search terms defined in the given query. The document relevance helps ensure that the documents in the category are valid matches.

You can also use a category query that consists of only metadata, such as “File Type = HTML”. This creates a category that contains only HTML documents.You can either search within a category about a certain subject, such as “England World Cup football” or simply use a category to filter search results, such as searching within a category of HTML documents.

Another way to categorize documents is based on the content from one or more training documents. Using train category, Sybase Search extracts the most relevant content from training documents and uses this information as a new internal query to generate matching documents. “Train Category” feature is similar to the “Find Similar” feature where only one document is used as source document except, with category training, relevant content is extracted from more than one document, ensuring that the extracted content is relevant to the training documents. This method has the following benefits:

StepsCreating a category using a base query

  1. Click Document Management.

  2. Click Categories.

  3. Click Create.

  4. In the Category Query Terms field, enter a natural-language query. The more information you provide, the more accurate your results are. See “Searching across documents”.

  5. In the Not Terms field, enter terms to indicate concepts dissimilar to those for which you are searching. See “Searching across documents”.

  6. Click the Details tab.

  7. Assign a name to distinguish this category from others.

  8. Enter text to further describe the category.

  9. Click the Document Groups tab.

  10. Select one or more document groups to restrict your search.

  11. Click the Metadata tab. To include metadata in the category:

    1. Select a metadata parameter from the metadata list. You can add as many as five metadata parameters to the category.

    2. Select an operator. All metadata types support the equal to (=) operator. The integer and date types also support greater than or equal to (>=) and the less than or equal to (<=) operators.

    3. Enter a value for the metadata parameter. Table 6-1 lists the predefined metadata parameters and types.

    4. If the metadata parameter contains a value that consists of more than one term, select the Within expression operator.

      When you set the operator to AND, every term must be present in the document metadata for the match to succeed. When you set the operator to OR, only one of the terms must be present in the document metadata for the match to succeed.

    5. If you have defined at least two metadata parameters, select the Across Expressions operator. When you set the operator to AND, both metadata parameters must succeed for the match to succeed. When you set the operator to OR, only one of the metadata parameters must succeed.

      See “Searching across documents”.

  12. Click the Result Options tab. To set up result options:

    1. From the Minimum document relevance list, select a percentage. The percentage you select defines the minimum relevance ranking that a document must score for it to be included in the category. Documents with scores lower than the percentage that you enter are not included.

    2. Select the Score Unknown Terms to specify that terms unknown to the system—which, therefore do not exist in any indexed document—be considered by the scoring algorithm.

    3. Under Training Options, specify the number of results to display per page and the number of paragraphs to display for each document. Select the Term Highlighting to highlight the query terms in the search results.

      NoteThe fields under Training Options assist category training and have no effect on category creation. The values specified in these fields are not saved during category creation.

  13. Click Create. Sybase Search creates the category, assigns a unique system-generated numeric ID to it, and automatically adds documents that match the category criteria. The new category and list of relevant documents appears on the View Category page.

StepsCreating a category using training documents

  1. Perform steps 1 through 12 of “Creating a category using a base query”.

  2. Click Run Category Query. Each search result contains the Add to training documents link.

  3. From the search results, determine the training document that matches the information you are searching and click Add to training documents. The name or title of the specified document appears in the Training documents box. You can add up to five training documents.

  4. Select Training Documents.

  5. Click Train Category. The search result displays documents that fall into the category, sorted by relevance.

  6. Click Create.

StepsEditing a category

  1. From the Categories page, determine the category that you want to edit.

  2. Click Edit.

  3. Make the required changes. See “Creating a category using a base query”.

  4. Click Save.

StepsRemoving a category

  1. From the Categories page, determine the category that you want to remove.

  2. Click Remove.

  3. Click OK to confirm the removal.