Customizing externally managed document stores

There are two important components that are crucial for importing documents into an externally managed document store manager.

The import client is implemented as a concrete class: com.omniq.repository.indexing.DocumentIndexingSession

The object defines a set of methods that enables you to authenticate and perform updates on a target external managed document store.

The import handler is an HTTP handler, and is configured in the Sybase Search container XML file. Every external managed document store uses a single import handler. The default URL for the HTTP handler is:

http://hostname:port/em/indexer

where:

NoteThe externally managed (or passive) document store managers require HTTP handlers to be enabled. If the HTTP handlers are disabled on a container, which hosts an externally managed document store manager then it will be unable to start any indexing sessions. Also, the Web robot requires an active external managed document store manager to index its Web pages.

Table 5-3 describes the custom HTTP headers used for client and server communication.

Table 5-3: HTTP import format

Header name

Description

X-Import-Command

The following options are supported:

  • Begin – starts a new indexing session on the target external managed document store.

  • Import – posts a new or updated document to the external managed document store.

  • Remove – requests a document to be removed from the external managed document store.

  • Abort – exits the indexing session.

  • End – ends the indexing session.

  • Ping – notifies the server that the client is still active.

  • Tell – notifies the server after the client has ended the indexing session; the server notifies the client whether the changes to the indexes are complete.

X-Import-Address

Specifies the address of the document store on which to begin an indexing session. This command is used in conjunction with the Begin command.

X-Import-Authenticate

Specifies either a user name and password pair, or a security session ID for the server to authenticate.

If a user name and password pair has been set explicitly, the authentication token is in the format username:password, where both the user name and password are UTF-8 and base64 encoded. If the user name and password pair has not been set, there must be a thread-local security session ID available for use, in which case the authentication token is a long integer value. This command is used in conjunction with the Begin command.

NoteIf neither a user name and password pair or a security session ID is available, an IllegalStateException is thrown.

X-Import-Session-ID

Specifies the ID of the indexing session. The server creates the ID token and returns it by wrapping it with the response to the Begin command. The client stores the value and returns it with every other request it makes to the server.

123-10010_fewk8fba, where the format is made up of the document store address and an unique identifier.

X-Import-Doc-Ref

Specifies the reference of the document which the client is attempting to import or remove.

X-Import-Response

Specifies the server’s response to the client request. This is an integer value which maps to an error code, where zero equals success.

This sample code illustrates how third-party applications can import documents into an externally managed document store using the import API.

URL url = new URL("http://localhost:7701/em/indexer");
IndexingSessionConfig cfg = new IndexingSessionConfig(url);
DocumentIndexingSession session = new DocumentIndexingSession(cfg);
session.setAuthDetails("robin", "sherwood".toCharArray());
IndexingActionResult result = session.begin("122-10010");
if (result.isSuccessful()) {
      String docRef = "1";
      FastMap metadata = // acquire metadata
      Reader content = // acquire content
      result = session.indexDocument(docRef, metadata, content);
      if (result.isSuccessful()) {
             System.out.println("Document indexed OK");
             session.end();
             while ( ! session.awaitCompletion(1).isSuccessful())
              ;
              System.out.println("Document is now live");
      }
     // else handle error
}
// else handle error