Adaptive Server generates a query plan containing the optimal join order for a multitable query without regard to the storage location of each table. If remote tables are represented in the query, Component Integration Services, which takes the storage location into account, performs additional optimization for the following conditions:
Join processing
Aggregate processing
In order to make intelligent evaluations of a query to improve performance in the above areas, statistics are required. These are obtained by executing the command update statistics for a specific table.
When updating statistics on a remote table, Component Integration Services intercepts the request and provides meaningful statistics for the remote table and all of its indexes (if any). The result of executing an update statistics command is a distribution statistics page stored in the database, for each index.
In Adaptive Server, data used to create this distribution page comes from local index pages. When you are updating statistics on a remote table, the data used to create the distribution statistics page comes from the keys used to make up the index on the remote table.
The Adaptive Server issues a query to the remote server to obtain all columns making up the index, sorted according to position within the index. For example, if table1 has an index made up of two columns, col1 and col2, then the query to that server is sent as follows when update statistics is executed:
select col1, col2 from table1 order by col1, col2
The results are then used to construct a distribution page in the format needed by the optimizer.
The detailed distribution statistics are used to determine optimal join order. This gives the server the ability to generate optimal queries against remote databases that may not support cost-based query optimization.
On large tables, update statistics can take a long time. To speed up the process, turn on trace flag 11209 before executing update statistics. This trace flag instructs update statistics to obtain only row counts on remote tables. The Adaptive Server query optimizer uses the row count information to make assumptions about the selectivity of a particular index. While these assumptions are not as complete as the full distribution statistics, they provide the minimal information needed to handle query optimization.
Component Integration Services remote location optimizer isolates join conditions represented in the query plan. For each remote server that is represented by two or more tables in the join, Component Integration Services modifies the query plan to appear as though a single virtual table is being processed for that server. Component Integration Services then forwards the join conditions to the remote server during query execution.
For example, if a query involves four tables, two that are located on the remote server SERVERA and two that are located on the remote server SERVERB, Component Integration Services processes the query as though it were a two-way join. The following query:
select * from A1, A2, B1, B2 where A1.id = A2.id and A2.id = B1.id and B1.id = B2 id
gets converted to:
select * from V1, V2 where V1.id = V2.id
V1 is the virtual table representing the results of the join between A1 and A2 (processed by SERVERA), and V2 is the virtual table representing the results of the join between B1 and B2 (processed by SERVERB). Since the Adaptive Server uses nested iteration (looping) to process inner tables of a join, the query is processed as follows:
open cursor on V1 fetch V1 row for each row in V1 open a cursor on V2 fetch V2 route results V1, V2 to client close cursor on V2
Component Integration Services optimizes queries containing ungrouped aggregate functions (min, max, sum, and count) by passing the aggregate to the remote server if the remote server is capable of performing the function.
For example, consider the following query on the remote table A1:
select count(*) from A1 where id > 100
The count(*) aggregate is forwarded to the remote server that owns A1.