The most compelling reason to implement eager aggregation is online data archiving, which is a distributed query processing (DQP) installation where recent OLTP read-write data is on an Adaptive Server and historical read-only data is on another server, either Adaptive Server or ASIQ.
The following view, v
,
offers decision support system (DSS) applications transparent access
to local Adaptive Server data in ase_tab and,
through the Component Integration Services (CIS) proxy_asiq_tab, to
remote historical data on an ASIQ server.
create view v(v1, v2) as select a1, a2 from ase_tab union all select q1, q2 from proxy_asiq_tab
The DSS applications ignore the distributed nature of the data and use such union-in-view tables as the base tables of their complex queries, typically using aggregation:
select t1, sum(v1) from t,v where t2=v2 group by t1
After view and union resolution, the following operator tree is obtained:
As this tree uses a CIS proxy table, the CIS layer uses a specialized remote scan operator to generate and ship a plan fragment to the remote site.
Figure 6-4: Suboptimal classical CIS behavior
As such, this mechanism is suboptimal: the entire history table is shipped through the CIS layer to the Adaptive Server side, incurring a large network cost; furthermove, the advanced ASIQ bitmap-based grouping algorithms are not used.
Figure 6-5: Classical processing of aggregation
Ideally, transformations are performed on the operators tree and grouping on the ASIQ side, so that only aggregated data is transferred.
Figure 6-6: Desired aggregation processing layout
In this example, there are two operators between the group and the CIS proxy: a join and a union. The next transform pushes grouping below the join and the union, achieving eager aggregation:
Grouping is now adjacent to the CIS proxy. The CIS layer can now send a grouped query to ASIQ and return aggregated data.
Figure 6-8: Optimal CIS behavior with eager aggregation
The efficient execution of complex aggregated DSS queries in a distributed environment is a challenge not only in online data archivings; it is, in general, a generic DSS/DQP problem in online data archiving DSS/DQP:
A typical query involves complex joins and unions, and aggregation is performed at the top of the query tree.
The data is distributed across the nodes, and the intermediate results must be shipped from a producer node to a consumer node for further processing.
Although the examples above are for DQP in general, and online data archiving in particular, the eager aggregation performance impact goes beyond shipping intermediate results between DQP nodes.
Eager aggregation enhances the performance of aggregated complex queries by reducing intermediate result sets. Since aggregated complex queries are typical, eager aggregation enhances Adaptive Server performance in all DSS applications.