HashVectorAgOp operator

The HashVectorAgOp operator is a blocking operator. All rows from the child operator must be processed before the first row from the HashVectorAgOp operator can be returned to its parent operator. Other than this, the algorithm is similar to the HashDistinctOp operator’s algorithm.

Rows are fetched from the child operator. Each row is hashed on the query’s grouping columns. The bucket that is hashed is searched to see if the vector already exists.

If the group by values do not exist, the vector is added and the aggregate values are initialized using this first row. If the group by values do exist, the current row is aggregated to the existing values. This example collects a list of all cities with the number of authors that live in each city.

1> select city, total_authors = count(*)
2>     from authors
3>     group by city
4> go

QUERY PLAN FOR STATEMENT 1 (at line 3).

2 operator(s) under root

The type of query is SELECT.

ROOT:EMIT Operator

	|HASH VECTOR AGGREGATE Operator
	|  GROUP BY
	|  Evaluate Grouped COUNT AGGREGATE.
	| Using Worktable1 for internal storage.
	|
	|   |SCAN Operator
	|   |  FROM TABLE
	|   |  authors
	|   |  Table Scan.
	|   |  Forward Scan.
	|   |  Positioning at start of table.
	|   |  Using I/O Size 2 Kbytes for data pages.
	|   |  With LRU Buffer Replacement Strategy for data pages.

In this query plan, the HashVectorAgOp operator reads all of the rows from its child operator, which is scanning the authors table. Each row is checked to see if there is already an entry bucket entry for the current city value. If there is not, a hash entry row is added with the new city grouping value and the count result is initialized to 1. If there is already a hash entry for the new row’s city value, the aggregation function is applied. In this case, the count result is incriminated.

The showplan output prints a group by message specifically for the HashVectorAgOp operator, then prints the grouped aggregation messages:

 |  Evaluate Grouped COUNT AGGREGATE.

The showplan output reports what worktable will be used to store spilled groups and unprocessed rows:

 | Using Worktable1 for internal storage.