The MERGE
JOIN
operator is a binary operator. The left
and right children are the outer and inner data streams, respectively.
Both data streams must be sorted on the MERGE
JOIN
’s key values.
First, a row from the outer stream is fetched. This initializes
the MERGE JOIN
’s join
key values. Then, rows from the inner stream are fetched until a
row with key values that match or are greater than (less than if
key column is descending) is encountered. If the join key matches,
the qualifying row is passed on for additional processing, and a
subsequent next call to the MERGE JOIN
operator
continues fetching from the currently active stream.
If the new values are greater than the current comparison key, these values are used as the new comparison join key while fetching rows from the other stream. This process continues until one of the data streams is exhausted.
Generally, the MERGE JOIN
strategy
is effective when a scan of the data streams requires that most
of the rows must be processed, and that, if any of the input streams
are large, they are already sorted on the join keys.
select ta.title_id from titleauthor ta, authors a where a.au_id = ta.au_id and au_lname = "Bloom" go QUERY PLAN FOR STATEMENT 1 (at line 2). STEP 1 The type of query is EXECUTE. Executing a newly cached statement. QUERY PLAN FOR STATEMENT 1 (at line 1). STEP 1 The type of query is SELECT. 3 operator(s) under root ROOT:EMIT Operator (VA = 3) |MERGE JOIN Operator (Join Type: Inner Join) | Using Worktable2 for internal storage. | Key Count: 1 | Key Ordering: ASC | | |SORT Operator | | Using Worktable1 for internal storage. | | | | |SCAN Operator | | | FROM TABLE | | | authors | | | a | | | Index : aunmind | | | Forward Scan. | | | Positioning by key. | | | Keys are: | | | au_lname ASC | | | Using I/O Size 2 Kbytes for index leaf pages. | | | With LRU Buffer Replacement Strategy for index leaf pages. | | | Using I/O Size 2 Kbytes for data pages. | | | With LRU Buffer Replacement Strategy for data pages. | | |SCAN Operator | | FROM TABLE | | titleauthor | | ta | | Index : auidind | | Forward Scan. | | Positioning at index start. | | Using I/O Size 2 Kbytes for index leaf pages. | | With LRU Buffer Replacement Strategy for index leaf pages. | | Using I/O Size 2 Kbytes for data pages. | | With LRU Buffer Replacement Strategy for data pages.
In this example, a sort operator is the left child, or outer
stream. The data source for the sort operator is the authors table.
The sort operator is required because the authors table
has no index on au_id that would otherwise
provide the necessary sorted order. A scan of the titleauthor table
is the right child/inner stream. The scan uses the auidind index,
which provides the necessary ordering for the MERGE
JOIN
strategy.
A row is fetched from the outer stream (the authors table is the original source) to establish an initial join key comparison value. Then rows are fetched from the titleauthor table until a row with a join key equal to or greater than the comparison key is found.
Inner stream rows with matching keys are stored in a cache
in case they need to be refetched. These rows are refetched when
the outer stream contains duplicate keys. When a titleauthor.au_id value
that is greater than the current join key comparison value is fetched, the MERGE
JOIN
operator starts fetching from the outer
stream until a join key value equal to or greater than the current titleauthor.au_id value
is found. The scan of the inner stream resumes at that point.
The MERGE JOIN
operator’s showplan output
contains a message indicating the worktable to be used for the inner
stream’s backing store. The worktable is written to if
the inner rows with duplicate join keys no longer fits in cached memory.
The width of a cached row is limited to 64 kilobytes.