Here are some suggestions for improving sort-merge join performance:
To reduce the size of worktables select only needed columns for tables used in merge joins. Avoid using select * unless you need all columns of the tables. This reduces the load on tempdb and the cost of sorting the result tables.
If you are concerned about possible performance impacts of merge joins or possible space problems in tempdb, see Chapter 29, “Introduction to Abstract Plans,” for a discussion of how abstract query plans can help determine which queries on your system use merge joins.
Look for opportunities for index covering. One example is queries where joins are in the form:
select t1.c3, t3.c4 from t1, t2, t3 wehre t1.c1 = t2.c1 and t2.c2 = t3.c2 and ...
and columns from t2 are not in the select list, or only the join columns are in the select list. An index on the join columns, t2(c1, c2) covers the query, allowing a merge join to avoid accessing the data pages of t2.
Merge joins can use indexes created in ascending or descending order when two tables are joined on multiple columns, such as these:
A.c1 = B.c1 and A.c2 = B.c2 and A.c3 = B.c3
The column order specified for the indexes must be an exact match, or exactly the reverse, for all columns to be used as join predicates when costing the join and accessing the data. If there is a mismatch of ordering in second or subsequent columns, only the matching columns are used for the join, and the remaining columns are used to restrict the results after the row has been retrieved. This table shows some examples for the query above:
Index creation order |
Clauses used as join predicates |
---|---|
A(c1 asc, c2 asc, c3 asc) B(c1 asc, c2 asc, c3 asc) |
All three clauses. |
A(c1 asc, c2 asc, c3 asc) B(c1 desc, c2 desc, c3 desc) |
All three clauses. |
A(c1 asc, c2 asc, c3 asc) B(c1 desc, c2 desc, c3 asc) |
The first two join clauses are used as join predicates and the third clause is evaluated as a restriction on the result. |
A1(c1 asc, c2 desc, c3 desc) B1(c1 desc, c2 desc, c3 asc) |
Only the first join clause is used as a join predicate. The remaining two clauses is evaluated as restrictions on the result set. |
Index key ordering is generally chosen to eliminate sort costs for order by queries. Using compatible ordering for frequently joined tables can also reduce join costs.