ANSI outer joins  Nested ANSI outer joins

Chapter 4: Joins: Retrieving Data from Several Tables

Should the predicate be in the on or where clause?

The result set of an ANSI outer join depends on whether you place the restriction in the on or the where clause. The on clause defines the result set of a joined table and which rows of this joined table have null-supplied values; the where clause defines which rows of the joined table are included in the result set.

Whether you use an on or a where clause in your join condition depends on what you want your result set to include. The following examples may help you decide whether to place the predicate in the on or the where clause.

Predicate restrictions on an outer table

The following query places a restriction on the outer table in the where clause. Because the restriction is applied to the result of the outer join, it removes all the rows for which the condition is not true:

select title, titles.title_id, price, au_id
from titles left join titleauthor
on titles.title_id = titleauthor.title_id
where titles.price > $20.00
title               title_id price      au_id
------------------- -------- ---------- -----------------
But Is It User F... PC1035        22.95 238-95-7766
Computer Phobic ... PS1372        21.59 724-80-9391
Computer Phobic ... PS1372        21.59 756-30-7391
Onions, Leeks, a... TC3218        20.95 807-91-6654
(4 rows affected)

Four rows meet the criteria and only these are included in the result set.

However, if you move this restriction on the outer table to the on clause, the result set includes all the rows that meet the on clause condition. Rows from the outer table that do not meet the condition are null-extended:

select title, titles.title_id, price, au_id
from titles left join titleauthor
on titles.title_id = titleauthor.title_id
and titles.price > $20.00
title                  title_id   price   au_id       
--------------------   ---------  ------  ---------------
The Busy Executive’s   BU1032     19.99   NULL
Cooking with Compute   BU1111     11.95   NULL
You Can Combat Compu   BU2075      2.99   NULL
Straight Talk About    BU7832     19.99   NULL
Silicon Valley Gastro  MC2222     19.99   NULL
The Gourmet Microwave  MC3021      2.99   NULL
The Psychology of Com  MC3026      NULL   NULL
But Is It User Friend  PC1035     22.95   238-95-7766
Secrets of Silicon Va  PC8888     20.00   NULL
Net Etiquette          PC9999      NULL   NULL
Computer Phobic and    PS1372     21.59   724-80-9391
Computer Phobic and    PS1372     21.59   756-30-7391
Is Anger the Enemy?    PS2091     10.95   NULL
Life Without Fear      PS2106      7.00   NULL
Prolonged Data Depri   PS3333     19.99   NULL
Emotional Security:    PS7777      7.99   NULL
Onions, Leeks, and Ga  TC3218     20.95   807-91-6654
Fifty Years in Buckin  TC4203     11.95   NULL
Sushi, Anyone?         TC7777     14.99   NULL
(19 rows affected)

Moving the restriction to the on clause added 15 null-supplied rows to the result set.

Generally, if your query uses a restriction on an outer table, and you want the result set to remove only the rows for which the restriction is false, you should probably place the restriction in the where clause to limit the rows of the result set. Outer table predicates are not used for index keys if they are in the on clause.

Whether you place the restriction on an outer table in the on or where clause ultimately depends on the information you need the query to return. If you only want the result set to include only the rows for which the restriction is true, you should place the restriction in the where clause. However if you require the result set to include all the rows of the outer table, regardless of whether they satisfy the restriction, you should place the restriction in the on clause.

Restrictions on an inner table

The following query includes a restriction on an inner table in the where clause:

select title, titles.title_id, titles.price, au_id
from titleauthor left join titles
on titles.title_id = titleauthor.title_id
where titles.price > $20.00
title            title_id   price    au_id
-------------    --------   -----    -----------
But Is It U...   PC1035     22.95    238-95-7766
Computer Ph...   PS1372     21.59    724-80-9391
Computer Ph...   PS1372     21.59    756-30-7391
Onions, Lee...   TC3218     20.95    807-91-6654
(4 rows affected)

Because the restriction of the where clause is applied to the result set after the join is made, it removes all the rows for which the restriction is not true. Put another way, the where clause is not true for all null-supplied values and removes them. A join that places its restriction in the where clause is effectively an inner join.

However, if you move the restriction to the on clause, it is applied during the join and is utilized in the production of the joined table. In this case, the result set includes all the rows of the inner table for which the restriction is true, plus all the rows of the outer table, which are null-extended if they do not meet the restriction criteria:

select title, titles.title_id, price, au_id
from titleauthor left join titles
on titles.title_id = titleauthor.title_id
and price > $20.00
title      title_id   price           au_id
---------  ---------  -----------     -----------
NULL       NULL       NULL            172-32-1176
NULL       NULL       NULL            213-46-8915
. . .
Onions,    TC3218     20.95           807-91-6654
. . . 
NUL        NULL       NULL            998-72-3567
NULL       NULL       NULL            998-72-3567
(25 rows affected)

This result set includes 21 rows that the previous example did not include.

Generally, if your query requires a restriction on an inner table (for example “and price > $20.00” in query above), you probably want to place the condition in the on clause; this preserves the rows of the outer table. If you include a restriction for an inner table in the where clause, the result set might not include the rows of the outer table.

Like the criteria for the placement of a restriction on an outer table, whether you place the restriction for an inner table in the on or where clause ultimately depends on the result set you want. If you are interested only in the rows for which the restriction is true, and not including the full set of rows for the outer table, place the restriction in the where clause. However, if you require the result set to include all the rows of the outer table, regardless of whether they satisfy the restriction, you should place the restriction in the on clause.

Restrictions included in both an inner and outer table

The restriction in the where clause of the following query includes both the inner and outer tables:

select title, titles.title_id, price, price*qty, qty 
from salesdetail left join titles
on titles.title_id = salesdetail.title_id
where price*qty > $30000.00
title               title_id   price  qty
-----------------   --------   -----  ---------  -----
Silicon Valley Ga   MC2222     19.99  40,619.68  2032
But Is It User Fr   PC1035     22.95  45,900.00  2000
But Is It User Fr   PC1035     22.95  45,900.00  2000
But Is It User Fr   PC1035     22.95  49,067.10  2138
Secrets of Silico   PC8888     20.00  40,000.00  2000
Prolonged Data De   PS3333     19.99  53,713.13  2687
Fifty Years in Bu   TC4203     11.95  32,265.00  2700
Fifty Years in Bu   TC4203     11.95  41,825.00  3500
(8 rows affected)

Placing the restriction in the where clause eliminates the following rows from the result set:

To keep the unmatched rows of the outer table, move the restriction into the on clause, as in this example:

select title, titles.title_id, price, price*qty, qty 
from salesdetail left join titles
on titles.title_id = salesdetail.title_id
and price*qty > $30000.00
title               title_id   price  qty
-----------------   --------   -----  ---------  -----
NULL                NULL       NULL   NULL       75
NULL                NULL       NULL   NULL       75 
. . .
Secrets of Silico   PC8888     20.00  40,000.00  2000 
. . .
NULL                NULL       NULL   NULL       300
NULL                NULL       NULL   NULL       400
(116 rows affected)

This query retains all 116 rows of the salesdetail table in the result set, and null-extends the rows that do not meet the restriction.

Where you place the restriction that includes both the inner and outer table depends on the result set you want. If you are interested in only the rows for which the restriction is true, place the restriction in the where clause. However, if you want to include all the rows of the outer table, regardless of whether they satisfy the restriction, place the restriction in the on clause.





Copyright © 2005. Sybase Inc. All rights reserved. Nested ANSI outer joins

View this book as PDF