SELECT statement

Description

Retrieves information from the database.

Syntax

SELECT [ ALL | DISTINCT ] [ FIRST | TOP number-of-rows ] select-list
... [ INTO { host-variable-list | variable-list | table-name } ]
... [ FROM table-list ]
... [ WHERE search-condition ]
... [ GROUP BY [ expression [,...]
      | ROLLUP ( expression [,...] )
      | CUBE ( expression [,...] ) ] ] 
... [ HAVING search-condition ]
... [ ORDER BY { expression | integer } [ ASC | DESC ] [, ...] ]

Parameters

select-list:: { column-name | expression [ [ AS ] alias-name ] | * }

Examples

Example 1

List all the tables and views in the system catalog.

SELECT tname
FROM SYS.SYSCATALOG
WHERE tname LIKE 'SYS%' ;

Example 2

List all customers and the total value of their orders.

SELECT company_name,
	CAST( sum(sales_order_items.quantity *
	product.unit_price) AS INTEGER) VALUE
FROM customer
	LEFT OUTER JOIN sales_order
	LEFT OUTER JOIN sales_order_items
	LEFT OUTER JOIN product
GROUP BY company_name
ORDER BY VALUE DESC

Example 3

How many employees are there?

SELECT count(*)
FROM Employee ;

Example 4

The following statement shows an Embedded SQL SELECT statement:

SELECT count(*) INTO :size FROM employee

Example 5

List the total sales by year, model, and color.

SELECT year, model, color, sum(sales) FROM sales_tab GROUP BY ROLLUP (year, model, color);

Example 6

Select all items with a certain discount into a temporary table:

SELECT * INTO #TableTemp FROM lineitem WHERE l_discount < 0.5

Usage

The SELECT statement is used for retrieving results from the database.

A SELECT statement can be used in DBISQL to browse data in the database or to export data from the database to an external file.

A SELECT statement can also be used in procedures or in Embedded SQL. The SELECT statement with an INTO clause is used for retrieving results from the database when the SELECT statement only returns one row. (Tables created with SELECT INTO do not inherit IDENTITY/AUTOINCREMENT tables.) For multiple-row queries, you must use cursors.When you select more than one column and do not use #table, SELECT INTO creates a permanent base table. SELECT INTO #table always creates a temporary table regardless of the number of columns. SELECT INTO table with a single column selects into a host variable.

Tables with the same name but different owners require aliases. A query like the following returns incorrect results:

SELECT * FROM user1.t1
WHERE NOT EXISTS
(SELECT *
FROM user2.t1
WHERE user2.t1.col1 = user1.t.col1);

For correct results, use an alias for each table, as follows:

SELECT * FROM user1.t1 U1
WHERE NOT EXISTS
(SELECT *
FROM user2.t1 U2
WHERE U2.col1 = U1.col1);

The INTO clause with a variable-list is used in procedures only.

A SELECT statement can also be used to return a result set from a procedure. The various parts of the SELECT statement are described below:

ALL or DISTINCT If Neither ALL nor DISTINCT is specified, ALL rows that satisfy the clauses of the SELECT statement are retrieved. If DISTINCT is specified, duplicate output rows are eliminated. This is called the projection of the result of the statement. In many cases, statements take significantly longer to execute when DISTINCT is specified. Thus, the use of DISTINCT should be reserved for cases where it is necessary.

If DISTINCT is used, the statement cannot contain an aggregate function with a DISTINCT parameter.

FIRST or TOP number-of-rows Specifies the number of rows returned from a query. FIRST returns the first row selected from the query. TOP returns the specified number of rows from the query, where number-of-rows is in the range 1 - 32767, and can be an integer constant or integer variable.

FIRST and TOP are used primarily with the ORDER BY clause. If these keywords are used without an ORDER BY clause, then the result may vary from run to run of the same query, as the optimizer may choose a different query plan.

FIRST and TOP are permitted only in the top level SELECT of a query, so they cannot be used in derived tables or view definitions. Using FIRST or TOP in a view definition may result in the keyword being ignored when a query is run on the view.

Using FIRST is the same as setting the ROW_COUNT database option to 1. Using TOP is the same as setting the ROW_COUNT option to the same number of rows, except that the maximum number of rows returned for TOP is 32767. ROW_COUNT does not have an upper limit for the number of rows returned. If both TOP and ROW_COUNT are set, then the value of TOP takes precedence.

Specifying FIRST or TOP in the SELECT statement has exactly the same effect as setting the ROW_COUNT option, as long as you can limit the number of rows returned to 32767. If you need the query to return more than 32K rows, then use ROW_COUNT. For more information on the ROW_COUNT database option, see “ROW_COUNT option”.

select-list The select-list is a list of expressions separated by commas specifying what will be retrieved from the database. If asterisk (*) is specified, it is expanded to select all columns of all tables in the FROM clause (table-name all columns of the named table). Aggregate functions and analytical functions are allowed in the select-list (see Chapter 5, “SQL Functions”).

In Sybase IQ, scalar subqueries (nested selects) are allowed in the select list of the top level SELECT, as in Adaptive Server Anywhere and Adaptive Server Enterprise. Subqueries cannot be used inside a conditional value expression (for example, in a CASE statement).

In Sybase IQ, subqueries can also be used in a WHERE or HAVING clause predicate (one of the supported predicate types). However, inside the WHERE or HAVING clause, subqueries cannot be used inside a value expression or inside a BETWEEN, CONTAINS, or LIKE predicate. Subqueries are not allowed in the ON clause of outer joins or in the GROUP BY clause.

For more details on the use of subqueries, see “Subqueries in expressions” and “Subqueries in search conditions”.

alias-names can be used throughout the query to represent the aliased expression. Alias names are also displayed by DBISQL at the top of each column of output from the SELECT statement. If the optional alias-name is not specified after an expression, DBISQL will display the expression. You cannot use the same name or expression for a column alias as the column name; IQ prevents this usage because it would be a recursive reference.

INTO host-variable-list This clause is used in Embedded SQL only. It specifies where the results of the SELECT statement will go. There must be one host-variable item for each item in the select-list. Select list items are put into the host variables in order. An indicator host variable is also allowed with each host-variable so the program can tell if the select list item was NULL.

INTO variable-list This clause is used in procedures only. It specifies where the results of the SELECT statement will go. There must be one variable for each item in the select list. Select list items are put into the variables in order.

INTO table-name This clause is used to create a table and fill it with data.

If the table name starts with # then the table is created as a temporary table. Otherwise, the table is created as a permanent base table. For permanent tables to be created, the query must satisfy the following conditions:

The select-list contains more than one item, and the INTO target is a single table-name identifier, or
The select-list contains a * and the INTO target is specified as owner.table.

To create a permanent table with one column, the table name must be specified as owner.table. Omit the owner specification for a temporary table.

This statement causes a COMMIT before execution as a side effect of creating the table. RESOURCE authority is required to execute this statement. No permissions are granted on the new table: the statement is a short form for CREATE TABLE followed by INSERT... SELECT.

Tables created using this statement do not have a primary key defined. You can add a primary key using ALTER TABLE. A primary key should be added before applying any UPDATEs or DELETEs to the table; otherwise, these operations result in all column values being logged in the transaction log for the affected rows.

Use of this clause is restricted to queries that are valid Adaptive Server Anywhere queries. Sybase IQ extensions are not supported.

FROM table-list Rows are retrieved from the tables and views specified in the table-list. Joins can be specified using join operators. For more information, see FROM clause. A SELECT statement with no FROM clause can be used to display the values of expressions not derived from tables. For example:

SELECT @@version

displays the value of the global variable @@version. This is equivalent to:

SELECT @@version
FROM DUMMY

If you omit the FROM clause, or if all tables in the query are in the SYSTEM dbspace, the query is processed by Adaptive Server Anywhere instead of Sybase IQ and may behave differently, especially with respect to syntactic and semantic restrictions and the effects of option settings. See the Adaptive Server Anywhere documentation for rules that may apply to processing.

If you have a query that does not require a FROM clause, you can force the query to be processed by Sybase IQ by adding the clause “FROM iq_dummy,” where iq_dummy is a one row, one column table that you create in your database.

WHERE search-condition This clause specifies which rows will be selected from the tables named in the FROM clause. It is also used to do joins between multiple tables. This is accomplished by putting a condition in the WHERE clause that relates a column or group of columns from one table with a column or group of columns from another table. Both tables must be listed in the FROM clause.

The use of the same CASE statement in both the SELECT and the WHERE clause of a grouped query is not allowed.

See “Search conditions” for a full description.

GROUP BY clause You can group by columns or alias names or functions. GROUP BY expressions must also appear in the select list. The result of the query contains one row for each distinct set of values in the named columns, aliases, or functions. The resulting rows are often referred to as groups since there is one row in the result for each group of rows from the table list. For the sake of GROUP BY, all NULL values are treated as identical. Aggregate functions can then be applied to these groups to get meaningful results.

The GROUP BY expression must contain more than a single constant. You do not need to add constants to the GROUP BY clause in order to select the constants in grouped queries. An error is returned and the query is rejected, if the GROUP BY expression contains only a single constant.

When GROUP BY is used, the select list, HAVING clause, and ORDER BY clause cannot reference any identifiers except those named in the GROUP BY clause. The following exception applies: The select-list and HAVING clause may contain aggregate functions.

You can use GROUP BY to calculate results or display a column or expression that does not appear in the select list. The GROUP BY clause displays all groups, even those excluded from calculations by a WHERE clause.

The following extensions to ANSI standards GROUP BY are supported in order to make Adaptive Server Enterprise queries run easily with Sybase IQ.

A select list that includes aggregates can include extended columns that are not arguments of aggregate functions and are not included in the GROUP BY clause.
The GROUP BY ALL clause displays all groups, even those excluded from calculations by a WHERE clause.
The HAVING clause can include columns or expressions that are not in the select list and not in the group by clause
An item in the ORDER BY clause could also contain extended columns as long as that item is either in the select list or the GROUP BY clause.

Some exceptions apply to this support:

If any extended column is from a view or derived table that is base on a query with group by, distinct or aggregation syntax, IQ will reject the query.
If there are two or more base tables involved in the extended group by query, IQ rejects the query. This means IQ only supports the extended group by clause for single table reference.
ASE does not support any OLAP function, so if there is any OLAP function in the extended group by clause, IQ rejects the query.

See the Examples in this section.

ROLLUP operator The ROLLUP operator in the GROUP BY clause allows you to analyze subtotals using different levels of detail. It creates subtotals that “roll up” from a very detailed level to a grand total.

The ROLLUP operator requires an ordered list of grouping expressions to be supplied as arguments. ROLLUP first calculates the standard aggregate values specified in the GROUP BY. Then ROLLUP moves from right to left through the list of grouping columns and creates progressively higher-level subtotals. A grand total is created at the end. If n is the number of grouping columns, then ROLLUP creates n+1 levels of subtotals.

Restrictions on the ROLLUP operator are:

The ROLLUP operator supports all of the aggregate functions available to the GROUP BY clause, but ROLLUP does not currently support COUNT DISTINCT and SUM DISTINCT.
ROLLUP can only be used in the SELECT statement; you cannot use ROLLUP in a SELECT subquery.
A multiple grouping specification that combines ROLLUP, CUBE, and GROUP BY columns in the same GROUP BY clause is not currently supported.
Constant expressions as GROUP BY keys are not supported.

GROUPING is used with the ROLLUP operator to distinguish between stored NULL values and “NULL” values in query results created by ROLLUP.

ROLLUP syntax:

SELECT ... [ GROUPING (column-name) ... ] ...
GROUP BY [ expression [,...]
| ROLLUP ( expression [,...] ) ]

See the section “Expressions” in Chapter 3, “SQL Language Elements” for the format of an operator expression.

GROUPING takes a column name as a parameter and returns a Boolean value as listed in Table 6-12.

**Table 6-12: Values returned by GROUPING with the ROLLUP operator**
If the value of the result is	GROUPING returns
NULL created by a ROLLUP operation	1 (TRUE)
NULL indicating the row is a subtotal	1 (TRUE)
not created by a ROLLUP operation	0 (FALSE)
a stored NULL	0 (FALSE)

ROLLUP example 1 The following example illustrates the use of ROLLUP and GROUPING and displays a set of mask columns created by GROUPING. The digits 0 and 1 displayed in columns S, N, and C are the values returned by GROUPING to represent the value of the ROLLUP result. A program can analyze the results of this query by using a mask of “011” to identify subtotal rows and “111” to identify the row of overall totals.

select size, name, color, SUM(quantity),
grouping (size) as S,
grouping (name) as N,
grouping (color) as C
from product
group by rollup (size, name, color)
having (S=1 or N=1 or C=1)
order by size, name, color;

size	name	color	SUM(product_quantity)	S	N	C
Large	Sweatshirt	(NULL)	71	0	0	1
Large	(NULL)	(NULL)	71	0	1	1
Medium	Shorts	(NULL)	80	0	0	1
Medium	Tee Shirt	(NULL)	54	0	0	1
Medium	(NULL)	(NULL)	134	0	1	1
One size fits all	Baseball Cap	(NULL)	124	0	0	1
One size fits all	Tee Shirt	(NULL)	75	0	0	1
One size fits all	Visor	(NULL)	64	0	0	1
One size fits all	(NULL)	(NULL)	263	0	1	1
Small	Tee Shirt	(NULL)	28	0	0	1
Small	(NULL)	(NULL)	28	0	1	1
(NULL)	(NULL)	(NULL)	496	1	1	1

ROLLUP example 2 The following example illustrates the use of GROUPING to distinguish stored NULL values and “NULL” values created by the ROLLUP operation. GROUPING checks the result of the ROLLUP and returns 0 for a stored NULL or 1 for a “NULL” value created by ROLLUP. Stored NULL values are then displayed as [NULL] in column prod_id, and “NULL” values created by ROLLUP are replaced with ALL in column PROD_IDS, as specified in the query.

select ship_date, prod_id, SUM(quantity),
case grouping (prod_id) when 1 then 'ALL' else CAST (prod_id as VARCHAR(15) END as PROD_IDS,
case grouping (ship_date) when 1 then 'ALL' else CAST (ship_date as VARCHAR(20)) END as SHIP_DATE
from sales_order_items
group by rollup (ship_date, prod_id)
having SUM(quantity) > 36 ORDER BY ship_date, prod_id;

ship_date	prod_id	SUM (sales_order_items. quantity)	PROD_IDS	SHIP_DATES
2004-01-03 00:00:00.000	(NULL)	60	ALL	2004-01-03 00:00:00.000
2004-01-09 00:00:00.000	400	48	400	2004-01-09 00:00:00.000
2004-01-09 00:00:00.000	401	48	401	2004-01-09 00:00:00.000
2004-01-09 00:00:00.000	(NULL)	96	ALL	2004-01-09 00:00:00.000
2004-01-13 00:00:00.000	(NULL)	72	ALL	2004-01-13 00:00:00.000
2004-01-14 00:00:00.000	500	60	500	2004-01-14 00:00:00.000
2004-01-14 00:00:00.000	501	60	501	2004-01-14 00:00:00.000
2004-01-14 00:00:00.000	(NULL)	120	ALL	2004-01-14 00:00:00.000
2004-01-15 00:00:00.000	500	60	500	2004-01-15 00:00:00.000
2004-01-15 00:00:00.000	(NULL)	72	ALL	2004-01-15 00:00:00.000
2004-01-17 00:00:00.000	(NULL)	72	ALL	2004-01-17 00:00:00.000
2004-01-20 00:00:00.000	700	60	700	2004-01-20 00:00:00.000
2004-01-20 00:00:00.000	(NULL)	84	ALL	2004-01-20 00:00:00.000
2004-01-21 00:00:00.000	(NULL)	120	ALL	2004-01-21 00:00:00.000
2004-01-23 00:00:00.000	(NULL)	48	ALL	2004-01-23 00:00:00.000
2004-01-24 00:00:00.000	600	48	600	2004-01-24 00:00:00.000
2004-01-24 00:00:00.000	601	48	601	2004-01-24 00:00:00.000
2004-01-24 00:00:00.000	(NULL)	120	ALL	2004-01-24 00:00:00.000
2004-01-28 00:00:00.000	(NULL)	84	ALL	2004-01-28 00:00:00.000
2004-01-29 00:00:00.000	(NULL)	168	ALL	2004-01-29 00:00:00.000

CUBE operator The CUBE operator in the GROUP BY clause analyzes data by forming the data into groups in more than one dimension. CUBE requires an ordered list of grouping expressions (dimensions) as arguments and enables the SELECT statement to calculate subtotals for all possible combinations of the group of dimensions.

Restrictions on the CUBE operator are:

The CUBE operator supports all of the aggregate functions available to the GROUP BY clause, but CUBE does not currently support COUNT DISTINCT and SUM DISTINCT.
CUBE can only be used in the SELECT statement; you cannot use CUBE in a SELECT subquery.
A multiple GROUPING specification that combines ROLLUP, CUBE, and GROUP BY columns in the same GROUP BY clause is not currently supported.
Constant expressions as GROUP BY keys are not supported.

GROUPING is used with the CUBE operator to distinguish between stored NULL values and “NULL” values in query results created by CUBE.

CUBE syntax:

SELECT ... [ GROUPING (column-name) ... ] ...
GROUP BY [ expression [,...]
| CUBE ( expression [,...] ) ]

GROUPING takes a column name as a parameter and returns a Boolean value as listed in Table 6-13.

**Table 6-13: Values returned by GROUPING with the CUBE operator**
If the value of the result is	GROUPING returns
NULL created by a CUBE operation	1 (TRUE)
NULL indicating the row is a subtotal	1 (TRUE)
not created by a CUBE operation	0 (FALSE)
a stored NULL	0 (FALSE)

See the examples in the description of the ROLLUP operator for illustrations of the use of the GROUPING function to interpret results.

When generating a query plan, the IQ optimizer estimates the total number of groups generated by the GROUP BY CUBE hash operation. The MAX_CUBE_RESULTS database option sets an upper boundary for the number of estimated rows the optimizer will consider for a hash algorithm that can be run. If the actual number of rows exceeds the MAX_CUBE_RESULT option value, the optimizer stops processing the query and returns the error message “Estimate number: nnn exceed the DEFAULT_MAX_CUBE_RESULT of GROUP BY CUBE or ROLLUP”, where nnn is the number estimated by the IQ optimizer. See the section “MAX_CUBE_RESULT option” in Chapter 2, “Database Options” for information on setting the MAX_CUBE_RESULT option.

CUBE example

The following queries use data from a census, including the state (geographic location), gender, education level, and income of people. The first query contains a GROUP BY clause that organizes the results of the query into groups of rows, according to the values of the columns state, gender, and education in the table census and computes the average income and the total counts of each group. This query uses only the GROUP BY clause without the CUBE operator to group the rows.

SELECT state, gender, education, COUNT(*),
CAST (ROUND ( AVG ( income ), 2 ) AS NUMERIC (18,2))
FROM census
GROUP BY state, gender, education;

The results of this query are:

state	gender	education	count(*)	avg income
MA	f	BA	3	48333.33
MA	f	HS	2	40000.00
MA	f	MS	1	45000.00
MA	m	BA	4	55000.00
MA	m	HS	1	55000.00
MA	m	MS	3	85000.00
NH	f	HS	2	50000.00
NH	f	MS	1	85000.00
NH	m	BA	3	55000.00
NH	m	MS	1	49000.00

Use the CUBE extension of the GROUP BY clause, if you want to compute the average income in the entire census of state, gender, and education and compute the average income in all possible combinations of the columns state, gender, and education, while making only a single pass through the census data. For example, use the CUBE operator if you want to compute the average income of all females in all states, or compute the average income of all people in the census according to their education and geographic location.

When CUBE calculates a group, CUBE puts a NULL value in the column(s) whose group is calculated. The distinction is difficult between the type of group each row represents and whether the NULL is a NULL stored in the database or a NULL resulting from CUBE. The GROUPING function solves this problem by returning 1, if the designated column has been merged to a higher level group.

The following query illustrates the use of the GROUPING function with GROUP BY CUBE.

SELECT
CASE GROUPING ( state ) WHEN 1 THEN 'ALL' ELSE state END AS c_state,
CASE GROUPING ( gender ) WHEN 1 THEN 'ALL' ELSE gender END AS c_gender,
CASE GROUPING ( education ) WHEN 1 THEN 'ALL' ELSE education END AS c_education,
COUNT(*), CAST (ROUND ( AVG ( income ), 2 ) AS NUMERIC (18,2)) AS average
FROM census
GROUP BY CUBE (state, gender, education);

The results of this query are shown below. Note that the NULLs generated by CUBE to indicate a subtotal row are replaced with ALL in the subtotal rows, as specified in the query.

c_state	c_gender	c_education	count(*)	average
MA	f	BA	3	48333.33
MA	f	HS	2	40000.00
MA	f	MS	1	45000.00
MA	f	ALL	6	45000.00
MA	m	BA	4	55000.00
MA	m	HS	1	55000.00
MA	m	MS	3	85000.00
MA	m	ALL	8	66250.00
MA	ALL	ALL	14	57142.86
NH	f	HS	2	50000.00
NH	f	MS	1	85000.00
NH	f	ALL	3	61666.67
NH	m	BA	3	55000.00
NH	m	MS	1	49000.00
NH	m	ALL	4	53500.00
NH	ALL	ALL	7	57000.00
ALL	ALL	ALL	21	57095.24
ALL	ALL	BA	10	53000.00
ALL	ALL	MS	6	72333.33
ALL	ALL	HS	5	47000.00
ALL	f	ALL	9	50555.56
ALL	m	ALL	12	62000.00
ALL	f	BA	3	48333.33
ALL	m	HS	1	55000.00
ALL	m	MS	4	76000.00
ALL	m	BA	7	55000.00
ALL	f	MS	2	65000.00
ALL	f	HS	4	45000.00
NH	ALL	HS	2	50000.00
NH	ALL	MS	2	67000.00
MA	ALL	MS	4	75000.00
MA	ALL	HS	3	45000.00
MA	ALL	BA	7	52142.86
NH	ALL	BA	3	55000.00

HAVING search-condition based on the group values and not on the individual row values. The HAVING clause can only be used if either the statement has a GROUP BY clause or if the select list consists solely of aggregate functions. Any column names referenced in the HAVING clause must either be in the GROUP BY clause or be used as a parameter to an aggregate function in the HAVING clause.

ORDER BY clause Orders the results of a query. Each item in the ORDER BY list can be labeled as ASC for ascending order or DESC for descending order. Ascending is assumed if neither is specified. If the expression is an integer n, then the query results will be sorted by the nth item in the select list.

In Embedded SQL, the SELECT statement is used for retrieving results from the database and placing the values into host variables via the INTO clause. The SELECT statement must return only one row. For multiple row queries, you must use cursors.

You cannot include a java class in the SELECT list, but you can, for example, create a function or variable that acts as a wrapper for the Java class and then select it.

Side effects

None.

Standards

SQL/92 Entry level feature. check clauses.

Sybase Supported by Adaptive Server Enterprise, with some differences in syntax.

Permissions

Must have SELECT permission on the named tables and views.