The ANSI SQL/OLAP extensions provide a number of additional aggregate functions that permit statistical analysis of numeric data. This support includes functions to compute variance, standard deviation, correlation, and linear regression.
The SQL/OLAP general set functions that take one argument include STDDEV, STDDEV_POP, STDDEV_SAMP, VARIANCE, VAR_POP, and VAR_SAMP.
<SIMPLE WINDOW AGGREGATE FUNCTION TYPE> ::= <BASIC AGGREGATE FUNCTION TYPE> | STDDEV | STDDEV_POP | STDDEV_SAMP | VARIANCE | VARIANCE_POP | VARIANCE_SAMP
STDDEV_POP – Computes the population standard deviation of the provided value expression evaluated for each row of the group or partition (if DISTINCT was specified, then each row that remains after duplicates have been eliminated), defined as the square root of the population variance.
STDDEV_SAMP – Computes the population standard deviation of the provided value expression evaluated for each row of the group or partition (if DISTINCT was specified, then each row that remains after duplicates have been eliminated), defined as the square root of the sample variance.
VAR_POP – Computes the population variance of value expression evaluated for each row of the group or partition (if DISTINCT was specified, then each row that remains after duplicates have been eliminated), defined as the sum of squares of the difference of value expression from the mean of value expression, divided by the number of rows (remaining) in the group or partition.
VAR_SAMP – Computes the sample variance of value expression evaluated for each row of the group or partition (if DISTINCT was specified, then each row that remains after duplicates have been eliminated), defined as the sum of squares of the difference of value expression, divided by one less than the number of rows (remaining) in the group or partition.
These functions, including STDDEV and VARIANCE, are true aggregate functions in that they can compute values for a partition of rows as determined by the query’s ORDER BY clause. As with other basic aggregate functions such as MAX or MIN, their computation ignores NULL values in the input. Also, regardless of the domain of the expression being analyzed, all variance and standard deviation computation uses IEEE double-precision floating point. If the input to any variance or standard deviation function is the empty set, then each function returns NULL as its result. If VAR_SAMP is computed for a single row, it returns NULL, while VAR_POP returns the value 0.