WIDTH_BUCKET function [Numerical]

Function

For a given expression, the WIDTH_BUCKET function returns the bucket number that the result of this expression will be assigned after it is evaluated.

Syntax

WIDTH_BUCKET ( expression, min_value, max_value, num_buckets) 

Parameters

expression is the expression for which the histogram is being created. This expression must evaluate to a numeric or datetime value or to a value that can be implicitly converted to a numeric or datetime value. If expr evaluates to null, then the expression returns null.

min_value An expression that resolves to the end points of the acceptable range for expr. Must also evaluate to numeric or datetime values and cannot evaluate to null.

max_value An expression that resolves to the end points of the acceptable range for expr. Must also evaluate to numeric or datetime values and cannot evaluate to null.

num_buckets Is an expression that resolves to a constant indicating the number of buckets. This expression must evaluate to a positive integer.

Examples

The following example creates a ten-bucket histogram on the credit_limit column for customers in Massachusetts in the sample table and returns the bucket number (“Credit Group”) for each customer. Customers with credit limits greater than the maximum value are assigned to the overflow bucket, 11:

SELECT customer_id, cust_last_name, credit_limit,
  WIDTH_BUCKET(credit_limit, 100, 5000, 10) "Credit
  Group" 
  FROM customers WHERE territory = 'MA'
  ORDER BY "Credit Group";
CUSTOMER_ID  CUST_LAST_NAME  CREDIT_LIMIT  Credit Group
-----------  --------------  ------------  ------------
825          Dreyfuss                 500             1
826          Barkin                   500             1
853          Palin                    400             1
827          Siegel                   500             1
843          Oates                    700             2
844          Julius                   700             2
835          Eastwood                1200             3
840          Elliott                 1400             3
842          Stern                   1400             3
841          Boyer                   1400             3
837          Stanton                 1200             3
836          Berenger                1200             3
848          Olmos                   1800             4
849          Kaurusmdki              1800             4
828          Minnelli                2300             5
829          Hunter                  2300             5
852          Tanner                  2300             5
851          Brown                   2300             5
850          Finney                  2300             5
830          Dutt                    3500             7
831          Bel Geddes              3500             7
832          Spacek                  3500             7
838          Nicholson               3500             7
839          Johnson                 3500             7
833          Moranis                 3500             7
834          Idle                    3500             7
845          Fawcett                 5000            11
846          Brando                  5000            11
847          Streep                  5000            11

When the bounds are reversed, the buckets are open-closed intervals. For example: WIDTH_BUCKET (credit_limit, 5000, 0, 5). In this example, bucket number 1 is (4000, 5000], bucket number 2 is (3000, 4000], and bucket number 5 is (0, 1000]. The overflow bucket is numbered 0 (5000, +infinity), and the underflow bucket is numbered 6 (-infinity, 0].

Usage

You can generate equiwidth histograms with the WIDTH_BUCKET function. Equiwidth histograms divide data sets into buckets whose interval size (highest value to lowest value) is equal. The number of rows held by each bucket will vary. A related function, NTILE, creates equiheight buckets.

Equiwidth histograms can be generated only for numeric, date or datetime data types; therefore, the first three parameters should be all numeric expressions or all date expressions. Other types of expressions are not allowed. If the first parameter is NULL, the result is NULL. If the second or the third parameter is NULL, an error message is returned, as a NULL value cannot denote any end point (or any point) for a range in a date or numeric value dimension. The last parameter (number of buckets) should be a numeric expression that evaluates to a positive integer value; 0, NULL, or a negative value will result in an error.

Buckets are numbered from 0 to (n+1). Bucket 0 holds the count of values less than the minimum. Bucket(n+1) holds the count of values greater than or equal to the maximum specified value.

Standards and compatibility

See also

“NTILE function [Analytical]”, which creates equiheight histograms.

Chapter 4, “Using OLAP” in the Sybase IQ Performance and Tuning Guide