Partition skew plays an important part in determining whether a parallel partitioned scan can be used. Adaptive Server partition skew is defined as the ratio of the size of the largest partition to the average size of a partition. Consider a table with four partitions of sizes 20, 20, 35, and 80 pages. The size of the average partition is (20 + 20 + 35 + 85)/4 = 40 pages. The biggest partition has 85 pages so partition skew is calculated as 85/40 = 2.125. In partitioned scans, the cost of doing a parallel scan is as expensive as doing the scan on the largest partition. Instead, a hash-based partition may turn out to be fast, as each worker process may hash on a page number or an allocation unit and scan its portion of the data. The penalty paid in terms of loss of performance by skewed partitions is not always at the scan level, but rather as more complex operators like several join operations are built over the data. The margin of error increases exponentially in such cases.
Run sp_help on a table to see the partition skews:
sp_help HA2 ........ name type partition_type partitions partition_keys ------ -------------------- -------------- ----------- HA2 base table hash 2 a1, a2 partition_name partition_id pages segment create_date -------------------------- ------------ ----------- -------------- -------------------------- HA2_752002679 752002679 324 default Aug 10 2005 2:05PM HA2_768002736 768002736 343 default Aug 10 2005 2:05PM Partition_Conditions -------------------- NULL Avg_pages Max_pages Min_pages Ratio(Max/Avg) Ratio(Min/Avg) ----------- ----------- ----------- --------------------------- --------------------------- 333 343 324 1.030030 0.972973
Alternatively, you can calculate skew by querying the systabstats system catalog, where the number of pages in each partition is listed.