SQL Server Analytical Functions

Introduction to Analytical Functions in SQL Server
July 24th, 2013 amit Leave a comment Go to comments

Analytic functions compute an aggregate value based on a group of rows. They differ from aggregate
functions in that they return multiple rows for each group. Let us look at analytical functions in SQL
Server
Four different parts of an Analytical clause:
the analytical function, for example AVG, LEAD, PERCENTILE_RANK
the partitioning clause, for example PARTITION BY job or PARTITION BY dept, job
the order by clause, for example order by job nulls last
the windowing clause, for example RANGE UNBOUNDED PRECEDING or ROWS
UNBOUNDED FOLLOWING
Notes: when no window clause is present then window is equal to partition.
Aggregate functions perform a calculation on a set of values and return a single value. With the
exception of COUNT, aggregate functions ignore null values. Aggregate functions are often used with
the GROUP BY clause of the SELECT statement.
Aggregate functions in T-SQL
AVG
MAX
BINARY_CHECKSUM MIN
CHECKSUM
SUM
CHECKSUM_AGG
STDEV
COUNT
STDEVP
COUNT_BIG
VAR
GROUPING
VARP
Example for aggregate function

SELECT MAX(st1.field3) FROM table1
Example for Analytical function
select ename
job
sal
hiredate
first_value(sal) over ( partition by job

order by
hiredate
range between current row

and
unbounded following
) job_avg
,
first_value(sal) over ( partition by job

order by
rows
hiredate
between current row
and 2 following
) job_avg
from emp
where sal < 2500
order
by
job
SQL Server Analytical Functions

Analytical functions are introduced in 2005.
Analytical Functions available in SQL Server 2005 -2008
RANK: Assigns a unique number for each row starting with 1, except for rows that have
duplicate values, in which case the same ranking is assigned and a gap appears in the sequence
for each duplicate ranking.
DENSE_RANK. This is same as RANK () function. Only difference is returns rank without
gaps.
ROW_NUMBER. Returns the sequential row number of the result set, starting at 1 for the first
row in each partition. For rows that have duplicate values, numbers are arbitrarily assigned.
NTILE. Divides an ordered partition into a specified number of groups. Each group is assigned
a number. If the number of rows isnt divisible by the number of groups, the first few groups
will have more rows than the latter groups. Otherwise, if the rows are divisible by the group
number, each group will have the same number of rows.
New Analytical functions available in SQL Server 2012

CUME_DIST: calculates the cumulative distribution of a value in a group of values. The range
of values returned by CUME_DIST is >0 to <=1, which represents percentage of number of
rows with value less than (for ascending order) or equal to current row.
Syntax:
Without Partition Clause
SELECT SalesOrderID, OrderQty,
CUME_DIST() OVER(ORDER BY SalesOrderID) AS CDist
FROM Sales.SalesOrderDetail
WHERE SalesOrderID IN (43670, 43669, 43667, 43663)
ORDER BY CDist DESC

With Partition Clause
SELECT SalesOrderID, OrderQty, ProductID,
CUME_DIST() OVER(PARTITION BY SalesOrderID
ORDER BY ProductID ) AS CDist
FROM Sales.SalesOrderDetail s
ORDER BY s.SalesOrderID DESC, CDist DESC
FIRST_VALUE: Returns the first value in an ordered set of values in SQL Server 2012.
Syntax:1) FIRST_VALUE (column) over (partition by column order by column)
2) FIRST_VALUE (column) ignore nulls over (partition by column order by column)
Above will ignore null columns

3) FIRST_VALUE (column) over (partition by column order by column) ROWS UNBOUNDED
PRECEDING
LAST_VALUE: Returns the highest value in the windows or set of result.
Syntax:1) LAST_VALUE (column) over (partition by column order by column)
2) LAST_VALUE (column) ignore nulls over (partition by column order by column)
Above will ignore null columns
3) LAST_VALUE (column) over (partition by column order by column) ROWS UNBOUNDED
PRECEDING
SELECT s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty,
FIRST_VALUE(SalesOrderDetailID) OVER (PARTITION BY SalesOrderID
ORDER BY SalesOrderDetailID
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) FstValue,
LAST_VALUE(SalesOrderDetailID) OVER (PARTITION BY SalesOrderID
ORDER BY SalesOrderDetailID
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) LstValue
ORDER BY s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty
Notes
When you ORDER a set of records in analytic functions you can specify a range of rows to consider,
ignoring the others. You can do this using the ROWS clause
UNBOUNDED PRECEDING the range starts at the first row of the partition.
UNBOUNDED FOLLOWING The range ends at the last row of the partition.
CURRENT ROW range begins at the current row or ends at the current row
n PRECEDING or n FOLLOWING The range starts or ends n rows before or after the current row
LEAD and LAG: The LAG/LEAD functions return a column from a previous/following row in
the partition, with respect to the current row, as specified by the row offset in the function,
without the use of a self-join.
SELECT s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty,
LEAD(SalesOrderDetailID) OVER (ORDER BY SalesOrderDetailID
) LeadValue,
LAG(SalesOrderDetailID) OVER (ORDER BY SalesOrderDetailID
) LagValue
ORDER BY s.SalesOrderID,s.SalesOrderDetailID,s.OrderQty
PERCENTILE_CONT (): Calculates a percentile based on a continuous distribution of the

column value. It is similar to median. The return type is float(53) and the value of percentile
should be between 0 and 1.

PERCENTILE_DISC (): Computes a specific percentile for sorted values in an entire rowset or
within distinct partitions of a rowset. For a given percentile value P, PERCENTILE_DISC sorts
the values of the expression in the ORDER BY clause and returns the value with the smallest
CUME_DIST value (with respect to the same sort specification) that is greater than or equal to
P.
CUME_DIST() OVER(PARTITION BY SalesOrderID
ORDER BY ProductID ) AS CDist,
PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY ProductID)
OVER (PARTITION BY SalesOrderID) AS PercentileDisc
FROM Sales.SalesOrderDetail
ORDER BY SalesOrderID DESC
PERCENT_RANK():This function returns relative standing of a value within a query result set
or partition.
The formula to find PERCENT_RANK () is as following:
PERCENT_RANK () = (RANK () 1) / (Total Rows 1)

RANK() OVER(PARTITION BY SalesOrderID
ORDER BY ProductID ) Rnk,
PERCENT_RANK() OVER(PARTITION BY SalesOrderID
ORDER BY ProductID ) AS PctDist
ORDER BY PctDist DESC
More Analytical functions

There is a commercial add-in package for SQL Server that adds literally hundreds of analytic functions
to SQL 2005/08/12, including PERCENTILE and PERCENTRANK.
http://westclintech.com/Products/XLeratorDBstatistics/XLeratorDBstatisticsDocumentation/tabid/159/t
opic/PERCENTILE/Default.aspx
Analytical Functions in Oracle

Here is the list of analytic functions in oracle as of available till 10.2g.
AVG
CORR
COVAR_POP
COVAR_SAMP
COUNT
CUME_DIST
DENSE_RANK
FIRST
FIRST_VALUE
LAG
LAST
LAST_VALUE
LEAD
MAX
MIN
NTILE
PERCENT_RANK
PERCENTILE_CONT
PERCENTILE_DISC
RANK
RATIO_TO_REPORT
REGR_ (Linear Regression) Functions
ROW_NUMBER
STDDEV
STDDEV_POP
STDDEV_SAMP
SUM
VAR_POP
VAR_SAMP
VARIANCE
New analytical functions in 11g
NTH_VALUE
ListAgg
Details
http://docs.oracle.com/cd/E11882_01/server.112/e26088/functions004.htm
Reference Links
http://msdn.microsoft.com/en-us/library/hh213234.aspx
http://technology.amis.nl/2004/10/04/analytical-sql-functions-theory-and-examples-part-2-on-theorder-by-and-windowing-clauses/2/
http://blog.sqlauthority.com/2007/10/09/sql-server-2005-sample-example-of-ranking-functionsrow_number-rank-dense_rank-ntile/

SQL Server Analytical Functions

Caricato da

Informazioni sul documento

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

SQL Server Analytical Functions

Caricato da

Copyright:

Formati disponibili

Introduction to Analytical Functions in SQL Server

July 24th, 2013 amit Leave a comment Go to comments

Example for aggregate function

first_value(sal) over ( partition by job

range between current row

first_value(sal) over ( partition by job

SQL Server Analytical Functions

number, each group will have the same number of rows.

New Analytical functions available in SQL Server 2012

ORDER BY CDist DESC

Above will ignore null columns

PERCENTILE_CONT (): Calculates a percentile based on a continuous distribution of the

should be between 0 and 1.

PERCENT_RANK () = (RANK () 1) / (Total Rows 1)

More Analytical functions

Analytical Functions in Oracle

Potrebbero piacerti anche