Sunday, 6 December 2020

Over Clause with Ranking and Aggregate functions

Introduction 

Over clause is very important used for partitioning and ordering of rowset. It means partition with over clause used to group data based on requested column and ordering is used to sort data based on given sort column.


Over clause also used with ranking functions (ROW_NUMBER ,RANK etc), aggregate functions (SUM, AVG, COUNT, etc) and many more area. Now you know importance of over clause as it is widely used with lot of functions will see below. 


create table department

(

id int primary key identity(1,1),

name varchar(50)

)


insert into department values ('IT')

insert into department values ('HR')

insert into department values ('Sales')


create table Employee

(

 id int primary key identity (1,1),

 name varchar(50),

 departmentId int foreign key references department(id)

)


insert into Employee values ('Rohan',1)

insert into Employee values ('Nasim',1)

insert into Employee values ('Shahid',1)

insert into Employee values ('Diksha',1)

insert into Employee values ('Sammer',2)

insert into Employee values ('Sanket',2)

insert into Employee values ('Sohel',2)

insert into Employee values ('Sachin',2)

insert into Employee values ('John',3)

insert into Employee values ('Peter',3)

insert into Employee values ('Suman',3)



Ranking functions

1. Row_number() : 

SQL function helps to add sequence or unique number to each row.


select 

ROW_NUMBER() over (order by dept.name) as RowNumber_function,

emp.name as EmployeeName, dept.name as Department  

from employee emp inner join department dept on emp.departmentId = dept.id


Query result shown below:


RowNumber_Function                   employeeName            Department

1                                 Sammer                 HR

2                                 Sanket                 HR

3                                 Sohel                 HR

4                                 Sachin                 HR

5                                 Rohan                 IT

6                                 Nasim                 IT

7                                 Shahid                 IT

8                                 Diksha                 IT

9                                 John                 Sales

10                                 Peter                 Sales

11                                 Suman                 Sales



2. Rank() : 

SQL function helps to add sequencing but here it shares same rank/row number for department. Check the Rank_function column it shares same rownumber if department is same but not following proper sequencing (1,5,9)

select 

RANK() over (order by dept.name) as Rank_Function,

emp.name as EmployeeName, dept.name as Department  

from employee emp inner join department dept on emp.departmentId = dept.id


Query result shown below:


Rank_Function                   employeeName            Department

1                     Sammer         HR

1                     Sanket                 HR

1                     Sohel                 HR

1                     Sachin                 HR

5                     Rohan                 IT

5                     Nasim                 IT

5                     Shahid                 IT

5                     Diksha                 IT

9                     John                 Sales

9                     Peter                 Sales

9                     Suman                 Sales


3. Dense_Rank() :

SQL function, it is same as Rank function but follow proper sequencing like after 1 it always pick sequence as 2 no matter how many rows shared sequence 1 in query result.

select 

DENSE_RANK() over (order by Dept.name) as DenseRank_function,

emp.name as EmployeeName, dept.name as Department  

from employee emp inner join department dept on emp.departmentId = dept.id


Rank_Function                   employeeName            Department

1                    Sammer         HR

1                    Sanket                 HR

1                    Sohel                 HR

1                    Sachin                 HR

2                     Rohan                  IT

2                     Nasim                  IT

2                     Shahid                 IT

2                     Diksha                  IT

3                     John                 Sales

3                      Peter                 Sales

3                     Suman                 Sales


To check difference between all 3 ranking function use below SQL script and see query result

select 

ROW_NUMBER() over (order by dept.name) as RowNumber_function,

RANK() over (order by dept.name) as Rank_Function,

DENSE_RANK() over (order by Dept.name) as DenseRank_function,

emp.name as EmployeeName, dept.name as Department  

from employee emp inner join department dept on emp.departmentId = dept.id




Till now we use Over clause with order by clause now we will use partition clause with oder by. As discussed partition by will group table data based on any column which share common data like here we have Department share common data .

4 employees belongs to HR, 4 employees again belongs to It and remaining from Sales

we can group table query based on Department column using partition by clause.


lest see below query with row number function.



select 

ROW_NUMBER() over (partition by dept.Name order by dept.name) as RowNumber_function,

emp.name as EmployeeName, dept.name as Department  

from employee emp inner join department dept on emp.departmentId = dept.id


Rank_Function                   employeeName            Department

1                     Sammer             HR

2                     Sanket                     HR

3                     Sohel                      HR

4                     Sachin                      HR

1                     Rohan                     IT

2                     Nasim                     IT

3                     Shahid                      IT

4                     Diksha                     IT

1                     John                     Sales

2                     Peter                     Sales

3                     Suman                     Sales


 Partition by clause support multiple columns shown in below scenario with comma separated column name. Create Employee table with some data.

create table Employee1

(

 id int primary key identity (1,1),

 name varchar(50),

 departmentId int foreign key references department(id),

 city varchar(20)

)


insert into Employee1 values ('Rohan',1,'Mumbai')

insert into Employee1 values ('Nasim',1,'Mumbai')

insert into Employee1 values ('Shahid',1,'Nagpur')

insert into Employee1 values ('Diksha',1,'Nagpur')

insert into Employee1 values ('Sammer',2,'London')

insert into Employee1 values ('Sanket',2,'London')

insert into Employee1 values ('Sohel',2,'Leicester')

insert into Employee1 values ('Sachin',2,'London')

insert into Employee1 values ('John',3,'New York')

insert into Employee1 values ('Peter',3,'Austin')

insert into Employee1 values ('Suman',3,'Austin')


select 

ROW_NUMBER() over (partition by dept.Name,emp.city order by dept.name) as RowNumber,

emp.name as EmployeeName, dept.name as Department,emp.city

from employee1 emp inner join department dept on emp.departmentId = dept.id


Query result grouped based on two column first it apply group on department column and then City. RowNumber calculated based on City column group and not on department.


RowNumber    EmployeeName  Department City

1          Sohel     HR         Leicester

1          Sachin     HR         London

2          Sammer     HR         London

3         Sanket     HR         London

1         Rohan     IT         Mumbai

2         Nasim     IT         Mumbai

1         Shahid     IT     Nagpur

2         Diksha     IT     Nagpur

1         Peter     Sales     Austin

2         Suman     Sales     Austin

1         John     Sales     New York





Aggregate functions

There are couple of aggregate functions in MS SQL server will show you example of count but you can check same with other aggregate functions like Min,Max Sum ... etc

1. Count : return number of row count for table or group of records in table.


select count(*) from Employee -- return count 11 (number of rows in table)


select 

count(emp.name)  over (partition by dept.name) as EmployeeCount, dept.name as Department  

from employee emp inner join department dept on emp.departmentId = dept.id


EmployeeCount Department

4                 HR

4                 HR

4                 HR

4                 HR

4                 IT

4                 IT

4                 IT

4                 IT

3                 Sales

3                 Sales

3                 Sales


same we can achieve with below query without any duplication of rows or with above query we can use distinct keyword to avoid duplicate records.


select 

count(emp.name)  as EmployeeCount, dept.name as Department  

from employee emp inner join department dept on emp.departmentId = dept.id

group by dept.name


EmployeeCount Department

4                 HR

4                 IT

3                 Sales


2 Comments:

At 11 December 2020 at 02:41 , Blogger Yogesh Patil said...

Very well explained in detail for things at one place which we use many times. Good keep it up.

 
At 14 December 2020 at 05:12 , Blogger Asif Sayyad said...

Thank Yogesh

 

Post a Comment

Subscribe to Post Comments [Atom]

<< Home