This SQL tutorial focuses on the SQL Group Functions in SQL Server, and provides explanations, examples and exercises. For this lesson’s exercises use this link.
 

Definition

SQL Server Group Functions operate on sets of rows to give one result per group, for example :

group_functions

 

SQL Server Common Group Functions

Function Description Syntax
SUM Returns the total sum
SUM(salary)
-- Result: 20000
MIN Returns the lowest value
MIN (salary)
-- Result: 2000
MAX Returns the highest value
MAX(salary)
-- Result: 7000
AVG Returns the average value
AVG(salary)
-- Result: 4000
COUNT (*) Returns the number of records in a table
COUNT(*)
-- Result: 5
COUNT (column) Returns the number of values (NULL values will not be counted) of the specified column
COUNT(name)
-- Result: 4
COUNT (DISTINCT column) Returns the number of distinct values
COUNT(DISTINCT name)
-- Result: 3

* Results based on the illustration mentioned above 
 

SQL Server GROUP Functions and NULL

  • In SQL Server, All Group functions ignore NULL values. For example: the average salary is calculated based on the rows in the table where a valid value is stored (the total salary divided by the number of employees receiving a salary).
  • In SQL Server You can use the ISNULL function to force group functions to include NULL values, in the following example the average is calculated based on all rows in the table, regardless of whether null values are stored in the salary column (the total salary divided by the total number of rows in the table):
SELECT       AVG(ISNULL(salary,0))
FROM         Employees

 

The SQL Server GROUP BY Clause

SELECT       column_name , group_function(column_name)
FROM         table_name
WHERE        condition
GROUP BY     column_name

So far, each group function described here has treated the table as one large group of data. In most cases you need to divide the table into smaller groups, instead of getting the average salary of all employees in Employees table, you would rather see, for example, the average salary grouped by each department (what is the average salary of the HR department, IT department and so on).

You can use the SQL Server GROUP BY clause to divide the rows in a table into groups. Then you can use the group functions to retrieve summary information for each group.

SELECT      department_id , AVG(salary)
FROM        employees
GROUP BY    department_id

 

A Few Guidelines

All columns in the SQL Server SELECT clause that are not group functions must be in the GROUP BY clause

When specifying group functions (AVG)  in a SELECT clause, alongside with other individual items (department_id), you must include a GROUP BY clause. In the GROUP BY clause you must specify these individual items (department_id in this case)  otherwise an error will be generated.

sql_group_by_clause

The SQL Server GROUP BY columns don’t have to be in the SQL Server SELECT clause

It is absolutely possible to group by different columns, but not to specify these columns in the SQL Server SELECT clause (however the result will not be meaningful). In this example we’re displaying the average salary for each departments without displaying the respective department numbers.

SELECT      AVG(salary)
FROM        employees
GROUP BY    department_id

You can list more than one column after the SQL Server GROUP BY clause

Sometimes you need to see the result for groups within groups, for example: in each department there are various positions (administrative positions, clerk, maintenance employees and so on). While the operations carried out so far were meant to display the average of each department, the query specified below allows seeing the average of each department, divided by position type:

SELECT      department_id , job_id , AVG(salary)
FROM        employees
WHERE       department_id IN (50, 80, 90)
GROUP BY    department_id , job_id

Using the SQL Server WHERE clause, you can exclude rows before dividing them into groups.

For example, if you need to display the average salary only for departments 50, 80, and 90:

SELECT      department_id , AVG(salary)
FROM        employees
WHERE       department_id IN (50, 80, 90)
GROUP BY    department_id

You cannot use the SQL Server WHERE clause to restrict groups

As seen in the last example, using the SQL Server WHERE clause you can restrict rows before dividing them into groups. However, it is not possible to specify group functions in a SQL Server WHERE clause, as that would result in an error.

The departments where the average salary is higher than 5000:

 SELECT      department_id , AVG(salary)
 FROM        employees
 WHERE       AVG(salary) > 5000
 GROUP BY    department_id    

(error)

 

The HAVING Clause

The SQL Server HAVING clause allows filtering of aggregated results produced by the SQL Server GROUP BY clause. In the same way you used SQL Server WHERE clause to restrict rows, you use the SQL Server HAVING clause to restrict groups.

SELECT       column_name , group_function(column_name)
FROM         table_name
WHERE        condition
GROUP BY     column_name
HAVING       condition

The departments where the average salary is higher than 5000:

SELECT      department_id , AVG(salary)
 FROM        employees
 GROUP BY    department_id
 HAVING      AVG(salary) > 5000

In a single query you can use both SQL Server HAVING and WHERE clauses. Out of departments 80, 50, and 90, the departments where the average salary is higher than 5000:

SELECT      department_id , AVG(salary)
FROM        employees
WHERE       department_id IN (50, 80, 90)
GROUP BY    department_id
HAVING      AVG(salary) > 5000

You can filter based on another group function than the one that appears in a SQL Server SELECT statement:

 SELECT      department_id , AVG(salary)
 FROM        employees
 WHERE       department_id IN (50, 80, 90)
 GROUP BY    department_id
 HAVING      MAX(salary) > 5000