This SQL tutorial focuses on the MySQL Group By Clause, and provides explanations, examples and exercises. For this lesson’s exercises use this link.
This tutorial is a part of several posts explaining how to use the MySQL Group Functions. To read additional posts regarding this subject, please use the following links:
- MySQL group functions
- MySQL Group By clause
- MySQL Having Clause
The MySQL GROUP BY Clause
SELECT column_name , group_function(column_name) FROM table_name WHERE condition GROUP BY column_name
So far, each group function described in the previous tutorial has treated the table as one large group of data. In most cases you need to divide the table into smaller groups, instead of getting the average price of all books in Books table, you would rather see, for example, the average price grouped by each category (what is the average price of the “Fiction” category, “Travel” category and so on).
You can use the MySQL GROUP BY clause to divide the rows in a table into groups. Then you can use the group functions to retrieve summary information for each group.
This MySQL example would retrieve a summary of the average price for each category number:
SELECT category_id , AVG(price) FROM books GROUP BY category_id
A Few Guidelines
All columns in the MySQL SELECT clause that are not group functions must be in the GROUP BY clause
When specifying group functions (such as AVG) in the SELECT clause, alongside with other individual items (such as category_id), you must include a GROUP BY clause. In the GROUP BY clause you must specify these individual items (category_id in this case).
Note: If an individual column (such as category_id) is not specified at the GROUP BY clause, The server is free to choose any value from this column to display alongside every group. This is an extension of the standard SQL (in Oracle or Microsoft this kind of action would result an error), and could be useful when all values of the individual column are the same for each group.
The MySQL GROUP BY columns don’t have to be in the MySQL SELECT clause
It is absolutely possible to group by different columns, but not to specify these columns in the MySQL SELECT clause (however the result will not be meaningful). This MySQL example would retrieve a summary of the average price for each category without displaying the respective category numbers.
SELECT AVG(price) FROM books GROUP BY category_id
You can list more than one column after the MySQL GROUP BY clause
Sometimes you need to see the result for groups within groups, for example: each category is provided by different publishers (“Fiction” books are provided by 4 different publishers). While the MySQL examples carried out so far were meant to display the average of each category, the query specified below retrieves a summary of the average price for each category and for each publisher:
SELECT category_id , publisher_id , AVG(price) FROM books GROUP BY category_id , publisher_id
Using the MySQL WHERE clause, you can exclude rows before dividing them into groups.
For example, if you need to display the average price only for categories 50, 80, and 90:
SELECT category_id , AVG(price) FROM books WHERE category_id IN (50, 80, 90) GROUP BY category_id
You cannot use the MySQL WHERE clause to restrict groups
As seen in the last example, using the MySQL WHERE clause you can restrict rows before dividing them into groups. However, it is not possible to specify group functions in a MySQL WHERE clause, as that would result in an error.
The categories where the average price is higher than 50:
SELECT category_id , AVG(price) FROM books WHERE AVG(price) > 50 GROUP BY category_id -- (error)