SQL GROUP BY
Tweet |
The SQL GROUP BY statement is used together with the SQL aggregate functions to group the retrieved data by one or more columns. The GROUP BY concept is one of the most complicated concepts for people new to the SQL language and the easiest way to understand it, is by example.
We want to retrieve a list with unique customers from our Sales table, and at the same time to get the total amount each customer has spent in our store.
OrderID | OrderDate | OrderPrice | OrderQuantity | CustomerName |
1 | 12/22/2005 | 160 | 2 | Smith |
2 | 08/10/2005 | 190 | 2 | Johnson |
3 | 07/13/2005 | 500 | 5 | Baldwin |
4 | 07/15/2005 | 420 | 2 | Smith |
5 | 12/22/2005 | 1000 | 4 | Wood |
6 | 10/2/2005 | 820 | 4 | Smith |
7 | 11/03/2005 | 2000 | 2 | Baldwin |
You already know how to retrieve a list with unique customer using the DISTINCT keyword:
The SQL statement above works just fine, but it doesn't return the total amount of money spent for each of the customers. In order to accomplish that we will use both SUM SQL function and the GROUP BY clause:
We have 2 columns specified in our SELECT list - CustomerName and SUM(OrderPrice). The problem is that SUM(OrderPrice), returns a single value, while we have many customers in our Sales table. The GROUP BY clause comes to the rescue, specifying that the SUM function has to be executed for each unique CustomerName value. In this case the GROUP BY clause acts similar to DISTINCT statement, but for the purpose of using it along with SQL aggregate functions. The result set retrieved from the statement above will look like this:
CustomerName | OrderPrice |
Baldwin | 2500 |
Johnson | 190 |
Smith | 1400 |
Wood | 1000 |
You do grouping using GROUP BY by more than one column, for example:
When grouping, keep in mind that all columns that appear in your SELECT column list, that are not aggregated (used along with one of the SQL aggregate functions), have to appear in the GROUP BY clause too.