SQL GROUP BY





The SQL GROUP BY statement is used together with the SQL aggregate functions to group the retrieved data by one or more columns. The GROUP BY concept is one of the most complicated concepts for people new to the SQL language and the easiest way to understand it, is by example.

We want to retrieve a list with unique customers from our Sales table, and at the same time to get the total amount each customer has spent in our store.

OrderIDOrderDateOrderPriceOrderQuantityCustomerName
112/22/20051602Smith
208/10/20051902Johnson
307/13/20055005Baldwin
407/15/20054202Smith
512/22/200510004Wood
610/2/20058204Smith
711/03/200520002Baldwin

You already know how to retrieve a list with unique customer using the DISTINCT keyword:

SELECT DISTINCT CustomerName FROM Sales

The SQL statement above works just fine, but it doesn't return the total amount of money spent for each of the customers. In order to accomplish that we will use both SUM SQL function and the GROUP BY clause:

SELECT CustomerName, SUM(OrderPrice) FROM Sales GROUP BY CustomerName

We have 2 columns specified in our SELECT list - CustomerName and SUM(OrderPrice). The problem is that SUM(OrderPrice), returns a single value, while we have many customers in our Sales table. The GROUP BY clause comes to the rescue, specifying that the SUM function has to be executed for each unique CustomerName value. In this case the GROUP BY clause acts similar to DISTINCT statement, but for the purpose of using it along with SQL aggregate functions. The result set retrieved from the statement above will look like this:

CustomerNameOrderPrice
Baldwin2500
Johnson190
Smith1400
Wood1000

You do grouping using GROUP BY by more than one column, for example:

SELECT CustomerName, OrderDate, SUM(OrderPrice) FROM Sales GROUP BY CustomerName, OrderDate

When grouping, keep in mind that all columns that appear in your SELECT column list, that are not aggregated (used along with one of the SQL aggregate functions), have to appear in the GROUP BY clause too.