GROUP BY
Simple Grouping
Section titled “Simple Grouping”Orders Table
|CustomerId|ProductId|Quantity|Price |---|---|---|---|--- |1|2|5|100 |1|3|2|200 |1|4|1|500 |2|1|4|50 |3|5|6|700
When grouping by a specific column, only unique values of this column are returned.
SELECT customerIdFROM ordersGROUP BY customerId;Return value:
|customerId |---|---|---|---|--- |1 |2 |3
Aggregate functions like count() apply to each group and not to the complete table:
SELECT customerId, COUNT(productId) as numberOfProducts, sum(price) as totalPriceFROM ordersGROUP BY customerId;Return value:
|customerId|numberOfProducts|totalPrice |---|---|---|---|--- |1|3|800 |2|1|50 |3|1|700
GROUP BY multiple columns
Section titled “GROUP BY multiple columns”One might want to GROUP BY more than one column
declare @temp table(age int, name varchar(15))
insert into @tempselect 18, 'matt' union allselect 21, 'matt' union allselect 21, 'matt' union allselect 18, 'luke' union allselect 18, 'luke' union allselect 21, 'luke' union allselect 18, 'luke' union allselect 21, 'luke'
SELECT Age, Name, count(1) countFROM @tempGROUP BY Age, Namewill group by both age and name and will produce:
|Age|Name|count |---|---|---|---|--- |18|luke|3 |21|luke|2 |18|matt|1 |21|matt|2
GROUP BY with ROLLUP and CUBE
Section titled “GROUP BY with ROLLUP and CUBE”The ROLLUP operator is useful in generating reports that contain subtotals and totals.
SELECT CASE WHEN (GROUPING(Item) = 1) THEN 'ALL' ELSE ISNULL(Item, 'UNKNOWN') END AS Item, CASE WHEN (GROUPING(Color) = 1) THEN 'ALL' ELSE ISNULL(Color, 'UNKNOWN') END AS Color, SUM(Quantity) AS QtySumFROM InventoryGROUP BY Item, Color WITH ROLLUP
Item Color QtySum-------------------- -------------------- --------------------------Chair Blue 101.00Chair Red 210.00Chair ALL 311.00Table Blue 124.00Table Red 223.00Table ALL 347.00ALL ALL 658.00(7 row(s) affected)
If the ROLLUP keyword in the query is changed to CUBE, the CUBE result set is the same, except these two additional rows are returned at the end:
ALL Blue 225.00ALL Red 433.00https://technet.microsoft.com/en-us/library/ms189305(v=sql.90).aspx
Group by with multiple tables, multiple columns
Section titled “Group by with multiple tables, multiple columns”Group by is often used with join statement. Let’s assume we have two tables. The first one is the table of students:
|Id|Full Name|Age |---|---|---|---|--- |1|Matt Jones|20 |2|Frank Blue|21 |3|Anthony Angel|18
Second table is the table of subject each student can take:
|Subject_Id|Subject |---|---|---|---|--- |1|Maths |2|P.E. |3|Physics
And because one student can attend many subjects and one subject can be attended by many students (therefore N:N relationship) we need to have third “bounding” table. Let’s call the table Students_subjects:
|Subject_Id|Student_Id |---|---|---|---|--- |1|1 |2|2 |2|1 |3|2 |1|3 |1|1
Now lets say we want to know the number of subjects each student is attending. Here the standalone GROUP BY statement is not sufficient as the information is not available through single table. Therefore we need to use GROUP BY with the JOIN statement:
Select Students.FullName, COUNT(Subject Id) as SubjectNumber FROM Students_SubjectsLEFT JOIN StudentsON Students_Subjects.Student_id = Students.IdGROUP BY Students.FullNameThe result of the given query is as follows:
|FullName|SubjectNumber |---|---|---|---|--- |Matt Jones|3 |Frank Blue|2 |Anthony Angel|1
For an even more complex example of GROUP BY usage, let’s say student might be able to assign the same subject to his name more than once (as shown in table Students_Subjects). In this scenario we might be able to count number of times each subject was assigned to a student by GROUPing by more than one column:
SELECT Students.FullName, Subjects.Subject,COUNT(Students_subjects.Subject_id) AS NumberOfOrdersFROM ((Students_SubjectsINNER JOIN StudentsON Students_Subjcets.Student_id=Students.Id)INNER JOIN SubjectsON Students_Subjects.Subject_id=Subjects.Subject_id)GROUP BY Fullname,SubjectThis query gives the following result:
|FullName|Subject|SubjectNumber |---|---|---|---|--- |Matt Jones|Maths|2 |Matt Jones|P.E|1 |Frank Blue|P.E|1 |Frank Blue|Physics|1 |Anthony Angel|Maths|1
HAVING
Section titled “HAVING”Because the WHERE clause is evaluated before GROUP BY, you cannot use WHERE to pare down results of the grouping (typically an aggregate function, such as COUNT(*)). To meet this need, the HAVING clause can be used.
For example, using the following data:
DECLARE @orders TABLE(OrderID INT, Name NVARCHAR(100))
INSERT INTO @orders VALUES( 1, 'Matt' ),( 2, 'John' ),( 3, 'Matt' ),( 4, 'Luke' ),( 5, 'John' ),( 6, 'Luke' ),( 7, 'John' ),( 8, 'John' ),( 9, 'Luke' ),( 10, 'John' ),( 11, 'Luke' )If we want to get the number of orders each person has placed, we would use
SELECT Name, COUNT(*) AS 'Orders'FROM @ordersGROUP BY Nameand get
|Name|Orders |---|---|---|---|--- |Matt|2 |John|5 |Luke|4
However, if we want to limit this to individuals who have placed more than two orders, we can add a HAVING clause.
SELECT Name, COUNT(*) AS 'Orders'FROM @ordersGROUP BY NameHAVING COUNT(*) > 2will yield
|Name|Orders |---|---|---|---|--- |John|5 |Luke|4
Note that, much like GROUP BY, the columns put in HAVING must exactly match their counterparts in the SELECT statement. If in the above example we had instead said
SELECT Name, COUNT(DISTINCT OrderID)our HAVING clause would have to say
HAVING COUNT(DISTINCT OrderID) > 2