LeetCode 596 - Classes More Than 5 Students

Database Language: SQL Server

Difficulty: Easy

Problem Description

Input

Table: Courses

| Column Name | Type    |
| ----------- | ------- |
| student     | varchar |
| class       | varchar |

(`student`, `class`) is the primary key (combination of columns with unique values) for this table.

Each row of this table indicates the name of a student and the class in which they are enrolled.

Requirement

Write a solution to find all the classes that have at least five students.

Return the result table in any order.

The result format is in the following example.

Examples

Example 1

Input

Courses table:

| student | class    |
| ------- | -------- |
| A       | Math     |
| B       | English  |
| C       | Math     |
| D       | Biology  |
| E       | Math     |
| F       | Computer |
| G       | Math     |
| H       | Math     |
| I       | Math     |
Output
| class   |
| ------- |
| Math    |
Explanation

SQL Schema

CREATE TABLE Courses (student varchar(255), class varchar(255));
ALTER TABLE Courses ADD CONSTRAINT PK_Courses PRIMARY KEY (student, class);

TRUNCATE TABLE Courses;
INSERT INTO Courses (student, class) values ('A', 'Math');
INSERT INTO Courses (student, class) values ('B', 'English');
INSERT INTO Courses (student, class) values ('C', 'Math');
INSERT INTO Courses (student, class) values ('D', 'Biology');
INSERT INTO Courses (student, class) values ('E', 'Math');
INSERT INTO Courses (student, class) values ('F', 'Computer');
INSERT INTO Courses (student, class) values ('G', 'Math');
INSERT INTO Courses (student, class) values ('H', 'Math');
INSERT INTO Courses (student, class) values ('I', 'Math');

Solutions

To be able to find all the classes that have at least five students, the number of students per class needs to be determined first. To cound the number of students per class, the `COUNT()` aggregation function will be used together with the `GROUP BY` clause of the `SELECT` statement:

SELECT class, COUNT(student) as student_count
FROM Courses
GROUP BY class
| class    | student_count |
| -------- | ------------- |
| Math     | 6             |
| English  | 1             |
| Biology  | 1             |
| Computer | 1             |

To return just the list of classes that have at least five student, the result above need to be filtered using the `HAVING` clause:

SELECT class, COUNT(student) as student_count
FROM Courses
GROUP BY class
HAVING student_count >= 5
| class | student_count |
| ----- | ------------- |
| Math  | 6             |

The required output for this question is just the class and not include the `student_count` so this column needs to be removed:

SELECT class
FROM Courses
GROUP BY class
HAVING student_count >= 5

But removing the `student_count` column generates the following error because it is being referenced in the `HAVING` clause of the statement:

Query 1 ERROR: Msg: 207, Line 4, State: 1, Level: 16
Invalid column name 'student_count'.

To overcome this error, the `student_count` in the `HAVING` clause needs to be replaced with the expression used for the `student_count`, which is `COUNT(student)`. The final solution is as follows:

# Final Solution Query
SELECT class
FROM Courses
GROUP BY class
HAVING COUNT(student) >= 5

Here's the query plan generated by SQL Server for this query:

  |--Filter(WHERE:([Expr1003]>=(5)))
    |--Compute Scalar(DEFINE:([Expr1003]=CONVERT_IMPLICIT(int,[Expr1006],0)))
            |--Stream Aggregate(GROUP BY:([leetcode].[dbo].[Courses].[class])
               DEFINE:([Expr1006]=COUNT([leetcode].[dbo].[Courses].[student])))
                |--Sort(ORDER BY:([leetcode].[dbo].[Courses].[class] ASC))
                    |--Table Scan(OBJECT:([leetcode].[dbo].[Courses]))

And here's the fastest runtime:

Related Articles: