LeetCode 184 - Department Highest Salary¶
Database Language: PostgreSQL
Difficulty:
Problem Description¶
Input¶
Table: Employee¶
Column Name | Type |
---|---|
id | int |
name | varchar |
salary | int |
departmentId | int |
id
is the primary key (column with unique VALUES) for this table. departmentId
is a foreign key (reference columns) of the ID from the Department
table. Each row of this table indicates the ID, name, and salary of an employee. It also contains the ID of their department.
Table: Department¶
Column Name | Type |
---|---|
id | int |
name | varchar |
id
is the primary key (column with unique VALUES) for this table. It is guaranteed that department name is not NULL. Each row of this table indicates the ID of a department and its name.
Requirement¶
Write a solution to find employees who have the highest salary in each of the departments.
Return the result table in any order.
The result format is in the following example.
Examples¶
Example 1¶
Input¶
Employee table:
id | name | salary | departmentId |
---|---|---|---|
1 | Joe | 70000 | 1 |
2 | Jim | 90000 | 1 |
3 | Henry | 80000 | 2 |
4 | Sam | 60000 | 2 |
5 | Max | 90000 | 1 |
Department table:
id | name |
---|---|
1 | IT |
2 | Sales |
Output¶
Department | Employee | Salary |
---|---|---|
IT | Jim | 90000 |
Sales | Henry | 80000 |
IT | Max | 90000 |
Explanation¶
Max and Jim both have the highest salary in the IT department and Henry has the highest salary in the Sales department.
SQL Schema¶
CREATE TABLE IF NOT EXISTS Employee (id int PRIMARY KEY, name VARCHAR(255), salary INT, departmentId INT);
CREATE TABLE IF NOT EXISTS Department (id int PRIMARY KEY, name VARCHAR(255));
TRUNCATE TABLE Employee;
INSERT INTO Employee (id, name, salary, departmentId) VALUES ('1', 'Joe', '70000', '1');
INSERT INTO Employee (id, name, salary, departmentId) VALUES ('2', 'Jim', '90000', '1');
INSERT INTO Employee (id, name, salary, departmentId) VALUES ('3', 'Henry', '80000', '2');
INSERT INTO Employee (id, name, salary, departmentId) VALUES ('4', 'Sam', '60000', '2');
INSERT INTO Employee (id, name, salary, departmentId) VALUES ('5', 'Max', '90000', '1');
TRUNCATE TABLE Department;
INSERT INTO Department (id, name) VALUES ('1', 'IT');
INSERT INTO Department (id, name) VALUES ('2', 'Sales');
Solution¶
Solution #1 - Using IN Operator with Row Constructors¶
The first step in finding the employees who have the highest salary in each of the department is to identify the highest salary of each department. This can be accomplished with the MAX()
aggregate function together with the GROUP BY
clause:
SELECT departmentId, MAX(salary)
FROM Employee
GROUP BY departmentId
The result of this query can now be used as a row constructor which allows the simultaneous comparison of multiple values. The result of the query is used in the following query that returns the employee name, deparment name and employee salary for all employees whose department ID and salary matches the highest salary for the same department:
# Final Solution 1 Query - Using IN Operator with Row Constructors
SELECT Department.name AS Department, Employee.name AS Employee, Employee.Salary
FROM Employee INNER JOIN Department
ON Employee.departmentId = Department.id AND
(Department.id, Employee.Salary) IN (SELECT departmentId, MAX(salary) FROM Employee GROUP BY departmentId)
Here's the query plan generated by PostgreSQL for the query above:
Nested Loop (cost=15.74..28.25 rows=1 width=1036)
-> Hash Join (cost=15.60..27.73 rows=1 width=528)
Hash Cond: ((employee.departmentid = employee_1.departmentid) AND (employee.salary = (max(employee_1.salary))))
-> Seq Scan on employee (cost=0.00..11.40 rows=140 width=524)
-> Hash (cost=13.50..13.50 rows=140 width=8)
-> HashAggregate (cost=12.10..13.50 rows=140 width=8)
Group Key: employee_1.departmentid
-> Seq Scan on employee employee_1 (cost=0.00..11.40 rows=140 width=8)
-> Index Scan using department_pkey on department (cost=0.14..0.51 rows=1 width=520)
Index Cond: (id = employee.departmentid)
And here's the fastest runtime:
- Runtime: 226ms
- Beats: 91.62% as of July 15, 2024
Solution #2 - Using Common Table Expression (CTE)¶
The second way of getting the highest salary of a department is through the use of a common-table expression (CTE). A common table expression (CTE) is a named temporary result set that exists within the scope of a single statement and that can be referred to later within that statement, possibly multiple times.
The first step is to create a query that will get the highest salary for each department. There are a couple of ways of doing this. The first is by using the MAX()
aggregate function together with the GROUP BY
clause of the SELECT
statement using just the Employee
table:
SELECT departmentId, MAX(salary) AS Salary
FROM Employee
GROUP BY departmentId
departmentId | Salary |
---|---|
1 | 90000 |
2 | 80000 |
The second way is to still use the MAX()
aggregate function together with the GROUP BY
clause but instead of just using the Employee
table, the table is joined with the Department
table since the department name is needed in the output.
SELECT Department.name, departmentId, MAX(salary) AS Salary
FROM Employee INNER JOIN Department
ON Employee.departmentId = Department.id
GROUP BY Department.name, departmentId
To simplify the final query, it will be this second option that will be used in the common-table expression (CTE). Now that we have the maximum salary for each deparment by name and ID, this result can now be joined back with the Employee
table to return the names of the employees whose salary matches the highest salary for their department. The final query will look as follows:
# Final Solution 2 Query - Using Common Table Expression (CTE)
WITH CTE_MaxSalary (Department, DepartmentID, Salary) AS
(SELECT Department.name, departmentId, MAX(salary) AS Salary
FROM Employee INNER JOIN Department
ON Employee.departmentId = Department.id
GROUP BY Department.name, departmentId)
SELECT cte.Department, Employee.name AS Employee, cte.Salary
FROM Employee INNER JOIN CTE_MaxSalary cte
ON Employee.departmentId = cte.departmentId AND
Employee.Salary = cte.Salary
Here's the query plan generated by PostgreSQL for this query:
Hash Join (cost=30.88..43.34 rows=1 width=1036)
Hash Cond: ((employee.departmentid = cte.departmentid) AND (employee.salary = cte.salary))
-> Seq Scan on employee (cost=0.00..11.40 rows=140 width=524)
-> Hash (cost=28.78..28.78 rows=140 width=524)
-> Subquery Scan on cte (cost=25.98..28.78 rows=140 width=524)
-> HashAggregate (cost=25.98..27.38 rows=140 width=524)
Group Key: department.name, employee_1.departmentid
-> Hash Join (cost=13.15..24.93 rows=140 width=524)
Hash Cond: (employee_1.departmentid = department.id)
-> Seq Scan on employee employee_1 (cost=0.00..11.40 rows=140 width=8)
-> Hash (cost=11.40..11.40 rows=140 width=520)
-> Seq Scan on department (cost=0.00..11.40 rows=140 width=520)
And here's the fastest runtime for this query:
- Runtime: 222ms
- Beats: 95.81% as of July 15, 2024
Solution Runtime Comparison¶
Here's the comparison of the fastest runtime for each of the solutions.
Solution # | Runtime | Beats |
---|---|---|
1 - Using IN Operator with Row Constructors | 226ms | 91.62% |
2 - Using Common Table Expression (CTE) | 222ms | 95.81% |