Database Language: SQL Server
Difficulty: Easy
| Column Name | Type | | --------------- | -------- | | order_number | int | | customer_number | int |
`order_number` is the primary key (column with unique values) for this table.
This table contains information about the order ID and the customer ID.
Write a solution to find the customer_number for the customer who has placed the largest number of orders.
The test cases are generated so that exactly one customer will have placed more orders than any other customer.
The result format is in the following example.
Orders table:
| order_number | customer_number | | ------------ | --------------- | | 1 | 1 | | 2 | 2 | | 3 | 3 | | 4 | 3 |
| customer_number | | --------------- | | 3 |
The customer with number 3 has two orders, which is greater than either customer 1 or 2 because each of them only has one order. So the result is customer_number 3.
CREATE TABLE orders (order_number INT PRIMARY KEY, customer_number INT); TRUNCATE TABLE orders; INSERT INTO orders (order_number, customer_number) values ('1', '1'); INSERT INTO orders (order_number, customer_number) values ('2', '2'); INSERT INTO orders (order_number, customer_number) values ('3', '3'); INSERT INTO orders (order_number, customer_number) values ('4', '3');
To find the customer who has placed the largest number of orders, the number of orders placed by each customer needs to be determined first. This can be determined by using the `COUNT()` aggregate function:
SELECT customer_number, COUNT(order_number) AS order_count FROM Orders GROUP BY customer_number
| customer_number | order_count | | --------------- | ----------- | | 1 | 1 | | 2 | 1 | | 3 | 2 |
Since the question wants the customer who placed the largest number of orders, the output of the previous query needs to be sorted by the `order_count` in descending order:
SELECT customer_number, COUNT(order_number) AS order_count FROM Orders GROUP BY customer_number ORDER BY order_count DESC
| customer_number | order_count | | --------------- | ----------- | | 3 | 2 | | 1 | 1 | | 2 | 1 |
The query above returned all customers but the question only wants a single customer who has placed the most orders. To address the requirement, the output needs to be limited to just 1 row using the `LIMIT 1` clause:
SELECT TOP 1 customer_number, COUNT(order_number) AS order_count FROM Orders GROUP BY customer_number ORDER BY order_count DESC
| customer_number | order_count | | --------------- | ----------- | | 3 | 2 |
Lastly, the output needed is just the `customer_number` without the number of orders made by that customer so the `order_count` column in the `SELECT` clause needs to be removed:
SELECT TOP 1 customer_number FROM Orders GROUP BY customer_number ORDER BY order_count DESC
But this generates the following error because the `order_count` column is being referenced in the `ORDER BY` clause:
Query 1 ERROR: Msg: 207, Line 4, State: 1, Level: 16 Invalid column name 'order_count'.
To resolve this, the previous expression used to create the `order_count` column will be used in the `ORDER BY` clause:
# Final Solution Query SELECT TOP 1 customer_number FROM Orders GROUP BY customer_number ORDER BY COUNT(order_number) DESC
| customer_number | | --------------- | | 3 |
Here's the query plan generated by SQL Server for this query:
|--Sort(TOP 1, ORDER BY:([Expr1002] DESC)) |--Compute Scalar(DEFINE:([Expr1002]=CONVERT_IMPLICIT(int,[Expr1005],0))) |--Stream Aggregate(GROUP BY:([leetcode].[dbo].[Orders].[customer_number]) DEFINE:([Expr1005]=Count(*))) |--Sort(ORDER BY:([leetcode].[dbo].[Orders].[customer_number] ASC)) |--Clustered Index Scan(OBJECT:([leetcode].[dbo].[Orders].[PK_Orders]))
And here's the fastest runtime for this query:
Runtime: 446ms
Beats: 93.21% as of July 27, 2024