Database Language: SQL Server
Difficulty: Easy
| Column Name | Type | | ----------- | ------- | | id | int | | name | varchar |
`id` is the primary key (column with unique VALUES) for this table. Each row of this table indicates the ID and name of a customer.
| Column Name | Type | | ----------- | ---- | | id | int | | customerId | int |
`id` is the primary key (column with unique VALUES) for this table.
customerId is a foreign key (reference columns) of the ID from the Customers table.
Each row of this table indicates the ID of an order and the ID of the customer who ordered it.
Write a solution to find all customers who never order anything.
Return the result table in any order.
The result format is in the following example.
Customers table:
| id | name | | -- | ----- | | 1 | Joe | | 2 | Henry | | 3 | Sam | | 4 | Max |
Orders table:
| id | customerId | | -- | ---------- | | 1 | 3 | | 2 | 1 |
| Customers | | --------- | | Henry | | Max |
CREATE TABLE [dbo].[Customers] (id INT PRIMARY KEY, name VARCHAR(255)); CREATE TABLE [dbo].[Orders] (id INT PRIMARY KEY, customerId INT); TRUNCATE TABLE [dbo].[Customers]; INSERT INTO [dbo].[Customers] (id, name) VALUES ('1', 'Joe'); INSERT INTO [dbo].[Customers] (id, name) VALUES ('2', 'Henry'); INSERT INTO [dbo].[Customers] (id, name) VALUES ('3', 'Sam'); INSERT INTO [dbo].[Customers] (id, name) VALUES ('4', 'Max'); TRUNCATE TABLE [dbo].[Orders]; INSERT INTO [dbo].[Orders] (id, customerId) VALUES ('1', '3'); INSERT INTO [dbo].[Orders] (id, customerId) VALUES ('2', '1');
When doing queries on 2 tables wherein the goal is to return all rows in the first table that does not exist on a second table, there are 3 possible ways of accomplishing this:
To identify rows in one table that does not exist on another table using the `LEFT OUTER JOIN`, the basic query structure is as follows:
SELECT <Output Columns> FROM <Table 1> LEFT OUTER JOIN <Table 2> ON <Table 1>.<Column 1> = <Table 2>.<Column 1> WHERE <Table 2>.<Column 2> IS NULL
Using this basic query structure, to identify customers in the `Customers` who has never ordered anything (or does not have any rows in the `Orders` table based on the `customerId`), the 2 tables need to be joined on the customer ID column and then the result needs to be filtered out by checking if the `Orders.id IS NULL`. Here's how the query will look like using the `LEFT OUTER JOIN`:
# Final Solution 1 Query - Using LEFT OUTER JOIN SELECT name AS Customers FROM Customers LEFT OUTER JOIN Orders ON Customers.id = Orders.customerId WHERE Orders.id IS NULL
Here's the corresponding query plan generated by SQL Server:
|--Filter(WHERE:([leetcode].[dbo].[Orders].[id] IS NULL)) |--Nested Loops(Left Outer Join, WHERE:([leetcode].[dbo].[Customers].[id]=[leetcode].[dbo].[Orders].[customerId])) |--Clustered Index Scan(OBJECT:([leetcode].[dbo].[Customers].[PK_Customer])) |--Clustered Index Scan(OBJECT:([leetcode].[dbo].[Orders].[PK_Orders]))
And here's the fastest runtime for this query:
Runtime: 362ms
Beats: 83.54% as of July 16, 2024
The basic structure of using `NOT EXISTS` in identifying rows in one table where no row exists on another table is as follows:
SELECT <Output Columns> FROM <Table 1> WHERE NOT EXISTS (SELECT 'X' FROM <Table 2> WHERE <Table 1>.<Column> = <Table 2>.<Column>)
Using this basic query structure that uses the `NOT EXISTS` to identify customers in the `Customers` table that do not have any orders in the `Orders` table using the `customerId` in the `Orders` to match against the `id` in the `Customers` table, the query will look as follows:
# Final Solution 2 Query - Using NOT EXISTS SELECT name AS Customers FROM Customers WHERE NOT EXISTS (SELECT 'X' FROM Orders WHERE Customers.id = Orders.customerId)
Here's the query plan generated by SQL Server for this query:
|--Nested Loops(Left Anti Semi Join, OUTER REFERENCES:([leetcode].[dbo].[Customers].[id])) |--Clustered Index Scan(OBJECT:([leetcode].[dbo].[Customers].[PK_Customer])) |--Top(TOP EXPRESSION:((1))) |--Clustered Index Scan(OBJECT:([leetcode].[dbo].[Orders].[PK_Orders]), WHERE:([leetcode].[dbo].[Customers].[id]=[leetcode].[dbo].[Orders].[customerId]))
And here's the fastest runtime for the `NOT EXISTS` solution:
Runtime: 361ms
Beats: 84.02% as of July 16, 2024
The basic structure in using `NOT IN` in identifying rows in one table where no row exists on another table is as follows:
SELECT <Output Columns> FROM <Table 1> WHERE <Column 1> NOT IN (SELECT <Column> FROM <Table 2>)
Using this basic query structure that uses the `NOT IN` to identify customers in the `Customers` table that do not have any orders in the `Orders` table using the `customerId` in the `Orders` to match against the `id` in the `Customers` table, the query will look as follows:
# Final Solution 3 Query - Using NOT IN SELECT name AS Customers FROM Customers WHERE id NOT IN (SELECT customerId FROM Orders)
Here's the query plan generated by SQL Server for this query:
|--Nested Loops(Left Anti Semi Join, WHERE:([leetcode].[dbo].[Orders].[customerId] IS NULL OR [leetcode].[dbo].[Customers].[id]=[leetcode].[dbo].[Orders].[customerId])) |--Clustered Index Scan(OBJECT:([leetcode].[dbo].[Customers].[PK_Customer])) |--Clustered Index Scan(OBJECT:([leetcode].[dbo].[Orders].[PK_Orders]))
And here's the fastest runtime for the `NOT EXISTS` solution:
Runtime: 364ms
Beats: 82.37% as of July 16, 2024
Here's the comparison of the fastest runtime for each of the solutions.
| Solution # | Runtime | Beats | | ------------------------- | ------- | ------ | | 1 - Using LEFT OUTER JOIN | 362ms | 83.54% | | 2 - Using NOT EXISTS | 361ms | 84.02% | | 3 - Using NOT IN | 364ms | 82.37% |