LeetCode 584 - Find Customer Referee¶
Database Language: MySQL
Difficulty:
Problem Description¶
Input¶
Table: Customer¶
Column Name | Type |
---|---|
id | int |
name | varchar |
referee_id | int |
In SQL, id
is the primary key column for this table.
Each row of this table indicates the id of a customer, their name, and the id of the customer who referred them.
Requirement¶
Find the names of the customer that are not referred by the customer with id = 2.
Return the result table in any order.
The result format is in the following example.
Examples¶
Example 1¶
Input¶
Customer table:
id | name | referee_id |
---|---|---|
1 | Will | NULL |
2 | Jane | NULL |
3 | Alex | 2 |
4 | Bill | NULL |
5 | Zack | 1 |
6 | Mark | 2 |
Output¶
name |
---|
Will |
Jane |
Bill |
Zack |
SQL Schema¶
CREATE TABLE IF NOT EXISTS Customer (id INT PRIMARY KEY, name VARCHAR(25), referee_id INT);
TRUNCATE TABLE Customer;
INSERT INTO Customer (id, name, referee_id) values ('1', 'Will', NULL);
INSERT INTO Customer (id, name, referee_id) values ('2', 'Jane', NULL);
INSERT INTO Customer (id, name, referee_id) values ('3', 'Alex', '2');
INSERT INTO Customer (id, name, referee_id) values ('4', 'Bill', NULL);
INSERT INTO Customer (id, name, referee_id) values ('5', 'Zack', '1');
INSERT INTO Customer (id, name, referee_id) values ('6', 'Mark', '2');
Solutions¶
There are three possible ways of finding the names of customers that are not referred by the customer with id = 2, namely:
- Using the
OR
operator - Using the
IFNULL
flow control function - Using the
COALESCE
comparison function
Solution 1 - Using OR Operator¶
To find the names of customers that are not referred by the customer with id = 2, all that needs to be done is check if the referee_id
is not equal to 2. This can be achieved by the following query:
SELECT name FROM Customer
WHERE referee_id != 2
Running this query returns the following result, which is not the required result:
name |
---|
Zack |
The query is not returning those customers where the referee_id
is NULL because NULL != 2
returns a NULL value, which is treated as false by MySQL. According to MySQL - Working with NULL Values,
You cannot use arithmetic comparison operators such as =, <, or <> to test for NULL. Because the result of any arithmetic comparison with NULL is also NULL, you cannot obtain any meaningful results from such comparisons.
To include customers that were not referred by any other customers, referee_id IS NULL
needs to be added in the query. The condition referee_id = NULL
cannot be used because as mentioned above, the result of any arithmetic comparison, such as =
, with NULL is also NULL. Adding the referee_id IS NULL
condition to the query above now yields the following query:
# Final Solution Query
SELECT name FROM Customer
WHERE referee_id != 2 OR referee_id IS NULL
This query now returns the desired result. Here's the query plan generated by MySQL for this query:
id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | SIMPLE | Customer | ALL | 6 | 86.11 | Using where |
And here's the fastest runtime for this query:
- Runtime: 386ms
- Beats: 97.04% as of July 16, 2024
Alternatively, instead of using the IS NULL
comparison operator, the <=>
(NULL-safe equal) operator can be used.
SELECT name FROM Customer
WHERE referee_id != 2 OR referee_id <=> NULL
One may ask, does the order of the conditions used by the OR
operator matter? So instead of WHERE referee_id != 2 OR referee_id IS NULL
, will it be faster if the conditions were interchanged to WHERE referee_id IS NULL OR referee_id != 2
? Will the following query be faster than the one earlier?
# Final Solution Query
SELECT name FROM Customer
WHERE referee_id IS NULL OR referee_id != 2
Comparing the query plan of this updated query with the one above, it can be seen that the same query plan was generated by MySQL:
id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | SIMPLE | Customer | ALL | 6 | 86.11 | Using where |
In addition, the fastest runtime for the "interchanged" condition is somewhat close to the one above:
- Runtime: 393ms
- Beats: 95.28% as of July 16, 2024
There have been a lot of discussions in the MySQL community about whether MySQL performs a "short-circuit" when evaluating the WHERE
clause that involves the OR
operator and no "official" documentation can be found that neither confirms nor denies that MySQL performs a "short-circuitwhen evaluating the
WHEREclause that involves the
OR` operator.
One other question that may be asked, given that the referee_id
is being compared between 2 values, namely the value 2 and NULL, can't the NOT IN
comparison operator be used:
SELECT name FROM Customer
WHERE referee_id NOT IN (NULL, 2);
Running this query will return an empty set. The reason the query is returning an empty set is because according to MySQL Comparisong Functions and Operators - IN Operator:
To comply with the SQL standard, IN() returns NULL not only if the expression on the left hand side is NULL, but also if no match is found in the list and one of the expressions in the list is NULL.
Since one of the expressions or values in the list is a NULL value, the result of the condition returns a NULL value. Thus, the referee_id NOT IN (NULL, 2)
condition returns False for each row in the table because in MySQL, 0 or NULL means false and anything else means true (Reference: MySQL - Working with NULL Values)
Solution 2 - Using IFNULL Flow Control Function¶
The second way of finding the names of customers who were not referred by the customer with id = 2 is with the use of the IFNULL
flow control function. The IFNULL
function accepts 2 parameters and returns the first parameter if it is not NULL; otherwise it returns the second parameter.
Here's how the query will now look using the IFNULL
flow control function instead of the OR
operator:
# Final Solution Query
SELECT name FROM Customer
WHERE IFNULL(referee_id, 0) != 2
With the IFNULL(referee_id, 0)
, if the value of referee_id
is not NULL, then the function returns the value of referee_id
. Otherwise, if the value of referee_id
is NULL, the function returns the second parameter, which is in this case, 0. The value passed as the second parameter of the IFNULL
function, in this case, can be any number as long as it is not 2 so that if referee_id
is NULL, the WHERE
clause will return a true value.
Here's the query plan generated by MySQL:
id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | SIMPLE | Customer | ALL | 6 | 100.00 | Using where |
And here's the fastest runtime for this query using the IFNULL
function:
- Runtime: 399ms
- Beats: 93.37%
Solution 3 - Using COALESCE Comparison Function¶
The third way of finding the names of customers who were not referred by the customer with id = 2 is with the use of the COALESCE
comparison function. The COALESCE
function returns the first non-NULL value in the list or NULL if there are no non-NULL values in the list.
Here's how the query will look like using the COALESCE
comparison function instead of the IFNULL
flow control function or the OR
operator:
# Final Solution Query
SELECT name FROM Customer
WHERE COALESCE(referee_id, 0) != 2
Similar to the IFNULL
function in the second solution, with the COALESCE(referee_id, 0)
, if the value of referee_id
is not NULL, then the COALESCE
function returns the value of referee_id
because that's the first non-NULL value in the list of parameters passed to the COALESCE
function. Otherwise, if the value of referee_id
is NULL, the COALESCE(referee_id, 0)
function returns 0 because that's the first non-NULL value in the list of parameters passed to the function. Just like the case for the IFNULL
function in the second solution, the second parameter passed to the COALESCE
function can be any number aside from 0 as long as it is not 2 so that the WHERE
clause will return a true value if referee_id
is NULL.
Here's the query plan generated by MySQL:
id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | SIMPLE | Customer | ALL | 6 | 100.00 | Using where |
The query plan using the COALESCE
comparison function is the same as the query plan for the IFNULL
function. And here's the fastest runtime for the query that uses the COALESCE
function:
- Runtime: 411ms
- Beats: 88.48% as of July 16, 2024
Solution Runtime Comparison¶
Here's the comparison of the fastest runtime for each of the solutions.
Solution # | Runtime | Beats |
---|---|---|
1a - Using OR Operator - NULL Check Last | 386ms | 97.04% |
1b - Using OR Operator - NULL Check First | 393ms | 95.28% |
2 - Using IFNULL Function | 399ms | 93.37% |
3 - Using COALESCE Function | 411ms | 88.48% |
As can be seen, the fastest runtime for both IFNULL
and COALESCE
solutions are similar (or near each other) since both solutions have the same execution plan. Also, as can be expected, the solution that uses just the OR
operator is faster than the solutions that uses either the IFNULL
or the COALESCE
function because the OR
solution is not using any additional functions.