LeetCode 175 - Combine Two Tables¶
Database Language: SQL Server
Difficulty:
Problem Description¶
Input¶
Table: Person¶
Column Name | Type |
---|---|
personId | int |
lastName | varchar |
firstName | varchar |
personId
is the primary key (column with unique values) for this table.
This table contains information about the ID of some persons and their first and last names.
Table: Address¶
Column Name | Type |
---|---|
addressId | int |
personId | int |
city | varchar |
state | varchar |
addressId
is the primary key (column with unique values) for this table.
Each row of this table contains information about the city and state of one person with ID = PersonId.
Requirement¶
Write a solution to report the first name, last name, city, and state of each person in the Person table. If the address of a personId is not present in the Address table, report null instead.
Return the result table in any order.
The result format is in the following example.
Examples¶
Example 1¶
Input¶
Person table:
personId | lastName | firstName |
---|---|---|
1 | Wang | Allen |
2 | Alice | Bob |
Address table:
addressId | personId | city | state |
---|---|---|---|
1 | 2 | New York City | New York |
2 | 3 | Leetcode | California |
Output¶
firstName | lastName | city | state |
---|---|---|---|
Allen | Wang | NULL | NULL |
Bob | Alice | New York City | New York |
Explanation¶
There is no address in the address table for the personId = 1 so we return null in their city and state. addressId = 1 contains information about the address of personId = 2.
SQL Schema¶
CREATE TABLE [dbo].[Person] (personId int PRIMARY KEY, firstName varchar(255), lastName varchar(255));
CREATE TABLE [dbo].[Address] (addressId int PRIMARY KEY, personId int, city varchar(255), state varchar(255));
TRUNCATE TABLE [dbo].[Person];
INSERT INTO [dbo].[Person] (personId, lastName, firstName) VALUES ('1', 'Wang', 'Allen');
INSERT INTO [dbo].[Person] (personId, lastName, firstName) VALUES ('2', 'Alice', 'Bob');
TRUNCATE TABLE Address;
INSERT INTO [dbo].[Address] (addressId, personId, city, state) VALUES ('1', '2', 'New York City', 'New York');
INSERT INTO [dbo].[Address] (addressId, personId, city, state) VALUES ('2', '3', 'Leetcode', 'California');
Solution¶
Given that the requirement is to return all rows from the Person
table even if there are no corresponding row in the Address
table, the LEFT OUTER JOIN
(or LEFT JOIN
for short) will be used to join the Person
table with the Address
table with the Person
table on the left side of the join clause. The 2 tables will be joined on the personId
column and here's how the join clause will look like:
FROM Person LEFT OUTER JOIN Address
ON Person.personId = Address.personId
The requirement wants the person's firstName
and lastName
as well as the person's city
and state
if it exists in the Address
table and return a NULL value to these 2 columns if they don't exist. Since we are using a LEFT OUTER JOIN
, the city
and state
will automatically return NULL values if an address record does not exist for a person. Here's how the SELECT
statement will look like:
SELECT Person.firstName, Person.lastName, Address.city, Address.state
It is a good practice to include the table name as part of the column name to avoid any ambiguity if you will be joining tables that has a column that exists in both tables. Putting these 2 together yields the following query:
# Final Query
SELECT Person.firstName, Person.lastName, Address.city, Address.state
FROM Person LEFT OUTER JOIN Address
ON Person.personId = Address.personId
The query plan generated by SQL Server for this query is as follows:
|--Nested Loops(Left Outer Join, WHERE:([leetcode].[dbo].[Person].[personId]=[leetcode].[dbo].[Address].[personId]))
|--Clustered Index Scan(OBJECT:([leetcode].[dbo].[Person].[PK_Person]))
|--Clustered Index Scan(OBJECT:([leetcode].[dbo].[Address].[PK_Address]))
And here's the fastest runtime for the query above:
- Runtime: 426ms
- Beats: 84.98% as of July 15, 2024