LeetCode 1148 - Article Views I¶
Database Language: MySQL
Difficulty:
Problem Description¶
Input¶
Table: Views¶
Column Name | Type |
---|---|
article_id | int |
author_id | int |
viewer_id | int |
view_date | date |
There is no primary key (column with unique values) for this table, the table may have duplicate rows.
Each row of this table indicates that some viewer viewed an article (written by some author) on some date.
Note that equal author_id
and viewer_id
indicate the same person.
Requirement¶
Write a solution to find all the authors that viewed at least one of their own articles.
Return the result table sorted by id in ascending order.
The result format is in the following example.
Examples¶
Example 1¶
Input¶
Views table:
article_id | author_id | viewer_id | view_date |
---|---|---|---|
1 | 3 | 5 | 2019-08-01 |
1 | 3 | 6 | 2019-08-02 |
2 | 7 | 7 | 2019-08-01 |
2 | 7 | 6 | 2019-08-02 |
4 | 7 | 1 | 2019-07-22 |
3 | 4 | 4 | 2019-07-21 |
3 | 4 | 4 | 2019-07-21 |
Output¶
id |
---|
4 |
7 |
SQL Schema¶
CREATE TABLE IF NOT EXISTS Views (article_id int, author_id int, viewer_id int, view_date date);
TRUNCATE TABLE Views;
INSERT INTO Views (article_id, author_id, viewer_id, view_date)
VALUES ('1', '3', '5', '2019-08-01');
INSERT INTO Views (article_id, author_id, viewer_id, view_date)
VALUES ('1', '3', '6', '2019-08-02');
INSERT INTO Views (article_id, author_id, viewer_id, view_date)
VALUES ('2', '7', '7', '2019-08-01');
INSERT INTO Views (article_id, author_id, viewer_id, view_date)
VALUES ('2', '7', '6', '2019-08-02');
INSERT INTO Views (article_id, author_id, viewer_id, view_date)
VALUES ('4', '7', '1', '2019-07-22');
INSERT INTO Views (article_id, author_id, viewer_id, view_date)
VALUES ('3', '4', '4', '2019-07-21');
INSERT INTO Views (article_id, author_id, viewer_id, view_date)
VALUES ('3', '4', '4', '2019-07-21');
Solutions¶
Solution 1 - Using DISTINCT Clause¶
Based on the description of the input table as well as the description of the requirements, the query should return the id
of the authors from the Views
table where the author_id
is the same as the viewer_id
and return the result in ascending order by the id
of the author. From this requirement, the query will be as follows (an alias of id
is assigned to the author_id
as this is what the output requires):
SELECT author_id AS id
FROM Views
WHERE author_id = viewer_id
ORDER BY id
The output of this query is as follows:
id |
---|
4 |
4 |
7 |
As can be seen from this output, author #4 is returned twice because there are 2 rows in the Views
table where author #4 read the same article but the required output only wants each author once even if the author read multiple articles that the author wrote. To remove duplicates, the DISTINCT
clause will be added to the query:
# Final Solution Query
SELECT DISTINCT author_id AS id
FROM Views
WHERE author_id = viewer_id
ORDER BY id
Here's the query execution plan generated by MySQL for this query:
-> Sort: views.author_id (actual time=0.053..0.053 rows=2 loops=1)
-> Table scan on <temporary> (actual time=0.000..0.001 rows=2 loops=1)
-> Temporary table with deduplication (cost=0.95 rows=7) (actual time=0.043..0.044 rows=2 loops=1)
-> Filter: (views.viewer_id = views.author_id) (cost=0.95 rows=7) (actual time=0.025..0.029 rows=3 loops=1)
-> Table scan on Views (cost=0.95 rows=7) (actual time=0.019..0.026 rows=7 loops=1)
And here's the fastest time for this query:
- Runtime: 365ms
- Beats: 87.22% as of August 3, 2024
Solution 2 - Using GROUP BY Clause¶
Instead of using the DISTINCT
clause, another way of removing duplicates from the output is with the use of GROUP BY
. According to MySQL - DISTINCT Optimization:
In most cases, a
DISTINCT
clause can be considered as a special case ofGROUP BY
. For example, the following two queries are equivalent:
SELECT DISTINCT c1, c2, c3 FROM t1 WHERE c1 > const;
SELECT c1, c2, c3 FROM t1 WHERE c1 > const GROUP BY c1, c2, c3;
Here's how the query will look like using the GROUP BY
instead of the DISTINCT
clause:
# Final Solution Query
SELECT author_id AS id
FROM Views
WHERE author_id = viewer_id
GROUP BY author_id
ORDER BY author_id
Here's the query execution plan generated by MySQL for this query:
-> Sort: views.author_id (actual time=0.087..0.087 rows=2 loops=1)
-> Table scan on <temporary> (actual time=0.001..0.002 rows=2 loops=1)
-> Temporary table with deduplication (cost=0.95 rows=7) (actual time=0.071..0.072 rows=2 loops=1)
-> Filter: (views.viewer_id = views.author_id) (cost=0.95 rows=7) (actual time=0.037..0.046 rows=3 loops=1)
-> Table scan on Views (cost=0.95 rows=7) (actual time=0.027..0.041 rows=7 loops=1)
And here's the fastest runtime for the query:
- Runtime: 356ms
- Beats: 92.44% as of August 3, 2024
Solution Runtime Comparison¶
Here's the comparison of the fastest runtime for each of the solutions.
Solution # | Runtime | Beats |
---|---|---|
1 - Using DISTINCT | 365ms | 87.22% |
2 - Using GROUP BY | 356ms | 92.44% |