Introduction
Joins are always an essential topic of SQL. In most of the Data Engineering interviews, questions are asked about SQL Joins irrespective of the number of years of experience one has.
If you know SQL Joins concepts then these SQL joins questions are not difficult to answer. But sometimes interviewers make it trickier by asking SQL joins puzzles, where these join columns may contain either duplicate values or NULL values which makes the result interesting and people easily get confused. Here in this blog, I am going to explain some of these SQL Joins tricky interview questions.
In this blog, we are going to cover the SQL Join interview questions with an explanation of where columns with SQL Joins conditions may contain:
- Unique values
- Duplicate values
- Null values
This blog on SQL Joins will help you in getting more information about SQL Joins concepts.
Tables to be use
In this blog, we are going to use 2 tables tbl_samplejoin1 & tbl_samplejoin2. Each of these 2 tables contains one column on which the join condition has been defined. Table tbl_samplejoin1 contains the column Col1 and table tbl_samplejoin2 has the column Col2.
Here we are going to explain the impact of having the unique records or duplicate records or Null values on the result of the Inner Join, Left Outer Join, Right Outer Join, and the Full Outer Join.
Scenario 1: Join Columns having Unique Values
This is the most basic and common scenario where you are asked the result of a query containing either Inner SQL Join or Left Outer Join or Right Outer Join or the Full Outer Join using the table containing the unique records in the columns on which join is defined.
Below is the screenshot of the tables containing the unique records.
Answers with explanation:
a) Inner Join: If we look at the records of both tables, there are only 2 matching records with values 1 & 2. Therefore, when we execute the inner query mentioned in the below screenshot, it gives only 2 records.
b) Left Outer Join: If look at both tables' data, we found that two records (1 & 2) of the table tbl_samplejoin1 have the matching records in the second table tbl_samplejoin2. But the remaining two records (3 & 5) don’t have the matching record in the second table. So result-set returns records 1 & 2 of table tbl_samplejoin1 with matching records 1 & 2 from right table and for records 3 & 5, it return with NULL value.
c) Right Outer Join: Since the table on the right tbl_samplejoin2 contains 3 rows with values 1, 2 & 4. We have matching values for 1 & 2 in the left table but not for 4. So the first two rows of the outcome contain the matching records of values 1& 2 but the third row has NULL from the left table and 4 from the Right table.
d) FULL Outer Join: In the case of a FULL outer join, we got 5 rows. The first & second rows contain the matching records of both tables. Since there is no record in the right table with values 3 & 5 so we have 3rd & 4th rows with values 3 & 5 from the left table and NULL value from the right table. Now left table tbl_samplejoin1 contains no record with value 4 so we have the 5th row in the outcome where we have value 4 from the right table and NULL value from the left table.
Scenario 2: Join columns having duplicate values
Let insert value 1 into the table tbl_samplejoin2 so that these tables contain the duplicate values. Let's check how this duplicate record impacts the outcomes of the join queries.
a) Inner Join: With duplicate values in the join column, we found one extra row has been added to the result-set. This is because we have now 2 matching rows in table tbl_samplejoin2 containing the value “1” for the value “1” of the table tbl_samplejoin1.
b) Left Outer Join: If we look at both tables' data, we will find that two records (1 & 2) of the table tbl_samplejoin1 have the matching records in the second table tbl_samplejoin2. But the remaining two records (3 & 5) don’t have the matching record in the second table. So result-set returns values of 1 & 2 from table tbl_samplejoin1 with matching values 1 & 2 from second table and for records 3 & 5, it return with NULL value.
c) Right Outer Join: The table on the right tbl_samplejoin2 contains 4 rows with values 1,1, 2 & 4. We have matching values for 1 & 2 in the left table but not for 4. So the first three rows of the outcome contain the matching records of values 1& 2 but the fourth row has NULL from the left table and 4 from the Right table.
d) Full Outer Join: In the case of a FULL outer join, we got 6 rows. The first, second, & third rows contain the matching records of both tables. Since there is no record in the right table with values 3 & 5 so we have 4th & 5thth rows with values 3 & 5 from the left table and NULL value from the right table. Now left table tbl_samplejoin1 contains no record with value 4 so we have the 6th row in the outcome where we have value 4 from the right table and NULL value from the left table.
Scenario 3: One Join table contains a Null Value
After adding the duplicate values, let's make it more interesting by adding the Null values and will see their impact on the joins. Let's add NULL value in the 2nd table tbl_samplejoin2.
a) Inner Join: There will be no impact of adding NULL value in the 2nd table and we got the same result-set as we got in the 2nd scenario.
b) Left Outer Join: Since in the case of Left outer join, we got all records of the left table with corresponding matching records from the right table in case of match or NULL value from the right table in case of no match. So there is no impact of adding NULL value in the 2nd table tbl_samplejoin2.
c) Right Outer Join: As we have added one extra record of NULL value in the 2nd table tbl_samplejoin2 and there is no matching value of NULL in the 1st table. So we have the 5th row with one NULL value from the right table tbl_samplejoin2 and another NULL from the left table as there is no matching record of NULL value in the left table.
d) Full Outer Join: In the case of a FULL Outer Join we will get one extra row. This extra row contains one NULL value from the right table tbl_samplejoin2 and another NULL from the left table as there is no matching record of NULL value in the left table.
Note: Scenario 4: Both Tables containing Null Values, and Scenario 5: Join tables containing Multiple Null Values will read in the next post.