SQL
by Mike Thompson, 10/11/2015
Table joins are an intermediate-level skill in SQL. Inner joins, the default kind, are good enough until you run into a situation where a value(s) of the foreign key you're using doesn't exist in the other table. Outer joins, the non-default kind, I find to be more useful for this reason, because data is rarely as pure and white as the driven snow. In the real world, inconsistencies creep into your data.
I have an example that I like to use in order to explain the differences between inner and outer joins, and when you would use one rather than the other. The example is based on two tables: one contains the colors of the rainbow, the other contains the colors of the US flag.
table_rainbow
- red
- orange
- yellow
- green
- blue
- purple
table_flag
- red
- white
- blue
The following query displays all the rows in table_rainbow.
Select table_rainbow.color From table_rainbow
- red
- orange
- yellow
- green
- blue
- purple
If we add an inner join to the above query, here's what the query will look like, and what will be returned when it is run.
Select table_rainbow.color From table_rainbow Inner Join table_flag On table_rainbow.color = table_flag.color
- red
- blue
What happened to the other four colors in table_rainbow? Inner join means, "Show only the rows in the first table that have corresponding rows in the second table", the key field(s) being specified in the On clause. The query showed us the colors in table_rainbow that are also in table_flag.
Outer join means, "Show the rows in the first table, whether or not they have corresponding rows in the second table." The following query is identical to the one above, except that "Inner" has been replaced with "Outer". Note that it returns exactly the same results as the very first query, which contained no join at all.
Select table_rainbow.color From table_rainbow Outer Join table_flag On table_rainbow.color = table_flag.color
- red
- orange
- yellow
- green
- blue
- purple
So if a query with an outer join returns the same rows as a query with no join at all, why would you ever bother to use an outer join? Well, let's say that you wanted to list the colors that are in table_rainbow but are not in table_flag. Here's a query that does that. Note carefully both the Outer Join clause and the Where clause (but not the bear claws, which I already ate when you weren't looking, along with the apple fritter).
Select table_rainbow.color From table_rainbow Outer Join table_flag On table_rainbow.color = table_flag.color Where table_flag.color Is Null
- orange
- yellow
- green
- purple
"Where table_flag.color Is Null" means "Where there is no corresponding color in the flag." Of course, you can play this game in the other direction too.
Select table_flag.color From table_flag Outer Join table_rainbow On table_flag.color = table_rainbow.color Where table_rainbow.color Is Null
- white
Finally, what if you want to know all the colors that are either in table_rainbow or in table_flag? The easiest way to do it is with a Union clause because Union, like the set operation after which it is named, eliminates duplicates.
Select table_rainbow.color From table_rainbow Union Select table_flag.color From table_flag
- red
- orange
- yellow
- green
- blue
- purple
- white