In a Relational Database, tables are often related to each other in a way that allows their information to only be written once in the whole database. Then, when you need to analyze the data, you'll need to combine the info from those related tables.

To do this in SQL, you can use JOIN statements. The LEFT JOIN statement is one of the various JOIN statements available. When you use it to join two tables, it keeps all the rows of the first table (the left table), even if there is not a corresponding match on the second table.

You can use JOIN in a SELECT query to join two tables, table_1 and table_2, like this:

SELECT columns
FROM table_1
LEFT OUTER JOIN table_2
ON relation;
SELECT columns
FROM table_1
LEFT JOIN table_2
ON relation;

First you write which columns will be present in the joined table. You can specify to which table the column belongs by prefixing the table name to the column name. This is necessary if some columns have the same name (like table_1.column_1 and table_2.column_1) with SELECT <columns>.

Then you would write the name of the first table as FROM table_1.

After that you'd write the name of the second table as LEFT OUTER JOIN table_2 or LEFT JOIN table_2 (omitting the OUTER keyword).

And at the end you'd write the relation to use to match the rows, for example ON table_1.column_A = table_2.column_B. Often the relation is by id, but it can be with any column.

SQL LEFT JOIN Example

Let's say you have a book database in which you have two tables, one with books, the other with authors. To avoid repeating all the author info for each book, that info is in its own table, and the books have only the author_name column.

book_idtitleauthor_namepubl_year
1Uno, nessuno e centomilaLuigi Pirandello1926
2Il visconte dimezzatoItalo Calvino1952
3Le tigri di MompracemEmilio Salgari1900
4Il giorno della civettaLeonardo Sciascia1961
5A ciascuno il suoLeonardo Sciascia1966
6Il fu Mattia PascialLuigi Pirandello1904
7I MalavogliaGiovanni Verga1881
author_idnameyear_of_birthplace_of_birthtrivia
1Luigi Pirandello1867AgrigentoNobel Prize in Literature in 1934
2Giovanni Verga1840Vizziniwas Senator of the Kingdom of Italy from 1920 to 1922
3Italo Svevo1861Triestereal name was Aron Hector Schmitz
4Cesare Pavese1908Santo Stefano BelboNULL
5Giuseppe Tomasi di Lampedusa1896Palermowas prince of Lampedusa from 1934 to 1957

We can join these two tables based on the names of the authors. Using the books table as the left table, you can write the following code to join them:

SELECT books.title AS book_title, books.publ_year, books.author_name, authors.year_of_birth, authors.place_of_birth
   FROM books
   LEFT JOIN authors
   ON books.author_name = authors.name
;

Let's break it down.

In the first line, you choose which columns to show in the final table. It's also the place to decide if some columns will have a different name in the resulting table using AS like with books.title AS book_title.

The second line, FROM books, says which is the first table to consider, also called the left table.

Then the third line, LEFT JOIN authors, says which other table to consider.

ON books.author_name = authors.name says to match the tables using the rows books.author_name and authors.name.

After this query you would get the table as below, where the rows that didn't get info from the authors table just show NULL.

book_titlepubl_yearauthor_nameyear_of_birthplace_of_birth
Uno, nessuno e centomila1926Luigi Pirandello1867Agrigento
Il visconte dimezzato1952Italo CalvinoNULLNULL
Le tigri di Mompracem1900Emilio SalgariNULLNULL
Il giorno della civetta1961Leonardo SciasciaNULLNULL
A ciascuno il suo1966Leonardo SciasciaNULLNULL
Il fu Mattia Pascal1904Luigi Pirandello1867Agrigento
I Malavoglia1881Giovanni Verga1840Vizzini

Note that the authors not present in the books table are not in this joined table. This is because, as I said before, only the unrelated rows from the left table (in this case books) are kept, not those from the right/second table.

A more complex LEFT JOIN example

Let's see another way you can use JOIN together with other SQL features to do some data analysis.

You might want to see how many books from each author are present in the database. You could use the below query to do so:

SELECT authors.name AS author_name,
    SUM(
      CASE
        WHEN books.title LIKE '%'
          THEN 1
        ELSE 0
      END
    ) as number_of_books
  FROM authors
  LEFT JOIN books
  ON books.author_name = authors.name
  GROUP BY authors.name
  ORDER BY number_of_books DESC
;

Code breakdown

Line 1: with SELECT you list the columns you want in the resulting table.

Line 2: SUM is an aggregation function used in conjunction with GROUP BY. The values of the rows that are grouped together are then summed.

Line 3-7: you use the CASE statement to get different results depending on a condition. In this case, a row is counted as 1 if it contains a book title, otherwise it is counted as 0. And here we use LIKE to check if the cell contains any characters (learn more in this article about Contains String).

Line 8: this gives a name of number_of_books to the column that is created for the SUM.

Line 9: the left/first table in this case is authors.

Line 10: the right/second table in this case is books.

Line 11: this joins the two tables using the author names.

Line 12: the rows are grouped by author name - all the rows with the same value in that column will be represented by a single row.

Line 13: we use order by to arrange in descending order using the number of books.

The query will give you the below table. Note that you see here only the authors that are present in the authors table. The authors mentioned in the books table without an entry in the authors table are not present here. This is an effect of the fact that the unrelated rows from the books table were not kept.

author_namenumber_of_books
Luigi Pirandello2
Giovanni Verga1
Cesare Pavese0
Giuseppe Tomasi di Lampedusa0
Italo Svevo0

If the authors table is updated to include all the authors mentioned in the books table, like this:

author_idnameyear_of_birthplace_of_birthtrivia
1Luigi Pirandello1867AgrigentoNobel Prize in Literature in 1934
2Giovanni Verga1840Vizziniwas Senator of the Kingdom of Italy from 1920 to 1922
3Italo Svevo1861Triestereal name was Aron Hector Schmitz
4Cesare Pavese1908Santo Stefano BelboNULL
5Giuseppe Tomasi di Lampedusa1896Palermowas prince of Lampedusa from 1934 to 1957
6Italo Calvino1923Santiago de las VegasNULL
7Emilio Salgari1862VeronaNULL
8Leonardo Sciascia1921RacalmutoNULL

Then the table from the query above would actually give the number of books for all authors.

author_namenumber_of_books
Leonardo Sciascia2
Luigi Pirandello2
Emilio Salgari1
Giovanni Verga1
Giovanni Verga1
Cesare Pavese0
Giuseppe Tomasi di Lampedusa0
Italo Svevo0

Conclusion

In a Relational Database, data should be written only once, so we often end up with multiple tables related to each other. LEFT JOIN is a really useful ally when we need to analyse data and join information from different tables. Enjoy querying your database using this powerful tool.