More SQL Please: UNION and Set Operators

pop out tables

products

| id | name | category | price | remaining_stock | vendor_id | |----|--------------------------|------------|-------|-----------------|-----------| | 1 | Cool Splash Cologne | Toiletries | 49.99 | 25 | 2 | | 2 | More Spice Please Chips | Food | 3.99 | 60 | 1 | | 3 | Night Shift Coffee | Food | 6.99 | 1 | 1 | | 4 | Crunchy Munchy Trail Mix | Food | 8.99 | 5 | 3 | | 5 | Super Glide Floss | Toiletries | 1.99 | 19 | 4 | | 6 | Big Bark Dog Food | Pet Care | 30 | 2 | 2 | | 7 | Soft Purr Cat Food | Pet Care | 30 | 2 | 2 | | 8 | Tiny Fresh Scent Tree | Automotive | 2 | 7 | 4 |

orders

| id | product_id | quantity | date_ordered | date_arrived | |----|------------|----------|---------------------|---------------------| | 1 | 4 | 40 | 2024-06-01 07:58:05 | 2024-06-03 09:04:05 | | 2 | 1 | 20 | 2024-06-14 19:30:10 | 2024-06-26 14:20:10 | | 3 | 5 | 50 | 2024-07-03 01:00:07 | 2024-07-09 12:50:12 | | 4 | 2 | 60 | 2024-07-24 06:06:03 | 2024-07-30 15:05:00 | | 5 | 3 | 60 | 2024-08-15 16:40:07 | 2024-08-19 07:05:00 | | 6 | 6 | 10 | 2024-09-01 08:01:01 | NULL |

archived_orders_2023

| id | product_id | quantity | date_ordered | date_arrived | |----|------------|----------|---------------------|---------------------| | 7 | 1 | 20 | 2023-12-15 10:30:00 | 2023-12-20 14:15:00 | | 8 | 2 | 25 | 2023-11-28 08:45:22 | 2023-12-02 11:20:15 | | 9 | 4 | 40 | 2023-08-22 11:05:15 | 2023-08-29 15:30:00 | | 10 | 8 | 100 | 2023-07-14 09:20:00 | 2023-07-18 10:15:30 | | 11 | 6 | 5 | 2023-06-30 14:55:25 | 2023-07-05 16:40:10 | | 12 | 7 | 10 | 2023-05-12 07:30:00 | 2023-05-16 13:25:45 |

The UNION operator allows you to combine two sets of result rows into one.

A UNION is different than a JOIN. Think of a JOIN as a way to combine columns "horizontally" from different tables, whereas a UNION is a way to stack rows "vertically" on top of one another.

In order for a UNION to succeed, the queries before and after it must have the same number of columns. Most databases require that the data types of the results match up as well. SQLite's dynamic typing feature allows you to mix data types within the same column. See Data Types for more information.

The column names from the first query are used in the resulting set.

This example shows one use case for UNION. Sometimes you will encounter separate tables that have the same structure. In this scenario, orders from the previous year are moved into another table as an archived copy.

This technique can improve database performance with large datasets by limiting the amount of rows the database needs to check for queries against this year's orders.

SELECT product_id, quantity
FROM orders
UNION
SELECT product_id, quantity
FROM archived_orders_2023;
| product_id | quantity | |------------|----------| | 1 | 20 | | 2 | 25 | | 2 | 60 | | 3 | 60 | | 4 | 40 | | 5 | 50 | | 6 | 5 | | 6 | 10 | | 7 | 10 | | 8 | 100 |

UNION ALL is a modifier that preserves duplicates in the resulting set.

UNION ignores duplicate rows by default.

Both tables contain some orders with the same product_id and quantity, so they are treated as duplicates by UNION.

As an aside, notice how the previous example had results ordered by product_id and this example has no meaningful order. This shows us how a database engine will typically return rows in whatever order it decides is most efficient to compute unless we supply an ORDER BY clause.

SELECT product_id, quantity
FROM orders
UNION ALL
SELECT product_id, quantity
FROM archived_orders_2023;
| product_id | quantity | |------------|----------| | 4 | 40 | | 1 | 20 | | 5 | 50 | | 2 | 60 | | 3 | 60 | | 6 | 10 | | 1 | 20 | | 2 | 25 | | 4 | 40 | | 8 | 100 | | 6 | 5 | | 7 | 10 |

UNION can be chained multiple times to keep appending rows to the result from multiple queries.

This example also shows how you can SELECT constants or computed columns and use them in a UNION. This can be a useful technique to include ad-hoc data that doesn't exist in a table.

SELECT product_id, quantity
FROM orders
UNION ALL
SELECT product_id, quantity
FROM archived_orders_2023
UNION ALL
SELECT 10, 100;
| product_id | quantity | |------------|----------| | 4 | 40 | | 1 | 20 | | 5 | 50 | | 2 | 60 | | 3 | 60 | | 6 | 10 | | 1 | 20 | | 2 | 25 | | 4 | 40 | | 8 | 100 | | 6 | 5 | | 7 | 10 | | 10 | 100 |

Ad-hoc data supplied to a UNION can be a useful technique to create separate "utility" tables as well.

Here we construct a table called shelf_life which maps a product category to a number of days of shelf life. Note again that we only need to alias the columns of the first input to the UNION because those column names will be used for all the rows in the result.

Note that this example also demonstrates a Common Table Expression to construct the shelf_life table for later use in the query.

WITH shelf_life AS (
    SELECT 'Food' AS category, 21 AS shelf_life_days
    UNION
    SELECT 'Toiletries', 365*2
    UNION
    SELECT 'Pet Care', 365
    UNION
    SELECT 'Automotive', 365*5
)
SELECT p.name, p.category, sl.shelf_life_days
FROM products p
INNER JOIN shelf_life sl
ON p.category = sl.category;
| name | category | shelf_life_days | |--------------------------|------------|-----------------| | Tiny Fresh Scent Tree | Automotive | 1825 | | More Spice Please Chips | Food | 21 | | Night Shift Coffee | Food | 21 | | Crunchy Munchy Trail Mix | Food | 21 | | Big Bark Dog Food | Pet Care | 365 | | Soft Purr Cat Food | Pet Care | 365 | | Cool Splash Cologne | Toiletries | 730 | | Super Glide Floss | Toiletries | 730 |
The result of a UNION is treated the same as any other SELECT statement, so it can be used with aggregate clauses like GROUP BY. This example shows how we can calculate the total purchased quantity of items by year.
SELECT 2024 AS year, sum(quantity) AS purchase_volume
FROM orders
UNION ALL
SELECT 2023 AS year, sum(quantity)
FROM archived_orders_2023
GROUP BY year;
| year | purchase_volume | |------|-----------------| | 2024 | 240 | | 2023 | 200 |

INTERSECT is an operator available in some databases that is used to return only the rows that are shared by both input sets.

Think of a UNION as being inclusive, or as growing the result set, while an INTERSECT is exclusive, or shrinking the result set. In terms of a Venn Diagram , INTERSECT is equivalent to the area where the two circles overlap.

In this example we show how INTERSECT helps us find out which orders we made in 2024 are identical to orders we made in 2023.

SELECT product_id, quantity
FROM orders
INTERSECT
SELECT product_id, quantity
FROM archived_orders_2023;
| product_id | quantity | |------------|----------| | 1 | 20 | | 4 | 40 |
For reference, INTERSECT is functionally equivalent to a JOIN with a DISTINCT clause to eliminate duplicates.
SELECT DISTINCT o.product_id, o.quantity
FROM orders o
INNER JOIN archived_orders_2023 ao
ON o.product_id = ao.product_id
AND o.quantity = ao.quantity;
| product_id | quantity | |------------|----------| | 4 | 40 | | 1 | 20 |

The EXCEPT operator returns rows from the first set that do not exist in the second set. Think of it as a way to subtract the second set from the first set. Some databases call this operator MINUS.

This example shows all the orders from 2024 that did not have identical orders placed in 2023.

SELECT product_id, quantity
FROM orders
EXCEPT
SELECT product_id, quantity
FROM archived_orders_2023;
| product_id | quantity | |------------|----------| | 2 | 60 | | 3 | 60 | | 5 | 50 | | 6 | 10 |
previous: GROUP BY home next: Subqueries