The UNION
operator allows you to combine two sets of result rows into one.
A UNION
is different than a JOIN
. Think of a JOIN
as a way to combine columns "horizontally" from different tables, whereas a UNION
is a way to stack rows "vertically" on top of one another.
In order for a UNION
to succeed, the queries before and after it must have the same number of columns. Most databases require that the data types of the results match up as well. SQLite's dynamic typing feature allows you to mix data types within the same column. See Data Types for more information.
The column names from the first query are used in the resulting set.
This example shows one use case for UNION
. Sometimes you will encounter separate tables that have the same structure. In this scenario, orders from the previous year are moved into another table as an archived copy.
This technique can improve database performance with large datasets by limiting the amount of rows the database needs to check for queries against this year's orders.
SELECT product_id, quantity
FROM orders
UNION
SELECT product_id, quantity
FROM archived_orders_2023;
UNION ALL
is a modifier that preserves duplicates in the resulting set.
UNION
ignores duplicate rows by default.
Both tables contain some orders with the same product_id
and quantity
, so they are treated as duplicates by UNION
.
As an aside, notice how the previous example had results ordered by product_id
and this example has no meaningful order. This shows us how a database engine will typically return rows in whatever order it decides is most efficient to compute unless we supply an ORDER BY
clause.
SELECT product_id, quantity
FROM orders
UNION ALL
SELECT product_id, quantity
FROM archived_orders_2023;
UNION
can be chained multiple times to keep appending rows to the result from multiple queries.
This example also shows how you can SELECT
constants or computed columns and use them in a UNION
. This can be a useful technique to include ad-hoc data that doesn't exist in a table.
SELECT product_id, quantity
FROM orders
UNION ALL
SELECT product_id, quantity
FROM archived_orders_2023
UNION ALL
SELECT 10, 100;
Ad-hoc data supplied to a UNION
can be a useful technique to create separate "utility" tables as well.
Here we construct a table called shelf_life
which maps a product category to a number of days of shelf life. Note again that we only need to alias the columns of the first input to the UNION
because those column names will be used for all the rows in the result.
Note that this example also demonstrates a Common Table Expression to construct the shelf_life
table for later use in the query.
WITH shelf_life AS (
SELECT 'Food' AS category, 21 AS shelf_life_days
UNION
SELECT 'Toiletries', 365*2
UNION
SELECT 'Pet Care', 365
UNION
SELECT 'Automotive', 365*5
)
SELECT p.name, p.category, sl.shelf_life_days
FROM products p
INNER JOIN shelf_life sl
ON p.category = sl.category;
UNION
is treated the same as any other SELECT
statement, so it can be used with aggregate clauses like GROUP BY
. This example shows how we can calculate the total purchased quantity of items by year.SELECT 2024 AS year, sum(quantity) AS purchase_volume
FROM orders
UNION ALL
SELECT 2023 AS year, sum(quantity)
FROM archived_orders_2023
GROUP BY year;
INTERSECT
is an operator available in some databases that is used to return only the rows that are shared by both input sets.
Think of a UNION
as being inclusive, or as growing the result set, while an INTERSECT
is exclusive, or shrinking the result set. In terms of a Venn Diagram, INTERSECT
is equivalent to the area where the two circles overlap.
In this example we show how INTERSECT
helps us find out which orders we made in 2024 are identical to orders we made in 2023.
SELECT product_id, quantity
FROM orders
INTERSECT
SELECT product_id, quantity
FROM archived_orders_2023;
INTERSECT
is functionally equivalent to a JOIN
with a DISTINCT
clause to eliminate duplicates.SELECT DISTINCT o.product_id, o.quantity
FROM orders o
INNER JOIN archived_orders_2023 ao
ON o.product_id = ao.product_id
AND o.quantity = ao.quantity;
The EXCEPT
operator returns rows from the first set that do not exist in the second set. Think of it as a way to subtract the second set from the first set. Some databases call this operator MINUS
.
This example shows all the orders from 2024 that did not have identical orders placed in 2023.
SELECT product_id, quantity
FROM orders
EXCEPT
SELECT product_id, quantity
FROM archived_orders_2023;