More SQL Please: Common Table Expressions

pop out tables

orders

| id | product_id | quantity | date_ordered | date_arrived | |----|------------|----------|---------------------|---------------------| | 1 | 4 | 40 | 2024-06-01 07:58:05 | 2024-06-03 09:04:05 | | 2 | 1 | 20 | 2024-06-14 19:30:10 | 2024-06-26 14:20:10 | | 3 | 4 | 40 | 2024-07-01 09:01:05 | 2024-07-13 09:01:05 | | 4 | 5 | 50 | 2024-07-03 01:00:07 | 2024-07-09 12:50:12 | | 5 | 2 | 60 | 2024-07-24 06:06:03 | 2024-07-30 15:05:00 | | 6 | 3 | 60 | 2024-08-15 16:40:07 | 2024-08-19 07:05:00 | | 7 | 6 | 10 | 2024-09-01 08:01:01 | NULL | | 8 | 7 | 10 | 2024-09-01 08:05:09 | NULL |

products

| id | name | category | price | remaining_stock | vendor_id | |----|--------------------------|------------|-------|-----------------|-----------| | 1 | Cool Splash Cologne | Toiletries | 49.99 | 25 | 2 | | 2 | More Spice Please Chips | Food | 3.99 | 60 | 1 | | 3 | Night Shift Coffee | Food | 6.99 | 1 | 1 | | 4 | Crunchy Munchy Trail Mix | Food | 8.99 | 5 | 3 | | 5 | Super Glide Floss | Toiletries | 1.99 | 19 | 4 | | 6 | Big Bark Dog Food | Pet Care | 30 | 2 | 2 | | 7 | Soft Purr Cat Food | Pet Care | 30 | 2 | 2 | | 8 | Tiny Fresh Scent Tree | Automotive | 2 | 7 | 4 |

vendors

| id | name | |----|-------------------------| | 1 | Quicky Express Food Co | | 2 | Good Times Distributors | | 3 | Natural Goodness | | 4 | Hyper Logistics | | 5 | Vend-tertainment |

Sometimes it can be useful to construct a particular set of data at the beginning of your SQL statement. Common Table Expressions, or CTEs, allow you to define a temporary table using a SELECT statement that lives only for the duration of your query.

CTEs have similar use cases as subqueries. For instance, they can make queries more readable by separating pre-computations from your main query. They can also help with visualizing the "chunks" piece by piece to better understand the intent behind the query. You can also use a CTE instead of a subquery when you need to refer to the data more than once later on in the query.

This example shows the use of a CTE to compute average order size then compare our current stock levels to the average order size. This might be a more reliable way to determine which products need to be restocked than looking at the remaining stock alone.

WITH avg_order_size AS (
    SELECT o.product_id AS product_id, avg(o.quantity) AS size
    FROM orders o
    GROUP BY o.product_id
)
SELECT 
    name, 
    avg_order_size.size AS avg_order_size, 
    remaining_stock
FROM products p
INNER JOIN avg_order_size
ON p.id = avg_order_size.product_id
WHERE remaining_stock < avg_order_size.size;

You can define as many CTEs as you want in your statement. CTEs can also use values from other CTEs that were defined before themselves. You can also join CTEs together just like a regular table.

Suppose we want to find out which products we are overstocked in and how much extra cash we could have if we managed our orders better.

In this example our first CTE finds the average total retail value of orders for each product. The second CTE finds the total retail value of all the products we have in stock. Then our statement joins these CTEs together and finds the difference between the totals to find out how much retail value is sitting in our overstocked products.

Even though this example is large, CTEs allow us to break up the problem into smaller pieces. Using descriptive aliases can help readers of this query better understand the intent of each CTE and ultimately the intent of the entire query.

If you need to create several complex CTEs for your query consider creating Temporary Tables instead. This technique to setup some transformed data first in temporary tables can be referred to as "staging".

WITH 
order_retail_value AS (
    SELECT 
        o.id AS order_id, 
        p.id AS product_id, 
        avg(o.quantity * p.price) AS avg_total
    FROM orders o
    INNER JOIN products p
    ON o.product_id = p.id
    GROUP BY p.id
),
stock_retail_value AS (
    SELECT 
        p.id AS product_id, 
        p.name AS product, 
        p.remaining_stock * p.price AS total
    FROM products p
)
SELECT 
    s.product,
    round(s.total, 2) AS stock_value,
    round(o.avg_total, 2) AS avg_order_value,
    round(s.total - o.avg_total, 2) AS overstock_value
FROM stock_retail_value s
INNER JOIN order_retail_value o
ON s.product_id = o.product_id
WHERE overstock_value > 0
ORDER BY overstock_value DESC;

CTEs can be useful if you need to pre-compute data once and then reference it multiple times later on.

In this example we pre-calculate the average cost of products across each category. Then we use that value twice; once to display it and once to calculate the difference between the average and the price of an individual product.

You could rewrite this query by using a subquery directly in the JOIN, but defining the CTE at the beginning tends to make your code more organized and readable. This is especially true if you use multiple CTEs in a single query.

WITH category_avg AS (
    SELECT category, avg(price) AS avg
    FROM products
    GROUP BY category
)
SELECT 
    p.name,
    p.price,
    round(category_avg.avg, 2) AS category_avg,
    round(p.price - category_avg.avg, 2) AS diff_from_avg
FROM products p
INNER JOIN category_avg
ON p.category = category_avg.category
ORDER BY abs(diff_from_avg) DESC;
previous: Subqueries home next: CASE