🟢 How to Build a Multi-Step Analysis With CTEs (Step by Step)


Hey Reader,

Most analysts can't answer something like this: "Which of our guides are generating the most revenue, and are we assigning them to the right expeditions?"

This isn't a one-step question. You need to:

  1. Calculate revenue per guide
  2. Look at which expedition types they're assigned to
  3. Compare guide revenue to expedition category performance
  4. Identify mismatches (top guides on underperforming products, or vice versa)

That's four analytical steps. Trying to write this as a single SQL query would be unreadable. CTEs (Common Table Expressions) let you build each step separately and combine them at the end.

What Are CTEs?

CTEs are temporary named result sets you define with the WITH keyword. Think of them as naming each step of your analysis before combining everything at the end.

WITH step_one AS (
    -- First calculation
),
step_two AS (
    -- Second calculation, can reference step_one
)
SELECT ...
FROM step_one
    JOIN step_two ON ...

Each step has a clear name. Each step does one thing. The final SELECT combines them into your answer.

The Full Analysis: Guide Performance

Here's the multi-step analysis for Summit Adventures (the fake adventure tourism company I created to help people learn business analytics):

-- Guide performance and expedition alignment analysis
-- Question: Which guides generate the most revenue, and are they 
-- assigned to the right expedition types?
-- Step 1: Calculate total revenue per guide
WITH guide_revenue AS (
    SELECT 
        g.guide_id,
        g.first_name || ' ' || g.last_name AS guide_name,
        g.specializations,
        g.years_experience,
        COUNT(DISTINCT ga.instance_id) AS trips_assigned,
        SUM(p.amount) AS total_revenue_generated
    FROM guides g
        INNER JOIN guide_assignments ga ON g.guide_id = ga.guide_id
        INNER JOIN expedition_instances ei ON ga.instance_id = ei.instance_id
        INNER JOIN bookings b ON ei.instance_id = b.instance_id
        INNER JOIN payments p ON b.booking_id = p.booking_id
    WHERE p.payment_status = 'completed'
        AND b.status IN ('completed', 'confirmed')
    GROUP BY g.guide_id, g.first_name, g.last_name, 
             g.specializations, g.years_experience
),
-- Step 2: Find each guide's primary expedition type (most assigned)
guide_primary_type AS (
    SELECT DISTINCT ON (g.guide_id)
        g.guide_id,
        e.expedition_type AS primary_type,
        COUNT(*) AS type_count
    FROM guides g
        INNER JOIN guide_assignments ga ON g.guide_id = ga.guide_id
        INNER JOIN expedition_instances ei ON ga.instance_id = ei.instance_id
        INNER JOIN expeditions e ON ei.expedition_id = e.expedition_id
    GROUP BY g.guide_id, e.expedition_type
    ORDER BY g.guide_id, COUNT(*) DESC
),
-- Step 3: Calculate average revenue by expedition type
type_benchmarks AS (
    SELECT 
        e.expedition_type,
        ROUND(AVG(p.amount), 2) AS avg_revenue_per_transaction,
        SUM(p.amount) AS total_type_revenue
    FROM expeditions e
        INNER JOIN expedition_instances ei ON e.expedition_id = ei.expedition_id
        INNER JOIN bookings b ON ei.instance_id = b.instance_id
        INNER JOIN payments p ON b.booking_id = p.booking_id
    WHERE p.payment_status = 'completed'
    GROUP BY e.expedition_type
)
-- Step 4: Combine everything into the final picture
SELECT 
    gr.guide_name,
    gr.specializations,
    gr.years_experience,
    gr.trips_assigned,
    gr.total_revenue_generated,
    gpt.primary_type AS most_assigned_type,
    tb.avg_revenue_per_transaction AS type_avg,
    ROUND(gr.total_revenue_generated / NULLIF(gr.trips_assigned, 0), 2) 
        AS guide_revenue_per_trip,
    CASE 
        WHEN gr.total_revenue_generated / NULLIF(gr.trips_assigned, 0) > 
             tb.avg_revenue_per_transaction * 1.2 
        THEN 'Above Average'
        WHEN gr.total_revenue_generated / NULLIF(gr.trips_assigned, 0) < 
             tb.avg_revenue_per_transaction * 0.8 
        THEN 'Below Average'
        ELSE 'On Track'
    END AS performance_label
FROM guide_revenue gr
    INNER JOIN guide_primary_type gpt ON gr.guide_id = gpt.guide_id
    INNER JOIN type_benchmarks tb ON gpt.primary_type = tb.expedition_type
ORDER BY gr.total_revenue_generated DESC
LIMIT 15;

One Query. Four Steps.

Look at what we just did:

  • Step 1 answers: "How much revenue does each guide generate?"
  • Step 2 answers: "What type of expedition does each guide mostly lead?"
  • Step 3 answers: "What's the benchmark for each expedition type?"
  • Step 4 combines everything: "How does each guide compare to their category?"

Each step is a self-contained calculation. If something looks wrong in the results, you can run each CTE independently to find the issue. Try doing that with a 50-line nested query.

The Power of Thinking with CTEs

CTEs aren't just a SQL feature. They're a thinking framework.

When you face a complex business question, resist the urge to write one massive query. Instead:

  1. Break the question into smaller questions
  2. Name each smaller question (that becomes your CTE name)
  3. Write each step independently
  4. Combine at the end

This maps directly to how experienced analysts think through problems. You're not just writing SQL — you're structuring your analytical reasoning.

Common CTE Patterns

Pattern 1: Calculate then filter

WITH calculations AS (
    SELECT customer_id, SUM(amount) AS total_spent
    FROM payments GROUP BY customer_id
)
SELECT * FROM calculations WHERE total_spent > 5000;

Pattern 2: Multiple dimensions

WITH volume AS (...),  -- How many?
     value AS (...),   -- How much?
     timing AS (...)   -- When?
SELECT ...
FROM volume JOIN value ON ... JOIN timing ON ...;

Pattern 3: Progressive enrichment

WITH raw_data AS (...),          -- Base records
     with_labels AS (...),        -- Add CASE WHEN categories
     with_benchmarks AS (...)     -- Add comparison metrics
SELECT * FROM with_benchmarks;

CTEs vs. Subqueries

Both accomplish similar things. Here's when to use each:

Use CTEs when:

  • You have 3+ logical steps
  • Multiple people will read the query
  • You need to debug step by step
  • The analysis tells a story (most business analysis)

Use subqueries when:

  • It's a quick, one-time calculation
  • You need a single comparison value
  • The logic is simple (1-2 steps)

For most business analysis, CTEs are the better choice. They make complex work readable, and readable work builds trust.

Try This At Your Job

Next time you face a multi-part business question, write down the sub-questions before opening your SQL editor:

  1. What are the individual pieces I need to calculate?
  2. What should each piece be named?
  3. How do the pieces connect?

Then turn each piece into a CTE. The query practically writes itself after that.

Until next time,

Brian (say hi on twitter!)

Brian Graves, creator of Analytics in Action

Say 👋 on X/Twitter, LinkedIn, or book a call with me. You can always reply to these emails. I check them all.


P.S. CTEs are used throughout SQL for Business Impact, but Module 8 (The Gordon Ramsay Blueprint) really showcases how to combine multiple analytical steps into a cohesive investigation. It's about systematic problem-solving under pressure. Check it out at sqlforbusinessimpact.com.

P.P.S. What's the most complex business question you've had to answer with SQL? Reply and share — I'd love to hear how you broke it down (or how you wish you had). I read every response.

Starting With Data

Learn to build analytics projects with SQL, Tableau, Excel, and Python. For data analysts looking to level up their career and complete beginners looking to get started. No fluff. No theory. Just step-by-step tutorials anyone can follow.

Read more from Starting With Data

Hey Reader, Quick question: Could a colleague open your most recent SQL query and understand what it does without asking you a single question? If the answer is "probably not," this newsletter is for you. Writing SQL that works is step one. Writing SQL that someone else can read, trust, and maintain — that's the skill that changes how people see your work. Why This Matters More Than You Think Here's what I've noticed over 15+ years in analytics: the analysts who get promoted aren't always the...

Hey Reader, Imagine you're presenting a finding to your team: "Customer A spent $4,200 with us." The first question from the room: "Is that a lot?" Most analysts learning SQL don't realize that, without context, the number means nothing. You need comparison. Is the average customer spend $500 or $5,000? Is $4,200 in the top 10% or the middle of the pack? This is the "compared to what?" problem, and subqueries solve it elegantly. Numbers without context don't drive decisions. Executives don't...

Hello Reader, Almost every business question boils down to one of two things: "How many?" (COUNT) "How much?" (SUM) How many customers booked this month? How much revenue did we generate? How many trips were cancelled? How much did we lose? Once you're comfortable with COUNT and SUM, you can answer the majority of questions that come your way (like sales analytics using SQL). Let me show you both patterns using a real business scenario. The Scenario Your operations manager asks: "Give me a...