🟢 How to Find Why Customers Cancel (Using SQL Pattern Analysis)


HeyReader,

Most analysts never think to ask: "Why are our customers cancelling?"

Everyone looks at bookings. Everyone reports revenue. But the data hiding in your cancelled orders often tells a more important story than your completed ones.

And analysts that want to move from beginner to intermediate SQL skill level have an unfair advantage here. Because there's a sequence to work through this kind of problem. That's what we're covering here.

At Summit Adventures (the fake adventure tourism company I created to help people learn business analytics), nearly 46% of bookings end up cancelled. That's a staggering number! And exactly the kind of pattern that, once you understand it, leads to real business improvements.

Let me show you how to investigate cancellation patterns step by step.

Step 1: Understand the Scale

Before diving into "why," start with "how much":

-- Cancellation rate overview
SELECT 
    status,
    COUNT(*) AS booking_count,
    ROUND(COUNT(*) *100.0/SUM(COUNT(*)) OVER (), 1) AS percentage
FROM bookings
GROUP BY status
ORDER BY booking_count DESC;

For Summit Adventures, the breakdown looks roughly like:

  • Completed: ~816 bookings (51%)
  • Cancelled: ~750 bookings (47%)
  • In Progress: ~34 bookings (2.1%)

A 47% cancellation rate is a significant business problem. But that number alone doesn't tell you what to do about it. You need to dig deeper.

Step 2: When Do Cancellations Happen?

Timing reveals a lot. Do customers cancel right away, or weeks later?

-- How quickly after booking do customers cancel?
SELECT 
    CASE 
        WHEN cancellation_date::date - booking_date::date <= 1 THEN 'Same day'
        WHEN cancellation_date::date - booking_date::date <= 7 THEN 'Within a week'
        WHEN cancellation_date::date - booking_date::date <= 30 THEN 'Within a month'
        ELSE 'More than a month later'
    END AS cancellation_timing,
    COUNT(*) AS cancellations,
    ROUND(AVG(total_amount), 2) AS avg_booking_value
FROM bookings
WHERE status = 'cancelled'
GROUP BY 
    CASE 
        WHEN cancellation_date::date - booking_date::date <= 1 THEN 'Same day'
        WHEN cancellation_date::date - booking_date::date <= 7 THEN 'Within a week'
        WHEN cancellation_date::date - booking_date::date <= 30 THEN 'Within a month'
        ELSE 'More than a month later'
    END
ORDER BY cancellations DESC;

If most cancellations happen within 24 hours, your checkout process might be creating buyer's remorse.

If they happen weeks later, customers might be finding better alternatives or facing scheduling conflicts.

Step 3: Which Expedition Types Get Cancelled Most?

-- Cancellation rates by expedition type
SELECT 
    e.expedition_type,
    COUNT(*) AS total_bookings,
    COUNT(*) FILTER (WHERE b.status = 'cancelled') AS cancellations,
    ROUND(
        COUNT(*) FILTER (WHERE b.status = 'cancelled') * 100.0 / COUNT(*), 
        1
    ) AS cancellation_rate
FROM bookings b
    INNER JOIN expedition_instances ei ON b.instance_id = ei.instance_id
    INNER JOIN expeditions e ON ei.expedition_id = e.expedition_id
GROUP BY e.expedition_type
ORDER BY cancellation_rate DESC;

This tells you whether specific types of expeditions have higher cancellation rates.

If climbing trips cancel at 55% but cultural tours cancel at 35%, that's actionable information. Maybe climbing trips need better expectation-setting, or maybe the difficulty descriptions need adjustment.

Step 4: Interpret the Cancellation Reasons

This is where qualitative data meets quantitative analysis:

-- Most common cancellation reasons
SELECT 
    cancellation_reason,
    COUNT(*) AS occurrences,
    ROUND(AVG(total_amount), 2) AS avg_lost_value
FROM bookings
WHERE status = 'cancelled'
    AND cancellation_reason IS NOT NULL
GROUP BY cancellation_reason
ORDER BY occurrences DESC
LIMIT 10;

Summit Adventures tracks reasons like "scheduling conflict," "changed plans," "found alternative," and "financial reasons." Each category suggests a different business response:

  • Scheduling conflict → Offer more flexible rebooking options
  • Financial reasons → Consider payment plans or early-bird pricing
  • Found alternative → Competitive analysis needed
  • Changed plans → This is normal attrition; focus elsewhere

Step 5: What's the Revenue Impact?

The final question: how much money is this costing?

-- Revenue lost to cancellations
SELECT 
    e.expedition_type,
    COUNT(*) FILTER (WHERE b.status = 'cancelled') AS cancelled_bookings,
    SUM(CASE WHEN b.status = 'cancelled' THEN b.total_amount ELSE 0 END) AS lost_revenue,
    SUM(CASE WHEN b.status IN ('completed', 'confirmed') 
        THEN b.total_amount ELSE 0 END) AS kept_revenue,
    ROUND(
        SUM(CASE WHEN b.status = 'cancelled' THEN b.total_amount ELSE 0 END) * 100.0 / 
        NULLIF(SUM(b.total_amount), 0), 
        1
    ) AS revenue_lost_pct
FROM bookings b
    INNER JOIN expedition_instances ei ON b.instance_id = ei.instance_id
    INNER JOIN expeditions e ON ei.expedition_id = e.expedition_id
GROUP BY e.expedition_type
ORDER BY lost_revenue DESC;

Now you can tell your leadership: "We're leaving hundreds of thousands of dollars on the table due to cancellations. Here's where the biggest losses are, and here are three things we can do about it."

Much different conversation than just saying "our cancellation rate is 46%" and leaving it at that.

Breaking Down the Sequence

This investigation followed a natural analytical progression:

  1. How much? (scale of the problem)
  2. When? (timing patterns)
  3. Where? (which products/categories)
  4. Why? (stated reasons)
  5. What's the impact? (revenue at stake)

This sequence works for any business problem, not just cancellations. Customer churn, support tickets, employee turnover, product returns.

Same investigative pattern.

Try This At Your Job

Replace "cancellations" with whatever your company tracks that represents lost opportunity:

  • E-commerce: Cart abandonment or returns
  • SaaS: Subscription cancellations or downgrades
  • Retail: Refund requests
  • HR: Employee resignations

The queries adapt but the investigative pattern stays the same.

Start with scale, then timing, then categories, then reasons, then impact.

Until next time,

Brian

Brian Graves, creator of Analytics in Action

Say 👋 on X/Twitter, LinkedIn, or book a call with me. You can always reply to these emails. I check them all.


P.S. This investigative approach (starting with a broad question and narrowing systematically) is at the heart of the Gordon Ramsay Blueprint in Module 8 of SQL for Business Impact. It's about diagnosing problems under pressure. Check it out at sqlforbusinessimpact.com.

P.P.S. What's the equivalent of "cancellations" at your company? What data do you wish you were analyzing but haven't gotten to yet? Hit reply! I'd love to hear what you're working on.

Starting With Data

Learn to build analytics projects with SQL, Tableau, Excel, and Python. For data analysts looking to level up their career and complete beginners looking to get started. No fluff. No theory. Just step-by-step tutorials anyone can follow.

Read more from Starting With Data

Hello Reader, This week's newsletter is a bit different. There's SQL in here, but the real topic is career strategy. Because one of the most common questions I get from readers is some version of: "I've been learning SQL for a few months. How do I prove I can actually do this job?" The honest answer: a portfolio of 2-3 projects that demonstrate business thinking, not just technical syntax. The problem is most portfolios I've reviewed look the same. A Kaggle competition. A tutorial from...

Hello Reader, A common analytics question that sounds simple but isn't: "Which customers have booked the same type of expedition more than once?" You can't answer this with a regular JOIN between two different tables. The information lives in one table. You need to compare rows within that table to find patterns. That's what a self-JOIN does. It joins a table to itself. It sounds unusual, but once you see the pattern, you'll recognize situations where it's exactly what you need. The Business...

Hello Reader, You've built a clean customer report. Locations, booking history, revenue by region. You send it off to whoever asked for it. Ten minutes later: "Why does this report show blank cells in the location column? And why do the totals not add up?" Welcome to the NULL problem. You're going to learn how to solve it once and for all today. In the Summit Adventures database (the fake adventure tourism company I created to help people learn business analytics), 78.2% of customers don't...