🟢 How to Build Customer Segments From Scratch With CASE WHEN + GROUP BY


Hello Reader,

Imagine you're a business analyst at Summit Adventures (the fake adventure tourism company I created to teach real-world business analytics concepts).

Your CEO asks a common question: "Who are our best customers?"

If you pull a list sorted by total spend, you'll get a spreadsheet with hundreds of rows. Not useful because no one acts on a list of 500 customers. And certainly not the CEO.

What they actually need are segments: 3-5 groups with clear labels, clear revenue impact, and clear recommended actions for each. "Our 19 VIP customers generate 10.3% of total revenue" is something a team can work with.

This is where CASE WHEN and GROUP BY combine.

  • CASE WHEN creates the categories.
  • GROUP BY summarizes them.

How to help your team (in the real world)

Gabby from marketing has budget for one targeted campaign next quarter. She needs to know: which customer group offers the best return? She can't market to everyone, so she needs clear segments with numbers.

Step 1: Define Your Segments With CASE WHEN

The first decision is the hardest: what thresholds define each segment?

There's no universal rule, but here's a practical approach. Start with the data:

-- Quick distribution of customer spending
SELECT 
    MIN(total_spent) AS min_spent,
    ROUND(AVG(total_spent)::numeric, 2) AS avg_spent,
    PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY total_spent) AS median_spent,
    PERCENTILE_CONT(0.9) WITHIN GROUP (ORDER BY total_spent) AS p90_spent,
    MAX(total_spent) AS max_spent
FROM (
    SELECT 
        c.customer_id,
        COALESCE(SUM(p.amount), 0) AS total_spent
    FROM customers c
        LEFT JOIN bookings b ON c.customer_id = b.customer_id
            AND b.status IN ('completed', 'confirmed')
        LEFT JOIN payments p ON b.booking_id = p.booking_id
            AND p.payment_status = 'completed'
    GROUP BY c.customer_id
) customer_totals
WHERE total_spent > 0;

This tells you the range and distribution.

Next, you choose breakpoints that create more meaningful groups that will drive actual action.

If the max spend is $19k, let's mark any customers above $10k as VIP and then work down from there.

(You can spend a LOT of time building and optimizing segments properly by a bunch of different criteria)

Step 2: Build and Summarize Segments

-- Customer value segmentation
SELECT 
    CASE 
        WHEN total_spent >= 10000 THEN 'VIP (10K+)'
        WHEN total_spent >= 5000 THEN 'High Value (5K-10K)'
        WHEN total_spent >= 2000 THEN 'Regular (2K-5K)'
        WHEN total_spent > 0 THEN 'Light (Under 2K)'
        ELSE 'No Purchases'
    END AS customer_segment,
    COUNT(*) AS customer_count,
    ROUND(AVG(total_spent)::numeric, 2) AS avg_segment_spend,
    ROUND(SUM(total_spent)::numeric, 2) AS segment_revenue,
    ROUND(
        SUM(total_spent)::numeric / (
            SELECT SUM(amount) FROM payments 
            WHERE payment_status = 'completed'
        ) * 100, 1
    ) AS pct_of_revenue
FROM (
    SELECT 
        c.customer_id,
        COALESCE(SUM(p.amount), 0) AS total_spent
    FROM customers c
        LEFT JOIN bookings b ON c.customer_id = b.customer_id
            AND b.status IN ('completed', 'confirmed')
        LEFT JOIN payments p ON b.booking_id = p.booking_id
            AND p.payment_status = 'completed'
    GROUP BY c.customer_id
) customer_totals
GROUP BY 
    CASE 
        WHEN total_spent >= 10000 THEN 'VIP (10K+)'
        WHEN total_spent >= 5000 THEN 'High Value (5K-10K)'
        WHEN total_spent >= 2000 THEN 'Regular (2K-5K)'
        WHEN total_spent > 0 THEN 'Light (Under 2K)'
        ELSE 'No Purchases'
    END
ORDER BY avg_segment_spend DESC;

Now we're talking!

Let's read the story these numbers tell:

  • 19 VIP customers (1.9% of the customer base) drive a whopping 10.3% of revenue. The most valuable relationships.
  • 98 High Value customers drive nearly 30% of all revenue. This is your core business.
  • 479 customers have never purchased. That's 48% of the customer base sitting at $0. Even converting 5% of this group into Light customers would add roughly $25K in revenue. Big opportunity!

How CASE WHEN + GROUP BY Work Together

The key insight is understanding the division of labor:

CASE WHEN creates the labels. It looks at each row and assigns it to a category:

CASE 
    WHEN total_spent >= 10000 THEN 'VIP (10K+)'
    WHEN total_spent >= 5000 THEN 'High Value (5K-10K)'
    ...
END

GROUP BY collapses those categories into summary rows. It takes 1,000 individual customer records and produces 5 segment totals.

The aggregation functions (COUNT, AVG, SUM) calculate metrics within each group.

Without CASE WHEN, you'd have to create temporary tables or manual filters for each segment. Without GROUP BY, you'd have labeled rows but no summaries. Together, they turn raw data into a strategy document.

Building Better Segments: Three Variations

Variation 1: Multi-dimensional segmentation

Instead of just spending, combine value with recency:

CASE 
    WHEN total_spent >= 5000 AND last_booking >= CURRENT_DATE - INTERVAL '6 months' 
        THEN 'High Value - Active'
    WHEN total_spent >= 5000 AND last_booking < CURRENT_DATE - INTERVAL '6 months' 
        THEN 'High Value - At Risk'
    WHEN total_spent > 0 AND last_booking >= CURRENT_DATE - INTERVAL '6 months' 
        THEN 'Active - Growth Potential'
    WHEN total_spent > 0 
        THEN 'Dormant'
    ELSE 'Never Purchased'
END AS customer_segment

Now "High Value - At Risk" is a retention emergency. Worth more investigation, for sure.

Variation 2: Expedition engagement segments

CASE 
    WHEN expedition_count >= 5 THEN 'Power User (5+ trips)'
    WHEN expedition_count >= 3 THEN 'Regular (3-4 trips)'
    WHEN expedition_count >= 1 THEN 'Tried It (1-2 trips)'
    ELSE 'Never Booked'
END AS engagement_tier

This answers: "How deeply engaged are our customers?"

Variation 3: Ticket size segments

CASE 
    WHEN avg_booking_value >= 3000 THEN 'Premium Buyer'
    WHEN avg_booking_value >= 1500 THEN 'Standard Buyer'
    ELSE 'Budget Buyer'
END AS buyer_type

This tells marketing which pricing tier each customer gravitates toward.

Answering Gabby's Question

Back to the original ask: where should Gabby focus her campaign budget?

  • Option A: Convert "No Purchases" into Light customers. Targeting 479 dormant sign-ups with a first-time booking offer. Even 10% conversion at $1,000 avg = $47,900.
  • Option B: Upgrade "Light" to "Regular." 225 customers averaging $1,057. Moving 20% to the $3,322 average = additional $102K.
  • Option C: Retain "High Value" customers. 98 customers averaging $6,881. Preventing 10% from churning preserves $67K.

The data says Option B (upgrading Light to Regular) has the highest potential return. These are customers who already trust you enough to buy. Getting a second or third booking from them is easier than acquiring new customers or preventing churn.

Common Mistakes to Avoid

Mistake 1: Too many segments

If you have 10 segments, you've created a spreadsheet, not a strategy. Aim for 3-5 groups with clear action items for each.

Mistake 2: Using the same CASE WHEN in SELECT but not GROUP BY

If your CASE WHEN appears in SELECT, the same expression must appear in GROUP BY. Otherwise you'll get an error or unexpected results.

Mistake 3: Overlapping conditions

CASE WHEN evaluates top-to-bottom and stops at the first match. Put your most specific conditions first (highest thresholds), then work down. Overlapping conditions create misclassified rows.

Try This at Your Job

Pick your most important table (customers, orders, users) and build one segmentation:

  1. Choose a metric to segment by (revenue, activity, engagement)
  2. Define 3-5 meaningful breakpoints
  3. Use CASE WHEN + GROUP BY to get counts and totals for each
  4. Calculate the percentage of revenue each segment represents
  5. Write one recommended action per segment

This is one of the most frequently requested analyses in any business. Once you build it, you'll use it constantly.

Until next time,

Brian

Brian Graves, creator of Analytics in Action

Say 👋 on X/Twitter, LinkedIn, or book a call with me. You can always reply to these emails. I check them all.

P.S. Customer segmentation is a core skill in Module 2 (The Marie Kondo Blueprint) and Module 6 (The So What Framework) of SQL for Business Impact. The course teaches you to build segments that drive decisions, not just organize data. Check it out at sqlforbusinessimpact.com.

P.P.S. What's the most useful customer segment you've built at work? Hit reply. I'd love to hear how you're using this pattern. I read every response.

Starting With Data

Learn to build analytics projects with SQL, Tableau, Excel, and Python. For data analysts looking to level up their career and complete beginners looking to get started. No fluff. No theory. Just step-by-step tutorials anyone can follow.

Read more from Starting With Data

Hey Reader -- I'm trying something new this week! Can you take a minute to fill out this quick survey? Thanks in advance! Okay, on to this week's topic: A common question I get from readers is some version of: "This Summit Adventures stuff is great, but I work in healthcare (or SaaS, or retail, or finance). How does this apply to me?" The honest answer: every SQL pattern you've learned in this newsletter translates directly to your industry. The table names change. So do column names. The...

Hello Reader, Here's a scenario that happens more often than anyone admits: You build a report. Your director uses it in a board presentation. A week later, someone notices the customer count includes 58 records with obviously fake email addresses like evan.clarke743@noemail. The report wasn't wrong. The data underneath it was dirty. And now your credibility takes a hit. Bummer. Not because of your SQL skills, but because you didn't check the final product before sending it off. Data quality...

Hey Reader, Last month, we covered LEFT JOIN to find customers who never booked. That's one type of absence — people who never showed up. But there's another kind that's even more expensive: people who showed up, committed, and then left. The dreaded cancellation. Customers who said "yes", then said "actually, never mind". In the Summit Adventures database (the fake adventure tourism company I created to help people learn business analytics), nearly 47% of all bookings end in cancellation....