🟢 Feeling Overwhelmed by Your Data? Use The Marie Kondo Blueprint


Hey Reader -- I'm trying something new this week! Can you take a minute to fill out this quick survey? Thanks in advance!

Okay, on to this week's topic:

Imagine you just got hired at as an analyst for a top-tier adventure tourism company (like Summit Adventures, the fake company I created to teach people like you how to solve real-world business problems with SQL)

You start your first day on the job by opening your company database and seeing 1,600 rows of customer bookings.

  • Varying prices.
  • Different dates.
  • Multiple status values.
  • Five expedition categories.
  • Promo codes scattered throughout.

(I've lost count of how many databases I've had to reverse engineer!)

Your brain says: "This is overwhelming. Where do I even start?"

This is the moment where most analysts freeze up or start dumping data straight into Excel.

I know this feeling well. When talking with other aspiring analysts, "feeling overwhelmed by messy data" is one of the top pain points they face. Many have completed endless tutorials but still freeze when opening a real database with dozens of tables and thousands of rows.

But confident analysts have a framework. They use what I call The Marie Kondo Blueprint - a systematic approach to organizing messy data into categories that actually spark business insights.

Let me show you how it works.

The Framework: Marie Kondo's Question Applied to Data

Marie Kondo is a Japanese organizing consultant, author, and TV presenter.

She asks: "Does this spark joy?"

When you're organizing data, you ask a different question: "Which categories matter for this business decision?"

Not which columns exist. Not what data you have. But which groupings will help you make a decision.

There are 3 simple steps in the Marie Kondo framework.

Step 1: Identify categories

Step 2: Choose metrics

Step 3: Filter the noise

Let's apply the steps to a real-world business problem.

The Business Question

Lindsey, the CEO of Summit Adventures, walked into Monday's meeting with this question:

"I need to understand our expedition performance by category. Which types of expeditions are generating the most revenue?"

She's NOT asking for a data dump, which would be instant overwhelm.

Instead, she wants categories (expedition types) and one metric (revenue).

That's your organizing principle.

The Query: Organized by Category

I ran this query using the Summit Adventure SQL Lab:

-- Revenue performance by expedition category
SELECT 
    e.expedition_type,
    COUNT(DISTINCT ei.instance_id) AS total_trips,
    COUNT(DISTINCT b.booking_id) AS total_bookings,
    SUM(p.amount) AS total_revenue,
    ROUND(AVG(p.amount), 2) AS avg_booking_value
FROM expeditions e
    INNER JOIN expedition_instances ei ON e.expedition_id = ei.expedition_id
    INNER JOIN bookings b ON ei.instance_id = b.instance_id
    INNER JOIN payments p ON b.booking_id = p.booking_id
WHERE p.payment_status = 'completed'
    AND b.status IN ('completed', 'confirmed')
GROUP BY e.expedition_type
ORDER BY total_revenue DESC;

Here's what we found:

  • Cultural expeditions: 72 trips, 139 bookings, $374,134 revenue
  • Photography tours: 97 trips, 193 bookings, $370,333 revenue
  • Hiking expeditions: 63 trips, 125 bookings, $351,951 revenue
  • Safari expeditions: 94 trips, 177 bookings, $348,587 revenue
  • Climbing expeditions: 65 trips, 124 bookings, $298,272 revenue

Cultural expeditions generate the most revenue ($374K) with strong booking volume (139 bookings) and a high average value ($1,959 per booking). Hiking expeditions have the highest average value ($2,095 per booking) despite lower volume, while photography leads in total trips (97) and bookings (193).

This tells Lindsey exactly where to focus:

  • Marketing should promote cultural trips (revenue leader with strong volume)
  • Sales should upsell hiking trips (highest per-customer value at $2,095)
  • Operations should expand photography offerings (highest trip count and booking volume prove demand)

One query. Clear categories. Actionable business insights.

The Marie Kondo Blueprint: Three Steps

There are 3 simple steps to the Marie Kondo Blueprint

  1. Identify categories
  2. Choose metrics
  3. Filter the noise

Let's break down each of the steps.

Step 1: Identify Your Categories

Ask: "What groupings matter for this decision?"

Not every column is a category. In our example:

  • ✅ expedition_type - This is what matters (Lindsey wants to compare types)
  • ❌ booking_date - Not relevant to this question (we're not looking at trends)
  • ❌ customer_id - Too granular for this decision (we don't care about individual customers)

Step 2: Choose Your Metric

What are you measuring within each category?

  • Revenue (SUM of payments)
  • Volume (COUNT of bookings)
  • Average value (AVG of payment amounts)

In business analytics, you usually want all three. They tell different parts of the story.

Step 3: Filter the Noise

Before grouping, remove data that shouldn't be counted:

  • Cancelled bookings (Line 12: status IN ('completed', 'confirmed'))
  • Failed payments (Line 11: payment_status = 'completed')
  • Test data or outliers

This is WHERE you clean up before organizing. Just like Marie Kondo says: handle each item (row) and decide if it belongs in your organized space.

Why This Framework Works Everywhere

The Marie Kondo Blueprint adapts to any business question:

E-commerce: "Which product categories are most profitable?"

GROUPBY product_category

SaaS: "Which subscription tiers generate most revenue?"

GROUPBY subscription_tier

Retail: "Which store locations perform best?"

GROUPBY store_location

The structure stays the same. You're organizing overwhelming data into categories that help you make decisions.

The GROUP BY Moment

Notice Line 14 in our query: GROUP BY e.expedition_type

This is the Marie Kondo organizing moment. It takes 204 individual bookings and organizes them into 5 neat categories.

Without GROUP BY, you see 204 rows of chaos. With GROUP BY, you see 5 rows of clarity.

That's the difference between data dependent (overwhelmed by rows) and data confident (organized by categories).

Common Mistakes to Avoid

Mistake #1: Grouping by too many columns

Many analysts write:

GROUPBY expedition_type, booking_date, customer_id, status

This creates tiny groups instead of meaningful categories.

Start with one grouping dimension. Add more only if the business question requires it.

Mistake #2: Not filtering before grouping

If you don't use WHERE to remove cancelled or failed transactions, your totals will be wrong. Always clean first (WHERE), then organize (GROUP BY).

Mistake #3: Forgetting ORDER BY

Without ordering, your categories appear randomly. Business leaders want to see the top performers first. Always ORDER BY total_revenue DESC or your key metric.

Try This Next Week

Next time someone asks you to analyze data by category:

  1. Ask the Marie Kondo question: "Which categories matter for this decision?"
  2. Pick your metrics: COUNT, SUM, AVG - what tells the story?
  3. Filter the noise: Use WHERE to remove data that shouldn't count
  4. Group and organize: GROUP BY your categories
  5. Sort by impact: ORDER BY your most important metric

You'll turn overwhelming data into organized insights that business leaders can actually use.

This framework is how confident analysts approach every messy dataset. Not with Excel panic, but with a systematic organizing principle.

Until next time,

Brian

P.S. The reason I shared this framework this week is because of what a reader wrote a few weeks back: "I am a product management guy I have some experience of using and making Tableau visuals but out of practice. However I have zero SQL and Python skills I would like to learn from a commercial perspective solving real world business problems. I would like to do SQL first and then Python. Appreciate any advise in terms of coaching or training would be appreciated."

Does this situation sound familiar to you? Let me know in the 2 minute survey.


Starting With Data

Learn to build analytics projects with SQL, Tableau, Excel, and Python. For data analysts looking to level up their career and complete beginners looking to get started. No fluff. No theory. Just step-by-step tutorials anyone can follow.

Read more from Starting With Data

Hey Reader, Imagine you're a marketing analyst and your marketing director walks over and says: "I need our customers broken into spending tiers (VIP, Standard, and Low Value) so I can send different emails to each group." You have the data. But every customer just has a dollar amount. There's no "tier" column in the database. How do you create categories that don't exist yet? This is where CASE WHEN becomes one of the most useful tools in your SQL toolkit. The problem is that raw data...

Hey Reader, This week's newsletter is different. Not a regular "how to solve problem X with SQL" kinda thing. Because I've realized something: you can write perfect queries and still never get the career impact you want. You need "soft skills" like presenting your results to leadership in addition to "technical skills" like using CTEs and WINDOW functions. Let me show you what I mean. The Scenario Wednesday, 2pm. Q2 review meeting. CMO: "Our booking conversion is down. What's causing it?"...

Hello Reader, Imagine this scenario: Your CMO walks into Monday's meeting with a question: "I need to see our top 3 highest-revenue expeditions in EACH category - hiking, climbing, safari, cultural, and photography." Most analysts would write 5 separate queries, one per category. Or export to Excel and manually filter. An hour of work, minimum. Confident analysts write one query using window functions. Takes 3 minutes. Here's the skill that separates intermediate analysts from advanced ones....