|
Hello Reader, A common analytics question that sounds simple but isn't: "Which customers have booked the same type of expedition more than once?" You can't answer this with a regular JOIN between two different tables. The information lives in one table. You need to compare rows within that table to find patterns. That's what a self-JOIN does. It joins a table to itself. It sounds unusual, but once you see the pattern, you'll recognize situations where it's exactly what you need. The Business ProblemMarcus, your VP of Operations, wants to understand customer loyalty at a deeper level than just "repeat customers." He wants to know: which customers are so loyal to a specific expedition type that they've booked it three or more times? This matters because a customer who booked three photography tours might become a brand advocate for that category. They might respond to premium photography packages, refer friends, or provide testimonials. Gold. The Query: Finding Category LoyalistsHere's what I ran against the Summit Adventures database (the fake adventure tourism company I created to help people learn business analytics):
Eight customers have booked the same expedition type three or more times. That's real loyalty. And real business intelligence. Alexandrie Ullrich has spent over $18,600 on hiking alone. That's a customer who should be getting early access to every new hiking expedition, a direct email from a guide, and perhaps an invitation to a premium experience. Now: The Self-JOIN PatternThe query above uses GROUP BY and HAVING — powerful, but it doesn't compare individual records to each other. Let's look at where self-JOINs shine: comparing rows within the same table. Business question: Which guides have worked together on the same expedition instance? This tells operations which guide pairings work well and which guides have experience collaborating.
Lewis Morris has worked with several different guides across multiple trips. Laurent Neveu is another frequent collaborator. These are the guides you'd pair with newcomers because they have experience working in teams. How the Self-JOIN (Actually) WorksThe key is this part:
We're joining
That The Self-JOIN PatternHere's the general structure:
Three components:
When to Use Self-JOINs in BusinessFinding customers in the same city: "Which of our high-value customers live in the same city?" → Useful for local events and meetups. Comparing employees: "Which team members started in the same month?" → Useful for cohort mentoring programs. Product pairings: "Which products are frequently bought together?" → Useful for bundle pricing. Sequential events: "Which bookings happened within 30 days of each other for the same customer?" → Useful for understanding purchase velocity. Common Mistakes to AvoidMistake 1: Forgetting the de-duplication condition Without Mistake 2: Using a self-JOIN when GROUP BY would work If you're counting "how many times did X happen?" That's GROUP BY + HAVING. Self-JOINs are for comparing one row to another row. Make sure you're using the right tool for the right job. Mistake 3: Not limiting results Self-JOINs can produce enormous result sets (every possible pair of rows). Always add filters and LIMIT while developing your query. Try This at Your JobThink about comparisons within a single dataset:
The self-JOIN pattern works whenever you need to find relationships between rows in the same table. Until next time, Brian P.S. Multi-table analysis is the core of Module 4 (The FBI Evidence Board Blueprint) in SQL for Business Impact. The course builds from simple JOINs to complex self-JOINs, giving you the full toolkit for connecting data across any source. Check it out at sqlforbusinessimpact.com. P.P.S. What kind of "same-table comparison" would be most useful at your company? Hit reply and let me know. These are fun puzzles to solve. I read every response. |
Learn to build analytics projects with SQL, Tableau, Excel, and Python. For data analysts looking to level up their career and complete beginners looking to get started. No fluff. No theory. Just step-by-step tutorials anyone can follow.
Hello Reader, This week's newsletter is a bit different. There's SQL in here, but the real topic is career strategy. Because one of the most common questions I get from readers is some version of: "I've been learning SQL for a few months. How do I prove I can actually do this job?" The honest answer: a portfolio of 2-3 projects that demonstrate business thinking, not just technical syntax. The problem is most portfolios I've reviewed look the same. A Kaggle competition. A tutorial from...
Hello Reader, You've built a clean customer report. Locations, booking history, revenue by region. You send it off to whoever asked for it. Ten minutes later: "Why does this report show blank cells in the location column? And why do the totals not add up?" Welcome to the NULL problem. You're going to learn how to solve it once and for all today. In the Summit Adventures database (the fake adventure tourism company I created to help people learn business analytics), 78.2% of customers don't...
Hello Reader, Imagine you're an analyst at Summit Adventures. Monday morning, you get called into a meeting and your VP asks: "How's revenue looking across our expedition types?" You pull up your analysis. You've got clean numbers. You present them: "Cultural expeditions brought in $374K. Photography brought in $370K. Hiking was $352K. Safari was $349K. Climbing was $298K." Silence. Then someone asks: "Okay... but what should we do about it?" You've answered the question accurately. But you...