🟢 How to Document Your SQL Work So Anyone Can Understand It


Hey Reader,

Quick question: Could a colleague open your most recent SQL query and understand what it does without asking you a single question?

If the answer is "probably not," this newsletter is for you.

Writing SQL that works is step one. Writing SQL that someone else can read, trust, and maintain — that's the skill that changes how people see your work.

Why This Matters More Than You Think

Here's what I've noticed over 15+ years in analytics: the analysts who get promoted aren't always the ones who write the most complex queries. They're the ones whose work is so clear that other people can build on it.

When your analysis is easy to follow:

  • Colleagues trust the results without interrogating you
  • Leadership can reference your work in meetings
  • New team members can pick up where you left off
  • You can revisit your own work 6 months later without confusion

When it's not: every analysis becomes a bottleneck that requires you to explain it. That limits your impact.

Three Documentation Habits That Make a Difference

Habit 1: Comment the "Why," Not the "What"

-- BAD: Comments that describe what SQL does (obvious from reading it)-- Select first name and last name from customersSELECT first_name, last_name FROM customers;
-- GOOD: Comments that explain WHY and business context-- Marketing requested VIP list for Q2 retention campaign-- VIP defined as: $5K+ lifetime spend, active in last 6 monthsSELECT 
    c.first_name ||' '|| c.last_name AS customer_name,
    c.email,
    SUM(p.amount) AS lifetime_spend,
    MAX(b.booking_date)::dateAS last_booking
FROM customers c
    INNERJOIN bookings b ON c.customer_id = b.customer_id
    INNERJOIN payments p ON b.booking_id = p.booking_id
WHERE p.payment_status ='completed'AND b.status IN ('completed', 'confirmed')
GROUPBY c.customer_id, c.first_name, c.last_name, c.email
HAVINGSUM(p.amount) >=5000ANDMAX(b.booking_date) >=CURRENT_DATE-INTERVAL'6 months'ORDERBY lifetime_spend DESC;

The "what" is visible in the code itself. The "why" disappears unless you write it down. Six months later, you'll need the why more than the what.

Habit 2: Use CTEs to Tell a Story

Compare these two versions of the same analysis:

Version A: One big query

SELECT e.expedition_type, COUNT(DISTINCT b.booking_id),
SUM(p.amount), ROUND(AVG(p.amount), 2),
COUNT(DISTINCT b.booking_id) FILTER (WHERE b.status = 'cancelled') * 100.0 / COUNT(DISTINCT b.booking_id)
FROM expeditions e INNER JOIN expedition_instances ei
ON e.expedition_id = ei.expedition_id INNER JOIN bookings b
ON ei.instance_id = b.instance_id INNER JOIN payments p
ON b.booking_id = p.booking_id WHERE p.payment_status = 'completed'
GROUP BY e.expedition_type ORDER BY SUM(p.amount) DESC;

Version B: CTEs with clear steps

-- Expedition category performance analysis
-- Requested by: Marcus (Operations)
-- Purpose: Q2 planning - identify which categories to expand
-- Step 1: Calculate revenue and volume per category
WITH category_performance AS (
    SELECT 
        e.expedition_type,
        COUNT(DISTINCT b.booking_id) AS total_bookings,
        SUM(p.amount) AS total_revenue,
        ROUND(AVG(p.amount), 2) AS avg_booking_value
    FROM expeditions e
        INNER JOIN expedition_instances ei ON e.expedition_id = ei.expedition_id
        INNER JOIN bookings b ON ei.instance_id = b.instance_id
        INNER JOIN payments p ON b.booking_id = p.booking_id
    WHERE p.payment_status = 'completed'
    GROUP BY e.expedition_type
),
-- Step 2: Calculate cancellation rates per category
category_cancellations AS (
    SELECT 
        e.expedition_type,
        ROUND(
            COUNT(*) FILTER (WHERE b.status = 'cancelled') * 100.0 / COUNT(*),
            1
        ) AS cancellation_rate
    FROM expeditions e
        INNER JOIN expedition_instances ei ON e.expedition_id = ei.expedition_id
        INNER JOIN bookings b ON ei.instance_id = b.instance_id
    GROUP BY e.expedition_type
)
-- Step 3: Combine for complete picture
SELECT 
    cp.expedition_type,
    cp.total_bookings,
    cp.total_revenue,
    cp.avg_booking_value,
    cc.cancellation_rate
FROM category_performance cp
    INNER JOIN category_cancellations cc ON cp.expedition_type = cc.expedition_type
ORDER BY cp.total_revenue DESC;

Same results. Version B is readable by anyone on your team.

Each CTE has a name that describes what it calculates. The comments label each step. A new analyst joining your team can follow the logic without asking you what it does.

Habit 3: Name Things for Business Users

Column names your manager won't understand:

SELECT 
    cnt_bid,
    sum_amt,
    avg_val,
    exp_typ
FROM ...

Column names anyone can read:

SELECT 
    total_bookings,
    total_revenue,
    avg_booking_value,
    expedition_type
FROM ...

This takes 10 extra seconds when writing the query and saves 10 minutes every time someone reads it. Including you.

A Template for Documented Analysis

Here's a pattern to use for any analysis that might be shared or revisited:

-- ============================================
-- Analysis: [What question this answers]
-- Requested by: [Who asked for it]
-- Date: [When you wrote it]
-- Data source: [Which database/tables]
-- Assumptions: [Any filters or business rules]
-- ============================================
-- Step 1: [Description of first step]
WITH step_one AS (
    SELECT ...
),
-- Step 2: [Description of second step]
step_two AS (
    SELECT ...
)
-- Final output: [What the results show]
SELECT ...
FROM step_one
    JOIN step_two ON ...
ORDER BY ...;

It takes 2 minutes to add this header. That 2-minute investment saves hours of "wait, what does this query do?" conversations.

The Speed of Trust

A documented query isn't just easier to read. It's easier to trust.

When leadership reviews your analysis, clean documentation signals: "This person is thorough. I can rely on their numbers." Messy, uncommented queries signal the opposite — even if the results are correct.

This isn't about perfection. It's about building a reputation for clear, reliable work. Over time, that reputation compounds into bigger projects, more visibility, and greater career impact.

Start Today

Pick your most recent SQL query and apply these three habits:

  1. Add a comment explaining why (not what) the query exists
  2. Break it into named CTEs if it has more than 2 JOINs
  3. Rename columns to plain business language

It won't take long. And the next time someone opens your work, they'll notice the difference.

Until next time,

Brian

Brian Graves, creator of Analytics in Action

Say 👋 on X/Twitter, LinkedIn, or book a call with me. You can always reply to these emails. I check them all.


P.S. Clear, well-structured queries are a running theme throughout SQL for Business Impact. Every module teaches not just the SQL pattern, but how to structure and present it clearly. It's the difference between writing queries and doing analysis. Check it out at sqlforbusinessimpact.com.

P.P.S. What does your team's SQL documentation look like? Are queries well-documented or is everything tribal knowledge? Hit reply — I'm curious how different teams handle this. I read every response.

Starting With Data

Learn to build analytics projects with SQL, Tableau, Excel, and Python. For data analysts looking to level up their career and complete beginners looking to get started. No fluff. No theory. Just step-by-step tutorials anyone can follow.

Read more from Starting With Data

Hey Reader, Last month, we covered LEFT JOIN to find customers who never booked. That's one type of absence — people who never showed up. But there's another kind that's even more expensive: people who showed up, committed, and then left. The dreaded cancellation. Customers who said "yes", then said "actually, never mind". In the Summit Adventures database (the fake adventure tourism company I created to help people learn business analytics), nearly 47% of all bookings end in cancellation....

Hello Reader, This week's newsletter is a bit different. There's SQL in here, but the real topic is career strategy. Because one of the most common questions I get from readers is some version of: "I've been learning SQL for a few months. How do I prove I can actually do this job?" The honest answer: a portfolio of 2-3 projects that demonstrate business thinking, not just technical syntax. The problem is most portfolios I've reviewed look the same. A Kaggle competition. A tutorial from...

Hello Reader, A common analytics question that sounds simple but isn't: "Which customers have booked the same type of expedition more than once?" You can't answer this with a regular JOIN between two different tables. The information lives in one table. You need to compare rows within that table to find patterns. That's what a self-JOIN does. It joins a table to itself. It sounds unusual, but once you see the pattern, you'll recognize situations where it's exactly what you need. The Business...