Every time I speak to sales managers, they share the same complaint. "I have so much sales data, but I don't know what to do with it." Sound familiar? Drowning in numbers:
A relentless barrage of data. Impossible to keep up with. And even if you somehow manage, interpreting that data to make meaningful decisions? A colossal task. You've considered learning programming, and diving into Python, perhaps. But even the thought of setting up complicated software, dealing with installations, and debugging code is daunting. It's a complete nightmare. And it doesn't feel worth all the hassle. Never feel overwhelmed with data again In this guide, I break down the entire process into 3 manageable steps. The best part? No installations. No fuss. Just pure, actionable steps. Here's what's waiting for you:
Sounds like a dream, right? The D.A.T.A Framework Over the past 15+ years of my data career, I've followed the same 4 steps for every new project. And I call it the D.A.T.A. Framework. Clever, I know. π
Don't overcomplicate it! Just keep that framework in mind (like I do) to keep things simple. π btw if you want me to share more about this framework (and others that I use all the time) just let me know! Okay, let's apply the D.A.T.A. Framework to solve the sales manager's problem with Python. Remember: there are only two types of data professionals: action-takers and sideliners. This is where you decide if you're an action-taker or a sideliner. Use my Global Superstore Sales Analysis with Python notebook so you can follow along. A free Google Colab account is required, but you can also download the ipynb file. Step 1: Setup and ConfigurationThe first step is setting everything up. Protip: Always ensure you have the latest version of libraries to avoid compatibility issues.
Step 2: Load the Data and Get a PreviewNext, let's load the Superstore Sales dataset and get a preview. Understanding your data structure is the first step in any analysis. This is the Acquire step of the D.A.T.A Framework. Rookie mistake: Not checking the first few rows of your dataset. Always inspect the initial rows to understand your data's structure.
Here's what it should look like:
Step 3: Analyze the DataThe final step for using Python to build a sales analysis is to actually create the analysis. Now, if you've been following the D.A.T.A. Framework, this is the final two parts. I'll show you how to transform the raw CSV data into something useable and then we'll build out three different analyses, all in Python. Analysis 1: Sales Analysis by Category Visualizing total sales by category provides a high-level overview of where the majority of sales are coming from. Proper labeling in your plots is essential. It makes your charts easily understandable to anyone viewing them.
Here's what you should see
Analysis 2: Monthly Sales Analysis Breaking down sales on a monthly basis helps in understanding trends, seasonal variations, and anomalies. Protip: Be sure to convert any date columns into a datetime datatype. Always ensure date-related operations are performed on columns of the correct datatype.
Here's the sales data by month so you can easily spot trends:
Analysis 3: Monthly Sales Over Time by Category Okay, now we're getting somewhere! Let's combine the insights we've gotten from the first two analyses into one where we can look at monthly trends by sales category.
And here's what it should look like. Nice!!
Analysis 4: Profit vs Sales Analysis by Subcategory Visualizing the relationship between profit and sales for each subcategory can reveal which products are the most lucrative. More profit = more cash money! A scatter plot is particularly useful for this type of analysis as it visually separates high-profit, high-sales products from the rest.
And here's the profit vs. sales scatter plot (created in Python and Plotly from CSV data!)
Great job! If you worked through those steps, you are well on your way to analyzing all sorts of data with Python. Here's a quick recap of what you learned
By following the D.A.T.A. Framework, we streamlined the process of deriving actionable insights from raw sales data. Ideas for next steps
The idea for this newsletter came directly from a reader β just like you! βTake 3 minutes to let me know what you want help with next.β Until next time, keep exploring and happy analyzing! Brian PS: The final Solving with SQL cohort for 2023 (possibly EVER) starts on Oct 16. If you want to level up your SQL skills by solving real-world business problems alongside other data professionals, then you should definitely register now. I'm planning out 2024 now and this is probably the last time I'll offer cohorts for SQL only. Whenever you're ready, there are three ways I can help
|
You are receiving this because you signed up for Starting with Data, purchased one of my data analytics products, or enrolled in one of my data analytics courses. Unsubscribe at any time using the link below. 600 1st Ave, Ste 330 PMB 92768, Seattle, WA 98104-2246 |
Learn to build analytics projects with SQL, Tableau, Excel, and Python. For data analysts looking to level up their career and complete beginners looking to get started. No fluff. No theory. Just step-by-step tutorials anyone can follow.
Hey Reader, Most analysts can't answer something like this: "Which of our guides are generating the most revenue, and are we assigning them to the right expeditions?" This isn't a one-step question. You need to: Calculate revenue per guide Look at which expedition types they're assigned to Compare guide revenue to expedition category performance Identify mismatches (top guides on underperforming products, or vice versa) That's four analytical steps. Trying to write this as a single SQL query...
Hey Reader, Quick question: Could a colleague open your most recent SQL query and understand what it does without asking you a single question? If the answer is "probably not," this newsletter is for you. Writing SQL that works is step one. Writing SQL that someone else can read, trust, and maintain β that's the skill that changes how people see your work. Why This Matters More Than You Think Here's what I've noticed over 15+ years in analytics: the analysts who get promoted aren't always the...
Hey Reader, Imagine you're presenting a finding to your team: "Customer A spent $4,200 with us." The first question from the room: "Is that a lot?" Most analysts learning SQL don't realize that, without context, the number means nothing. You need comparison. Is the average customer spend $500 or $5,000? Is $4,200 in the top 10% or the middle of the pack? This is the "compared to what?" problem, and subqueries solve it elegantly. Numbers without context don't drive decisions. Executives don't...