0
0
Excelspreadsheet~15 mins

Why Power Query transforms messy data in Excel - Business Case Study

Choose your learning style9 modes available
Scenario Mode
👤 Your Role: You are a data analyst at a retail company.
📋 Request: Your manager wants you to clean and organize sales data that is messy and inconsistent so it can be used for accurate reporting.
📊 Data: You have a sales data file with columns for Date, Product, Region, Sales Amount, and Notes. The data has inconsistent date formats, extra spaces, mixed case text, and some missing values.
🎯 Deliverable: A clean and organized table with consistent date format, trimmed text, standardized case, and no missing values in key columns.
Progress0 / 6 steps
Sample Data
DateProductRegionSales AmountNotes
1/5/2024 apple north100First sale
2024-01-06BananaSouth200
07-Jan-2024APPLENorth150Repeat
1/8/24banana southMissing sale
2024/01/09OrangeEast120
10-Jan-2024orangeEast130Good sale
11/01/2024AppleNorth170
2024-01-12BananaSouth210New stock
1
Step 1: Load the messy sales data into Power Query from Excel.
In Excel, select the data range, then go to Data > From Table/Range to open Power Query Editor.
Expected Result
The data appears in Power Query Editor ready for transformation.
2
Step 2: Standardize the Date column to a consistent date format.
Select the Date column, then choose Transform > Data Type > Date.
Expected Result
All dates are converted to Excel date format, e.g., 01/05/2024, 01/06/2024, etc.
3
Step 3: Trim extra spaces from the Product column.
Select the Product column, then choose Transform > Format > Trim.
Expected Result
Product names have no leading or trailing spaces.
4
Step 4: Convert all text in the Product and Region columns to proper case (first letter capital).
Select Product and Region columns, then Transform > Format > Capitalize Each Word.
Expected Result
Product and Region names are standardized, e.g., 'Apple', 'Banana', 'North', 'South'.
5
Step 5: Remove rows where Sales Amount is missing.
Filter the Sales Amount column to remove blank or null values.
Expected Result
Rows with missing sales amounts are removed from the data.
6
Step 6: Close and load the cleaned data back into Excel as a new table.
Click Home > Close & Load to return the cleaned data to Excel.
Expected Result
A new clean table appears in Excel with consistent dates, trimmed and capitalized text, and no missing sales amounts.
Final Result
Date       | Product | Region | Sales Amount | Notes
-----------------------------------------------------
01/05/2024 | Apple   | North  | 100          | First sale
01/06/2024 | Banana  | South  | 200          | 
01/07/2024 | Apple   | North  | 150          | Repeat
01/09/2024 | Orange  | East   | 120          | 
01/10/2024 | Orange  | East   | 130          | Good sale
01/11/2024 | Apple   | North  | 170          | 
01/12/2024 | Banana  | South  | 210          | New stock
Dates are now consistent and easy to sort or filter.
Product and Region names are standardized for accurate grouping.
Rows with missing sales amounts are removed to avoid errors in analysis.
The cleaned data is ready for reliable reporting and charts.
Bonus Challenge

Add a new column in Power Query that categorizes sales as 'High' if Sales Amount is above 150, otherwise 'Low'.

Show Hint
Use Add Column > Conditional Column in Power Query and set condition: Sales Amount > 150 then 'High' else 'Low'.