0
0
Power BIbi_tool~15 mins

Removing duplicates in Power BI - Real Business Scenario

Choose your learning style9 modes available
Scenario Mode
👤 Your Role: You are a sales analyst at a retail company.
📋 Request: Your manager wants a clean sales report without duplicate sales records to ensure accurate revenue analysis.
📊 Data: You have a sales dataset with columns: OrderID, Product, Customer, SalesAmount, and OrderDate. Some orders are duplicated due to system errors.
🎯 Deliverable: Produce a cleaned sales table in Power BI that removes duplicate orders based on OrderID, and create a report showing total sales by Product.
Progress0 / 5 steps
Sample Data
OrderIDProductCustomerSalesAmountOrderDate
1001NotebookAlice252024-05-01
1002PenBob52024-05-02
1003NotebookCharlie252024-05-03
1002PenBob52024-05-02
1004MarkerDiana102024-05-04
1005PenEva52024-05-05
1003NotebookCharlie252024-05-03
1006NotebookFrank252024-05-06
1
Step 1: Load the sales data into Power BI Desktop from your source file.
Use 'Get Data' and select your data source file containing the sales data.
Expected Result
Sales data table loaded with all rows including duplicates.
2
Step 2: Open Power Query Editor to clean the data.
Click 'Transform Data' to open Power Query Editor.
Expected Result
Power Query Editor opens showing the sales data.
3
Step 3: Remove duplicate rows based on the OrderID column.
Select the 'OrderID' column, then click 'Remove Rows' > 'Remove Duplicates'.
Expected Result
Only unique OrderID rows remain; duplicate orders are removed.
4
Step 4: Close and apply the changes to load the cleaned data into Power BI.
Click 'Close & Apply' in Power Query Editor.
Expected Result
Cleaned sales data loaded into Power BI without duplicates.
5
Step 5: Create a report visual to show total sales by Product.
Insert a 'Clustered Column Chart'. Set Axis to 'Product' and Values to 'Sum of SalesAmount'.
Expected Result
Bar chart displays total sales amount for each product without double counting duplicates.
Final Result
Sales Report

Product    | Total Sales
-------------------------
Notebook   | ██████████ 75
Pen        | ████ 10
Marker     | ██ 10

(Bar chart with Notebook highest, Pen and Marker lower)
Duplicate orders were removed to avoid inflating sales numbers.
Notebook is the top-selling product with total sales of 75.
Pen and Marker have lower total sales after cleaning duplicates.
Bonus Challenge

Create a measure that counts the number of unique customers who bought each product after removing duplicates.

Show Hint
Use DAX function DISTINCTCOUNT on the Customer column filtered by Product.