Reduce and aggregate actions
📖 Scenario: You work at a small online store. You have a list of sales records showing the product name and the quantity sold. You want to find out the total quantity sold for each product.
🎯 Goal: Build a Spark program that sums the quantities sold for each product using reduce and aggregate actions.
📋 What You'll Learn
Create an RDD with the exact sales data given
Create a variable for the minimum quantity threshold
Use reduceByKey to sum quantities for each product
Print the final aggregated result
💡 Why This Matters
🌍 Real World
Summing sales quantities per product is a common task in retail analytics to understand product performance.
💼 Career
Data scientists and analysts often use reduce and aggregate actions in Spark to process large datasets efficiently.
Progress0 / 4 steps