GROUP and JOIN operations
📖 Scenario: You work at a small online store. You have two data files: one with customer orders and another with customer details. You want to learn how to group orders by customer and then join customer details with their orders.
🎯 Goal: Build a Hadoop MapReduce job that groups orders by customer ID and then joins customer details with their orders to produce a combined output showing customer name and their orders.
📋 What You'll Learn
Create a dataset of orders with customer IDs and order amounts
Create a dataset of customers with customer IDs and names
Write a MapReduce job to group orders by customer ID
Write a MapReduce job to join customer details with their orders
Print the final joined output showing customer names and their orders
💡 Why This Matters
🌍 Real World
Grouping and joining data is common in sales analysis, customer segmentation, and reporting in many businesses.
💼 Career
Data scientists and engineers often need to group and join large datasets to prepare data for analysis or machine learning.
Progress0 / 4 steps