LOAD, FILTER, and STORE operations
📖 Scenario: You work with a large dataset of customer orders stored in Hadoop. You want to load this data, filter orders with amounts greater than 100, and save the filtered results for further analysis.
🎯 Goal: Build a Hadoop Pig script that loads the orders data, filters orders with amount greater than 100, and stores the filtered data into a new location.
📋 What You'll Learn
Load data from
'/data/orders' with fields order_id, customer_id, and amountCreate a filter condition to keep only orders where
amount > 100Store the filtered results into
'/data/filtered_orders'💡 Why This Matters
🌍 Real World
Filtering large datasets in Hadoop is common for preparing data for analysis or reporting.
💼 Career
Data engineers and analysts use LOAD, FILTER, and STORE operations daily to manage big data pipelines.
Progress0 / 4 steps