0
0
Hadoopdata~30 mins

Pig Latin basics in Hadoop - Mini Project: Build & Apply

Choose your learning style9 modes available
Pig Latin basics
📖 Scenario: You work at a small company that collects sales data. You want to use Pig Latin to analyze this data easily.
🎯 Goal: Learn how to load data, filter it, and display results using Pig Latin.
📋 What You'll Learn
Create a relation with sales data
Add a filter condition
Use a foreach statement to select fields
Display the final filtered data
💡 Why This Matters
🌍 Real World
Pig Latin is used to process large datasets in Hadoop environments for data analysis.
💼 Career
Knowing Pig Latin helps in roles like data engineer or big data analyst working with Hadoop.
Progress0 / 4 steps
1
Create the sales data relation
Create a relation called sales with these exact tuples: ("Alice", 300), ("Bob", 150), ("Charlie", 200) using the LOAD statement with PigStorage and schema (name:chararray, amount:int).
Hadoop
Need a hint?

Use LOAD with PigStorage and define the schema with AS.

2
Filter sales greater than 180
Create a relation called big_sales by filtering sales where amount is greater than 180 using the FILTER statement.
Hadoop
Need a hint?

Use FILTER with the condition amount > 180.

3
Select name and amount from filtered data
Create a relation called result using FOREACH on big_sales to generate only the name and amount fields.
Hadoop
Need a hint?

Use FOREACH with GENERATE to select fields.

4
Display the filtered sales data
Use the DUMP statement to display the contents of the result relation.
Hadoop
Need a hint?

Use DUMP result; to see the filtered data.