0
0
Hadoopdata~10 mins

Pig Latin basics in Hadoop - Interactive Code Practice

Choose your learning style9 modes available
Practice - 5 Tasks
Answer the questions below
1fill in blank
easy

Complete the code to load data from a file named 'data.txt'.

Hadoop
data = LOAD 'data.txt' USING [1];
Drag options to blanks, or click blank then click option'
APigStorage()
BTextLoader()
CCsvLoader()
DJsonLoader()
Attempts:
3 left
💡 Hint
Common Mistakes
Using JsonLoader() for plain text files.
Forgetting to specify a loader.
Using CsvLoader() when the file is not CSV.
2fill in blank
medium

Complete the code to filter records where the age field is greater than 30.

Hadoop
filtered_data = FILTER data BY [1] > 30;
Drag options to blanks, or click blank then click option'
Adata
B30
Cage
Dcount
Attempts:
3 left
💡 Hint
Common Mistakes
Using the dataset name instead of the field name.
Using a number instead of a field name.
Using a field that does not exist.
3fill in blank
hard

Fix the error in the code to group data by the 'department' field.

Hadoop
grouped = GROUP data BY [1];
Drag options to blanks, or click blank then click option'
Aage
Bdepartment
Cname
Dsalary
Attempts:
3 left
💡 Hint
Common Mistakes
Grouping by a field that is not categorical.
Using a field that does not exist in the data.
Using the dataset name instead of a field.
4fill in blank
hard

Fill both blanks to create a new relation with only the 'name' and 'salary' fields.

Hadoop
selected = FOREACH data GENERATE [1], [2];
Drag options to blanks, or click blank then click option'
Aname
Bage
Csalary
Ddepartment
Attempts:
3 left
💡 Hint
Common Mistakes
Selecting fields not present in the data.
Selecting fields unrelated to the task.
Mixing up field names.
5fill in blank
hard

Fill all three blanks to calculate the average salary per department.

Hadoop
grouped = GROUP data BY [1];
avg_salary = FOREACH grouped GENERATE [2], AVG([3]);
Drag options to blanks, or click blank then click option'
Adepartment
Bgroup
Cdata.salary
Dsalary
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'salary' instead of 'data.salary' inside AVG.
Not using 'group' to output the group key.
Grouping by the wrong field.