Practice - 5 Tasks
Answer the questions below
1fill in blank
easyComplete the code to load data from a file named 'data.txt'.
Hadoop
data = LOAD 'data.txt' USING [1];
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using JsonLoader() for plain text files.
Forgetting to specify a loader.
Using CsvLoader() when the file is not CSV.
✗ Incorrect
PigStorage() is the default loader in Pig Latin for loading plain text files with fields separated by tabs.
2fill in blank
mediumComplete the code to filter records where the age field is greater than 30.
Hadoop
filtered_data = FILTER data BY [1] > 30;
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using the dataset name instead of the field name.
Using a number instead of a field name.
Using a field that does not exist.
✗ Incorrect
The FILTER statement uses the field name 'age' to compare values greater than 30.
3fill in blank
hardFix the error in the code to group data by the 'department' field.
Hadoop
grouped = GROUP data BY [1]; Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Grouping by a field that is not categorical.
Using a field that does not exist in the data.
Using the dataset name instead of a field.
✗ Incorrect
Grouping is done by the 'department' field to aggregate data per department.
4fill in blank
hardFill both blanks to create a new relation with only the 'name' and 'salary' fields.
Hadoop
selected = FOREACH data GENERATE [1], [2];
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Selecting fields not present in the data.
Selecting fields unrelated to the task.
Mixing up field names.
✗ Incorrect
The FOREACH GENERATE statement selects the 'name' and 'salary' fields from the data.
5fill in blank
hardFill all three blanks to calculate the average salary per department.
Hadoop
grouped = GROUP data BY [1]; avg_salary = FOREACH grouped GENERATE [2], AVG([3]);
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'salary' instead of 'data.salary' inside AVG.
Not using 'group' to output the group key.
Grouping by the wrong field.
✗ Incorrect
First, group by 'department'. Then generate the group name and average of the salary field from the grouped data.