0
0
Hadoopdata~10 mins

Why Pig simplifies data transformation in Hadoop - Test Your Understanding

Choose your learning style9 modes available
Practice - 5 Tasks
Answer the questions below
1fill in blank
easy

Complete the code to load data using Pig Latin.

Hadoop
data = LOAD 'input.txt' USING [1] AS (name:chararray, age:int);
Drag options to blanks, or click blank then click option'
APigStorage(',')
BTextLoader()
CJsonLoader()
DCsvLoader()
Attempts:
3 left
💡 Hint
Common Mistakes
Using JsonLoader for plain text files.
Using TextLoader which is not a Pig built-in loader.
Using CsvLoader which is not a standard Pig loader.
2fill in blank
medium

Complete the code to filter data where age is greater than 30.

Hadoop
filtered = FILTER data BY age [1] 30;
Drag options to blanks, or click blank then click option'
A==
B<
C>
D<=
Attempts:
3 left
💡 Hint
Common Mistakes
Using '<' which filters ages less than 30.
Using '==' which filters ages exactly 30.
Using '<=' which includes ages less or equal to 30.
3fill in blank
hard

Fix the error in the code to group data by name.

Hadoop
grouped = GROUP data BY [1];
Drag options to blanks, or click blank then click option'
Aage
Bname
Csalary
Ddate
Attempts:
3 left
💡 Hint
Common Mistakes
Grouping by 'age' which groups by age, not name.
Grouping by 'salary' which may not exist in data.
Grouping by 'date' which is unrelated here.
4fill in blank
hard

Fill both blanks to create a new relation with names and count of records.

Hadoop
result = FOREACH grouped GENERATE [1], COUNT([2]);
Drag options to blanks, or click blank then click option'
Agroup
Bdata
Cgrouped
Dfiltered
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'grouped' instead of 'group' for the group key.
Counting 'grouped' which is not a bag of records.
Using 'filtered' which is unrelated here.
5fill in blank
hard

Fill all three blanks to order the result by group descending and store it.

Hadoop
ordered = ORDER result BY [1] [2];
STORE ordered INTO '[3]';
Drag options to blanks, or click blank then click option'
Agroup
BDESC
Coutput_folder
DASC
Attempts:
3 left
💡 Hint
Common Mistakes
Ordering by count instead of group.
Using ASC instead of DESC for descending order.
Not specifying a valid output folder.