Challenge - 5 Problems

🎖️

Pig vs Hive Mastery

Get all challenges correct to earn this badge!

Test your skills under time pressure!

🧠 Conceptual

intermediate

1:30remaining

Primary Language Used in Pig and Hive

Which language is primarily used to write scripts in Apache Pig and Apache Hive respectively?

APig Latin for Pig and HiveQL for Hive

BSQL for Pig and Java for Hive

CPython for Pig and Pig Latin for Hive

DHiveQL for Pig and Pig Latin for Hive

Attempts:

2 left

🧠 Conceptual

intermediate

1:30remaining

Data Processing Model Difference

What is the main difference in data processing models between Pig and Hive?

ABoth use declarative models but Hive supports more data types

BPig uses a procedural data flow model; Hive uses a declarative SQL-like model

CBoth use procedural models but with different syntax

DPig uses a declarative model; Hive uses a procedural model

Attempts:

2 left

❓ Predict Output

advanced

2:00remaining

Output of Pig Latin Script

What is the output of this Pig Latin script given the input data below?

Input data (file):
1,apple,10
2,banana,20
3,apple,15

Script:
fruit_data = LOAD 'input' USING PigStorage(',') AS (id:int, name:chararray, quantity:int);
apple_data = FILTER fruit_data BY name == 'apple';
total = FOREACH (GROUP apple_data ALL) GENERATE SUM(apple_data.quantity) as total_quantity;
DUMP total;

Hadoop

fruit_data = LOAD 'input' USING PigStorage(',') AS (id:int, name:chararray, quantity:int);
apple_data = FILTER fruit_data BY name == 'apple';
total = FOREACH (GROUP apple_data ALL) GENERATE SUM(apple_data.quantity) as total_quantity;
DUMP total;

A({45})

B({30})

C({15})

D({25})

Attempts:

2 left

❓ Predict Output

advanced

2:00remaining

Hive Query Output for Grouping

Given a Hive table 'fruits' with columns (id INT, name STRING, quantity INT) and data:
1,apple,10
2,banana,20
3,apple,15

What is the output of this HiveQL query?
SELECT name, SUM(quantity) as total_quantity FROM fruits GROUP BY name ORDER BY total_quantity DESC;

Hadoop

SELECT name, SUM(quantity) as total_quantity FROM fruits GROUP BY name ORDER BY total_quantity DESC;

A[('banana', 30), ('apple', 25)]

B[('banana', 20), ('apple', 25)]

C[('apple', 25), ('banana', 20)]

D[('apple', 15), ('banana', 20)]

Attempts:

2 left

🚀 Application

expert

2:30remaining

Choosing Between Pig and Hive for a Task

You have a large dataset with complex data transformations involving multiple steps and custom functions. You want to write scripts that allow step-by-step data manipulation and debugging. Which tool is more suitable?

APig, because it supports procedural scripts with stepwise transformations and custom functions

BHive, because it supports SQL-like queries and is easier for analysts

CHive, because it is faster for all types of data processing

DPig, because it only supports simple queries and no custom functions

Attempts:

2 left