0
0
Hadoopdata~20 mins

Sqoop for database imports in Hadoop - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Sqoop Import Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
Predict Output
intermediate
2:00remaining
What is the output of this Sqoop import command?

Consider the following Sqoop command to import data from a MySQL database into HDFS:

sqoop import \
--connect jdbc:mysql://localhost/employees \
--username user \
--password pass \
--table employees \
--target-dir /user/hadoop/employees_data \
--num-mappers 1

What will be the result of running this command?

AData from the 'employees' table is imported into HDFS directory '/user/hadoop/employees_data' as SequenceFiles using one mapper.
BData from the 'employees' table is imported into HDFS directory '/user/hadoop/employees_data' as Avro files using multiple mappers.
CData from the 'employees' table is imported into HDFS directory '/user/hadoop/employees_data' as text files using one mapper.
DThe command will fail because the '--num-mappers' option cannot be set to 1.
Attempts:
2 left
💡 Hint

Think about the default file format and the effect of '--num-mappers 1'.

🧠 Conceptual
intermediate
1:00remaining
Which option controls the number of parallel tasks in Sqoop import?

In Sqoop, you want to speed up data import by running multiple parallel tasks. Which command-line option controls how many parallel mappers Sqoop uses during import?

A--num-mappers
B--parallel-tasks
C--mapper-count
D--task-parallelism
Attempts:
2 left
💡 Hint

Look for the option that specifies the number of mappers.

🔧 Debug
advanced
1:30remaining
Why does this Sqoop import command fail with a connection error?

Given this command:

sqoop import \
--connect jdbc:mysql://localhost:3306/employees \
--username user \
--password pass \
--table employees \
--target-dir /user/hadoop/employees_data

The command fails with a connection refused error. What is the most likely cause?

AThe username or password is incorrect.
BThe MySQL server is not running or not reachable at localhost:3306.
CThe '--table' option is misspelled.
DThe '--target-dir' path is invalid in HDFS.
Attempts:
2 left
💡 Hint

Connection refused usually means the server is unreachable.

data_output
advanced
1:30remaining
What is the number of files created after this Sqoop import?

Run this Sqoop import command:

sqoop import \
--connect jdbc:mysql://localhost/employees \
--username user \
--password pass \
--table employees \
--target-dir /user/hadoop/employees_data \
--num-mappers 4

Assuming the import succeeds, how many part files will be created in the target directory?

A5 part files, one extra for metadata
B1 part file, because files are merged
C0 files, because data is stored in a database
D4 part files, one per mapper
Attempts:
2 left
💡 Hint

Each mapper writes one output file.

🚀 Application
expert
2:00remaining
How to import only rows matching a condition using Sqoop?

You want to import only employees with salary greater than 50000 from the 'employees' table using Sqoop. Which option should you use to filter rows during import?

A--where "salary > 50000"
B--condition salary > 50000
C--filter salary > 50000
D--query "SELECT * FROM employees WHERE salary > 50000"
Attempts:
2 left
💡 Hint

Look for the option that allows SQL WHERE clause filtering.