Recall & Review
beginner
What is Sqoop used for in Hadoop?
Sqoop is a tool that helps transfer data between Hadoop and relational databases. It is mainly used to import data from databases into Hadoop's HDFS and export data back.Click to reveal answer
beginner
Which command imports a full table from a database into HDFS using Sqoop?
The command is <code>sqoop import --connect <jdbc-connection-string> --table <table-name> --username <user> --password <password> --target-dir <hdfs-directory></code>. This imports the entire table into the specified HDFS directory.Click to reveal answer
intermediate
What does the
--split-by option do in Sqoop import?The <code>--split-by</code> option tells Sqoop which column to use to split the data for parallel import. This helps speed up the import by dividing the work across multiple mappers.Click to reveal answer
intermediate
How can you import only specific columns from a database table using Sqoop?Use the
--columns option with a comma-separated list of column names. For example, --columns "id,name,age" imports only those columns.Click to reveal answer
intermediate
What is the role of the
--where clause in Sqoop import?The <code>--where</code> clause lets you filter rows during import by specifying a SQL condition. For example, <code>--where "age > 30"</code> imports only rows where age is greater than 30.Click to reveal answer
What does Sqoop primarily do?
✗ Incorrect
Sqoop is designed to transfer data between Hadoop and relational databases.
Which option specifies the database table to import in Sqoop?
✗ Incorrect
The --table option tells Sqoop which database table to import.
How does Sqoop speed up data import?
✗ Incorrect
Sqoop uses multiple mappers and the --split-by option to parallelize data import.
Which option filters rows during import?
✗ Incorrect
The --where option applies a SQL condition to filter rows during import.
To import only specific columns, which option do you use?
✗ Incorrect
The --columns option lets you specify which columns to import.
Explain how Sqoop imports data from a relational database into Hadoop.
Think about the steps and options needed to move data from a database to Hadoop.
You got /5 concepts.
Describe how you can control which data Sqoop imports from a database table.
Consider options that limit or organize the data imported.
You got /4 concepts.