0
0
Hadoopdata

Sqoop for database imports in Hadoop - Cheat Sheet & Quick Revision

Choose your learning style9 modes available
Recall & Review
beginner
What is Sqoop used for in Hadoop?
Sqoop is a tool that helps transfer data between Hadoop and relational databases. It is mainly used to import data from databases into Hadoop's HDFS and export data back.
Click to reveal answer
beginner
Which command imports a full table from a database into HDFS using Sqoop?
The command is <code>sqoop import --connect &lt;jdbc-connection-string&gt; --table &lt;table-name&gt; --username &lt;user&gt; --password &lt;password&gt; --target-dir &lt;hdfs-directory&gt;</code>. This imports the entire table into the specified HDFS directory.
Click to reveal answer
intermediate
What does the --split-by option do in Sqoop import?
The <code>--split-by</code> option tells Sqoop which column to use to split the data for parallel import. This helps speed up the import by dividing the work across multiple mappers.
Click to reveal answer
intermediate
How can you import only specific columns from a database table using Sqoop?
Use the --columns option with a comma-separated list of column names. For example, --columns "id,name,age" imports only those columns.
Click to reveal answer
intermediate
What is the role of the --where clause in Sqoop import?
The <code>--where</code> clause lets you filter rows during import by specifying a SQL condition. For example, <code>--where "age &gt; 30"</code> imports only rows where age is greater than 30.
Click to reveal answer
What does Sqoop primarily do?
AProcesses big data inside Hadoop
BTransfers data between Hadoop and relational databases
CVisualizes data in Hadoop
DManages Hadoop cluster nodes
Which option specifies the database table to import in Sqoop?
A--table
B--connect
C--target-dir
D--split-by
How does Sqoop speed up data import?
ABy using multiple mappers with --split-by
BBy compressing data
CBy importing only metadata
DBy exporting data instead
Which option filters rows during import?
A--columns
B--username
C--where
D--target-dir
To import only specific columns, which option do you use?
A--table
B--split-by
C--connect
D--columns
Explain how Sqoop imports data from a relational database into Hadoop.
Think about the steps and options needed to move data from a database to Hadoop.
You got /5 concepts.
    Describe how you can control which data Sqoop imports from a database table.
    Consider options that limit or organize the data imported.
    You got /4 concepts.