Hadoopdata~10 mins

Sqoop for database imports in Hadoop - Step-by-Step Execution

Choose your learning style9 modes available

Learn Why Deep Visual Try Challenge Project Recall Time

Concept Flow - Sqoop for database imports

Start Sqoop Import Command

↓

Connect to Database

↓

Run SQL Query to Extract Data

↓

Convert Data to Hadoop Format

↓

Store Data in HDFS

↓

Import Complete

Sqoop imports data by connecting to a database, extracting data with SQL, converting it, and storing it in Hadoop.

Execution Sample

Hadoop

sqoop import \
  --connect jdbc:mysql://dbhost/dbname \
  --username user \
  --password pass \
  --table employees \
  --target-dir /user/hadoop/employees

This command imports the 'employees' table from MySQL into Hadoop's HDFS directory.

Execution Table

Step	Action	Details	Result
1	Start Sqoop Import	Run sqoop import command	Sqoop process begins
2	Connect to Database	Connect to jdbc:mysql://dbhost/dbname	Connection established
3	Authenticate	Use username and password	Authentication successful
4	Run SQL Query	SELECT * FROM employees	Data rows fetched
5	Convert Data	Convert rows to Hadoop format (e.g., text files)	Data converted
6	Store Data	Write data to /user/hadoop/employees in HDFS	Data stored in HDFS
7	Import Complete	Close connections and finish	Import finished successfully

💡 Import finishes after data is stored in HDFS and connections close

Variable Tracker

Variable	Start	After Step 2	After Step 4	After Step 6	Final
Connection	None	Connected	Connected	Connected	Closed
Data Rows	None	None	Fetched	Stored	Stored
HDFS Directory	Empty	Empty	Empty	/user/hadoop/employees	/user/hadoop/employees

Key Moments - 3 Insights

Why does Sqoop need a database connection before importing?

What happens to the data after it is fetched from the database?

Why is the target directory important in Sqoop import?

Visual Quiz - 3 Questions

Test your understanding

Look at the execution_table, what is the result after Step 3?

AAuthentication successful

BData rows fetched

CConnection established

DImport finished successfully

Concept Snapshot

Sqoop Import Syntax:
sqoop import --connect <jdbc_url> --username <user> --password <pass> --table <table_name> --target-dir <hdfs_path>

Behavior:
Connects to DB, extracts data, converts it, stores in HDFS.

Key Rule:
Target directory must exist or be writable in HDFS.

Import stops if DB connection fails.

Full Transcript

Sqoop imports data from a database into Hadoop by running a command that connects to the database using JDBC. It authenticates with username and password, then runs a SQL query to fetch data from the specified table. The data is converted into a Hadoop-friendly format and saved into a target directory in HDFS. The process ends by closing connections. Variables like connection status, data rows, and HDFS directory change state step-by-step during the import. If the connection fails, the import stops early. The target directory is where the data is stored for Hadoop to use.