0
0
Hadoopdata~30 mins

Sqoop for database imports in Hadoop - Mini Project: Build & Apply

Choose your learning style9 modes available
Importing Data from a Database using Sqoop
📖 Scenario: You work as a data analyst in a company that stores customer data in a MySQL database. You want to analyze this data using Hadoop tools. To do this, you need to import the customer data from the MySQL database into Hadoop's HDFS using Sqoop.
🎯 Goal: Learn how to use Sqoop commands to import data from a MySQL database table into HDFS step-by-step.
📋 What You'll Learn
Use Sqoop to connect to a MySQL database
Import a specific table called customers from the database
Specify the target directory in HDFS for the imported data
Use a condition to import only customers from a specific country
Display the imported data files in HDFS
💡 Why This Matters
🌍 Real World
Companies often store data in relational databases but want to analyze it using big data tools like Hadoop. Sqoop helps move data easily from databases to Hadoop.
💼 Career
Data engineers and analysts use Sqoop to import data for analysis, reporting, and building data pipelines.
Progress0 / 4 steps
1
Set up the database connection details
Create variables for the database connection details: db_url with value jdbc:mysql://localhost:3306/companydb, db_user with value root, and db_password with value password123.
Hadoop
Need a hint?

Use three variables named exactly db_url, db_user, and db_password with the given string values.

2
Set the target HDFS directory and table name
Create variables table_name with value customers and target_dir with value /user/hadoop/customers_data.
Hadoop
Need a hint?

Define table_name and target_dir variables with the exact values.

3
Write the Sqoop import command with a condition
Create a variable sqoop_command that stores the full Sqoop import command as a string. The command should use db_url, db_user, db_password, table_name, and target_dir. Add a condition to import only customers where country = 'USA'. Use the format: sqoop import --connect {db_url} --username {db_user} --password {db_password} --table {table_name} --where "country='USA'" --target-dir {target_dir} --delete-target-dir.
Hadoop
Need a hint?

Use an f-string to build the Sqoop command with the exact options and condition.

4
Print the Sqoop command to run
Write a print statement to display the sqoop_command variable.
Hadoop
Need a hint?

Use print(sqoop_command) to show the full command.