0
0
Hadoopdata~30 mins

External vs managed tables in Hadoop - Hands-On Comparison

Choose your learning style9 modes available
External vs Managed Tables in Hadoop
📖 Scenario: You are working with Hadoop and Hive to manage data tables. You want to understand the difference between external and managed tables by creating examples of each type.
🎯 Goal: Create one managed table and one external table in Hive. Then, query both tables to see their contents.
📋 What You'll Learn
Create a managed table called managed_employees with columns id (int) and name (string).
Create an external table called external_employees with the same columns, pointing to a specific HDFS location /user/hive/external_employees.
Insert sample data into the managed table.
Query both tables to display their data.
💡 Why This Matters
🌍 Real World
Data engineers use managed and external tables to organize and control data storage in Hadoop environments.
💼 Career
Understanding table types is essential for managing data lifecycle and storage in big data jobs.
Progress0 / 4 steps
1
Create a managed table
Write a Hive query to create a managed table called managed_employees with columns id (int) and name (string).
Hadoop
Need a hint?

A managed table is created with CREATE TABLE without EXTERNAL.

2
Create an external table
Write a Hive query to create an external table called external_employees with columns id (int) and name (string), stored at HDFS location /user/hive/external_employees.
Hadoop
Need a hint?

Use CREATE EXTERNAL TABLE and specify the LOCATION for external tables.

3
Insert data into the managed table
Write a Hive query to insert two rows into the managed_employees table: (1, 'Alice') and (2, 'Bob').
Hadoop
Need a hint?

Use INSERT INTO TABLE with VALUES to add rows.

4
Query both tables
Write two Hive queries to select all data from managed_employees and external_employees tables and print the results.
Hadoop
Need a hint?

Use SELECT * FROM table_name to see all rows in a table.