How to Create Table in Hive in Hadoop: Syntax and Example
To create a table in Hive on Hadoop, use the
CREATE TABLE statement followed by the table name and column definitions inside parentheses. You can specify the data format and location optionally. For example, CREATE TABLE tablename (id INT, name STRING); creates a simple table.Syntax
The basic syntax to create a table in Hive is:
- CREATE TABLE: Keyword to start table creation.
- table_name: Name of the table you want to create.
- (column_name data_type, ...): List of columns with their data types.
- ROW FORMAT and STORED AS: Optional clauses to define data format.
- LOCATION: Optional path to specify where data files are stored.
sql
CREATE TABLE table_name ( column1 INT, column2 STRING, column3 FLOAT ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE LOCATION '/user/hive/warehouse/table_name';
Example
This example creates a simple Hive table named employees with three columns: id, name, and salary. It uses default storage format and location.
sql
CREATE TABLE employees ( id INT, name STRING, salary FLOAT );
Output
OK
Time taken: 1.234 seconds
Common Pitfalls
Common mistakes when creating Hive tables include:
- Forgetting to specify data types for columns.
- Using unsupported data types.
- Not setting the correct file format or delimiter when importing data.
- Trying to create a table that already exists without
IF NOT EXISTS.
Always check your syntax and data format to avoid errors.
sql
/* Wrong: Missing data types */ CREATE TABLE wrong_table ( id INT, name STRING ); /* Right: Specify data types */ CREATE TABLE correct_table ( id INT, name STRING );
Quick Reference
| Clause | Description |
|---|---|
| CREATE TABLE | Starts the table creation statement |
| table_name | Name of the table to create |
| (columns) | Defines columns and their data types |
| IF NOT EXISTS | Optional, avoids error if table exists |
| ROW FORMAT | Defines how data is formatted (e.g., DELIMITED) |
| FIELDS TERMINATED BY | Specifies delimiter for fields |
| STORED AS | Specifies file format (e.g., TEXTFILE, ORC) |
| LOCATION | Specifies HDFS path for table data |
Key Takeaways
Use CREATE TABLE with column names and data types to define a Hive table.
Specify data format and location for better control over data storage.
Avoid errors by using IF NOT EXISTS when creating tables.
Always check column data types and delimiters to match your data.
Hive tables can be managed or external depending on LOCATION clause.