How to Use HDFS DFS Commands in Hadoop: Syntax and Examples
Use
hdfs dfs commands to interact with Hadoop's distributed file system (HDFS). These commands let you list, copy, move, delete, and manage files and directories in HDFS from the command line.Syntax
The basic syntax for hdfs dfs commands is:
hdfs dfs -command [options] [path]
Here, -command is the operation you want to perform, such as -ls to list files or -put to upload files. [options] are extra flags for the command, and [path] is the HDFS file or directory path.
bash
hdfs dfs -ls /user/hadoop hdfs dfs -put localfile.txt /user/hadoop/ hdfs dfs -cat /user/hadoop/file.txt
Example
This example shows how to list files in HDFS, upload a local file, and read a file's content using hdfs dfs commands.
bash
hdfs dfs -ls /user/hadoop hdfs dfs -put example.txt /user/hadoop/ hdfs dfs -cat /user/hadoop/example.txt
Output
Found 0 items
(example.txt content displayed here)
Common Pitfalls
Common mistakes include:
- Using local file paths instead of HDFS paths or vice versa.
- Not having proper permissions to access or modify HDFS files.
- Forgetting to specify the full HDFS path starting with
/. - Confusing
-put(upload) with-copyFromLocalwhich behaves similarly but has subtle differences.
Always check your current directory and permissions before running commands.
bash
Wrong: hdfs dfs -put /user/hadoop/example.txt /user/hadoop/ Right: hdfs dfs -put example.txt /user/hadoop/
Quick Reference
| Command | Description |
|---|---|
| hdfs dfs -ls /path | List files and directories in HDFS path |
| hdfs dfs -put localfile /path | Upload local file to HDFS |
| hdfs dfs -get /path localdir | Download HDFS file to local directory |
| hdfs dfs -rm /path/file | Remove file from HDFS |
| hdfs dfs -mkdir /path/dir | Create directory in HDFS |
| hdfs dfs -cat /path/file | Display contents of HDFS file |
Key Takeaways
Use
hdfs dfs commands to manage files in Hadoop's HDFS from the command line.Always specify full HDFS paths starting with
/ to avoid confusion.Check permissions before running commands to prevent access errors.
Common commands include
-ls, -put, -get, -rm, and -cat.Be careful to distinguish between local and HDFS file paths when using commands.