How to Copy File to HDFS in Hadoop: Simple Commands
To copy a file to HDFS in Hadoop, use the
hdfs dfs -put or hdfs dfs -copyFromLocal command followed by the local file path and the target HDFS directory. For example, hdfs dfs -put /local/path/file.txt /hdfs/path/ uploads the file to HDFS.Syntax
The basic syntax to copy a file from your local system to HDFS is:
hdfs dfs -put <local_source> <hdfs_destination>hdfs dfs -copyFromLocal <local_source> <hdfs_destination>
Here, <local_source> is the path of the file on your local machine, and <hdfs_destination> is the directory path in HDFS where you want to copy the file.
Both commands do the same thing: upload local files to HDFS.
bash
hdfs dfs -put /local/path/file.txt /hdfs/path/ hdfs dfs -copyFromLocal /local/path/file.txt /hdfs/path/
Example
This example shows how to copy a file named data.txt from your local directory /home/user/ to the HDFS directory /user/hadoop/.
bash
hdfs dfs -put /home/user/data.txt /user/hadoop/ hdfs dfs -ls /user/hadoop/
Output
Found 1 items
-rw-r--r-- 3 hadoop supergroup 1234 2024-06-01 10:00 /user/hadoop/data.txt
Common Pitfalls
- Trying to copy a file to a non-existent HDFS directory will cause an error. Make sure the target directory exists or create it with
hdfs dfs -mkdir. - Using incorrect file paths or typos in the command will fail silently or show errors.
- Permissions issues can prevent copying files; ensure you have write access to the target HDFS directory.
bash
hdfs dfs -put /home/user/data.txt /user/hadoop/nonexistent_dir/ # Error: No such file or directory # Correct way: hdfs dfs -mkdir -p /user/hadoop/nonexistent_dir/ hdfs dfs -put /home/user/data.txt /user/hadoop/nonexistent_dir/
Quick Reference
Here is a quick cheat sheet for copying files to HDFS:
| Command | Description |
|---|---|
| hdfs dfs -put | Copy local file to HDFS |
| hdfs dfs -copyFromLocal | Same as -put, copy local file to HDFS |
| hdfs dfs -mkdir | Create directory in HDFS |
| hdfs dfs -ls | List files in HDFS directory |
Key Takeaways
Use
hdfs dfs -put or hdfs dfs -copyFromLocal to copy files from local to HDFS.Ensure the target HDFS directory exists before copying files.
Check permissions to avoid write access errors when copying files.
Use
hdfs dfs -ls to verify files are copied successfully.Both
-put and -copyFromLocal commands work the same for uploading local files.