0
0
HadoopHow-ToBeginner ยท 3 min read

How to Delete a File in HDFS in Hadoop Quickly

To delete a file in HDFS, use the hdfs dfs -rm <file_path> command. This removes the specified file from the Hadoop Distributed File System safely and immediately.
๐Ÿ“

Syntax

The basic syntax to delete a file in HDFS is:

  • hdfs dfs: The Hadoop file system command interface.
  • -rm: The option to remove a file.
  • <file_path>: The full path of the file in HDFS you want to delete.

Example: hdfs dfs -rm /user/hadoop/file.txt deletes file.txt from the specified HDFS directory.

bash
hdfs dfs -rm <file_path>
๐Ÿ’ป

Example

This example shows how to delete a file named example.txt located in the HDFS directory /user/hadoop/.

bash
hdfs dfs -rm /user/hadoop/example.txt
Output
Deleted /user/hadoop/example.txt
โš ๏ธ

Common Pitfalls

  • Trying to delete a file that does not exist will show an error like rm: `/path/to/file': No such file or directory.
  • Using -rmr is deprecated for recursive deletes; use -rm -r carefully for directories.
  • Not having proper permissions will cause a permission denied error.
  • Confusing local file system commands with HDFS commands; always use hdfs dfs prefix for HDFS operations.
bash
hdfs dfs -rm /user/hadoop/nonexistentfile.txt
# Output: rm: `/user/hadoop/nonexistentfile.txt': No such file or directory
Output
rm: `/user/hadoop/nonexistentfile.txt': No such file or directory
๐Ÿ“Š

Quick Reference

Here is a quick summary of commands related to deleting files in HDFS:

CommandDescription
hdfs dfs -rm Delete a single file from HDFS
hdfs dfs -rm -r Recursively delete a directory and its contents
hdfs dfs -ls List files to verify before deleting
hdfs dfs -rm -skipTrash Delete file immediately without moving to trash
โœ…

Key Takeaways

Use hdfs dfs -rm <file_path> to delete files in HDFS safely.
Check file existence with hdfs dfs -ls before deleting to avoid errors.
Use recursive delete -rm -r carefully for directories, not files.
Ensure you have proper permissions to delete files in HDFS.
Avoid confusing local file commands with HDFS commands; always use hdfs dfs.