0
0
HadoopHow-ToBeginner ยท 3 min read

How to List Files in HDFS in Hadoop Quickly

To list files in HDFS, use the hdfs dfs -ls command followed by the directory path. This command shows files and directories inside the specified HDFS path.
๐Ÿ“

Syntax

The basic syntax to list files in HDFS is:

  • hdfs dfs -ls <path>: Lists files and directories at the given HDFS path.
  • -ls is the command to list files.
  • <path> is the directory or file path in HDFS you want to check.
bash
hdfs dfs -ls /user/hadoop/
๐Ÿ’ป

Example

This example shows how to list all files and directories inside the HDFS directory /user/hadoop/. It helps you see file names, permissions, owner, size, and modification date.

bash
hdfs dfs -ls /user/hadoop/
Output
Found 3 items -rw-r--r-- 3 hadoop supergroup 1234 2024-06-01 10:00 /user/hadoop/file1.txt -rw-r--r-- 3 hadoop supergroup 5678 2024-06-02 11:30 /user/hadoop/file2.csv drwxr-xr-x - hadoop supergroup 0 2024-06-03 09:45 /user/hadoop/data
โš ๏ธ

Common Pitfalls

Common mistakes when listing files in HDFS include:

  • Not specifying the full or correct HDFS path, which results in errors or empty output.
  • Using ls without hdfs dfs prefix, which runs local system commands instead of HDFS commands.
  • Not having proper permissions to access the directory, causing permission denied errors.

Always use the full HDFS path and hdfs dfs -ls command to avoid confusion.

bash
ls /user/hadoop/
# Wrong: This lists local files, not HDFS files

hdfs dfs -ls /wrong/path
# Wrong: Path does not exist in HDFS

hdfs dfs -ls /user/hadoop/
# Right: Correct command and path
๐Ÿ“Š

Quick Reference

Here is a quick cheat sheet for listing files in HDFS:

CommandDescription
hdfs dfs -ls /pathList files and directories at /path in HDFS
hdfs dfs -ls -R /pathList files recursively in all subdirectories
hdfs dfs -ls -h /pathList files with human-readable file sizes
hdfs dfs -ls /List files in the root directory of HDFS
โœ…

Key Takeaways

Use hdfs dfs -ls <path> to list files in HDFS directories.
Always specify the correct HDFS path to avoid errors or empty results.
Do not confuse local ls with HDFS hdfs dfs -ls commands.
Check permissions if you get access denied errors when listing files.
Use options like -R for recursive listing and -h for readable sizes.