0
0
Hadoopdata~7 mins

HDFS read and write operations in Hadoop

Choose your learning style9 modes available
Introduction

We use HDFS read and write operations to store and access big data files across many computers easily and safely.

When you want to save large data files from your program to a distributed storage system.
When you need to read big data files stored in HDFS for analysis or processing.
When you want to move data between your local computer and the Hadoop cluster.
When you want to check the contents of a file stored in HDFS.
When you want to append new data to an existing file in HDFS.
Syntax
Hadoop
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.io.OutputStream;

public class HDFSReadWrite {
    public static void main(String[] args) throws Exception {
        Configuration configuration = new Configuration();
        FileSystem hdfs = FileSystem.get(configuration);

        // Write operation
        Path writePath = new Path("/user/hadoop/example.txt");
        try (OutputStream outputStream = hdfs.create(writePath)) {
            String data = "Hello HDFS!";
            outputStream.write(data.getBytes());
        }

        // Read operation
        Path readPath = new Path("/user/hadoop/example.txt");
        try (BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(hdfs.open(readPath)))) {
            String line;
            while ((line = bufferedReader.readLine()) != null) {
                System.out.println(line);
            }
        }
    }
}

The FileSystem object connects to HDFS and lets you read or write files.

Use hdfs.create(path) to write a new file or overwrite an existing one.

Examples
This creates a new file and writes data to it.
Hadoop
// Writing to HDFS when file does not exist
Path newFilePath = new Path("/user/hadoop/newfile.txt");
try (OutputStream outputStream = hdfs.create(newFilePath)) {
    outputStream.write("Data for new file".getBytes());
}
This reads the first line from the existing file.
Hadoop
// Reading from HDFS when file exists
Path existingFilePath = new Path("/user/hadoop/newfile.txt");
try (BufferedReader reader = new BufferedReader(new InputStreamReader(hdfs.open(existingFilePath)))) {
    String line = reader.readLine();
    System.out.println(line);
}
If the file is empty, reading returns null immediately.
Hadoop
// Reading from HDFS when file is empty
Path emptyFilePath = new Path("/user/hadoop/emptyfile.txt");
try (BufferedReader reader = new BufferedReader(new InputStreamReader(hdfs.open(emptyFilePath)))) {
    String line = reader.readLine();
    System.out.println(line); // prints null
}
This overwrites the existing file with new data.
Hadoop
// Writing to HDFS overwrites existing file
Path overwritePath = new Path("/user/hadoop/newfile.txt");
try (OutputStream outputStream = hdfs.create(overwritePath)) {
    outputStream.write("Overwritten data".getBytes());
}
Sample Program

This program writes a text string to a file in HDFS and then reads it back to print on the screen.

Hadoop
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.io.OutputStream;

public class HDFSReadWriteExample {
    public static void main(String[] args) throws Exception {
        Configuration configuration = new Configuration();
        FileSystem hdfs = FileSystem.get(configuration);

        Path filePath = new Path("/user/hadoop/sample.txt");

        System.out.println("Writing data to HDFS file...");
        try (OutputStream outputStream = hdfs.create(filePath)) {
            String dataToWrite = "Welcome to HDFS read and write operations!";
            outputStream.write(dataToWrite.getBytes());
        }

        System.out.println("Reading data from HDFS file...");
        try (BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(hdfs.open(filePath)))) {
            String line;
            while ((line = bufferedReader.readLine()) != null) {
                System.out.println(line);
            }
        }
    }
}
OutputSuccess
Important Notes

Time complexity: Writing and reading depend on file size and network speed.

Space complexity: Uses memory for buffers during read/write.

Common mistake: Forgetting to close streams can cause data loss or errors.

Use write operation to save or overwrite files; use read operation to access stored data.

Summary

HDFS read and write let you store and access big data files across many machines.

Writing creates or overwrites files; reading fetches file contents line by line.

Always close streams to avoid errors and data loss.