0
0
NumPydata~5 mins

Memory-mapped files with np.memmap in NumPy

Choose your learning style9 modes available
Introduction

Memory-mapped files let you work with big data stored on disk as if it were in memory. This helps when your data is too large to fit in your computer's RAM.

You want to process a large dataset that does not fit into your computer's memory.
You want to read or write parts of a large binary file without loading the whole file.
You want to share data between different programs without copying it all into memory.
You want faster access to large arrays stored on disk without loading them fully.
Syntax
NumPy
np.memmap(filename, dtype='float32', mode='r+', offset=0, shape=None, order='C')

filename is the path to the binary file on disk.

mode controls read/write access: 'r' for read-only, 'r+' for read-write, 'w+' to create or overwrite.

Examples
This opens a 1000x1000 float32 array stored in 'data.dat' for reading.
NumPy
import numpy as np

# Open existing file for reading
mmap_array = np.memmap('data.dat', dtype='float32', mode='r', shape=(1000, 1000))
This creates a new file 'new_data.dat', writes numbers 0 to 249999, and saves it to disk.
NumPy
import numpy as np

# Create a new memmap file and write data
mmap_array = np.memmap('new_data.dat', dtype='int32', mode='w+', shape=(500, 500))
mmap_array[:] = np.arange(250000).reshape(500, 500)
mmap_array.flush()
Sample Program

This program creates a memory-mapped file, writes a 3x4 array to it, saves it, then reads it back and prints the array.

NumPy
import numpy as np

# Create a memmap file with shape (3, 4) and int32 data
filename = 'example.dat'
mmap_array = np.memmap(filename, dtype='int32', mode='w+', shape=(3, 4))

# Fill the array with values
mmap_array[:] = np.array([[1, 2, 3, 4],
                          [5, 6, 7, 8],
                          [9, 10, 11, 12]])

# Save changes to disk
mmap_array.flush()

# Open the same file for reading
mmap_read = np.memmap(filename, dtype='int32', mode='r', shape=(3, 4))

# Print the data read from the file
print(mmap_read)
OutputSuccess
Important Notes

Always call flush() to save changes from memory to disk.

Memory-mapped files work best with binary data, not text files.

Be careful with the shape and dtype to match the file's data layout.

Summary

Memory-mapped files let you handle large data on disk like arrays in memory.

Use np.memmap to create or open these files with control over reading and writing.

This helps save memory and speeds up working with big datasets.