NumPydata~5 mins

Memory-mapped files with np.memmap in NumPy - Time & Space Complexity

Choose your learning style9 modes available

Learn Why Deep Visual Try Challenge Project Recall Time

Time Complexity: Memory-mapped files with np.memmap

O(k)

Understanding Time Complexity

When working with large data files, memory-mapped files let us access data without loading it all at once.

We want to know how the time to access data grows as the file size increases.

Scenario Under Consideration

Analyze the time complexity of the following code snippet.

import numpy as np

# Create a memory-mapped file for a large array
filename = 'large_array.dat'
shape = (10000, 10000)

# Open memmap in read-write mode
mmap_array = np.memmap(filename, dtype='float64', mode='w+', shape=shape)

# Access a single element
value = mmap_array[5000, 5000]

# Modify a slice
mmap_array[1000:2000, 1000:2000] = 1.0

# Flush changes to disk
mmap_array.flush()

This code creates a large memory-mapped array, reads one element, modifies a slice, and saves changes.

Identify Repeating Operations

Look at what repeats or takes time in this code.

Primary operation: Accessing and modifying parts of the large array stored on disk.
How many times: Reading one element is a single operation; modifying a slice touches many elements (here 1000 x 1000 = 1,000,000 elements).

How Execution Grows With Input

Accessing one element stays quick no matter the file size.

Modifying a slice grows with the number of elements changed.

Input Size (n x n)	Approx. Operations for Slice Modification
2,000 x 2,000	1,000,000
10,000 x 10,000	1,000,000
50,000 x 50,000	1,000,000

Pattern observation: The time to modify stays roughly constant regardless of the total file size, growing linearly with the number of elements changed (area of the slice).

Final Time Complexity

Time Complexity: O(k)

This means the time grows linearly with the number of elements accessed or modified, not the total file size.

Common Mistake

[X] Wrong: "Accessing any part of a memory-mapped file always takes time proportional to the whole file size."

[OK] Correct: Memory-mapped files let you access small parts quickly without reading the entire file, so time depends on how much you touch, not the full size.

Interview Connect

Understanding how memory-mapped files work helps you handle big data efficiently, a useful skill in many data science tasks.

Self-Check

What if we changed the slice size to the entire array? How would the time complexity change?