0
0
PandasHow-ToBeginner · 2 min read

Pandas: How to Convert DataFrame to NumPy Array

Use the pandas DataFrame method .to_numpy() to convert a DataFrame to a NumPy array, like array = df.to_numpy().
📋

Examples

Inputdf = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
Output[[1 3] [2 4]]
Inputdf = pd.DataFrame({'X': [5.5, 6.5], 'Y': [7.5, 8.5]})
Output[[5.5 7.5] [6.5 8.5]]
Inputdf = pd.DataFrame({'A': [1, None], 'B': [None, 4]})
Output[[ 1. nan] [nan 4.]]
🧠

How to Think About It

To convert a DataFrame to a NumPy array, think of extracting the raw data without the row and column labels. The .to_numpy() method returns the data as a plain array, which is useful for numerical operations or when working with libraries that require arrays.
📐

Algorithm

1
Get the pandas DataFrame as input.
2
Call the <code>.to_numpy()</code> method on the DataFrame.
3
Return the resulting NumPy array.
💻

Code

pandas
import pandas as pd

df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]})
array = df.to_numpy()
print(array)
Output
[[1 4] [2 5] [3 6]]
🔍

Dry Run

Let's trace converting a simple DataFrame to a NumPy array.

1

Create DataFrame

df = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})

2

Call to_numpy()

array = df.to_numpy()

3

Resulting array

[[1 3] [2 4]]

AB
13
24
💡

Why This Works

Step 1: DataFrame stores data with labels

A pandas DataFrame holds data in rows and columns with labels for both.

Step 2: to_numpy() extracts raw data

The .to_numpy() method removes labels and returns just the data as a NumPy array.

Step 3: Result is easy for numerical work

The resulting array can be used in calculations or passed to libraries that need arrays.

🔄

Alternative Approaches

Using .values attribute
pandas
import pandas as pd

df = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
array = df.values
print(array)
This is an older way and works similarly but <code>.to_numpy()</code> is preferred for clarity and future compatibility.
Using numpy.array() on DataFrame
pandas
import pandas as pd
import numpy as np

df = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
array = np.array(df)
print(array)
This converts the DataFrame to a NumPy array but may be less explicit than <code>.to_numpy()</code>.

Complexity: O(n*m) time, O(n*m) space

Time Complexity

Converting a DataFrame to a NumPy array requires copying all data elements, so time grows with the number of rows (n) times columns (m).

Space Complexity

A new array is created to hold the data, so space also grows with n*m.

Which Approach is Fastest?

.to_numpy() and .values are similar in speed; using np.array() may add slight overhead.

ApproachTimeSpaceBest For
.to_numpy()O(n*m)O(n*m)Clear, recommended method
.valuesO(n*m)O(n*m)Legacy code compatibility
np.array(df)O(n*m)O(n*m)When working directly with NumPy
💡
Always use .to_numpy() for clear and future-proof DataFrame to array conversion.
⚠️
Beginners often try to convert DataFrame directly with np.array(df.values()) which causes errors because values is an attribute, not a method.