Pandas: How to Convert DataFrame to NumPy Array
Use the pandas DataFrame method
.to_numpy() to convert a DataFrame to a NumPy array, like array = df.to_numpy().Examples
Inputdf = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
Output[[1 3]
[2 4]]
Inputdf = pd.DataFrame({'X': [5.5, 6.5], 'Y': [7.5, 8.5]})
Output[[5.5 7.5]
[6.5 8.5]]
Inputdf = pd.DataFrame({'A': [1, None], 'B': [None, 4]})
Output[[ 1. nan]
[nan 4.]]
How to Think About It
To convert a DataFrame to a NumPy array, think of extracting the raw data without the row and column labels. The
.to_numpy() method returns the data as a plain array, which is useful for numerical operations or when working with libraries that require arrays.Algorithm
1
Get the pandas DataFrame as input.2
Call the <code>.to_numpy()</code> method on the DataFrame.3
Return the resulting NumPy array.Code
pandas
import pandas as pd df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]}) array = df.to_numpy() print(array)
Output
[[1 4]
[2 5]
[3 6]]
Dry Run
Let's trace converting a simple DataFrame to a NumPy array.
1
Create DataFrame
df = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
2
Call to_numpy()
array = df.to_numpy()
3
Resulting array
[[1 3] [2 4]]
| A | B |
|---|---|
| 1 | 3 |
| 2 | 4 |
Why This Works
Step 1: DataFrame stores data with labels
A pandas DataFrame holds data in rows and columns with labels for both.
Step 2: to_numpy() extracts raw data
The .to_numpy() method removes labels and returns just the data as a NumPy array.
Step 3: Result is easy for numerical work
The resulting array can be used in calculations or passed to libraries that need arrays.
Alternative Approaches
Using .values attribute
pandas
import pandas as pd df = pd.DataFrame({'A': [1, 2], 'B': [3, 4]}) array = df.values print(array)
This is an older way and works similarly but <code>.to_numpy()</code> is preferred for clarity and future compatibility.
Using numpy.array() on DataFrame
pandas
import pandas as pd import numpy as np df = pd.DataFrame({'A': [1, 2], 'B': [3, 4]}) array = np.array(df) print(array)
This converts the DataFrame to a NumPy array but may be less explicit than <code>.to_numpy()</code>.
Complexity: O(n*m) time, O(n*m) space
Time Complexity
Converting a DataFrame to a NumPy array requires copying all data elements, so time grows with the number of rows (n) times columns (m).
Space Complexity
A new array is created to hold the data, so space also grows with n*m.
Which Approach is Fastest?
.to_numpy() and .values are similar in speed; using np.array() may add slight overhead.
| Approach | Time | Space | Best For |
|---|---|---|---|
| .to_numpy() | O(n*m) | O(n*m) | Clear, recommended method |
| .values | O(n*m) | O(n*m) | Legacy code compatibility |
| np.array(df) | O(n*m) | O(n*m) | When working directly with NumPy |
Always use
.to_numpy() for clear and future-proof DataFrame to array conversion.Beginners often try to convert DataFrame directly with
np.array(df.values()) which causes errors because values is an attribute, not a method.