0
0
Pandasdata~5 mins

Creating DataFrame from NumPy array in Pandas

Choose your learning style9 modes available
Introduction

We use DataFrames to organize data in tables with rows and columns. Creating a DataFrame from a NumPy array helps us turn raw numbers into a clear table format for easy analysis.

You have numerical data in a NumPy array and want to analyze it using pandas.
You want to add row and column labels to your array data for better understanding.
You need to combine array data with other pandas tools like filtering or grouping.
You want to save or visualize your array data in a table format.
Syntax
Pandas
import pandas as pd
import numpy as np

# Create a NumPy array
data_array = np.array([[1, 2], [3, 4]])

# Create DataFrame from the array
df = pd.DataFrame(data_array, columns=['Column1', 'Column2'], index=['Row1', 'Row2'])

The columns parameter names the columns.

The index parameter names the rows.

Examples
This shows what happens if the array is empty: the DataFrame is empty with no rows or columns.
Pandas
import pandas as pd
import numpy as np

# Empty array
empty_array = np.array([]).reshape(0, 0)
df_empty = pd.DataFrame(empty_array)
print(df_empty)
This creates a DataFrame with one row and one column from a single number.
Pandas
import pandas as pd
import numpy as np

# One element array
one_element_array = np.array([[42]])
df_one = pd.DataFrame(one_element_array, columns=['OnlyColumn'], index=['OnlyRow'])
print(df_one)
This creates a DataFrame with two rows and three columns, showing how to label both.
Pandas
import pandas as pd
import numpy as np

# Array with multiple rows and columns
multi_array = np.array([[10, 20, 30], [40, 50, 60]])
df_multi = pd.DataFrame(multi_array, columns=['A', 'B', 'C'], index=['First', 'Second'])
print(df_multi)
Sample Program

This program shows how to turn a NumPy array of student scores into a labeled DataFrame for easier reading and analysis.

Pandas
import pandas as pd
import numpy as np

# Step 1: Create a NumPy array with sample data
sample_data = np.array([[5, 10, 15], [20, 25, 30], [35, 40, 45]])

# Step 2: Print the original NumPy array
print("Original NumPy array:")
print(sample_data)

# Step 3: Create a DataFrame from the NumPy array
# Naming columns and rows for clarity
data_frame = pd.DataFrame(sample_data, columns=['Math', 'Science', 'English'], index=['Student1', 'Student2', 'Student3'])

# Step 4: Print the DataFrame
print("\nDataFrame created from NumPy array:")
print(data_frame)
OutputSuccess
Important Notes

Creating a DataFrame from a NumPy array takes O(n*m) time, where n is rows and m is columns, because it copies all data.

It uses extra memory to store row and column labels besides the array data.

Common mistake: forgetting to match the number of column names to the number of columns in the array causes errors.

Use this method when you want to add labels and use pandas features; if you only need raw numbers, NumPy arrays alone may be enough.

Summary

DataFrames organize array data into labeled tables.

You can add row and column names when creating a DataFrame from a NumPy array.

This helps make data easier to understand and analyze with pandas tools.