Structured arrays and DataFrames help organize data with different types in one place. They make it easy to work with complex data like tables.
Structured arrays vs DataFrames in NumPy
import numpy as np import pandas as pd # Structured array creation structured_array = np.array([(1, 'Alice', 25), (2, 'Bob', 30)], dtype=[('id', 'i4'), ('name', 'U10'), ('age', 'i4')]) # DataFrame creation data_frame = pd.DataFrame({'id': [1, 2], 'name': ['Alice', 'Bob'], 'age': [25, 30]})
Structured arrays use numpy's dtype to define column names and types.
DataFrames are from pandas and offer more features for data analysis.
import numpy as np # Empty structured array empty_structured = np.array([], dtype=[('id', 'i4'), ('name', 'U10'), ('age', 'i4')]) print(empty_structured)
import numpy as np # Structured array with one element one_element = np.array([(1, 'Alice', 25)], dtype=[('id', 'i4'), ('name', 'U10'), ('age', 'i4')]) print(one_element)
import pandas as pd # DataFrame with one row one_row_df = pd.DataFrame({'id': [1], 'name': ['Alice'], 'age': [25]}) print(one_row_df)
import pandas as pd # DataFrame with empty data empty_df = pd.DataFrame(columns=['id', 'name', 'age']) print(empty_df)
This program shows how to create a structured array, access its data, convert it to a DataFrame, and filter rows in the DataFrame.
import numpy as np import pandas as pd # Create a structured array with 3 rows structured_array = np.array([ (1, 'Alice', 25), (2, 'Bob', 30), (3, 'Charlie', 35) ], dtype=[('id', 'i4'), ('name', 'U10'), ('age', 'i4')]) print('Structured Array:') print(structured_array) print() # Access the 'name' column from structured array print('Names from structured array:') print(structured_array['name']) print() # Convert structured array to pandas DataFrame data_frame = pd.DataFrame(structured_array) print('Converted DataFrame:') print(data_frame) print() # Filter DataFrame for age > 28 filtered_df = data_frame[data_frame['age'] > 28] print('Filtered DataFrame (age > 28):') print(filtered_df)
Structured arrays are fast and use less memory but have limited features compared to DataFrames.
DataFrames provide many tools for data cleaning, filtering, and analysis but use more memory.
Common mistake: Trying to use DataFrame methods directly on structured arrays will cause errors.
Use structured arrays when you need speed and fixed types; use DataFrames for flexible data analysis.
Structured arrays store data with named columns and fixed types using numpy.
DataFrames are more powerful tables from pandas with many analysis features.
You can convert between structured arrays and DataFrames to use the best of both.