Set operations help you find common or different items between groups of data. For structured data, this means comparing rows with multiple fields.
0
0
Set operations on structured data in NumPy
Introduction
You want to find common records between two tables of data.
You need to find records in one dataset but not in another.
You want to combine two datasets without duplicates.
You want to find records that are unique to each dataset.
Syntax
NumPy
numpy.intersect1d(array1, array2) numpy.union1d(array1, array2) numpy.setdiff1d(array1, array2) numpy.setxor1d(array1, array2)
These functions work on 1D arrays, so for structured data, you often view rows as single items.
Structured arrays have named fields, so you can compare rows as tuples.
Examples
This finds rows that appear in both arrays.
NumPy
import numpy as np # Define two structured arrays arr1 = np.array([(1, 'A'), (2, 'B'), (3, 'C')], dtype=[('id', 'i4'), ('label', 'U1')]) arr2 = np.array([(2, 'B'), (3, 'C'), (4, 'D')], dtype=arr1.dtype) # Find common rows common = np.intersect1d(arr1, arr2) print(common)
This finds rows in arr1 that are not in arr2.
NumPy
unique_to_arr1 = np.setdiff1d(arr1, arr2)
print(unique_to_arr1)This combines both arrays without duplicates.
NumPy
all_unique = np.union1d(arr1, arr2)
print(all_unique)This finds rows that are in one array or the other but not both.
NumPy
diff = np.setxor1d(arr1, arr2)
print(diff)Sample Program
This program shows how to use set operations on structured arrays to find common, unique, combined, and different rows.
NumPy
import numpy as np # Create two structured arrays with fields 'id' and 'score' arr1 = np.array([(1, 90), (2, 85), (3, 88)], dtype=[('id', 'i4'), ('score', 'i4')]) arr2 = np.array([(2, 85), (3, 88), (4, 92)], dtype=arr1.dtype) # Find common rows common = np.intersect1d(arr1, arr2) print('Common rows:') print(common) # Find rows unique to arr1 unique_arr1 = np.setdiff1d(arr1, arr2) print('\nRows unique to arr1:') print(unique_arr1) # Combine all unique rows all_unique = np.union1d(arr1, arr2) print('\nAll unique rows combined:') print(all_unique) # Find rows in either arr1 or arr2 but not both diff = np.setxor1d(arr1, arr2) print('\nRows in either arr1 or arr2 but not both:') print(diff)
OutputSuccess
Important Notes
Structured arrays compare rows as whole records, so all fields must match to be considered equal.
Set operations return sorted results by default.
If you want to compare only some fields, extract those fields first.
Summary
Set operations help compare structured data by rows.
Use numpy functions like intersect1d, union1d, setdiff1d, and setxor1d.
These operations are useful to find common, unique, or different records.