Masked arrays help you work with data that has missing or invalid values. They let you ignore these values in calculations without deleting them.
Masked arrays concept in NumPy
import numpy as np # Create a masked array masked_array = np.ma.array(data, mask=mask_array) # data: normal numpy array or list # mask_array: boolean array where True means the value is masked (ignored)
The mask array must be the same shape as the data array.
Masked values are ignored in calculations like mean, sum, etc.
import numpy as np # Example 1: Masking some values values = np.array([1, 2, 3, 4, 5]) mask = np.array([False, True, False, False, True]) masked_values = np.ma.array(values, mask=mask) print(masked_values)
import numpy as np # Example 2: Masked array with all values valid (no mask) values = np.array([10, 20, 30]) masked_values = np.ma.array(values) print(masked_values)
import numpy as np # Example 3: Masked array with all values masked values = np.array([7, 8, 9]) mask = np.array([True, True, True]) masked_values = np.ma.array(values, mask=mask) print(masked_values)
import numpy as np # Example 4: Masking first and last elements values = np.array([100, 200, 300, 400]) mask = np.array([True, False, False, True]) masked_values = np.ma.array(values, mask=mask) print(masked_values)
This program creates a masked array to ignore negative values in calculations. It prints the original data, the mask, the masked array, and the mean ignoring invalid values.
import numpy as np # Create a normal numpy array with some invalid data data = np.array([10, -1, 20, -999, 30, 40]) # Define a mask where invalid data is True # Here, we consider negative values as invalid mask_invalid = data < 0 # Create a masked array masked_data = np.ma.array(data, mask=mask_invalid) print("Original data:", data) print("Mask for invalid data:", mask_invalid) print("Masked array:", masked_data) # Calculate mean ignoring masked values mean_value = masked_data.mean() print(f"Mean ignoring invalid values: {mean_value}")
Time complexity for creating a masked array is O(n), where n is the number of elements.
Space complexity is O(n) because the mask array stores a boolean for each element.
A common mistake is not matching the mask shape to the data shape, which causes errors.
Use masked arrays when you want to keep invalid data but exclude it from calculations. Use data cleaning if you want to remove invalid data completely.
Masked arrays let you mark data as invalid without deleting it.
They help perform calculations ignoring invalid or missing values.
Always ensure the mask matches the data shape and use masked arrays to keep data integrity.