0
0
NumPydata~20 mins

np.genfromtxt() for handling missing data in NumPy - Mini Project: Build & Apply

Choose your learning style9 modes available
Using np.genfromtxt() to Handle Missing Data
📖 Scenario: You work in a small store that tracks daily sales data in a CSV file. Sometimes, some sales numbers are missing because the cashier forgot to enter them. You want to load this data into Python and handle the missing values properly so you can analyze the sales.
🎯 Goal: Load the sales data from a CSV file using np.genfromtxt() so that missing values are handled as np.nan. Then, calculate the average sales ignoring the missing values.
📋 What You'll Learn
Use np.genfromtxt() to load data from a CSV string with missing values
Set the correct parameters to handle missing data as np.nan
Calculate the average sales ignoring missing values using np.nanmean()
Print the average sales as the final output
💡 Why This Matters
🌍 Real World
Stores, labs, and businesses often collect data with missing entries. Handling missing data correctly is important for accurate analysis.
💼 Career
Data scientists and analysts frequently use <code>np.genfromtxt()</code> to load imperfect datasets and prepare them for analysis.
Progress0 / 4 steps
1
Create the sales data CSV string
Create a variable called sales_data that contains this exact CSV string with missing values represented by empty fields:
10,20,30\n40,,60\n70,80,
NumPy
Need a hint?

Use triple quotes or escaped newlines to create the multiline string exactly as shown.

2
Set the delimiter configuration
Create a variable called delimiter and set it to the string "," to specify the CSV delimiter.
NumPy
Need a hint?

The delimiter for CSV files is usually a comma.

3
Load the data using np.genfromtxt() handling missing values
Use np.genfromtxt() to load the data from sales_data with the delimiter variable. Set dtype to float and missing_values to an empty string "" so missing entries become np.nan. Store the result in a variable called sales_array.
NumPy
Need a hint?

Use sales_data.splitlines() to pass the CSV lines to np.genfromtxt().

4
Calculate and print the average sales ignoring missing values
Calculate the average of sales_array ignoring np.nan values using np.nanmean(). Store the result in average_sales. Then print average_sales.
NumPy
Need a hint?

Use np.nanmean() to ignore np.nan values when calculating the average.