0
0
NumPydata~20 mins

np.genfromtxt() for handling missing data in NumPy - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
np.genfromtxt() Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
Predict Output
intermediate
2:00remaining
Output of np.genfromtxt() with missing values
What is the output of the following code snippet that uses np.genfromtxt() to read data with missing values?
NumPy
import numpy as np
from io import StringIO

data = """
1,2,3
4,,6
7,8,
"""

arr = np.genfromtxt(StringIO(data), delimiter=',', filling_values=-1)
print(arr)
AValueError: could not convert string to float: ''
B
[[ 1.  2.  3.]
 [ 4. -1.  6.]
 [ 7.  8. -1.]]
C
[[1. 2. 3.]
 [4. nan 6.]
 [7. 8. nan]]
D
[[1 2 3]
 [4 0 6]
 [7 8 0]]
Attempts:
2 left
💡 Hint
Look at how filling_values replaces missing entries.
🧠 Conceptual
intermediate
1:30remaining
Understanding the role of filling_values in np.genfromtxt()
What does the filling_values parameter do in np.genfromtxt() when reading a file with missing data?
AIt specifies the value to replace missing data entries during loading.
BIt skips rows that contain any missing data.
CIt converts all data to strings instead of numbers.
DIt raises an error if any missing data is found.
Attempts:
2 left
💡 Hint
Think about how missing data can be handled automatically.
data_output
advanced
2:30remaining
Resulting array shape and content with missing data
Given this CSV data with missing values, what is the shape and content of the numpy array after loading with np.genfromtxt() using delimiter=',' and default parameters?
NumPy
import numpy as np
from io import StringIO

data = """
10,20,30
40,,60
,80,90
"""

arr = np.genfromtxt(StringIO(data), delimiter=',')
print(arr)
print(arr.shape)
A
[[10. 20. 30.]
 [40. 0. 60.]
 [0. 80. 90.]]
(3, 3)
B
[[10 20 30]
 [40 0 60]
 [0 80 90]]
(3, 3)
CValueError: could not convert string to float: ''
D
[[10. 20. 30.]
 [40. nan 60.]
 [nan 80. 90.]]
(3, 3)
Attempts:
2 left
💡 Hint
By default, missing values become NaN in float arrays.
🔧 Debug
advanced
2:00remaining
Identify the error when loading CSV with missing data
What error will this code raise when trying to load CSV data with missing values using np.genfromtxt() without specifying filling_values?
NumPy
import numpy as np
from io import StringIO

data = """
1,2,3
4,,6
7,8,9
"""

arr = np.genfromtxt(StringIO(data), delimiter=',', dtype=int)
print(arr)
ANo error, prints array with zeros for missing values
BTypeError: unsupported operand type(s) for +: 'int' and 'str'
CValueError: invalid literal for int() with base 10: ''
DSyntaxError: invalid syntax
Attempts:
2 left
💡 Hint
Missing values cannot be converted to int without filling.
🚀 Application
expert
3:00remaining
Handling mixed missing data types with np.genfromtxt()
You have a CSV file with numeric and string columns, some missing values in both. Which np.genfromtxt() call correctly loads the data, replacing missing numeric values with -999 and missing strings with 'missing'?
NumPy
import numpy as np
from io import StringIO

data = """
1,apple,3.5
2,,
,banana,4.1
"""
Anp.genfromtxt(StringIO(data), delimiter=',', dtype=None, encoding=None, filling_values={0: -999, 1: 'missing', 2: -999})
Bnp.genfromtxt(StringIO(data), delimiter=',', dtype=None, encoding=None, filling_values=-999)
Cnp.genfromtxt(StringIO(data), delimiter=',', dtype='U10,f8,i4', filling_values=['missing', -999, -999])
Dnp.genfromtxt(StringIO(data), delimiter=',', dtype=None, encoding=None, missing_values='', filling_values='missing')
Attempts:
2 left
💡 Hint
Use a dictionary for filling_values to specify per-column replacements.