Challenge - 5 Problems
Interpolation Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
❓ Predict Output
intermediate2:00remaining
Output of linear interpolation on a DataFrame column
What is the output of the following code snippet that uses linear interpolation to fill missing values in a pandas DataFrame column?
Pandas
import pandas as pd import numpy as np df = pd.DataFrame({'A': [1, np.nan, np.nan, 4, 5]}) df['A'] = df['A'].interpolate(method='linear') print(df['A'].tolist())
Attempts:
2 left
💡 Hint
Linear interpolation fills missing values by connecting points with a straight line.
✗ Incorrect
The missing values at positions 1 and 2 are filled by linear interpolation between 1 and 4, resulting in 2.0 and 3.0 respectively.
❓ data_output
intermediate2:00remaining
Result of time interpolation on a time-indexed DataFrame
Given a DataFrame indexed by dates with missing values, what is the result of using time-based interpolation?
Pandas
import pandas as pd import numpy as np idx = pd.to_datetime(['2024-01-01', '2024-01-02', '2024-01-04', '2024-01-05']) df = pd.DataFrame({'Value': [10, np.nan, 40, 50]}, index=idx) df_interpolated = df.interpolate(method='time') print(df_interpolated['Value'].tolist())
Attempts:
2 left
💡 Hint
Time interpolation considers the time difference between index points.
✗ Incorrect
The missing value on 2024-01-02 is interpolated based on time distance between 2024-01-01 (10) and 2024-01-04 (40), resulting in 20.0 (1 day out of 3 days: 10 + 30 * (1/3)).
🔧 Debug
advanced2:00remaining
Identify the error in interpolation code
What error will this code raise when trying to interpolate missing values in a DataFrame?
Pandas
import pandas as pd import numpy as np df = pd.DataFrame({'A': [1, np.nan, 3]}) df['A'] = df['A'].interpolate(method='polynomial', order=2) print(df)
Attempts:
2 left
💡 Hint
Check if the 'order' parameter is correctly passed for polynomial interpolation.
✗ Incorrect
The 'polynomial' method requires the 'order' parameter to be specified, otherwise it raises a ValueError.
🚀 Application
advanced2:00remaining
Choosing interpolation method for categorical data
You have a DataFrame column with missing categorical values like ['red', NaN, 'blue', NaN, 'green']. Which interpolation method is appropriate to fill missing values?
Attempts:
2 left
💡 Hint
Categorical data cannot be interpolated numerically.
✗ Incorrect
Nearest method fills missing categorical values by copying the nearest valid category, which is suitable for categorical data.
🧠 Conceptual
expert2:00remaining
Effect of limit parameter in interpolation
What is the effect of setting the 'limit' parameter to 1 in pandas interpolate method when filling missing values?
Attempts:
2 left
💡 Hint
Think about how 'limit' controls consecutive missing values filled.
✗ Incorrect
The 'limit' parameter restricts how many consecutive NaNs are filled. Setting limit=1 fills only one NaN in a row, leaving longer sequences partially unfilled.