Challenge - 5 Problems
Resampling Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
❓ Predict Output
intermediate2:00remaining
Output of resampling with mean aggregation
What is the output of this code snippet that resamples a time series to daily frequency and calculates the mean?
Data Analysis Python
import pandas as pd import numpy as np dates = pd.date_range('2024-01-01 12:00', periods=4, freq='6H') data = pd.Series([10, 20, 30, 40], index=dates) result = data.resample('D').mean() print(result)
Attempts:
2 left
💡 Hint
Think about how resample groups data by day and then averages values within each day.
✗ Incorrect
The data has 4 values at 6-hour intervals starting at noon on Jan 1. Resampling by day groups values on Jan 1 (12:00 and 18:00) and Jan 2 (00:00 and 06:00). The mean for Jan 1 is (10+20)/2=15, for Jan 2 is (30+40)/2=35.
❓ data_output
intermediate2:00remaining
Number of rows after upsampling with forward fill
Given a time series with hourly data for 3 hours, what is the number of rows after upsampling to 30-minute frequency with forward fill?
Data Analysis Python
import pandas as pd dates = pd.date_range('2024-01-01 00:00', periods=3, freq='H') data = pd.Series([1, 2, 3], index=dates) upsampled = data.resample('30T').ffill() print(len(upsampled))
Attempts:
2 left
💡 Hint
Count how many 30-minute intervals fit between the first and last timestamp inclusive.
✗ Incorrect
Original data has 3 points at 00:00, 01:00, 02:00. Upsampling to 30 minutes creates timestamps at 00:00, 00:30, 01:00, 01:30, 02:00. That's 5 rows. Forward fill fills missing values.
🔧 Debug
advanced2:00remaining
Identify the error in resampling code
What error does this code raise when trying to resample a DataFrame without a datetime index?
Data Analysis Python
import pandas as pd data = pd.DataFrame({'value': [1, 2, 3]}) result = data.resample('D').sum()
Attempts:
2 left
💡 Hint
Check the type of index required for resample to work.
✗ Incorrect
Resample requires the DataFrame to have a datetime-like index. Here, the index is default RangeIndex, so it raises a TypeError indicating the index type is invalid.
🚀 Application
advanced2:00remaining
Choosing correct resampling method for downsampling
You have minute-level temperature data and want to get the maximum temperature per hour. Which resampling method and aggregation should you use?
Attempts:
2 left
💡 Hint
Think about grouping data into hours and picking the highest temperature.
✗ Incorrect
To get max temperature per hour, resample with hourly frequency ('H') and aggregate with max(). Other options either use wrong frequency or wrong aggregation.
🧠 Conceptual
expert3:00remaining
Effect of resampling with different label and closed parameters
When resampling time series data with 'label="right"' and 'closed="right"', how are the bins labeled and which side is included in the interval?
Attempts:
2 left
💡 Hint
Think about how intervals are defined when closed='right' and how labels align.
✗ Incorrect
With label='right' and closed='right', each bin is labeled by its right endpoint, and intervals include the right edge but exclude the left edge. This affects which data points fall into each bin.