Pandasdata~10 mins

Resampling time series data in Pandas - Step-by-Step Execution

Choose your learning style9 modes available

Learn Why Deep Visual Try Challenge Project Recall Time

Concept Flow - Resampling time series data

Start with time series data

↓

Choose resampling frequency

↓

Apply resample() method

↓

Aggregate data (mean, sum, etc.)

↓

Get new resampled time series

↓

End

We start with time series data, pick a new time interval, resample using pandas, aggregate values, and get a new summarized time series.

Execution Sample

Pandas

import pandas as pd

# Create sample data
idx = pd.date_range('2024-01-01', periods=6, freq='H')
data = pd.Series([10, 20, 15, 30, 25, 40], index=idx)

# Resample hourly data to 3-hour intervals
resampled = data.resample('3H').mean()
print(resampled)

This code creates hourly data and resamples it to 3-hour intervals by averaging values.

Execution Table

Step	Action	Input Data	Resample Frequency	Aggregation	Output Data
1	Create hourly time series	[10,20,15,30,25,40]	Hourly (H)	None	2024-01-01 00:00:00 -> 10 2024-01-01 01:00:00 -> 20 2024-01-01 02:00:00 -> 15 2024-01-01 03:00:00 -> 30 2024-01-01 04:00:00 -> 25 2024-01-01 05:00:00 -> 40
2	Choose resample frequency	Hourly data	3 Hours (3H)	None	Preparing to group data into 3-hour bins
3	Group data into 3-hour bins	Hourly data	3H	None	Bin 1: 00:00-02:59 -> [10,20,15] Bin 2: 03:00-05:59 -> [30,25,40]
4	Aggregate each bin by mean	Bins	3H	Mean	Bin 1 mean: (10+20+15)/3 = 15 Bin 2 mean: (30+25+40)/3 = 31.67
5	Create new resampled series	Aggregated means	3H	Mean	2024-01-01 00:00:00 -> 15 2024-01-01 03:00:00 -> 31.67
6	Print resampled data	Resampled series	3H	Mean	Output: 2024-01-01 00:00:00 15.00 2024-01-01 03:00:00 31.67 Freq: 3H, dtype: float64

💡 All original data grouped and aggregated into 3-hour intervals, resampling complete.

Variable Tracker

Variable	Start	After Step 1	After Step 3	After Step 4	Final
data	None	[10,20,15,30,25,40] hourly indexed	[[10,20,15],[30,25,40]] grouped by 3H	[15, 31.67] means of groups	[15, 31.67] resampled series
resampled	None	None	None	None	[15.0, 31.67] with 3H freq

Key Moments - 3 Insights

Why does the resampled series have fewer rows than the original?

What happens if we use sum instead of mean for aggregation?

Why is the new index at 00:00 and 03:00 after resampling?

Visual Quiz - 3 Questions

Test your understanding

Look at the execution_table at step 4, what is the mean of the first 3-hour bin?

A31.67

B15

C20

D10

Concept Snapshot

Resampling time series data with pandas:
- Use .resample('freq') on a time-indexed series or DataFrame
- 'freq' is new time interval (e.g., '3H' for 3 hours)
- Aggregate grouped data with mean(), sum(), etc.
- Result is a new time series with fewer or more points
- Index labels are interval start times

Full Transcript

This visual execution shows how pandas resamples time series data. We start with hourly data points. We pick a new frequency, here 3 hours. The data is grouped into 3-hour bins. Each bin's values are aggregated by mean. The output is a new series with fewer points, each representing the average over 3 hours. The index labels are the start times of each 3-hour interval. This process helps summarize or change the time scale of data easily.