0
0
Pandasdata~30 mins

Resampling with groupby for time data in Pandas - Mini Project: Build & Apply

Choose your learning style9 modes available
Resampling with groupby for time data
📖 Scenario: You work for a bike rental company. You have a dataset of bike rentals recorded every minute. You want to analyze the total rentals per hour for each bike station.
🎯 Goal: Build a program that groups bike rental data by station and resamples the data to show total rentals per hour for each station.
📋 What You'll Learn
Create a pandas DataFrame with datetime index and rental counts for each station
Create a variable for the resampling frequency
Use groupby on the station column and resample the data by the frequency
Print the final resampled DataFrame showing total rentals per hour per station
💡 Why This Matters
🌍 Real World
Bike rental companies analyze rental data by time and location to optimize bike availability and station management.
💼 Career
Data analysts and data scientists often resample time series data grouped by categories to find trends and summaries for business decisions.
Progress0 / 4 steps
1
Create the bike rental data
Create a pandas DataFrame called df with these columns: 'station', 'rentals'. Use the datetime index with these exact timestamps: '2024-06-01 08:00', '2024-06-01 08:01', '2024-06-01 08:02', '2024-06-01 09:00', '2024-06-01 09:01'. The data should have these rows exactly:
station: 'A', 'A', 'B', 'A', 'B'
rentals: 5, 3, 2, 4, 1
Pandas
Need a hint?

Use pd.to_datetime to create the datetime index. Then create the DataFrame df with the given data and index.

2
Set the resampling frequency
Create a variable called freq and set it to the string 'H' to represent hourly resampling.
Pandas
Need a hint?

Set freq to the string 'H' for hourly resampling.

3
Group by station and resample hourly
Create a new DataFrame called hourly_rentals by grouping df by the 'station' column and resampling the 'rentals' column by the frequency freq. Use sum() to add rentals in each hour.
Pandas
Need a hint?

Use df.groupby('station').resample(freq).sum() to get total rentals per hour per station.

4
Print the hourly rentals result
Print the hourly_rentals DataFrame to display the total rentals per hour for each station.
Pandas
Need a hint?

Use print(hourly_rentals) to show the grouped and resampled data.