How to Filter Data by Date in pandas: Simple Guide
To filter rows by date in pandas, first ensure your date column is in
datetime format using pd.to_datetime(). Then use boolean indexing with conditions like df[df['date'] >= '2023-01-01'] to select rows matching your date criteria.Syntax
To filter a pandas DataFrame by date, use boolean indexing with a date condition on a datetime column.
df['date_column']: The column with dates.pd.to_datetime(): Converts strings to datetime format.df[condition]: Filters rows where the condition is True.
python
df['date_column'] = pd.to_datetime(df['date_column']) filtered_df = df[df['date_column'] >= 'YYYY-MM-DD']
Example
This example shows how to filter a DataFrame to keep only rows where the date is on or after January 1, 2023.
python
import pandas as pd data = {'date': ['2022-12-31', '2023-01-01', '2023-02-15', '2022-11-20'], 'value': [10, 20, 30, 40]} df = pd.DataFrame(data) # Convert 'date' column to datetime df['date'] = pd.to_datetime(df['date']) # Filter rows where date is >= 2023-01-01 filtered_df = df[df['date'] >= '2023-01-01'] print(filtered_df)
Output
date value
1 2023-01-01 20
2 2023-02-15 30
Common Pitfalls
Common mistakes when filtering by date include:
- Not converting the date column to
datetimetype, causing incorrect filtering. - Using string comparisons without
pd.to_datetime(), which can lead to wrong results. - Forgetting to use the correct date format (YYYY-MM-DD) in conditions.
python
import pandas as pd data = {'date': ['2023-01-01', '2023-02-01'], 'value': [1, 2]} df = pd.DataFrame(data) # Wrong: filtering without datetime conversion wrong_filter = df[df['date'] >= '2023-01-15'] # Right: convert to datetime first df['date'] = pd.to_datetime(df['date']) right_filter = df[df['date'] >= '2023-01-15'] print('Wrong filter result:') print(wrong_filter) print('\nRight filter result:') print(right_filter)
Output
Wrong filter result:
date value
1 2023-02-01 2
Right filter result:
date value
1 2023-02-01 2
Quick Reference
| Action | Code Example |
|---|---|
| Convert column to datetime | df['date'] = pd.to_datetime(df['date']) |
| Filter dates after a day | df[df['date'] > '2023-01-01'] |
| Filter dates between two days | df[(df['date'] >= '2023-01-01') & (df['date'] <= '2023-01-31')] |
| Filter dates before a day | df[df['date'] < '2023-01-01'] |
Key Takeaways
Always convert your date column to datetime type before filtering.
Use boolean indexing with date conditions to filter rows.
Write dates in 'YYYY-MM-DD' format for clear comparisons.
Combine conditions with & for filtering between dates.
Avoid filtering on strings without datetime conversion to prevent errors.