How to Extract Year from Datetime in Pandas Easily
To extract the year from a datetime column in pandas, use the
.dt.year accessor on a datetime Series. This returns the year as an integer for each datetime value.Syntax
Use the .dt.year attribute on a pandas Series with datetime values to get the year part.
Series.dt: Accessor for datetime properties.year: Extracts the year as an integer.
python
df['date_column'].dt.yearExample
This example shows how to create a pandas DataFrame with a datetime column and extract the year from it.
python
import pandas as pd data = {'date_column': ['2023-01-15', '2022-07-30', '2021-12-05']} df = pd.DataFrame(data) df['date_column'] = pd.to_datetime(df['date_column']) df['year'] = df['date_column'].dt.year print(df)
Output
date_column year
0 2023-01-15 2023
1 2022-07-30 2022
2 2021-12-05 2021
Common Pitfalls
Common mistakes include trying to extract the year from a column that is not in datetime format, which causes errors or wrong results. Always convert the column to datetime using pd.to_datetime() before extracting the year.
Also, avoid using string slicing to get the year because it is less reliable and error-prone.
python
import pandas as pd data = {'date_column': ['2023-01-15', '2022-07-30', '2021-12-05']} df = pd.DataFrame(data) # Wrong: extracting year without datetime conversion # This will cause an error or wrong output # df['year'] = df['date_column'].dt.year # Right: convert to datetime first df['date_column'] = pd.to_datetime(df['date_column']) df['year'] = df['date_column'].dt.year print(df)
Output
date_column year
0 2023-01-15 2023
1 2022-07-30 2022
2 2021-12-05 2021
Quick Reference
Summary tips for extracting year from datetime in pandas:
- Ensure the column is datetime type with
pd.to_datetime(). - Use
.dt.yearto get the year as an integer. - Do not use string slicing for year extraction.
Key Takeaways
Always convert your column to datetime type before extracting the year.
Use the .dt.year accessor to get the year from datetime values in pandas.
Avoid string slicing to extract year as it is unreliable and error-prone.
The extracted year is returned as an integer for easy analysis.