How to Convert to Datetime in Pandas | Data Science Guide
pd.to_datetime() to convert a column or values to datetime in pandas, for example: df['date'] = pd.to_datetime(df['date']).Examples
How to Think About It
pd.to_datetime() which can handle strings, lists, or columns. It tries to parse the input into datetime objects automatically, making it easy to work with dates.Algorithm
Code
import pandas as pd data = {'date': ['2023-01-01', '2023/02/01', 'March 3, 2023', 'invalid']} df = pd.DataFrame(data) df['date'] = pd.to_datetime(df['date'], errors='coerce') print(df)
Dry Run
Let's trace converting the 'date' column with mixed date formats and an invalid entry.
Original Data
['2023-01-01', '2023/02/01', 'March 3, 2023', 'invalid']
Apply pd.to_datetime with errors='coerce'
Convert each string to datetime; invalid strings become NaT
Resulting DataFrame
['2023-01-01', '2023-02-01', '2023-03-03', NaT]
| Original String | Converted Datetime |
|---|---|
| 2023-01-01 | 2023-01-01 00:00:00 |
| 2023/02/01 | 2023-02-01 00:00:00 |
| March 3, 2023 | 2023-03-03 00:00:00 |
| invalid | NaT |
Why This Works
Step 1: Automatic Parsing
pd.to_datetime() tries to understand many date formats automatically, so you don't need to specify the format for common cases.
Step 2: Handling Errors
Using errors='coerce' converts invalid or unparseable dates to NaT, which means 'Not a Time' or missing date.
Step 3: Assigning Back
You assign the converted datetime values back to your DataFrame column to replace the original strings with datetime objects.
Alternative Approaches
import pandas as pd df = pd.DataFrame({'date': ['2023-01-01', '2023-02-01']}) df['date'] = pd.to_datetime(df['date'], format='%Y-%m-%d') print(df)
import pandas as pd from datetime import datetime df = pd.DataFrame({'date': ['2023-01-01', '2023-02-01']}) df['date'] = df['date'].apply(lambda x: datetime.strptime(x, '%Y-%m-%d')) print(df)
Complexity: O(n) time, O(n) space
Time Complexity
The function processes each element once, so time grows linearly with the number of dates.
Space Complexity
It creates a new datetime array of the same size, so space usage is linear.
Which Approach is Fastest?
Specifying the exact format with format= is fastest because pandas skips guessing the format.
| Approach | Time | Space | Best For |
|---|---|---|---|
| pd.to_datetime() auto | O(n) | O(n) | General use, unknown formats |
| pd.to_datetime() with format | O(n) | O(n) | Known fixed date formats, faster |
| apply with datetime.strptime | O(n) | O(n) | Custom parsing, slower, more control |
errors='coerce' in pd.to_datetime() to safely convert invalid dates to NaT without errors.errors='coerce' causes the code to break when invalid date strings are present.