Date and time feature extraction helps us turn dates and times into useful numbers or categories. This makes it easier for machine learning models to understand patterns related to time.
Date and time feature extraction in ML Python
Start learning this pattern below
Jump into concepts and practice - no test required
import pandas as pd # Convert column to datetime type df['date_column'] = pd.to_datetime(df['date_column']) # Extract features df['year'] = df['date_column'].dt.year df['month'] = df['date_column'].dt.month df['day'] = df['date_column'].dt.day df['hour'] = df['date_column'].dt.hour df['weekday'] = df['date_column'].dt.weekday # Monday=0, Sunday=6 df['is_weekend'] = df['weekday'] >= 5 # True if Saturday or Sunday
Make sure your date column is in datetime format before extracting features.
You can extract many parts like year, month, day, hour, minute, second, weekday, and more.
df['date'] = pd.to_datetime(df['date']) df['year'] = df['date'].dt.year df['month'] = df['date'].dt.month
df['hour'] = df['date'].dt.hour df['weekday'] = df['date'].dt.weekday
df['is_weekend'] = df['date'].dt.weekday >= 5
This program converts date strings to datetime and extracts year, month, day, hour, weekday, and weekend flag.
import pandas as pd # Sample data with date strings data = {'date': ['2024-06-01 14:30:00', '2024-06-02 09:15:00', '2024-06-08 20:45:00']} df = pd.DataFrame(data) # Convert to datetime df['date'] = pd.to_datetime(df['date']) # Extract features df['year'] = df['date'].dt.year df['month'] = df['date'].dt.month df['day'] = df['date'].dt.day df['hour'] = df['date'].dt.hour df['weekday'] = df['date'].dt.weekday df['is_weekend'] = df['weekday'] >= 5 print(df)
Weekday numbers start at 0 for Monday and end at 6 for Sunday.
Boolean features like 'is_weekend' help models learn special patterns for weekends.
Always check your date format before conversion to avoid errors.
Date and time features turn raw dates into useful numbers for models.
Common features include year, month, day, hour, weekday, and weekend flags.
These features help models find patterns related to time and improve predictions.
Practice
Solution
Step 1: Understand date features
Date features include parts of a date like year, month, day, hour, and weekday.Step 2: Identify relevant feature
Among the options, only 'Month' is a part of a date and useful for models.Final Answer:
Month -> Option CQuick Check:
Date feature = Month [OK]
- Choosing unrelated features like color or font size
- Confusing date features with unrelated data
'date'?Solution
Step 1: Recall pandas datetime accessor
To extract weekday, use the.dtaccessor followed by.weekdaywithout parentheses.Step 2: Check each option
df['weekday'] = df['date'].dt.weekday uses.dt.weekdaycorrectly. df['weekday'] = df['date'].weekday() callsweekday()directly on the series, which is invalid. df['weekday'] = df['date'].weekday misses.dt. df['weekday'] = df['date'].dt.weekday() incorrectly uses parentheses after.weekday.Final Answer:
df['weekday'] = df['date'].dt.weekday -> Option AQuick Check:
Use .dt.weekday without parentheses [OK]
- Calling weekday() as a method on series
- Missing .dt accessor
- Adding parentheses after .weekday
import pandas as pd
df = pd.DataFrame({'date': pd.to_datetime(['2024-06-01 14:30', '2024-06-02 09:15'])})
df['hour'] = df['date'].dt.hour
df['is_weekend'] = df['date'].dt.weekday >= 5
print(df[['hour', 'is_weekend']].to_dict())What is the printed output?
Solution
Step 1: Extract hour values
The first date has hour 14, second has hour 9, so 'hour' column is {0:14, 1:9}.Step 2: Determine weekend flags
Weekday 5 and 6 are weekend. Dates are 2024-06-01 (Saturday=5) and 2024-06-02 (Sunday=6). Both are weekend, so 'is_weekend' should be True for both.Step 3: Check code logic
Code usesdf['date'].dt.weekday >= 5, which is True for both dates. So 'is_weekend' is {0: True, 1: True}.Final Answer:
{'hour': {0: 14, 1: 9}, 'is_weekend': {0: True, 1: True}} -> Option BQuick Check:
Weekend days are 5 or 6, both dates match [OK]
- Assuming weekend is false for Saturday/Sunday
- Mixing hour extraction with weekend logic
- Misreading weekday numbers
df['month'] = df['date'].month
What is the error and how to fix it?
Solution
Step 1: Understand pandas datetime access
Datetime properties like month must be accessed with.dtwhen working on a pandas Series.Step 2: Identify error cause
Usingdf['date'].monthtries to get 'month' attribute of the Series, causing AttributeError.Step 3: Correct code
Usedf['date'].dt.monthto extract month correctly.Final Answer:
AttributeError because .month must be accessed via .dt; fix: df['date'].dt.month -> Option AQuick Check:
Use .dt.month for pandas datetime columns [OK]
- Missing .dt accessor
- Trying to call .month() as a method
- Not converting column to datetime type
'timestamp'. You want to create a feature that is 1 if the time is during business hours (9am to 5pm) on weekdays, else 0. Which code correctly creates this feature?Solution
Step 1: Define business hours range
Business hours are from 9:00 (inclusive) to 17:00 (exclusive), so hour >= 9 and hour < 17.Step 2: Define weekdays
Weekdays are Monday (0) to Friday (4), so weekday < 5.Step 3: Combine conditions and convert to int
Use logical AND (&) to combine conditions and convert boolean to int with.astype(int).Final Answer:
df['business_hours'] = ((df['timestamp'].dt.hour >= 9) & (df['timestamp'].dt.hour < 17) & (df['timestamp'].dt.weekday < 5)).astype(int) -> Option DQuick Check:
Use inclusive start, exclusive end for hours and weekday < 5 [OK]
- Using >9 instead of >=9
- Including weekend days by using <=5
- Using <=17 instead of <17
