Model Pipeline - One-hot encoding
One-hot encoding changes categories into numbers that a computer can understand. It turns each category into a new column with 0 or 1, showing if the category is present.
Jump into concepts and practice - no test required
One-hot encoding changes categories into numbers that a computer can understand. It turns each category into a new column with 0 or 1, showing if the category is present.
N/A
| Epoch | Loss ↓ | Accuracy ↑ | Observation |
|---|---|---|---|
| 1 | N/A | N/A | One-hot encoding is a data preprocessing step, not a training process. |
get_dummies() to perform one-hot encoding on a column.import pandas as pd
colors = ['red', 'blue', 'green', 'blue']
df = pd.DataFrame({'color': colors})
encoded = pd.get_dummies(df['color'])
print(encoded)pd.get_dummies on a Series creates a DataFrame with one column per unique category, filled with 1s and 0s indicating presence.from sklearn.preprocessing import OneHotEncoder encoder = OneHotEncoder() encoder.fit(['red', 'blue', 'green'])
handle_unknown='ignore' fits on training data and safely encodes test data without errors.