0
0
ML Pythonml~10 mins

Label encoding in ML Python - ML Experiment: Train & Evaluate

Choose your learning style9 modes available
Experiment - Label encoding
Problem:You have a dataset with categorical labels like colors: ['red', 'green', 'blue', 'green', 'red']. You want to convert these text labels into numbers so a machine learning model can understand them.
Current Metrics:N/A - The problem is about data preprocessing, not model accuracy yet.
Issue:The model cannot work directly with text labels. Without encoding, it will throw errors or perform poorly.
Your Task
Convert the categorical labels into numeric form using label encoding so they can be used in a machine learning model.
Use only label encoding (no one-hot encoding).
Use sklearn's LabelEncoder.
Keep the order of labels as in the original list.
Hint 1
Hint 2
Hint 3
Solution
ML Python
from sklearn.preprocessing import LabelEncoder

labels = ['red', 'green', 'blue', 'green', 'red']

encoder = LabelEncoder()
encoded_labels = encoder.fit_transform(labels)

print('Original labels:', labels)
print('Encoded labels:', encoded_labels.tolist())
Imported LabelEncoder from sklearn.preprocessing.
Created a LabelEncoder instance.
Fitted the encoder on the original labels.
Transformed the labels into numeric form.
Results Interpretation

Before encoding: ['red', 'green', 'blue', 'green', 'red']

After encoding: [2, 1, 0, 1, 2]

Label encoding converts text labels into numbers so models can process them. Each unique label gets a unique number.
Bonus Experiment
Try using one-hot encoding on the same labels and compare the results with label encoding.
💡 Hint
Use sklearn's OneHotEncoder and observe how the output changes from single numbers to arrays representing each category.