0
0
ML Pythonprogramming~5 mins

Handling categorical variables in ML Python - Cheat Sheet & Quick Revision

Choose your learning style9 modes available
Recall & Review
beginner
What is a categorical variable in machine learning?
A categorical variable is a type of data that represents categories or groups, like colors or types of animals, rather than numbers.
Click to reveal answer
beginner
Why can't machine learning models use categorical variables directly?
Most models need numbers to do math, so categorical variables must be changed into numbers before training.
Click to reveal answer
beginner
What is one-hot encoding?
One-hot encoding turns each category into a new column with 1 or 0, showing if the category is present or not.
Click to reveal answer
intermediate
What is label encoding and when is it useful?
Label encoding assigns a unique number to each category. It is useful when categories have an order, like small, medium, large.
Click to reveal answer
intermediate
What problem can arise if you use label encoding on categories without order?
The model might think one category is bigger or better than another because of the numbers, which is wrong for unordered categories.
Click to reveal answer
Which method creates new columns for each category with 1s and 0s?
ANormalization
BOne-hot encoding
CLabel encoding
DStandardization
When is label encoding most appropriate?
AFor ordered categories like sizes
BFor categories with no order
CFor numerical data only
DFor text data without categories
What is a risk of using label encoding on unordered categories?
AModel converts categories to text
BModel ignores the categories
CModel crashes during training
DModel treats categories as ordered numbers
Which of these is NOT a way to handle categorical variables?
AOne-hot encoding
BLabel encoding
CMin-max scaling
DFrequency encoding
Why do we need to convert categorical variables before training models?
AModels only understand numbers
BCategorical data is always missing
CModels prefer text data
DCategorical data is too large
Explain how one-hot encoding works and why it is useful for categorical variables.
Describe the difference between label encoding and one-hot encoding and when to use each.