Recall & Review
beginner
What is a categorical variable in machine learning?
A categorical variable is a type of data that represents categories or groups, like colors or types of animals, rather than numbers.
Click to reveal answer
beginner
Why can't machine learning models use categorical variables directly?
Most models need numbers to do math, so categorical variables must be changed into numbers before training.
Click to reveal answer
beginner
What is one-hot encoding?
One-hot encoding turns each category into a new column with 1 or 0, showing if the category is present or not.
Click to reveal answer
intermediate
What is label encoding and when is it useful?
Label encoding assigns a unique number to each category. It is useful when categories have an order, like small, medium, large.
Click to reveal answer
intermediate
What problem can arise if you use label encoding on categories without order?
The model might think one category is bigger or better than another because of the numbers, which is wrong for unordered categories.
Click to reveal answer
Which method creates new columns for each category with 1s and 0s?
When is label encoding most appropriate?
What is a risk of using label encoding on unordered categories?
Which of these is NOT a way to handle categorical variables?
Why do we need to convert categorical variables before training models?
Explain how one-hot encoding works and why it is useful for categorical variables.
Describe the difference between label encoding and one-hot encoding and when to use each.