Challenge - 5 Problems
Categorical Variable Mastery
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate2:00remaining
Why use one-hot encoding for categorical variables?
Why is one-hot encoding commonly used to handle categorical variables in machine learning?
Attempts:
2 left
❓ Predict Output
intermediate2:00remaining
Output of label encoding on a categorical list
What is the output of the following Python code using sklearn's LabelEncoder?
ML Python
from sklearn.preprocessing import LabelEncoder le = LabelEncoder() categories = ['red', 'blue', 'green', 'blue', 'red'] encoded = le.fit_transform(categories) print(list(encoded))
Attempts:
2 left
❓ Model Choice
advanced2:00remaining
Best model choice for high-cardinality categorical data
You have a dataset with a categorical feature containing 10,000 unique categories. Which model is best suited to handle this feature without extensive preprocessing?
Attempts:
2 left
❓ Hyperparameter
advanced2:00remaining
Choosing encoding method for tree-based models
Which encoding method is generally preferred for categorical variables when using tree-based models like Random Forest or XGBoost?
Attempts:
2 left
❓ Metrics
expert2:00remaining
Evaluating impact of encoding on model performance
You trained two models on the same dataset: Model A uses one-hot encoding for categorical variables, Model B uses target encoding. Both models are gradient boosting classifiers. Model A has 85% accuracy, Model B has 88% accuracy on the test set. What is the most likely explanation?
Attempts:
2 left