ML Pythonml~20 mins

Target encoding in ML Python - Practice Problems & Coding Challenges

Choose your learning style10 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Start learning this pattern below

Jump into concepts and practice - no test required

Recommended

Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong

Challenge - 5 Problems

🎖️

Target Encoding Mastery

Get all challenges correct to earn this badge!

Test your skills under time pressure!

🧠 Conceptual

intermediate

2:00remaining

Why use target encoding instead of one-hot encoding?

Which of the following is the main advantage of using target encoding for categorical variables compared to one-hot encoding?

ATarget encoding always prevents overfitting better than one-hot encoding.

BTarget encoding converts categorical variables into multiple binary columns.

CTarget encoding reduces dimensionality by converting categories into a single numeric value based on the target variable.

DTarget encoding does not require the target variable during training.

Attempts:

2 left

❓ Predict Output

intermediate

2:00remaining

Output of target encoding with smoothing

Given the following data and target encoding with smoothing, what is the encoded value for category 'B'?

ML Python

import pandas as pd

data = pd.DataFrame({'category': ['A', 'B', 'B', 'C', 'A', 'B'], 'target': [1, 0, 1, 0, 1, 1]})

# Global mean of target
global_mean = data['target'].mean()

# Calculate category mean and count
category_stats = data.groupby('category')['target'].agg(['mean', 'count'])

# Smoothing factor
alpha = 2

# Calculate smoothed mean
category_stats['smoothed'] = (category_stats['mean'] * category_stats['count'] + global_mean * alpha) / (category_stats['count'] + alpha)

encoded_value_B = category_stats.loc['B', 'smoothed']
print(round(encoded_value_B, 3))

A0.5

B0.667

C0.75

D0.4

Attempts:

2 left

❓ Model Choice

advanced

2:00remaining

Choosing target encoding for model type

For which type of model is target encoding most beneficial compared to one-hot encoding?

ANeural networks that require one-hot encoded inputs.

BTree-based models like random forests that handle categorical variables natively.

CK-nearest neighbors models that rely on distance metrics.

DLinear regression models that assume numeric input features.

Attempts:

2 left

❓ Metrics

advanced

2:00remaining

Effect of target encoding on model evaluation metrics

What is a common risk when using target encoding without proper cross-validation, and how does it affect evaluation metrics?

AData leakage causing overly optimistic metrics like accuracy or RMSE on validation data.

BUnderfitting leading to poor training accuracy but good validation accuracy.

CIncreased model bias causing metrics to be consistently low on both train and test sets.

DNo effect on metrics because target encoding is independent of the target variable.

Attempts:

2 left

🔧 Debug

expert

2:00remaining

Debugging target encoding implementation error

Consider this code snippet for target encoding. What error will it raise when run?

ML Python

import pandas as pd

data = pd.DataFrame({'cat': ['x', 'y', 'x', 'z'], 'target': [1, 0, 1, 0]})

means = data.groupby('cat')['target'].mean()

# Incorrect: trying to map means to original data without converting to dict
encoded = data['cat'].map(means)

print(encoded)

ANo error; prints encoded values correctly.

BKeyError because 'cat' values are missing in means index.

CTypeError because map expects a dict but gets a Series.

DValueError due to mismatched index during mapping.

Attempts:

2 left

Practice

(1/5)

1. What is the main purpose of target encoding in machine learning?

easy

A. Remove missing values from the dataset

B. Normalize numerical features to a 0-1 scale

C. Create new categorical features by combining existing ones

D. Convert categorical variables into numbers using the average target value

Target encoding in ML Python - Practice Problems & Coding Challenges

Start learning this pattern below

Practice

Solution

Step 1: Understand what target encoding does

Step 2: Compare with other options

Final Answer:

Quick Check:

Solution

Step 1: Identify correct target encoding method

Step 2: Check code correctness

Final Answer:

Quick Check:

Solution

Step 1: Calculate mean target per category from training data

Step 2: Map test categories and fill missing

Final Answer:

Quick Check:

Solution

Step 1: Understand overfitting cause in target encoding

Step 2: Identify mistake in data leakage

Final Answer:

Quick Check:

Solution

Step 1: Understand overfitting from rare categories

Step 2: Apply smoothing to reduce noise

Final Answer:

Quick Check: