Recall & Review

beginner

What is target encoding in machine learning?

Target encoding is a technique that replaces categorical values with the average of the target variable for those categories. It helps models use categorical data effectively.

Click to reveal answer

intermediate

Why do we use target encoding instead of one-hot encoding for high-cardinality features?

Target encoding reduces the number of new features created, avoiding a large increase in data size and helping models learn better from categories with many unique values.

Click to reveal answer

intermediate

How does target encoding help prevent overfitting?

By using techniques like smoothing and cross-validation, target encoding avoids leaking target information from the training set to the model, which helps prevent overfitting.

Click to reveal answer

advanced

What is smoothing in target encoding?

Smoothing blends the category's target mean with the overall target mean to reduce noise from categories with few samples, making the encoding more stable.

Click to reveal answer

beginner

Give a simple example of target encoding for a categorical feature with three categories and their target means.

If a feature has categories A, B, C with target means 0.2, 0.5, and 0.8 respectively, target encoding replaces A with 0.2, B with 0.5, and C with 0.8 in the dataset.

Click to reveal answer

What does target encoding replace a categorical value with?

AA unique integer ID

BA one-hot vector

CThe average target value for that category

DThe frequency of the category

Why is smoothing used in target encoding?

ATo reduce noise from categories with few samples

BTo increase the number of features

CTo speed up encoding

DTo convert numerical data to categorical

Which problem can target encoding help solve better than one-hot encoding?

AHandling missing values

BEncoding high-cardinality categorical features

CScaling numerical features

DReducing dataset size by removing features

What is a risk of using target encoding without precautions?

AModel underfitting

BLoss of categorical information

CSlower training time

DData leakage leading to overfitting

Which method helps avoid overfitting when applying target encoding?

AUsing cross-validation to compute encodings

BIgnoring rare categories

CUsing one-hot encoding instead

DNormalizing the target variable

Explain what target encoding is and why it is useful for categorical data.

Describe how smoothing and cross-validation help make target encoding more reliable.

Practice

(1/5)

1. What is the main purpose of target encoding in machine learning?

easy

A. Remove missing values from the dataset

B. Normalize numerical features to a 0-1 scale

C. Create new categorical features by combining existing ones

D. Convert categorical variables into numbers using the average target value

Target encoding in ML Python - Cheat Sheet & Quick Revision

Start learning this pattern below

Practice

Solution

Step 1: Understand what target encoding does

Step 2: Compare with other options

Final Answer:

Quick Check:

Solution

Step 1: Identify correct target encoding method

Step 2: Check code correctness

Final Answer:

Quick Check:

Solution

Step 1: Calculate mean target per category from training data

Step 2: Map test categories and fill missing

Final Answer:

Quick Check:

Solution

Step 1: Understand overfitting cause in target encoding

Step 2: Identify mistake in data leakage

Final Answer:

Quick Check:

Solution

Step 1: Understand overfitting from rare categories

Step 2: Apply smoothing to reduce noise

Final Answer:

Quick Check: