0
0
ML Pythonml~5 mins

ColumnTransformer for mixed types in ML Python - Cheat Sheet & Quick Revision

Choose your learning style9 modes available
Recall & Review
beginner
What is the purpose of ColumnTransformer in machine learning?

ColumnTransformer helps apply different data transformations to different columns in a dataset. This is useful when columns have mixed types, like numbers and categories.

Click to reveal answer
beginner
How does ColumnTransformer handle numeric and categorical columns differently?

It allows you to specify separate transformers for numeric columns (like scaling) and categorical columns (like one-hot encoding) in one step.

Click to reveal answer
intermediate
Why is it better to use ColumnTransformer instead of transforming columns separately?

Using ColumnTransformer keeps the preprocessing organized, avoids mistakes, and integrates well with machine learning pipelines.

Click to reveal answer
intermediate
What happens if you don't specify remainder='passthrough' in ColumnTransformer?

Columns not listed in transformers are dropped by default. Using remainder='passthrough' keeps those columns unchanged.

Click to reveal answer
beginner
Give an example of transformers used for numeric and categorical data in ColumnTransformer.

Numeric: StandardScaler() to scale numbers.<br>Categorical: OneHotEncoder() to convert categories into binary columns.

Click to reveal answer
What does ColumnTransformer do?
ASplits data into training and testing sets
BTrains multiple models at once
CVisualizes data distributions
DApplies different transformations to different columns
Which transformer is commonly used for numeric columns?
AStandardScaler
BOneHotEncoder
CLabelEncoder
DCountVectorizer
What does OneHotEncoder do for categorical data?
AConverts categories into numbers 1, 2, 3...
BRemoves missing values
CCreates binary columns for each category
DScales categories between 0 and 1
If you want to keep columns not transformed by ColumnTransformer, what parameter do you use?
Aremainder='passthrough'
Bkeep='all'
Cdrop=False
Dpreserve=True
Why is using ColumnTransformer helpful in a machine learning pipeline?
AIt automatically selects the best model
BIt organizes preprocessing for mixed data types
CIt increases model accuracy by itself
DIt visualizes the data
Explain how ColumnTransformer helps when your dataset has both numeric and categorical columns.
Think about how you prepare numbers and categories differently.
You got /4 concepts.
    Describe what happens if you do not set the remainder parameter in ColumnTransformer.
    Consider what happens to columns you forget to mention.
    You got /3 concepts.