0
0
ML Pythonml~20 mins

ColumnTransformer for mixed types in ML Python - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
ColumnTransformer Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
Predict Output
intermediate
2:00remaining
Output of ColumnTransformer with numeric and categorical features
What is the shape of the transformed data after applying this ColumnTransformer?
ML Python
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import StandardScaler, OneHotEncoder
import numpy as np

X = np.array([[1, 'red'], [2, 'blue'], [3, 'green']])

ct = ColumnTransformer(
    transformers=[
        ('num', StandardScaler(), [0]),
        ('cat', OneHotEncoder(), [1])
    ]
)

X_transformed = ct.fit_transform(X)
print(X_transformed.shape)
A(3, 4)
B(3, 3)
C(3, 5)
D(3, 2)
Attempts:
2 left
💡 Hint
Remember that OneHotEncoder creates one column per unique category.
Model Choice
intermediate
2:00remaining
Choosing transformers for mixed data types
You have a dataset with numeric, categorical, and text columns. Which set of transformers is best to use in a ColumnTransformer?
ARobustScaler for numeric, OrdinalEncoder for categorical, HashingVectorizer for text
BStandardScaler for numeric, OneHotEncoder for categorical, TfidfVectorizer for text
CMinMaxScaler for numeric, LabelEncoder for categorical, CountVectorizer for text
DNormalizer for numeric, OneHotEncoder for categorical, LabelEncoder for text
Attempts:
2 left
💡 Hint
Text data needs vectorization, categorical data needs encoding, numeric data needs scaling.
Hyperparameter
advanced
2:00remaining
Effect of 'remainder' parameter in ColumnTransformer
What happens if you set remainder='passthrough' in a ColumnTransformer?
AAn error is raised if columns are missing in transformers
BColumns not specified are dropped from the output
CAll columns are transformed regardless of specification
DColumns not specified in transformers are passed through without changes
Attempts:
2 left
💡 Hint
Think about how to keep columns unchanged in the output.
🔧 Debug
advanced
2:00remaining
Error when fitting ColumnTransformer with incompatible data types
What error will this code raise and why? from sklearn.compose import ColumnTransformer from sklearn.preprocessing import StandardScaler, OneHotEncoder import numpy as np X = np.array([[1, 'red'], [2, 'blue'], [3, 4]]) ct = ColumnTransformer( transformers=[ ('num', StandardScaler(), [0]), ('cat', OneHotEncoder(), [1]) ] ) ct.fit(X)
ML Python
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import StandardScaler, OneHotEncoder
import numpy as np

X = np.array([[1, 'red'], [2, 'blue'], [3, 4]])

ct = ColumnTransformer(
    transformers=[
        ('num', StandardScaler(), [0]),
        ('cat', OneHotEncoder(), [1])
    ]
)

ct.fit(X)
ATypeError because column 1 has mixed types (strings and integers)
BTypeError because StandardScaler cannot process strings
CNo error, the code runs successfully
DIndexError because column 1 does not exist
Attempts:
2 left
💡 Hint
Check the data types in the categorical column.
🧠 Conceptual
expert
2:00remaining
Why use ColumnTransformer in a machine learning pipeline?
Which is the best explanation for why ColumnTransformer is useful when working with mixed data types?
AIt replaces the need for feature selection by removing irrelevant columns
BIt automatically detects data types and chooses the best model
CIt allows applying different preprocessing steps to different columns in one step, simplifying pipelines
DIt converts all data to numeric format without any user input
Attempts:
2 left
💡 Hint
Think about handling numeric and categorical data differently.