0
0
ML Pythonml~10 mins

Pipeline best practices in ML Python - Interactive Code Practice

Choose your learning style9 modes available
Practice - 5 Tasks
Answer the questions below
1fill in blank
easy

Complete the code to create a simple pipeline that scales data and fits a model.

ML Python
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression

pipeline = Pipeline([
    ('scaler', StandardScaler()),
    ('classifier', [1]())
])
Drag options to blanks, or click blank then click option'
AKNeighborsClassifier
BLogisticRegression
CRandomForestClassifier
DSVC
Attempts:
3 left
💡 Hint
Common Mistakes
Choosing a model class that is not imported or not suitable for the pipeline step.
Forgetting to instantiate the model with parentheses.
2fill in blank
medium

Complete the code to split data into training and testing sets before building the pipeline.

ML Python
from sklearn.model_selection import [1]

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
Drag options to blanks, or click blank then click option'
AKFold
Bcross_val_score
CGridSearchCV
Dtrain_test_split
Attempts:
3 left
💡 Hint
Common Mistakes
Using cross-validation functions instead of splitting data.
Not importing the correct function.
3fill in blank
hard

Fix the error in the pipeline code by completing the missing step for feature selection.

ML Python
from sklearn.feature_selection import [1]

pipeline = Pipeline([
    ('scaler', StandardScaler()),
    ('selector', SelectKBest(k=10)),
    ('classifier', LogisticRegression())
])
Drag options to blanks, or click blank then click option'
AVarianceThreshold
BPCA
CSelectKBest
DRFE
Attempts:
3 left
💡 Hint
Common Mistakes
Using PCA which is a dimensionality reduction technique, not feature selection.
Using classes not imported or incompatible with the pipeline.
4fill in blank
hard

Fill both blanks to create a pipeline that scales data and performs cross-validation scoring.

ML Python
from sklearn.model_selection import [1]
from sklearn.preprocessing import [2]
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import Pipeline

pipeline = Pipeline([
    ('scaler', [2]()),
    ('classifier', LogisticRegression())
])
scores = [1](pipeline, X, y, cv=5)
Drag options to blanks, or click blank then click option'
Across_val_score
Btrain_test_split
CStandardScaler
DMinMaxScaler
Attempts:
3 left
💡 Hint
Common Mistakes
Confusing train_test_split with cross_val_score.
Using MinMaxScaler instead of StandardScaler when standardization is needed.
5fill in blank
hard

Fill all three blanks to create a pipeline that imputes missing values, scales features, and fits a classifier.

ML Python
from sklearn.impute import [1]
from sklearn.preprocessing import [2]
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import Pipeline

pipeline = Pipeline([
    ('imputer', [1]()),
    ('scaler', [2]()),
    ('classifier', LogisticRegression())
])
Drag options to blanks, or click blank then click option'
ASimpleImputer
BStandardScaler
CMinMaxScaler
DKNNImputer
Attempts:
3 left
💡 Hint
Common Mistakes
Using KNNImputer without importing it properly.
Mixing up MinMaxScaler and StandardScaler in this context.