Challenge - 5 Problems
SciPy Pipeline Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
❓ Predict Output
intermediate2:00remaining
Output of a simple pipeline with StandardScaler and LogisticRegression
What is the output of the following code snippet that creates a pipeline with a scaler and logistic regression, then fits and predicts on a test sample?
SciPy
from sklearn.pipeline import Pipeline from sklearn.preprocessing import StandardScaler from sklearn.linear_model import LogisticRegression import numpy as np X_train = np.array([[1, 2], [2, 3], [3, 4], [4, 5]]) y_train = np.array([0, 0, 1, 1]) pipeline = Pipeline([ ('scaler', StandardScaler()), ('logreg', LogisticRegression(random_state=0)) ]) pipeline.fit(X_train, y_train) X_test = np.array([[1.5, 2.5]]) prediction = pipeline.predict(X_test) print(prediction)
Attempts:
2 left
💡 Hint
Think about how StandardScaler transforms the input and how logistic regression predicts based on training labels.
✗ Incorrect
The pipeline first scales the input features. The logistic regression model trained on the scaled training data predicts class 0 for the test input [1.5, 2.5].
❓ data_output
intermediate2:00remaining
Shape of transformed data after applying PCA in a pipeline
Given the following pipeline that applies PCA to reduce dimensionality, what is the shape of the transformed data after calling transform on X_test?
SciPy
from sklearn.pipeline import Pipeline from sklearn.decomposition import PCA import numpy as np X_train = np.random.rand(10, 5) X_test = np.random.rand(3, 5) pipeline = Pipeline([ ('pca', PCA(n_components=2)) ]) pipeline.fit(X_train) X_transformed = pipeline.transform(X_test) print(X_transformed.shape)
Attempts:
2 left
💡 Hint
PCA reduces the number of features to n_components but keeps the number of samples the same.
✗ Incorrect
The transform method returns data with the same number of samples (3) but reduced features (2). So the shape is (3, 2).
🔧 Debug
advanced2:00remaining
Identify the error in pipeline usage with SciPy function
What error will this code raise when trying to use a SciPy function inside a scikit-learn pipeline step?
SciPy
from sklearn.pipeline import Pipeline from sklearn.preprocessing import FunctionTransformer from scipy.special import expit import numpy as np X = np.array([[0, 1], [2, 3]]) pipeline = Pipeline([ ('sigmoid', FunctionTransformer(expit)), ]) pipeline.fit(X) output = pipeline.transform(X) print(output)
Attempts:
2 left
💡 Hint
FunctionTransformer wraps a function to apply it during transform. Check if expit works element-wise.
✗ Incorrect
FunctionTransformer correctly applies the SciPy expit function element-wise. No error occurs and the output is the sigmoid transformed array.
🚀 Application
advanced2:00remaining
Using a custom SciPy statistical function in a pipeline step
You want to add a pipeline step that replaces each feature with its z-score using SciPy's zscore function. Which pipeline step code correctly applies this transformation?
Attempts:
2 left
💡 Hint
zscore needs axis=0 to standardize features column-wise.
✗ Incorrect
Option C correctly wraps scipy.stats.zscore with axis=0 to standardize features. Option C misses axis argument, default is axis=0 but may cause issues. Option C uses StandardScaler which is similar but not SciPy. Option C uses axis=1 which standardizes rows, not features.
🧠 Conceptual
expert2:00remaining
Why integrate SciPy functions in scikit-learn pipelines?
What is the main advantage of integrating SciPy functions inside scikit-learn pipelines using FunctionTransformer?
Attempts:
2 left
💡 Hint
Think about how pipelines help organize multiple steps in machine learning workflows.
✗ Incorrect
FunctionTransformer wraps SciPy functions so they can be used as pipeline steps, allowing smooth chaining of preprocessing and modeling steps in scikit-learn.