Bird
Raised Fist0
ML Pythonml~20 mins

Feature selection methods in ML Python - Practice Problems & Coding Challenges

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Challenge - 5 Problems
🎖️
Feature Selection Mastery
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate
2:00remaining
Which feature selection method uses model coefficients to select important features?

Imagine you have a dataset with many features. You want to pick the most important ones by looking at the model's learned weights. Which feature selection method does this?

AFilter method using correlation coefficients
BWrapper method using recursive feature elimination
CDimensionality reduction using PCA
DEmbedded method using Lasso regression coefficients
Attempts:
2 left
💡 Hint

Think about methods that select features during model training by shrinking some coefficients to zero.

Predict Output
intermediate
2:00remaining
What is the number of features selected by this code snippet?

Consider the following Python code using sklearn to select features based on univariate statistical tests.

ML Python
from sklearn.datasets import load_iris
from sklearn.feature_selection import SelectKBest, f_classif

X, y = load_iris(return_X_y=True)
selector = SelectKBest(score_func=f_classif, k=2)
X_new = selector.fit_transform(X, y)
num_features = X_new.shape[1]
A3
B4
C2
D1
Attempts:
2 left
💡 Hint

Look at the parameter k in SelectKBest.

Model Choice
advanced
2:00remaining
Which model is best suited for embedded feature selection in high-dimensional sparse data?

You have a dataset with thousands of features but only a few samples. You want a model that can select important features while training. Which model is best?

ALasso Regression
BRandom Forest Classifier
CK-Nearest Neighbors
DSupport Vector Machine with RBF kernel
Attempts:
2 left
💡 Hint

Think about models that add penalties to reduce coefficients to zero.

Hyperparameter
advanced
2:00remaining
What effect does increasing the 'k' parameter in SelectKBest have?

In the SelectKBest feature selection method, what happens if you increase the value of k?

AMore features are selected, possibly including less relevant ones
BFewer features are selected, focusing only on the top ones
CThe model automatically tunes <code>k</code> during training
DThe method switches from filter to wrapper approach
Attempts:
2 left
💡 Hint

Think about what k controls in feature selection.

Metrics
expert
2:00remaining
Which metric best evaluates feature selection effectiveness in classification?

You want to measure how well your feature selection improved your classification model. Which metric is most appropriate to compare before and after feature selection?

AMean squared error on training data
BModel accuracy on a validation set
CNumber of features selected
DTraining time of the model
Attempts:
2 left
💡 Hint

Think about what shows if the model predicts better with selected features.

Practice

(1/5)
1. Which of the following best describes the purpose of feature selection in machine learning?
easy
A. To choose the most important features to improve model performance
B. To increase the number of features in the dataset
C. To randomly remove features from the dataset
D. To convert features into labels for training

Solution

  1. Step 1: Understand feature selection goal

    Feature selection aims to pick the most useful features that help the model learn better.
  2. Step 2: Evaluate options

    Only To choose the most important features to improve model performance correctly states that feature selection chooses important features to improve model performance.
  3. Final Answer:

    To choose the most important features to improve model performance -> Option A
  4. Quick Check:

    Feature selection = pick important features [OK]
Hint: Feature selection picks useful features, not random or all [OK]
Common Mistakes:
  • Thinking feature selection adds features
  • Confusing feature selection with feature engineering
  • Believing feature selection changes labels
2. Which Python library provides the SelectKBest feature selection method?
easy
A. pandas
B. scikit-learn
C. numpy
D. matplotlib

Solution

  1. Step 1: Recall common ML libraries

    Scikit-learn is the main library for machine learning tools including feature selection.
  2. Step 2: Match method to library

    SelectKBest is part of scikit-learn's feature_selection module, not pandas, numpy, or matplotlib.
  3. Final Answer:

    scikit-learn -> Option B
  4. Quick Check:

    SelectKBest = scikit-learn [OK]
Hint: SelectKBest is from scikit-learn, not data or plotting libs [OK]
Common Mistakes:
  • Choosing pandas because it handles data
  • Confusing numpy with ML feature tools
  • Selecting matplotlib which is for plotting
3. What will be the output shape of features after applying VarianceThreshold(threshold=0.1) on a dataset with shape (100, 5) where only 3 features have variance above 0.1?
medium
A. (5, 100)
B. (100, 5)
C. (3, 100)
D. (100, 3)

Solution

  1. Step 1: Understand VarianceThreshold effect

    VarianceThreshold removes features with variance below the threshold, keeping only those above it.
  2. Step 2: Apply to given data

    Since 3 features have variance above 0.1, only those 3 remain. The number of samples (100) stays the same.
  3. Final Answer:

    (100, 3) -> Option D
  4. Quick Check:

    VarianceThreshold keeps features with variance > threshold [OK]
Hint: Output shape keeps rows, columns = features passing threshold [OK]
Common Mistakes:
  • Confusing rows and columns in shape
  • Assuming all features remain
  • Thinking variance threshold changes sample count
4. Consider this code snippet:
from sklearn.feature_selection import RFE
from sklearn.linear_model import LogisticRegression

model = LogisticRegression()
rfe = RFE(model, n_features_to_select=2)
rfe.fit(X, y)
selected = rfe.transform(X)
print(selected.shape)
If X has shape (50, 4), but the output shape is (50, 4), what is the likely error?
medium
A. RFE does not reduce features automatically
B. n_features_to_select is greater than number of features
C. RFE was not fitted before transform
D. LogisticRegression model is incompatible with RFE

Solution

  1. Step 1: Understand RFE usage

    RFE must be fitted before calling transform to reduce features.
  2. Step 2: Check given code and output

    If output shape is unchanged, likely transform was called before fitting or fitting failed.
  3. Step 3: Identify cause

    Since code shows fitting before transform, but output shape unchanged, the most common cause is that transform was called on unfitted RFE or fit did not complete properly.
  4. Final Answer:

    RFE was not fitted before transform -> Option C
  5. Quick Check:

    Fit RFE before transform to reduce features [OK]
Hint: Ensure RFE is fitted before transform [OK]
Common Mistakes:
  • Assuming transform always reduces features without fitting
  • Ignoring the need to fit RFE
  • Thinking model type causes shape issue
5. You have a dataset with 10 features, but 4 are highly correlated and 2 have very low variance. Which feature selection approach best improves model simplicity and speed?
hard
A. Apply VarianceThreshold to remove low variance, then use correlation filter to drop correlated features
B. Use RFE with all features and keep all 10
C. Use SelectKBest to pick top 6 features by univariate scores
D. Randomly drop 4 features to reduce dimensionality

Solution

  1. Step 1: Identify problem features

    Low variance features add little info; correlated features add redundancy.
  2. Step 2: Choose method to remove both

    VarianceThreshold removes low variance features; correlation filter removes redundant correlated features.
  3. Step 3: Evaluate options

    Apply VarianceThreshold to remove low variance, then use correlation filter to drop correlated features combines both methods to improve simplicity and speed effectively.
  4. Final Answer:

    Apply VarianceThreshold to remove low variance, then use correlation filter to drop correlated features -> Option A
  5. Quick Check:

    Remove low variance + correlated features = simpler model [OK]
Hint: Combine variance and correlation filters for best feature reduction [OK]
Common Mistakes:
  • Using only one method ignoring other feature issues
  • Randomly dropping features without reason
  • Keeping all features with RFE without reduction