Bird
Raised Fist0
ML Pythonml~20 mins

Recursive feature elimination in ML Python - Practice Problems & Coding Challenges

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Challenge - 5 Problems
🎖️
RFE Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate
2:00remaining
What is the main goal of Recursive Feature Elimination (RFE)?

Imagine you have many features in your dataset. What does RFE try to do with these features?

AIt tries to select the most important features by removing the least important ones step by step.
BIt creates new features by combining existing ones to improve model accuracy.
CIt randomly removes features to reduce dataset size without considering importance.
DIt increases the number of features by duplicating existing ones to add more data.
Attempts:
2 left
💡 Hint

Think about how RFE helps simplify the model by focusing on key features.

Predict Output
intermediate
2:00remaining
What is the output shape of X after applying RFE with 3 features?

Given the code below, what will be the shape of X_rfe?

ML Python
from sklearn.datasets import load_iris
from sklearn.feature_selection import RFE
from sklearn.linear_model import LogisticRegression

iris = load_iris()
X = iris.data
Y = iris.target

model = LogisticRegression(max_iter=200)
rfe = RFE(model, n_features_to_select=3)
rfe = rfe.fit(X, Y)
X_rfe = rfe.transform(X)
print(X_rfe.shape)
A(4, 150)
B(150, 4)
C(3, 150)
D(150, 3)
Attempts:
2 left
💡 Hint

RFE reduces the number of features but keeps the number of samples the same.

Model Choice
advanced
2:00remaining
Which model is best suited for RFE in this scenario?

You want to use RFE to select features for a classification task with a small dataset. Which model below is most appropriate to use with RFE?

ALogistic Regression with L2 regularization
BK-Nearest Neighbors without feature importance
CK-Means clustering
DPrincipal Component Analysis (PCA)
Attempts:
2 left
💡 Hint

RFE needs a model that can provide feature importance or coefficients.

Hyperparameter
advanced
2:00remaining
What does the hyperparameter n_features_to_select control in RFE?

In Recursive Feature Elimination, what is the role of n_features_to_select?

AIt specifies the percentage of features to remove at each step.
BIt sets the number of features to keep after elimination.
CIt controls the number of iterations to run the model.
DIt defines the minimum importance score for features to be kept.
Attempts:
2 left
💡 Hint

Think about how many features you want to end up with after RFE finishes.

Metrics
expert
2:00remaining
How to evaluate if RFE improved model performance?

You applied RFE to select features and trained a model. Which metric comparison best shows if RFE helped?

ACompare the time taken to train the model without checking accuracy.
BCompare training loss only on the training data before and after RFE.
CCompare model accuracy on a test set before and after RFE.
DCompare the number of features selected without checking model results.
Attempts:
2 left
💡 Hint

Think about how to know if the model got better at predicting new data.

Practice

(1/5)
1. What is the main goal of Recursive Feature Elimination (RFE) in machine learning?
easy
A. To select the most important features by removing less important ones step by step
B. To increase the number of features in the dataset
C. To randomly shuffle the features before training
D. To create new features by combining existing ones

Solution

  1. Step 1: Understand the purpose of RFE

    RFE works by removing less important features one at a time to keep only the best ones.
  2. Step 2: Compare options to the purpose

    Only To select the most important features by removing less important ones step by step describes this step-by-step removal of less important features.
  3. Final Answer:

    To select the most important features by removing less important ones step by step -> Option A
  4. Quick Check:

    RFE = Stepwise feature removal [OK]
Hint: RFE removes features stepwise to keep the best ones [OK]
Common Mistakes:
  • Thinking RFE adds or creates features
  • Confusing RFE with random feature shuffling
  • Believing RFE increases feature count
2. Which of the following is the correct way to import Recursive Feature Elimination from scikit-learn in Python?
easy
A. from sklearn.feature_selection import RecursiveFeatureElimination
B. from sklearn.feature_selection import RFE
C. import sklearn.feature_selection.RFE as rfe
D. from sklearn.selection import RFE

Solution

  1. Step 1: Recall the correct import statement

    The class is named RFE and is in sklearn.feature_selection.
  2. Step 2: Match options with correct syntax

    from sklearn.feature_selection import RFE correctly imports RFE from sklearn.feature_selection.
  3. Final Answer:

    from sklearn.feature_selection import RFE -> Option B
  4. Quick Check:

    Correct import is 'from sklearn.feature_selection import RFE' [OK]
Hint: Remember: RFE is imported directly from sklearn.feature_selection [OK]
Common Mistakes:
  • Using wrong module name like sklearn.selection
  • Trying to import full name RecursiveFeatureElimination
  • Using incorrect import syntax
3. Given the following Python code using RFE, what will be the output of print(selected_features)?
from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression
from sklearn.feature_selection import RFE

iris = load_iris()
X, y = iris.data, iris.target
model = LogisticRegression(max_iter=200)
rfe = RFE(model, n_features_to_select=2)
rfe.fit(X, y)
selected_features = rfe.support_
print(selected_features)
medium
A. [ True True False False ]
B. [False True False True ]
C. [ True False True False ]
D. [False False True True ]

Solution

  1. Step 1: Understand RFE output support_

    The support_ attribute is a boolean array showing which features are selected.
  2. Step 2: Run RFE with LogisticRegression on iris dataset

    RFE selects the two most important features, which for iris are the last two features (petal length and petal width), so the output is [False False True True].
  3. Final Answer:

    [False False True True ] -> Option D
  4. Quick Check:

    RFE selects last two iris features = [False False True True] [OK]
Hint: Iris important features are last two; RFE selects those [OK]
Common Mistakes:
  • Assuming first two features are selected
  • Confusing support_ with ranking_
  • Not setting max_iter causing convergence warnings
4. Identify the error in this RFE usage code and choose the correct fix:
from sklearn.feature_selection import RFE
from sklearn.linear_model import LogisticRegression

model = LogisticRegression()
rfe = RFE(model, n_features_to_select=0)
rfe.fit(X, y)
medium
A. n_features_to_select cannot be zero; set it to a positive integer
B. LogisticRegression must be imported from sklearn.linear_model.linear_model
C. RFE requires a random_state parameter
D. fit method requires sample_weight argument

Solution

  1. Step 1: Check parameter n_features_to_select

    This parameter must be at least 1 or None, zero is invalid.
  2. Step 2: Identify correct fix

    Setting n_features_to_select to a positive integer fixes the error.
  3. Final Answer:

    n_features_to_select cannot be zero; set it to a positive integer -> Option A
  4. Quick Check:

    n_features_to_select > 0 required [OK]
Hint: n_features_to_select must be positive, never zero [OK]
Common Mistakes:
  • Setting n_features_to_select to zero
  • Wrong import paths for LogisticRegression
  • Thinking random_state is mandatory for RFE
5. You have a dataset with 20 features and want to use RFE with a Random Forest model to select the top 5 features. Which of the following code snippets correctly applies RFE and outputs the names of the selected features assuming your data is in a pandas DataFrame df and target in y?
hard
A. from sklearn.feature_selection import RFE from sklearn.ensemble import RandomForestClassifier model = RandomForestClassifier() rfe = RFE(model, n_features_to_select=5) rfe.fit(df, y) selected = df.columns[rfe.ranking_ <= 5] print(selected.tolist())
B. from sklearn.feature_selection import RFE from sklearn.ensemble import RandomForestClassifier model = RandomForestClassifier() rfe = RFE(model, n_features_to_select=5) rfe.fit(y, df) selected = df.columns[rfe.support_] print(selected.tolist())
C. from sklearn.feature_selection import RFE from sklearn.ensemble import RandomForestClassifier model = RandomForestClassifier() rfe = RFE(model, n_features_to_select=5) rfe.fit(df, y) selected = df.columns[rfe.support_] print(selected.tolist())
D. from sklearn.feature_selection import RFE from sklearn.ensemble import RandomForestClassifier model = RandomForestClassifier() rfe = RFE(model, n_features_to_select=5) rfe.fit(df, y) selected = df.columns[rfe.ranking_ == 5] print(selected.tolist())

Solution

  1. Step 1: Check correct fit method usage

    Features (df) must be first argument, target (y) second in fit.
  2. Step 2: Select features using support_ boolean mask

    Use rfe.support_ to get selected features, then map to column names.
  3. Final Answer:

    Code snippet A correctly fits and selects features using support_ mask -> Option C
  4. Quick Check:

    fit(df, y) + support_ mask = correct feature selection [OK]
Hint: fit(df, y) and use support_ to get selected features [OK]
Common Mistakes:
  • Swapping X and y in fit method
  • Using ranking_ == 5 instead of support_
  • Not converting boolean mask to column names