Jump into concepts and practice - no test required
or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Recall & Review
beginner
What is Recursive Feature Elimination (RFE)?
RFE is a method to select important features by repeatedly training a model, ranking features by importance, and removing the least important ones until the best set remains.
Click to reveal answer
beginner
Why do we use Recursive Feature Elimination?
We use RFE to improve model performance and reduce complexity by keeping only the most useful features and removing irrelevant or noisy ones.
Click to reveal answer
intermediate
How does RFE decide which features to remove?
RFE trains a model and ranks features by their importance scores (like coefficients or feature importances). It removes the least important features in each step.
Click to reveal answer
intermediate
What types of models can be used with RFE?
RFE works with models that provide feature importance, such as linear models with coefficients or tree-based models with feature importance scores.
Click to reveal answer
intermediate
What is a common stopping criterion in RFE?
RFE stops when a desired number of features is reached or when removing more features hurts model performance.
Click to reveal answer
What does Recursive Feature Elimination do?
ANormalizes all features
BAdds new features to the dataset
CRandomly selects features
DRemoves features one by one based on importance
✗ Incorrect
RFE removes the least important features step-by-step based on model importance scores.
Which model type is suitable for RFE?
AModels that provide feature importance
BModels that only predict labels
CModels without coefficients
DModels that do not train
✗ Incorrect
RFE requires models that can rank features by importance, like linear or tree-based models.
When does RFE usually stop removing features?
AWhen all features are removed
BAfter one iteration
CWhen the desired number of features is reached
DWhen accuracy drops to zero
✗ Incorrect
RFE stops when it reaches the target number of features or performance worsens.
What is the main goal of using RFE?
ATo increase dataset size
BTo select the most important features
CTo create new features
DTo shuffle data randomly
✗ Incorrect
RFE aims to keep only the most useful features for better model performance.
Which of these is NOT a step in RFE?
AAdd random noise to features
BRank features by importance
CRemove least important features
DTrain model on current features
✗ Incorrect
RFE does not add noise; it removes features based on importance.
Explain how Recursive Feature Elimination works step-by-step.
Think about training, ranking, removing, and repeating.
You got /4 concepts.
Why is feature selection important and how does RFE help with it?
Consider benefits of fewer features and how RFE chooses them.
You got /4 concepts.
Practice
(1/5)
1. What is the main goal of Recursive Feature Elimination (RFE) in machine learning?
easy
A. To select the most important features by removing less important ones step by step
B. To increase the number of features in the dataset
C. To randomly shuffle the features before training
D. To create new features by combining existing ones
Solution
Step 1: Understand the purpose of RFE
RFE works by removing less important features one at a time to keep only the best ones.
Step 2: Compare options to the purpose
Only To select the most important features by removing less important ones step by step describes this step-by-step removal of less important features.
Final Answer:
To select the most important features by removing less important ones step by step -> Option A
Quick Check:
RFE = Stepwise feature removal [OK]
Hint: RFE removes features stepwise to keep the best ones [OK]
Common Mistakes:
Thinking RFE adds or creates features
Confusing RFE with random feature shuffling
Believing RFE increases feature count
2. Which of the following is the correct way to import Recursive Feature Elimination from scikit-learn in Python?
easy
A. from sklearn.feature_selection import RecursiveFeatureElimination
B. from sklearn.feature_selection import RFE
C. import sklearn.feature_selection.RFE as rfe
D. from sklearn.selection import RFE
Solution
Step 1: Recall the correct import statement
The class is named RFE and is in sklearn.feature_selection.
Step 2: Match options with correct syntax
from sklearn.feature_selection import RFE correctly imports RFE from sklearn.feature_selection.
Final Answer:
from sklearn.feature_selection import RFE -> Option B
Quick Check:
Correct import is 'from sklearn.feature_selection import RFE' [OK]
Hint: Remember: RFE is imported directly from sklearn.feature_selection [OK]
Common Mistakes:
Using wrong module name like sklearn.selection
Trying to import full name RecursiveFeatureElimination
Using incorrect import syntax
3. Given the following Python code using RFE, what will be the output of print(selected_features)?
from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression
from sklearn.feature_selection import RFE
iris = load_iris()
X, y = iris.data, iris.target
model = LogisticRegression(max_iter=200)
rfe = RFE(model, n_features_to_select=2)
rfe.fit(X, y)
selected_features = rfe.support_
print(selected_features)
medium
A. [ True True False False ]
B. [False True False True ]
C. [ True False True False ]
D. [False False True True ]
Solution
Step 1: Understand RFE output support_
The support_ attribute is a boolean array showing which features are selected.
Step 2: Run RFE with LogisticRegression on iris dataset
RFE selects the two most important features, which for iris are the last two features (petal length and petal width), so the output is [False False True True].
Final Answer:
[False False True True ] -> Option D
Quick Check:
RFE selects last two iris features = [False False True True] [OK]
Hint: Iris important features are last two; RFE selects those [OK]
Common Mistakes:
Assuming first two features are selected
Confusing support_ with ranking_
Not setting max_iter causing convergence warnings
4. Identify the error in this RFE usage code and choose the correct fix:
from sklearn.feature_selection import RFE
from sklearn.linear_model import LogisticRegression
model = LogisticRegression()
rfe = RFE(model, n_features_to_select=0)
rfe.fit(X, y)
medium
A. n_features_to_select cannot be zero; set it to a positive integer
B. LogisticRegression must be imported from sklearn.linear_model.linear_model
C. RFE requires a random_state parameter
D. fit method requires sample_weight argument
Solution
Step 1: Check parameter n_features_to_select
This parameter must be at least 1 or None, zero is invalid.
Step 2: Identify correct fix
Setting n_features_to_select to a positive integer fixes the error.
Final Answer:
n_features_to_select cannot be zero; set it to a positive integer -> Option A
Quick Check:
n_features_to_select > 0 required [OK]
Hint: n_features_to_select must be positive, never zero [OK]
Common Mistakes:
Setting n_features_to_select to zero
Wrong import paths for LogisticRegression
Thinking random_state is mandatory for RFE
5. You have a dataset with 20 features and want to use RFE with a Random Forest model to select the top 5 features. Which of the following code snippets correctly applies RFE and outputs the names of the selected features assuming your data is in a pandas DataFrame df and target in y?
hard
A. from sklearn.feature_selection import RFE
from sklearn.ensemble import RandomForestClassifier
model = RandomForestClassifier()
rfe = RFE(model, n_features_to_select=5)
rfe.fit(df, y)
selected = df.columns[rfe.ranking_ <= 5]
print(selected.tolist())
B. from sklearn.feature_selection import RFE
from sklearn.ensemble import RandomForestClassifier
model = RandomForestClassifier()
rfe = RFE(model, n_features_to_select=5)
rfe.fit(y, df)
selected = df.columns[rfe.support_]
print(selected.tolist())
C. from sklearn.feature_selection import RFE
from sklearn.ensemble import RandomForestClassifier
model = RandomForestClassifier()
rfe = RFE(model, n_features_to_select=5)
rfe.fit(df, y)
selected = df.columns[rfe.support_]
print(selected.tolist())
D. from sklearn.feature_selection import RFE
from sklearn.ensemble import RandomForestClassifier
model = RandomForestClassifier()
rfe = RFE(model, n_features_to_select=5)
rfe.fit(df, y)
selected = df.columns[rfe.ranking_ == 5]
print(selected.tolist())
Solution
Step 1: Check correct fit method usage
Features (df) must be first argument, target (y) second in fit.
Step 2: Select features using support_ boolean mask
Use rfe.support_ to get selected features, then map to column names.
Final Answer:
Code snippet A correctly fits and selects features using support_ mask -> Option C