0
0
ML Pythonml~20 mins

Random forest in depth in ML Python - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Random Forest Mastery
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate
2:00remaining
How does a random forest reduce overfitting compared to a single decision tree?

Random forests build many decision trees and combine their results. Why does this help reduce overfitting compared to using just one tree?

ABecause averaging many trees reduces variance and prevents the model from fitting noise in the training data.
BBecause random forests use deeper trees that memorize the training data better.
CBecause random forests only use a single feature for all splits, making the model simpler.
DBecause random forests remove all noisy data points before training.
Attempts:
2 left
💡 Hint

Think about how averaging multiple guesses can make the final guess more stable.

Predict Output
intermediate
2:00remaining
Output of feature importance extraction from a random forest

What is the output of this code snippet that trains a random forest and prints feature importances?

ML Python
from sklearn.datasets import load_iris
from sklearn.ensemble import RandomForestClassifier

iris = load_iris()
X, y = iris.data, iris.target
model = RandomForestClassifier(n_estimators=10, random_state=42)
model.fit(X, y)
importances = model.feature_importances_
print([round(i, 2) for i in importances])
A[0.5, 0.2, 0.2, 0.1]
B[0.25, 0.25, 0.25, 0.25]
C[0.11, 0.03, 0.44, 0.42]
D[0.0, 0.0, 0.0, 1.0]
Attempts:
2 left
💡 Hint

Feature importances sum to 1 and reflect how useful each feature is for splitting.

Hyperparameter
advanced
2:00remaining
Which hyperparameter controls the number of features considered at each split in a random forest?

When training a random forest, which hyperparameter decides how many features are randomly selected to consider for splitting at each node?

Amax_depth
Bmax_features
Cmin_samples_split
Dn_estimators
Attempts:
2 left
💡 Hint

This parameter controls randomness in feature selection per split.

Metrics
advanced
2:00remaining
Interpreting out-of-bag (OOB) error in random forests

What does the out-of-bag (OOB) error estimate in a random forest represent?

AThe error on data not used to train each tree, providing an unbiased estimate of test error.
BThe error on the training data used to build each tree.
CThe error on a separate validation dataset provided by the user.
DThe error after pruning the trees to reduce complexity.
Attempts:
2 left
💡 Hint

OOB samples are those left out when bootstrapping data for each tree.

🔧 Debug
expert
2:00remaining
Why does this random forest model raise a ValueError?

Consider this code snippet:

from sklearn.ensemble import RandomForestClassifier
model = RandomForestClassifier(n_estimators=100, max_features='auto')
model.fit(X_train, y_train)

Why does this code raise a ValueError?

ABecause RandomForestClassifier requires max_depth to be set explicitly.
BBecause n_estimators must be less than 50.
CBecause X_train and y_train are not defined.
DBecause 'auto' is not a valid value for max_features in RandomForestClassifier in recent sklearn versions.
Attempts:
2 left
💡 Hint

Check the allowed values for max_features in sklearn 1.2+.