Practice

(1/5)

1. What is the main advantage of using a random forest over a single decision tree?

easy

A. It reduces overfitting by averaging multiple trees.

B. It always runs faster than a single tree.

C. It requires less data to train.

D. It uses only one feature for splitting.

Solution

Step 1: Understand decision tree limitations
A single decision tree can easily overfit, meaning it learns noise and performs poorly on new data.
Step 2: How random forest improves
Random forest builds many trees on random subsets of data and features, then averages their results to reduce overfitting.
Final Answer:
It reduces overfitting by averaging multiple trees. -> Option A
Quick Check:
Random forest reduces overfitting = B [OK]

Hint: Random forest averages trees to avoid overfitting [OK]

Common Mistakes:

Thinking random forest is always faster than one tree
Believing it uses fewer data than a single tree
Assuming it splits on only one feature

2. Which of the following is the correct way to create a random forest classifier in Python using scikit-learn?

easy

A. from sklearn.ensemble import RandomForestClassifier model = RandomForestClassifier(n_estimators=100)

B. from sklearn.tree import RandomForest model = RandomForest(100)

C. import randomforest model = randomforest.RandomForestClassifier(100)

D. from sklearn.ensemble import RandomForest model = RandomForest(n_trees=100)

Solution

Step 1: Identify correct import
The random forest classifier is in sklearn.ensemble as RandomForestClassifier.
Step 2: Check constructor usage
We create it by calling RandomForestClassifier with n_estimators=100 to set number of trees.
Final Answer:
from sklearn.ensemble import RandomForestClassifier model = RandomForestClassifier(n_estimators=100) -> Option A
Quick Check:
Correct import and parameter = A [OK]

Hint: Use sklearn.ensemble.RandomForestClassifier with n_estimators [OK]

Common Mistakes:

Importing from sklearn.tree instead of sklearn.ensemble
Using wrong class names like RandomForest
Passing wrong parameter names like n_trees

3. Consider this Python code using scikit-learn's random forest:

from sklearn.ensemble import RandomForestClassifier
model = RandomForestClassifier(n_estimators=3, max_depth=2, random_state=42)
X = [[0, 0], [1, 1], [0, 1], [1, 0]]
y = [0, 1, 1, 0]
model.fit(X, y)
preds = model.predict([[0, 0], [1, 1]])
print(list(preds))

What is the output?

medium

A. [0, 0]

B. [0, 1]

C. [1, 0]

D. [1, 1]

Solution

Step 1: Understand training data and labels
Input points [0,0] and [1,1] have labels 0 and 1 respectively.
Step 2: Predict on same points with trained model
Random forest with 3 trees and max depth 2 will learn simple splits and predict correctly on these points.
Final Answer:
[0, 1] -> Option B
Quick Check:
Predictions match training labels = C [OK]

Hint: Predictions on training points usually match labels [OK]

Common Mistakes:

Confusing input order and labels
Assuming random forest predicts opposite labels
Ignoring max_depth effect

4. You wrote this code but get an error:

from sklearn.ensemble import RandomForestClassifier
model = RandomForestClassifier(n_estimators='100')
model.fit(X_train, y_train)

What is the problem?

medium

A. fit method requires extra parameters.

B. RandomForestClassifier does not have n_estimators parameter.

C. n_estimators should be an integer, not a string.

D. You must import RandomForestRegressor instead.

Solution

Step 1: Check parameter type for n_estimators
n_estimators expects an integer number of trees, not a string.
Step 2: Identify error cause
Passing '100' as a string causes a type error during model creation or training.
Final Answer:
n_estimators should be an integer, not a string. -> Option C
Quick Check:
Parameter type mismatch = A [OK]

Hint: Use integer for n_estimators, not string [OK]

Common Mistakes:

Passing numbers as strings in parameters
Confusing classifier and regressor classes
Thinking fit needs extra arguments

5. You want to improve your random forest model's accuracy on a complex dataset. Which combination of hyperparameters is best to try first?

hard

A. Set max_depth to 1 and keep n_estimators low

B. Decrease n_estimators and decrease max_depth

C. Increase max_features to total features and decrease n_estimators

D. Increase n_estimators and increase max_depth

Solution

Step 1: Understand effect of n_estimators
More trees (higher n_estimators) usually improve accuracy by reducing variance.
Step 2: Understand effect of max_depth
Increasing max_depth allows trees to learn more complex patterns, improving accuracy on complex data.
Final Answer:
Increase n_estimators and increase max_depth -> Option D
Quick Check:
More trees + deeper trees = better accuracy [OK]

Hint: More trees and deeper trees usually improve accuracy [OK]

Common Mistakes:

Reducing trees and depth lowers accuracy
Setting max_depth too low causes underfitting
Increasing max_features too much can cause overfitting

Epoch	Loss ↓	Accuracy ↑	Observation
1	0.45	0.75	Initial trees start to learn patterns
2	0.38	0.80	More trees reduce error and improve accuracy
3	0.33	0.83	Model converges with stable improvements
4	0.30	0.85	Small gains as trees refine decisions
5	0.28	0.86	Training stabilizes with low loss and high accuracy

Random forest in depth in ML Python - Model Pipeline Trace

Start learning this pattern below

Practice

Solution

Step 1: Understand decision tree limitations

Step 2: How random forest improves

Final Answer:

Quick Check:

Solution

Step 1: Identify correct import

Step 2: Check constructor usage

Final Answer:

Quick Check:

Solution

Step 1: Understand training data and labels

Step 2: Predict on same points with trained model

Final Answer:

Quick Check:

Solution

Step 1: Check parameter type for n_estimators

Step 2: Identify error cause

Final Answer:

Quick Check:

Solution

Step 1: Understand effect of n_estimators

Step 2: Understand effect of max_depth

Final Answer:

Quick Check: