Bird
Raised Fist0
ML Pythonml~10 mins

Random forest in depth in ML Python - Interactive Code Practice

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Practice - 5 Tasks
Answer the questions below
1fill in blank
easy

Complete the code to import the RandomForestClassifier from scikit-learn.

ML Python
from sklearn.ensemble import [1]
Drag options to blanks, or click blank then click option'
ASVC
BDecisionTreeClassifier
CKNeighborsClassifier
DRandomForestClassifier
Attempts:
3 left
💡 Hint
Common Mistakes
Importing DecisionTreeClassifier instead of RandomForestClassifier
Importing unrelated classifiers like KNeighborsClassifier or SVC
2fill in blank
medium

Complete the code to create a random forest model with 100 trees.

ML Python
model = RandomForestClassifier(n_estimators=[1])
Drag options to blanks, or click blank then click option'
A100
B10
C1000
D1
Attempts:
3 left
💡 Hint
Common Mistakes
Using too few trees like 1 or 10 which may underfit
Using too many trees like 1000 which can be slow
3fill in blank
hard

Fix the error in the code to fit the model on training data X_train and y_train.

ML Python
model.[1](X_train, y_train)
Drag options to blanks, or click blank then click option'
Afit
Bpredict
Ctransform
Dscore
Attempts:
3 left
💡 Hint
Common Mistakes
Using predict instead of fit to train the model
Using transform which is for data preprocessing
4fill in blank
hard

Fill both blanks to create a dictionary of feature importances and sort it by importance descending.

ML Python
importances = dict(enumerate(model.[1]))
sorted_importances = dict(sorted(importances.items(), key=lambda item: item[[2]], reverse=True))
Drag options to blanks, or click blank then click option'
Afeature_importances_
B0
C1
Dfeatures_
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'features_' which does not exist
Sorting by item[0] which sorts by feature names, not importance
5fill in blank
hard

Fill all three blanks to predict on test data X_test, calculate accuracy, and print it.

ML Python
predictions = model.[1](X_test)
accuracy = [2](y_test, predictions)
print('Accuracy:', [3])
Drag options to blanks, or click blank then click option'
Apredict
Baccuracy_score
Caccuracy
Dscore
Attempts:
3 left
💡 Hint
Common Mistakes
Using model.score instead of accuracy_score function
Printing the function name instead of the accuracy variable

Practice

(1/5)
1. What is the main advantage of using a random forest over a single decision tree?
easy
A. It reduces overfitting by averaging multiple trees.
B. It always runs faster than a single tree.
C. It requires less data to train.
D. It uses only one feature for splitting.

Solution

  1. Step 1: Understand decision tree limitations

    A single decision tree can easily overfit, meaning it learns noise and performs poorly on new data.
  2. Step 2: How random forest improves

    Random forest builds many trees on random subsets of data and features, then averages their results to reduce overfitting.
  3. Final Answer:

    It reduces overfitting by averaging multiple trees. -> Option A
  4. Quick Check:

    Random forest reduces overfitting = B [OK]
Hint: Random forest averages trees to avoid overfitting [OK]
Common Mistakes:
  • Thinking random forest is always faster than one tree
  • Believing it uses fewer data than a single tree
  • Assuming it splits on only one feature
2. Which of the following is the correct way to create a random forest classifier in Python using scikit-learn?
easy
A. from sklearn.ensemble import RandomForestClassifier model = RandomForestClassifier(n_estimators=100)
B. from sklearn.tree import RandomForest model = RandomForest(100)
C. import randomforest model = randomforest.RandomForestClassifier(100)
D. from sklearn.ensemble import RandomForest model = RandomForest(n_trees=100)

Solution

  1. Step 1: Identify correct import

    The random forest classifier is in sklearn.ensemble as RandomForestClassifier.
  2. Step 2: Check constructor usage

    We create it by calling RandomForestClassifier with n_estimators=100 to set number of trees.
  3. Final Answer:

    from sklearn.ensemble import RandomForestClassifier model = RandomForestClassifier(n_estimators=100) -> Option A
  4. Quick Check:

    Correct import and parameter = A [OK]
Hint: Use sklearn.ensemble.RandomForestClassifier with n_estimators [OK]
Common Mistakes:
  • Importing from sklearn.tree instead of sklearn.ensemble
  • Using wrong class names like RandomForest
  • Passing wrong parameter names like n_trees
3. Consider this Python code using scikit-learn's random forest:
from sklearn.ensemble import RandomForestClassifier
model = RandomForestClassifier(n_estimators=3, max_depth=2, random_state=42)
X = [[0, 0], [1, 1], [0, 1], [1, 0]]
y = [0, 1, 1, 0]
model.fit(X, y)
preds = model.predict([[0, 0], [1, 1]])
print(list(preds))
What is the output?
medium
A. [0, 0]
B. [0, 1]
C. [1, 0]
D. [1, 1]

Solution

  1. Step 1: Understand training data and labels

    Input points [0,0] and [1,1] have labels 0 and 1 respectively.
  2. Step 2: Predict on same points with trained model

    Random forest with 3 trees and max depth 2 will learn simple splits and predict correctly on these points.
  3. Final Answer:

    [0, 1] -> Option B
  4. Quick Check:

    Predictions match training labels = C [OK]
Hint: Predictions on training points usually match labels [OK]
Common Mistakes:
  • Confusing input order and labels
  • Assuming random forest predicts opposite labels
  • Ignoring max_depth effect
4. You wrote this code but get an error:
from sklearn.ensemble import RandomForestClassifier
model = RandomForestClassifier(n_estimators='100')
model.fit(X_train, y_train)
What is the problem?
medium
A. fit method requires extra parameters.
B. RandomForestClassifier does not have n_estimators parameter.
C. n_estimators should be an integer, not a string.
D. You must import RandomForestRegressor instead.

Solution

  1. Step 1: Check parameter type for n_estimators

    n_estimators expects an integer number of trees, not a string.
  2. Step 2: Identify error cause

    Passing '100' as a string causes a type error during model creation or training.
  3. Final Answer:

    n_estimators should be an integer, not a string. -> Option C
  4. Quick Check:

    Parameter type mismatch = A [OK]
Hint: Use integer for n_estimators, not string [OK]
Common Mistakes:
  • Passing numbers as strings in parameters
  • Confusing classifier and regressor classes
  • Thinking fit needs extra arguments
5. You want to improve your random forest model's accuracy on a complex dataset. Which combination of hyperparameters is best to try first?
hard
A. Set max_depth to 1 and keep n_estimators low
B. Decrease n_estimators and decrease max_depth
C. Increase max_features to total features and decrease n_estimators
D. Increase n_estimators and increase max_depth

Solution

  1. Step 1: Understand effect of n_estimators

    More trees (higher n_estimators) usually improve accuracy by reducing variance.
  2. Step 2: Understand effect of max_depth

    Increasing max_depth allows trees to learn more complex patterns, improving accuracy on complex data.
  3. Final Answer:

    Increase n_estimators and increase max_depth -> Option D
  4. Quick Check:

    More trees + deeper trees = better accuracy [OK]
Hint: More trees and deeper trees usually improve accuracy [OK]
Common Mistakes:
  • Reducing trees and depth lowers accuracy
  • Setting max_depth too low causes underfitting
  • Increasing max_features too much can cause overfitting