Practice

(1/5)

1. What is the main idea behind Gradient Boosting (GBM)?

easy

A. Using a single deep neural network for prediction

B. Combining many weak models to create a strong model

C. Clustering data points into groups

D. Reducing data dimensions using PCA

Solution

Step 1: Understand the concept of boosting
Boosting means combining many simple models (weak learners) to improve overall prediction.
Step 2: Identify Gradient Boosting's approach
Gradient Boosting builds models sequentially, each correcting errors of the previous one, making a strong model.
Final Answer:
Combining many weak models to create a strong model -> Option B
Quick Check:
Boosting = Combining weak models [OK]

Hint: Boosting means many weak models combined [OK]

Common Mistakes:

Confusing boosting with deep learning
Thinking GBM clusters data
Mixing boosting with dimensionality reduction

2. Which of the following is the correct way to import GradientBoostingClassifier from scikit-learn?

easy

A. import GradientBoostingClassifier from sklearn

B. from sklearn import GradientBoostingClassifier

C. from sklearn.ensemble import GradientBoostingClassifier

D. import GradientBoostingClassifier from sklearn.ensemble

Solution

Step 1: Recall correct import syntax in Python
Python imports classes or functions using 'from module import class' syntax.
Step 2: Identify the correct module for GradientBoostingClassifier
GradientBoostingClassifier is in sklearn.ensemble, so correct import is from sklearn.ensemble import GradientBoostingClassifier.
Final Answer:
from sklearn.ensemble import GradientBoostingClassifier -> Option C
Quick Check:
Correct import syntax = from sklearn.ensemble import GradientBoostingClassifier [OK]

Hint: Use 'from sklearn.ensemble import GradientBoostingClassifier' [OK]

Common Mistakes:

Using 'import' instead of 'from ... import ...'
Importing from wrong module
Wrong order of import statement

3. What will be the output of the following code snippet?

from sklearn.ensemble import GradientBoostingRegressor
X = [[1], [2], [3], [4]]
y = [2, 4, 6, 8]
gbm = GradientBoostingRegressor(n_estimators=100, learning_rate=0.1)
gbm.fit(X, y)
pred = gbm.predict([[5]])
print(round(pred[0], 1))

medium

A. 9.0

B. 10.0

C. 8.0

D. 6.0

Solution

Step 1: Understand the training data and model
X and y show a linear relation y = 2 * x. The model is GradientBoostingRegressor with 100 trees and learning rate 0.1.
Step 2: Predict for input 5
Gradient Boosting can extrapolate somewhat beyond training data, especially with many estimators and moderate learning rate, so prediction is close to 10.0.
Final Answer:
9.0 -> Option A
Quick Check:
Prediction near linear extrapolation = 9.0 [OK]

Hint: Tree boosting can approximate linear extrapolation with enough estimators [OK]

Common Mistakes:

Expecting exact linear output
Ignoring learning rate effect
Confusing classification with regression output

4. Identify the error in this Gradient Boosting code snippet:

from sklearn.ensemble import GradientBoostingClassifier
X = [[0], [1], [2]]
y = [0, 1, 0]
gbm = GradientBoostingClassifier(n_estimators='100')
gbm.fit(X, y)

medium

A. n_estimators should be an integer, not a string

B. X should be a numpy array, not a list

C. GradientBoostingClassifier cannot handle binary targets

D. Missing learning_rate parameter

Solution

Step 1: Check parameter types
n_estimators expects an integer number of trees, but '100' is a string, causing a type error.
Step 2: Validate other parts
X as list is acceptable, binary targets are valid, learning_rate is optional with default 0.1.
Final Answer:
n_estimators should be an integer, not a string -> Option A
Quick Check:
Parameter types must match expected types [OK]

Hint: Check parameter types carefully [OK]

Common Mistakes:

Passing numbers as strings
Assuming lists are invalid input
Thinking learning_rate is mandatory

5. You want to improve a Gradient Boosting model's accuracy but training is very slow. Which combination of hyperparameters is best to try first?

hard

A. Increase n_estimators and decrease learning_rate

B. Increase both n_estimators and learning_rate

C. Set n_estimators to 1 and learning_rate to 0.01

D. Decrease n_estimators and increase learning_rate

Solution

Step 1: Understand hyperparameter effects
More n_estimators means more trees and slower training; higher learning_rate speeds learning but risks overfitting.
Step 2: Balance speed and accuracy
Decreasing n_estimators reduces training time; increasing learning_rate compensates to keep accuracy.
Final Answer:
Decrease n_estimators and increase learning_rate -> Option D
Quick Check:
Fewer trees + higher learning rate = faster training [OK]

Hint: Fewer trees + higher learning rate speeds training [OK]

Common Mistakes:

Increasing both slows training
Too low n_estimators hurts accuracy
Too low learning_rate slows learning

Epoch	Loss ↓	Accuracy ↑	Observation
1	0.45	0.65	Initial tree reduces error significantly
10	0.30	0.78	Model improves as more trees are added
50	0.18	0.88	Loss steadily decreases, accuracy rises
100	0.12	0.92	Model converges with good accuracy

Gradient Boosting (GBM) in ML Python - Model Pipeline Trace

Start learning this pattern below

Practice

Solution

Step 1: Understand the concept of boosting

Step 2: Identify Gradient Boosting's approach

Final Answer:

Quick Check:

Solution

Step 1: Recall correct import syntax in Python

Step 2: Identify the correct module for GradientBoostingClassifier

Final Answer:

Quick Check:

Solution

Step 1: Understand the training data and model

Step 2: Predict for input 5

Final Answer:

Quick Check:

Solution

Step 1: Check parameter types

Step 2: Validate other parts

Final Answer:

Quick Check:

Solution

Step 1: Understand hyperparameter effects

Step 2: Balance speed and accuracy

Final Answer:

Quick Check: