Bird
Raised Fist0
ML Pythonml~20 mins

XGBoost in ML Python - Practice Problems & Coding Challenges

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Challenge - 5 Problems
🎖️
XGBoost Mastery Badge
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate
2:00remaining
Understanding XGBoost Objective Functions

Which of the following objective functions is not commonly used in XGBoost for classification tasks?

Areg:squarederror
Bmulti:softmax
Cbinary:logistic
Dcount:poisson
Attempts:
2 left
💡 Hint

Think about which objective functions are for regression versus classification.

Predict Output
intermediate
2:00remaining
XGBoost Model Training Output

What will be the output of the following Python code snippet using XGBoost?

ML Python
import xgboost as xgb
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

iris = load_iris()
X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, test_size=0.2, random_state=42)
dtrain = xgb.DMatrix(X_train, label=y_train)
dtest = xgb.DMatrix(X_test, label=y_test)
params = {'objective': 'multi:softprob', 'num_class': 3, 'eval_metric': 'mlogloss'}
model = xgb.train(params, dtrain, num_boost_round=10, evals=[(dtest, 'eval')], verbose_eval=False)
preds = model.predict(dtest)
print(len(preds), len(preds[0]))
A30
B30 1
C120 3
D30 3
Attempts:
2 left
💡 Hint

Check the number of test samples and the shape of prediction output for multi-class softprob.

Model Choice
advanced
2:00remaining
Choosing XGBoost Parameters for Imbalanced Data

You have a highly imbalanced binary classification dataset. Which XGBoost parameter setting is best to help the model focus on the minority class?

ASet <code>max_depth</code> to a very high value like 20
BSet <code>scale_pos_weight</code> to the ratio of negative to positive samples
CUse <code>objective</code> as <code>reg:squarederror</code>
DSet <code>learning_rate</code> to 1.0
Attempts:
2 left
💡 Hint

Think about how to handle class imbalance in XGBoost.

Metrics
advanced
2:00remaining
Evaluating XGBoost Model with Multi-class Log Loss

After training an XGBoost model with objective='multi:softprob', you get a multi-class log loss (mlogloss) of 0.8 on the test set. What does this value indicate?

AThe model predictions have moderate confidence and some errors
BThe model predictions are random and uninformative
CThe model predictions are very confident and accurate
DThe model predictions perfectly match the true labels
Attempts:
2 left
💡 Hint

Lower mlogloss is better; think about what 0.8 means.

🔧 Debug
expert
2:00remaining
Debugging XGBoost Training Error

Consider this code snippet that raises an error during training:

import xgboost as xgb
import numpy as np

X = np.array([[1, 2], [3, 4], [5, 6]])
y = np.array([0, 1])
dtrain = xgb.DMatrix(X, label=y)
params = {'objective': 'binary:logistic'}
model = xgb.train(params, dtrain, num_boost_round=5)

What is the cause of the error?

AMissing eval_metric parameter in params
BIncorrect objective function for binary classification
CMismatch between number of samples in X and y labels
DDMatrix requires labels to be float, not int
Attempts:
2 left
💡 Hint

Check the shapes of X and y carefully.

Practice

(1/5)
1. What is the main purpose of XGBoost in machine learning?
easy
A. To clean and prepare data for analysis
B. To store large datasets efficiently
C. To visualize data trends and patterns
D. To build a model that predicts outcomes from data

Solution

  1. Step 1: Understand XGBoost's role

    XGBoost is a machine learning algorithm used to create predictive models from data.
  2. Step 2: Compare options to XGBoost's function

    Only To build a model that predicts outcomes from data describes building a predictive model, which matches XGBoost's purpose.
  3. Final Answer:

    To build a model that predicts outcomes from data -> Option D
  4. Quick Check:

    XGBoost = Predictive modeling [OK]
Hint: XGBoost is for prediction, not data cleaning or storage [OK]
Common Mistakes:
  • Confusing XGBoost with data cleaning tools
  • Thinking XGBoost is for data visualization
  • Assuming XGBoost stores data
2. Which of the following is the correct way to import XGBoost's XGBClassifier in Python?
easy
A. from xgboost import XGBClassifier
B. import XGBoost
C. import xgboost as xgb
D. import xgbboost

Solution

  1. Step 1: Recall correct import syntax

    The common way to use XGBoost's classifier is to import XGBClassifier from xgboost.
  2. Step 2: Check each option

    from xgboost import XGBClassifier uses correct syntax: 'from xgboost import XGBClassifier'. import xgboost as xgb is close but usually we import the module as 'xgb' and then use classes. Options B and D are incorrect module names.
  3. Final Answer:

    from xgboost import XGBClassifier -> Option A
  4. Quick Check:

    Correct import = from xgboost import XGBClassifier [OK]
Hint: Use 'from xgboost import XGBClassifier' to import model class [OK]
Common Mistakes:
  • Using wrong capitalization in module name
  • Trying to import non-existent modules
  • Misspelling 'xgboost'
3. What will be the output of this code snippet?
from xgboost import XGBClassifier
model = XGBClassifier(use_label_encoder=False, eval_metric='logloss')
X_train = [[1, 2], [3, 4]]
y_train = [0, 1]
model.fit(X_train, y_train)
preds = model.predict([[1, 2]])
print(preds)
medium
A. [0]
B. [1]
C. [0 1]
D. Error due to missing eval_metric

Solution

  1. Step 1: Understand the training data and labels

    The model is trained on two samples: [1, 2] labeled 0 and [3, 4] labeled 1.
  2. Step 2: Predict on input [1, 2]

    Since [1, 2] was labeled 0 in training, the model will predict 0 for this input.
  3. Final Answer:

    [0] -> Option A
  4. Quick Check:

    Prediction matches training label [OK]
Hint: Prediction matches closest training label [OK]
Common Mistakes:
  • Expecting prediction to be 1 for input [1, 2]
  • Thinking eval_metric causes error here
  • Confusing output format as list or array
4. Identify the error in this XGBoost code snippet:
from xgboost import XGBClassifier
model = XGBClassifier()
X_train = [[1, 2], [3, 4]]
y_train = [0, 1]
model.fit(X_train, y_train, eval_metric='error')
preds = model.predict([[5, 6]])
print(preds)
medium
A. Missing use_label_encoder=false causes warning
B. eval_metric='error' is invalid for XGBClassifier's fit method
C. X_train should be a numpy array, not a list
D. predict method requires 2D array input, but [[5, 6]] is 1D

Solution

  1. Step 1: Check eval_metric usage in fit()

    For XGBClassifier, eval_metric should be passed during model creation, not in fit(). Passing it in fit() causes error.
  2. Step 2: Verify other parts

    X_train as list works fine, use_label_encoder=false is recommended but not error, and [[5, 6]] is a valid 2D input.
  3. Final Answer:

    eval_metric='error' is invalid for XGBClassifier's fit method -> Option B
  4. Quick Check:

    eval_metric in fit() causes error [OK]
Hint: Set eval_metric when creating model, not in fit() [OK]
Common Mistakes:
  • Passing eval_metric in fit() instead of constructor
  • Thinking list input causes error
  • Ignoring warnings about use_label_encoder
5. You want to improve your XGBoost model's performance on a classification task with imbalanced classes. Which approach is best to try first?
hard
A. Reduce learning_rate to make training faster
B. Increase max_depth to make trees deeper
C. Use scale_pos_weight to balance positive and negative classes
D. Remove features with missing values

Solution

  1. Step 1: Understand class imbalance problem

    When classes are imbalanced, the model may ignore the smaller class.
  2. Step 2: Choose best method to handle imbalance

    Using scale_pos_weight adjusts the importance of positive class, helping model learn better on imbalanced data.
  3. Final Answer:

    Use scale_pos_weight to balance positive and negative classes -> Option C
  4. Quick Check:

    scale_pos_weight = best for imbalance [OK]
Hint: Adjust scale_pos_weight to handle imbalanced classes [OK]
Common Mistakes:
  • Increasing max_depth may cause overfitting
  • Reducing learning_rate slows training, not fixes imbalance
  • Removing features may lose important info