Which of the following objective functions is not commonly used in XGBoost for classification tasks?
Think about which objective functions are for regression versus classification.
reg:squarederror is used for regression tasks, not classification. The others are classification or count models.
What will be the output of the following Python code snippet using XGBoost?
import xgboost as xgb from sklearn.datasets import load_iris from sklearn.model_selection import train_test_split iris = load_iris() X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, test_size=0.2, random_state=42) dtrain = xgb.DMatrix(X_train, label=y_train) dtest = xgb.DMatrix(X_test, label=y_test) params = {'objective': 'multi:softprob', 'num_class': 3, 'eval_metric': 'mlogloss'} model = xgb.train(params, dtrain, num_boost_round=10, evals=[(dtest, 'eval')], verbose_eval=False) preds = model.predict(dtest) print(len(preds), len(preds[0]))
Check the number of test samples and the shape of prediction output for multi-class softprob.
The test set has 30 samples. For 'multi:softprob', predictions are probabilities for each class, so output shape is (30, 3).
You have a highly imbalanced binary classification dataset. Which XGBoost parameter setting is best to help the model focus on the minority class?
Think about how to handle class imbalance in XGBoost.
scale_pos_weight helps balance the positive class weight, improving performance on imbalanced data. Other options are either irrelevant or harmful.
After training an XGBoost model with objective='multi:softprob', you get a multi-class log loss (mlogloss) of 0.8 on the test set. What does this value indicate?
Lower mlogloss is better; think about what 0.8 means.
A multi-class log loss of 0.8 means the model is somewhat confident but makes errors; perfect predictions would have mlogloss near 0.
Consider this code snippet that raises an error during training:
import xgboost as xgb
import numpy as np
X = np.array([[1, 2], [3, 4], [5, 6]])
y = np.array([0, 1])
dtrain = xgb.DMatrix(X, label=y)
params = {'objective': 'binary:logistic'}
model = xgb.train(params, dtrain, num_boost_round=5)What is the cause of the error?
Check the shapes of X and y carefully.
The feature matrix X has 3 samples but y has only 2 labels, causing a mismatch error during DMatrix creation.