Challenge - 5 Problems

🎖️

Decision Tree Mastery

Get all challenges correct to earn this badge!

Test your skills under time pressure!

🧠 Conceptual

intermediate

2:00remaining

How does a decision tree classifier decide the best split?

Imagine you want to split your data at a node in a decision tree. What criterion does the tree use to choose the best feature and value to split on?

AIt splits to create the largest possible groups regardless of class labels.

BIt randomly picks any feature and value without considering the data.

CIt chooses the split that maximizes the information gain or reduces impurity the most.

DIt always splits on the feature with the highest numeric value.

Attempts:

2 left

❓ Predict Output

intermediate

2:00remaining

Output of training a decision tree on simple data

What is the accuracy printed by this code after training a decision tree on the given data?

ML Python

from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score

X = [[0], [1], [2], [3]]
y = [0, 0, 1, 1]

model = DecisionTreeClassifier(random_state=42)
model.fit(X, y)
pred = model.predict(X)
acc = accuracy_score(y, pred)
print(acc)

A1.0

B0.5

C0.75

D0.0

Attempts:

2 left

❓ Hyperparameter

advanced

2:00remaining

Effect of max_depth on decision tree complexity

What happens if you set the max_depth parameter of a decision tree classifier to a very small number like 1?

AThe tree will randomly choose splits without considering max_depth.

BThe tree will grow very deep and overfit the data.

CThe tree will ignore the max_depth and grow fully.

DThe tree will be very simple, possibly underfitting the data.

Attempts:

2 left

❓ Metrics

advanced

2:00remaining

Choosing the right metric for imbalanced classes

You trained a decision tree classifier on a dataset where 95% of samples belong to class A and 5% to class B. Which metric is best to evaluate your model's performance?

APrecision and recall, because they focus on minority class performance.

BAccuracy, because it shows overall correct predictions.

CMean squared error, because it measures prediction error.

DConfusion matrix size, because bigger matrices mean better models.

Attempts:

2 left

🔧 Debug

expert

2:00remaining

Why does this decision tree code raise an error?

What error does this code raise and why?

from sklearn.tree import DecisionTreeClassifier

X = [[1, 2], [3, 4]]
y = [0, 1, 0]

model = DecisionTreeClassifier()
model.fit(X, y)

ATypeError: DecisionTreeClassifier() got an unexpected keyword argument.

BValueError: Found input variables with inconsistent numbers of samples because X has 2 samples but y has 3 labels.

CIndexError: list index out of range during fitting.

DNo error, the model fits successfully.

Attempts:

2 left