Challenge - 5 Problems

🎖️

Anomaly Detection Master

Get all challenges correct to earn this badge!

Test your skills under time pressure!

🧠 Conceptual

intermediate

1:00remaining

What is the main goal of anomaly detection?

In simple terms, what does anomaly detection try to find in data?

ATo find data points that are very different from the rest

BTo find patterns that happen very often in the data

CTo group data points into clusters based on similarity

DTo predict future values based on past data

Attempts:

2 left

❓ Predict Output

intermediate

1:30remaining

Output of simple anomaly score calculation

Given the following Python code that calculates anomaly scores as the absolute difference from the mean, what is the output list?

ML Python

data = [10, 12, 10, 13, 100]
mean = sum(data) / len(data)
anomaly_scores = [abs(x - mean) for x in data]
print([round(score, 2) for score in anomaly_scores])

A[18.0, 16.0, 18.0, 15.0, 72.0]

B[21.0, 19.0, 21.0, 18.0, 69.0]

C[19.0, 17.0, 19.0, 16.0, 71.0]

D[19.0, 17.0, 18.0, 16.0, 71.0]

Attempts:

2 left

❓ Model Choice

advanced

1:30remaining

Best model choice for anomaly detection in high-dimensional data

You have a dataset with many features (dimensions) and want to detect anomalies. Which model is best suited?

AK-Nearest Neighbors (KNN) based anomaly detection

BPrincipal Component Analysis (PCA) based anomaly detection

CLinear Regression

DDecision Tree Classifier

Attempts:

2 left

❓ Metrics

advanced

1:30remaining

Which metric is best to evaluate anomaly detection performance?

You have a model that flags anomalies. Which metric best measures how well it finds true anomalies without too many false alarms?

APrecision

BAccuracy

CMean Squared Error

DRecall

Attempts:

2 left

🔧 Debug

expert

2:00remaining

Why does this Isolation Forest model fail to detect anomalies?

You trained an Isolation Forest model but it flags almost all points as normal. What is the most likely cause?

ML Python

from sklearn.ensemble import IsolationForest
model = IsolationForest(contamination=0.1, max_samples=100)
model.fit(data)
preds = model.predict(data)
print(sum(preds == -1))  # Count anomalies

Amax_samples is too high, causing overfitting

BThe model needs more trees to detect anomalies

CThe data is not scaled, causing model to fail

DThe contamination parameter is set too low compared to actual anomaly rate

Attempts:

2 left