0
0
ML Pythonprogramming~20 mins

Data distributions and outliers in ML Python - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Outlier Mastery Badge
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate
1:30remaining
Understanding the effect of outliers on mean and median

Consider a dataset with values: [10, 12, 14, 15, 100]. Which statement best describes the effect of the outlier (100) on the mean and median?

AThe mean is heavily influenced by the outlier, but the median remains relatively stable.
BBoth mean and median are equally influenced by the outlier.
CThe median is heavily influenced by the outlier, but the mean remains stable.
DNeither mean nor median is influenced by the outlier.
Attempts:
2 left
Predict Output
intermediate
2:00remaining
Detecting outliers using the IQR method

What is the output of the following Python code that detects outliers using the interquartile range (IQR)?

ML Python
import numpy as np

data = np.array([5, 7, 8, 12, 15, 18, 22, 100])
Q1 = np.percentile(data, 25)
Q3 = np.percentile(data, 75)
IQR = Q3 - Q1
outliers = data[(data < Q1 - 1.5 * IQR) | (data > Q3 + 1.5 * IQR)]
print(outliers.tolist())
A[]
B[5, 7]
C[100]
D[22, 100]
Attempts:
2 left
Model Choice
advanced
2:00remaining
Choosing a model robust to outliers

You have a dataset with many outliers. Which regression model is best suited to minimize the effect of outliers on predictions?

ARobust regression using Huber loss
BRidge regression with L2 regularization
CLinear regression using least squares loss
DPolynomial regression of degree 5
Attempts:
2 left
Hyperparameter
advanced
2:00remaining
Adjusting hyperparameters to handle outliers in clustering

When using DBSCAN clustering on data with noise and outliers, which hyperparameter adjustment helps to better identify outliers as noise points?

ADecrease the epsilon (eps) value to make clusters tighter
BDecrease the minimum samples (min_samples) to form a cluster
CIncrease the epsilon (eps) value to include more points in clusters
DIncrease the minimum samples (min_samples) to require more points per cluster
Attempts:
2 left
Metrics
expert
2:30remaining
Evaluating model performance with imbalanced data and outliers

You have a classification dataset with imbalanced classes and some outliers. Which metric is most reliable to evaluate model performance in this scenario?

AAccuracy
BF1-score
CPrecision
DMean Squared Error (MSE)
Attempts:
2 left