0
0
ML Pythonprogramming~20 mins

Exploratory data analysis in ML Python - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Exploratory Data Analysis Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
data_output
intermediate
1:30remaining
Calculate the mean and median of a dataset

You have a dataset of ages: [22, 25, 29, 24, 30, 22, 28]. What are the mean and median values?

ML Python
import numpy as np
ages = [22, 25, 29, 24, 30, 22, 28]
mean_age = np.mean(ages)
median_age = np.median(ages)
print(f"Mean: {mean_age}, Median: {median_age}")
AMean: 25.0, Median: 25.71
BMean: 25.71, Median: 24.0
CMean: 24.0, Median: 25.71
DMean: 25.71, Median: 25.0
Attempts:
2 left
visualization
intermediate
2:00remaining
Identify the correct histogram plot for given data

Given the data points: [1, 2, 2, 3, 3, 3, 4, 4, 5], which histogram correctly shows the frequency of each number?

ML Python
import matplotlib.pyplot as plt
import numpy as np
data = [1, 2, 2, 3, 3, 3, 4, 4, 5]
plt.hist(data, bins=5, edgecolor='black')
plt.show()
ABar heights: 1=1, 2=2, 3=3, 4=2, 5=1
BBar heights: 1=2, 2=1, 3=3, 4=1, 5=2
CBar heights: 1=1, 2=3, 3=2, 4=2, 5=1
DBar heights: 1=1, 2=2, 3=2, 4=3, 5=1
Attempts:
2 left
🧠 Conceptual
advanced
1:30remaining
Understanding correlation coefficients

Which statement correctly describes a Pearson correlation coefficient of -0.85 between two variables?

AStrong positive linear relationship
BWeak negative linear relationship
CStrong negative linear relationship
DNo linear relationship
Attempts:
2 left
🔧 Debug
advanced
1:30remaining
Identify the error in this code for calculating variance

What error will this code raise?

import numpy as np
data = [1, 2, 3, 4, 5]
variance = np.var(data, ddof=1)
print(variance)
ML Python
import numpy as np
data = [1, 2, 3, 4, 5]
variance = np.var(data, ddof=1)
print(variance)
ASyntaxError due to missing parenthesis
BTypeError because ddof is invalid
CNameError because variance is undefined
DNo error, prints variance
Attempts:
2 left
🚀 Application
expert
2:00remaining
Determine the number of clusters from a silhouette score plot

You run k-means clustering on a dataset with k values from 2 to 6. The silhouette scores are: {2: 0.45, 3: 0.52, 4: 0.48, 5: 0.40, 6: 0.35}. Which k should you choose for best cluster separation?

Ak = 2
Bk = 3
Ck = 4
Dk = 5
Attempts:
2 left