Challenge - 5 Problems
Statistics Uncertainty Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
❓ Predict Output
intermediate2:00remaining
Understanding Confidence Interval Calculation
What is the output of the following code that calculates a 95% confidence interval for a sample mean?
SciPy
import numpy as np from scipy import stats np.random.seed(0) sample = np.random.normal(loc=50, scale=5, size=30) mean = np.mean(sample) sem = stats.sem(sample) conf_int = stats.t.interval(0.95, len(sample)-1, loc=mean, scale=sem) print(tuple(round(x, 2) for x in conf_int))
Attempts:
2 left
💡 Hint
Recall that the confidence interval depends on the sample mean, standard error, and t-distribution quantiles.
✗ Incorrect
The code calculates the 95% confidence interval using the t-distribution for a sample of size 30 drawn from a normal distribution. The output is the lower and upper bounds rounded to two decimals.
❓ data_output
intermediate1:30remaining
Result of Sampling Distribution Visualization Data
After running this code to simulate sampling distribution of the mean, what is the mean of the sample means array?
SciPy
import numpy as np np.random.seed(1) sample_means = [np.mean(np.random.normal(100, 15, 50)) for _ in range(1000)] mean_of_means = round(np.mean(sample_means), 2) print(mean_of_means)
Attempts:
2 left
💡 Hint
The sample means should be close to the population mean but may vary slightly due to randomness.
✗ Incorrect
The average of the 1000 sample means is close to the population mean of 100 but not exactly 100 due to sampling variability.
❓ visualization
advanced2:30remaining
Interpreting a Histogram of Sample Means
Which option best describes the shape of the histogram generated by this code showing sample means from repeated sampling?
SciPy
import numpy as np import matplotlib.pyplot as plt np.random.seed(2) sample_means = [np.mean(np.random.exponential(scale=1.0, size=40)) for _ in range(1000)] plt.hist(sample_means, bins=30, color='skyblue', edgecolor='black') plt.title('Histogram of Sample Means from Exponential Distribution') plt.xlabel('Sample Mean') plt.ylabel('Frequency') plt.show()
Attempts:
2 left
💡 Hint
Think about how the Central Limit Theorem affects the distribution of sample means.
✗ Incorrect
Even though the original data is exponential (skewed), the distribution of sample means tends to be normal (bell-shaped) when sample size is large enough.
🧠 Conceptual
advanced1:30remaining
Why Use Probability Distributions in Statistics?
Which option best explains why statistics uses probability distributions to quantify uncertainty?
Attempts:
2 left
💡 Hint
Think about what uncertainty means in real life and how statistics handles it.
✗ Incorrect
Probability distributions represent how data can vary and help quantify the uncertainty inherent in measurements and predictions.
🔧 Debug
expert2:00remaining
Identify the Error in Confidence Interval Calculation
What error does this code raise when trying to calculate a 99% confidence interval for a small sample?
SciPy
import numpy as np from scipy import stats sample = np.array([5, 7, 8, 6, 9]) mean = np.mean(sample) sem = stats.sem(sample) conf_int = stats.t.interval(0.99, len(sample)-1, loc=mean, scale=sem) print(conf_int)
Attempts:
2 left
💡 Hint
Check the degrees of freedom parameter passed to the t.interval function.
✗ Incorrect
The degrees of freedom should be sample size minus one, but the code uses the full sample size, causing a ValueError.