Challenge - 5 Problems

🎖️

Statistics Uncertainty Master

Get all challenges correct to earn this badge!

Test your skills under time pressure!

❓ Predict Output

intermediate

2:00remaining

Understanding Confidence Interval Calculation

What is the output of the following code that calculates a 95% confidence interval for a sample mean?

SciPy

import numpy as np
from scipy import stats

np.random.seed(0)
sample = np.random.normal(loc=50, scale=5, size=30)
mean = np.mean(sample)
sem = stats.sem(sample)
conf_int = stats.t.interval(0.95, len(sample)-1, loc=mean, scale=sem)
print(tuple(round(x, 2) for x in conf_int))

A(47.50, 52.66)

B(49.00, 51.00)

C(48.50, 51.50)

D(48.01, 52.15)

Attempts:

2 left

❓ data_output

intermediate

1:30remaining

Result of Sampling Distribution Visualization Data

After running this code to simulate sampling distribution of the mean, what is the mean of the sample means array?

SciPy

import numpy as np
np.random.seed(1)
sample_means = [np.mean(np.random.normal(100, 15, 50)) for _ in range(1000)]
mean_of_means = round(np.mean(sample_means), 2)
print(mean_of_means)

A99.85

B100.0

C101.5

D98.0

Attempts:

2 left

❓ visualization

advanced

2:30remaining

Interpreting a Histogram of Sample Means

Which option best describes the shape of the histogram generated by this code showing sample means from repeated sampling?

SciPy

import numpy as np
import matplotlib.pyplot as plt
np.random.seed(2)
sample_means = [np.mean(np.random.exponential(scale=1.0, size=40)) for _ in range(1000)]
plt.hist(sample_means, bins=30, color='skyblue', edgecolor='black')
plt.title('Histogram of Sample Means from Exponential Distribution')
plt.xlabel('Sample Mean')
plt.ylabel('Frequency')
plt.show()

AThe histogram is heavily skewed right, reflecting the original exponential distribution.

BThe histogram shows multiple peaks indicating a bimodal distribution.

CThe histogram is uniform because sample means are evenly distributed.

DThe histogram is symmetric and bell-shaped due to the Central Limit Theorem.

Attempts:

2 left

🧠 Conceptual

advanced

1:30remaining

Why Use Probability Distributions in Statistics?

Which option best explains why statistics uses probability distributions to quantify uncertainty?

ABecause probability distributions provide exact predictions of future events without error.

BBecause probability distributions model the variability and randomness in data, allowing us to estimate uncertainty.

CBecause probability distributions eliminate the need for data collection by assuming fixed values.

DBecause probability distributions simplify data by ignoring variability and focusing on averages.

Attempts:

2 left

🔧 Debug

expert

2:00remaining

Identify the Error in Confidence Interval Calculation

What error does this code raise when trying to calculate a 99% confidence interval for a small sample?

SciPy

import numpy as np
from scipy import stats

sample = np.array([5, 7, 8, 6, 9])
mean = np.mean(sample)
sem = stats.sem(sample)
conf_int = stats.t.interval(0.99, len(sample)-1, loc=mean, scale=sem)
print(conf_int)

AIndexError: list index out of range

BTypeError: unsupported operand type(s) for +: 'int' and 'str'

CValueError: degrees of freedom must be positive

DNo error, outputs the confidence interval tuple

Attempts:

2 left