0
0
Data Analysis Pythondata~20 mins

P-values and significance in Data Analysis Python - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
P-values and Significance Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
Predict Output
intermediate
2:00remaining
Understanding P-value from a t-test
What is the output of the following Python code that performs a t-test on two small samples?
Data Analysis Python
from scipy.stats import ttest_ind
sample1 = [5, 7, 8, 6, 9]
sample2 = [10, 12, 11, 13, 14]
t_stat, p_value = ttest_ind(sample1, sample2)
print(round(p_value, 3))
A0.05
B0.002
C0.5
D0.1
Attempts:
2 left
💡 Hint
Think about how different the two samples are and what a small p-value means.
data_output
intermediate
2:00remaining
Interpreting significance from p-values in a DataFrame
Given the DataFrame below with p-values from multiple tests, which rows are considered statistically significant at alpha = 0.05?
Data Analysis Python
import pandas as pd
data = {'Test': ['A', 'B', 'C', 'D'], 'p_value': [0.03, 0.07, 0.001, 0.2]}
df = pd.DataFrame(data)
significant = df[df['p_value'] < 0.05]
print(significant['Test'].tolist())
A['A', 'B', 'C']
B['B', 'D']
C['A', 'C']
D['C', 'D']
Attempts:
2 left
💡 Hint
Remember significance means p-value less than 0.05.
🧠 Conceptual
advanced
2:00remaining
Effect of sample size on p-value
Which statement best explains how increasing sample size affects the p-value in hypothesis testing?
AIncreasing sample size decreases variability, which can lead to smaller p-values if the effect exists.
BIncreasing sample size always increases the p-value, making results less significant.
CSample size does not affect the p-value; it only depends on the observed effect size.
DIncreasing sample size always leads to p-values equal to zero.
Attempts:
2 left
💡 Hint
Think about how more data affects the certainty of results.
🔧 Debug
advanced
2:00remaining
Identifying error in p-value calculation code
What error will this code produce when trying to calculate a p-value from a t-test?
Data Analysis Python
from scipy.stats import ttest_ind
sample1 = [1, 2, 3]
sample2 = [4, 5]
t_stat, p_value = ttest_ind(sample1, sample2, equal_var=False)
print(p_value)
AIndexError due to list indexing
BValueError due to unequal sample sizes
CTypeError because equal_var is not a valid argument
DNo error; prints a valid p-value
Attempts:
2 left
💡 Hint
Check the documentation for ttest_ind about sample sizes and parameters.
🚀 Application
expert
3:00remaining
Choosing significance level for multiple tests
You run 20 independent hypothesis tests each at alpha = 0.05. What is the approximate probability of getting at least one false positive (Type I error) by chance?
AAbout 0.64
BAbout 0.95
CAbout 1.0
DExactly 0.05
Attempts:
2 left
💡 Hint
Use the formula for the complement of no false positives: 1 - (1 - alpha)^number_of_tests.