0
0
SciPydata~10 mins

Chi-squared test in SciPy - Step-by-Step Execution

Choose your learning style9 modes available
Concept Flow - Chi-squared test
Start with observed data
Calculate expected data
Compute Chi-squared statistic
Calculate p-value
Compare p-value to significance level
Reject H0
The test starts with observed data, calculates expected counts, then computes the Chi-squared statistic and p-value, and finally decides if the difference is significant.
Execution Sample
SciPy
from scipy.stats import chi2_contingency

observed = [[10, 20], [20, 40]]
chi2, p, dof, expected = chi2_contingency(observed)
print(p)
This code runs a Chi-squared test on a 2x2 table of observed counts and prints the p-value.
Execution Table
StepActionValue/CalculationResult
1Input observed data[[10, 20], [20, 40]]Observed counts set
2Calculate row sums[30, 60]Row sums computed
3Calculate column sums[30, 60]Column sums computed
4Calculate total sum90Total count computed
5Calculate expected countsExpected = (row_sum * col_sum) / total[[10.0, 20.0], [20.0, 40.0]]
6Compute Chi-squared statisticSum((observed - expected)^2 / expected)0.0
7Calculate p-valueUsing Chi-squared distribution with dof=11.0
8Compare p-value to 0.051.0 > 0.05Fail to reject null hypothesis
9End-Test complete
💡 p-value is greater than 0.05, so we fail to reject the null hypothesis
Variable Tracker
VariableStartAfter Step 2After Step 3After Step 4After Step 5After Step 6After Step 7Final
observed[[10, 20], [20, 40]][[10, 20], [20, 40]][[10, 20], [20, 40]][[10, 20], [20, 40]][[10, 20], [20, 40]][[10, 20], [20, 40]][[10, 20], [20, 40]][[10, 20], [20, 40]]
row_sumsN/A[30, 60][30, 60][30, 60][30, 60][30, 60][30, 60][30, 60]
col_sumsN/AN/A[30, 60][30, 60][30, 60][30, 60][30, 60][30, 60]
totalN/AN/AN/A9090909090
expectedN/AN/AN/AN/A[[10.0, 20.0], [20.0, 40.0]][[10.0, 20.0], [20.0, 40.0]][[10.0, 20.0], [20.0, 40.0]][[10.0, 20.0], [20.0, 40.0]]
chi2_statisticN/AN/AN/AN/AN/A0.00.00.0
p_valueN/AN/AN/AN/AN/AN/A1.01.0
Key Moments - 2 Insights
Why are the expected counts the same as the observed counts in this example?
Because the row and column proportions perfectly match, the expected counts equal the observed counts, resulting in a Chi-squared statistic of 0 (see execution_table step 5 and 6).
What does a p-value of 1.0 mean in this test?
A p-value of 1.0 means there is no evidence to reject the null hypothesis; the observed data fits the expected distribution perfectly (see execution_table step 7 and 8).
Visual Quiz - 3 Questions
Test your understanding
Look at the execution table at step 6. What is the Chi-squared statistic value?
A10.0
B1.0
C0.0
D90.0
💡 Hint
Check the 'Compute Chi-squared statistic' row in the execution_table.
At which step does the code calculate the total sum of observed counts?
AStep 2
BStep 4
CStep 5
DStep 7
💡 Hint
Look for the step labeled 'Calculate total sum' in the execution_table.
If the p-value was less than 0.05, what would be the conclusion at step 8?
AReject null hypothesis
BFail to reject null hypothesis
CTest is inconclusive
DCalculate expected counts again
💡 Hint
Refer to the decision logic in the concept_flow and execution_table step 8.
Concept Snapshot
Chi-squared test syntax:
from scipy.stats import chi2_contingency
chi2, p, dof, expected = chi2_contingency(observed)

- observed: 2D array of counts
- Computes chi2 statistic and p-value
- p < 0.05 means reject null hypothesis
- Tests if observed differs from expected frequencies
Full Transcript
The Chi-squared test compares observed counts to expected counts to see if differences are significant. We start with observed data, calculate expected counts based on row and column totals, then compute the Chi-squared statistic as the sum of squared differences divided by expected counts. Next, we find the p-value from the Chi-squared distribution with appropriate degrees of freedom. If the p-value is less than 0.05, we reject the null hypothesis, meaning the observed data significantly differs from expected. Otherwise, we fail to reject it, meaning no significant difference. In the example, observed and expected counts are equal, so the Chi-squared statistic is 0 and p-value is 1, indicating no difference.