0
0
SciPydata~5 mins

Chi-squared test in SciPy

Choose your learning style9 modes available
Introduction

The Chi-squared test helps us check if two things are related or if they happen by chance.

To see if a new medicine works differently for men and women.
To check if people prefer one brand over another in different cities.
To find out if a coin is fair by counting heads and tails.
To test if customer choices depend on age groups.
To analyze if voting preferences differ by region.
Syntax
SciPy
from scipy.stats import chi2_contingency

chi2, p, dof, expected = chi2_contingency(observed_table)

observed_table is a 2D list or array with counts.

The function returns four values: chi-squared statistic, p-value, degrees of freedom, and expected counts.

Examples
Basic example with a 2x2 table of counts.
SciPy
from scipy.stats import chi2_contingency

observed = [[10, 20], [20, 40]]
chi2, p, dof, expected = chi2_contingency(observed)
print(f"Chi2: {chi2}, p-value: {p}")
Example with a 2x3 table to see how degrees of freedom change.
SciPy
from scipy.stats import chi2_contingency

observed = [[30, 10, 20], [20, 40, 10]]
chi2, p, dof, expected = chi2_contingency(observed)
print(f"Degrees of freedom: {dof}")
Sample Program

This program tests if preference depends on gender using a 2x2 table.

SciPy
from scipy.stats import chi2_contingency

# Data: Survey results of preference by gender
# Rows: Male, Female
# Columns: Like, Dislike
observed = [[30, 10], [20, 40]]

chi2, p, dof, expected = chi2_contingency(observed)

print(f"Chi-squared statistic: {chi2:.2f}")
print(f"p-value: {p:.4f}")
print(f"Degrees of freedom: {dof}")
print("Expected frequencies:")
for row in expected:
    print([round(x, 2) for x in row])
OutputSuccess
Important Notes

A small p-value (usually less than 0.05) means the two things are likely related.

The test needs counts, not percentages or averages.

Tables should have enough data; very small counts can make results unreliable.

Summary

The Chi-squared test checks if two categories are connected.

Use it with tables of counts to find relationships.

Look at the p-value to decide if the result is important.