The Kolmogorov-Smirnov test helps us check if two sets of numbers come from the same pattern or if one set matches a known pattern.
0
0
Kolmogorov-Smirnov test in SciPy
Introduction
To see if your sample data fits a normal distribution before using other tests.
To compare two groups of data and check if they behave similarly.
To test if a new batch of products matches the quality of an old batch.
To check if website visitors' behavior changed after a redesign.
To verify if a random number generator produces numbers like a uniform distribution.
Syntax
SciPy
from scipy.stats import ks_2samp result = ks_2samp(data1, data2) # or for one sample vs a distribution from scipy.stats import kstest result = kstest(data, 'distribution_name')
ks_2samp compares two data samples.
kstest compares one sample to a known distribution like 'norm' for normal.
Examples
Compare two small samples to see if they come from the same distribution.
SciPy
from scipy.stats import ks_2samp sample1 = [1, 2, 3, 4, 5] sample2 = [2, 3, 4, 5, 6] result = ks_2samp(sample1, sample2) print(result)
Check if a sample fits a normal distribution.
SciPy
from scipy.stats import kstest import numpy as np sample = np.random.normal(0, 1, 100) result = kstest(sample, 'norm') print(result)
Sample Program
This code compares exam scores from two classes to see if their score distributions differ.
SciPy
from scipy.stats import ks_2samp # Two samples of exam scores scores_classA = [88, 92, 85, 91, 87, 90, 89] scores_classB = [78, 82, 80, 79, 81, 77, 83] # Perform Kolmogorov-Smirnov test result = ks_2samp(scores_classA, scores_classB) print(f"KS statistic: {result.statistic:.3f}") print(f"p-value: {result.pvalue:.3f}")
OutputSuccess
Important Notes
A small p-value (usually less than 0.05) means the two samples likely come from different distributions.
The KS test is sensitive to differences in both the center and shape of distributions.
Samples should be independent and continuous for best results.
Summary
The Kolmogorov-Smirnov test compares two samples or a sample to a known distribution.
It helps check if data follows a pattern or if two groups behave similarly.
Look at the p-value to decide if differences are meaningful.