The normal distribution helps us understand how data spreads around an average value. It is useful because many natural things follow this pattern.
Normal distribution in SciPy
from scipy.stats import norm # Create a normal distribution object norm(loc=mean, scale=std_dev) # Calculate probability density function (PDF) at x pdf_value = norm.pdf(x, loc=mean, scale=std_dev) # Calculate cumulative distribution function (CDF) at x cdf_value = norm.cdf(x, loc=mean, scale=std_dev) # Generate random samples samples = norm.rvs(loc=mean, scale=std_dev, size=n)
loc is the mean (center) of the distribution.
scale is the standard deviation (spread) of the distribution.
from scipy.stats import norm # Standard normal distribution (mean=0, std=1) pdf_at_0 = norm.pdf(0) cdf_at_1 = norm.cdf(1)
from scipy.stats import norm # Normal distribution with mean=10 and std=2 pdf_at_12 = norm.pdf(12, loc=10, scale=2) cdf_at_8 = norm.cdf(8, loc=10, scale=2)
from scipy.stats import norm # Generate 5 random samples from N(5, 1.5) samples = norm.rvs(loc=5, scale=1.5, size=5) print(samples)
This program calculates the probability density at 55 and cumulative probability up to 45 for a normal distribution with mean 50 and standard deviation 5. It also generates 10 random samples from this distribution.
from scipy.stats import norm # Define mean and standard deviation mean = 50 std_dev = 5 # Calculate PDF at 55 pdf_55 = norm.pdf(55, loc=mean, scale=std_dev) # Calculate CDF at 45 cdf_45 = norm.cdf(45, loc=mean, scale=std_dev) # Generate 10 random samples samples = norm.rvs(loc=mean, scale=std_dev, size=10) print(f"PDF at 55: {pdf_55:.4f}") print(f"CDF at 45: {cdf_45:.4f}") print("Random samples:") print(samples)
The PDF value shows how likely a specific value is, but it is not a probability itself.
The CDF value shows the probability that a value is less than or equal to a point.
Random samples help simulate real data following the normal pattern.
The normal distribution models data around an average with a certain spread.
Use norm.pdf to find how dense the data is at a point.
Use norm.cdf to find the probability of data being below a point.