0
0
SciPydata~5 mins

Normal distribution in SciPy

Choose your learning style9 modes available
Introduction

The normal distribution helps us understand how data spreads around an average value. It is useful because many natural things follow this pattern.

To model heights of people in a population.
To analyze test scores where most students score near the average.
To estimate measurement errors in experiments.
To predict daily temperatures that vary around a mean.
To check if data fits a common pattern for further analysis.
Syntax
SciPy
from scipy.stats import norm

# Create a normal distribution object
norm(loc=mean, scale=std_dev)

# Calculate probability density function (PDF) at x
pdf_value = norm.pdf(x, loc=mean, scale=std_dev)

# Calculate cumulative distribution function (CDF) at x
cdf_value = norm.cdf(x, loc=mean, scale=std_dev)

# Generate random samples
samples = norm.rvs(loc=mean, scale=std_dev, size=n)

loc is the mean (center) of the distribution.

scale is the standard deviation (spread) of the distribution.

Examples
This example uses the standard normal distribution to find the PDF at 0 and CDF at 1.
SciPy
from scipy.stats import norm

# Standard normal distribution (mean=0, std=1)
pdf_at_0 = norm.pdf(0)
cdf_at_1 = norm.cdf(1)
This example calculates PDF and CDF for a normal distribution centered at 10 with spread 2.
SciPy
from scipy.stats import norm

# Normal distribution with mean=10 and std=2
pdf_at_12 = norm.pdf(12, loc=10, scale=2)
cdf_at_8 = norm.cdf(8, loc=10, scale=2)
This example shows how to create random data points from a normal distribution.
SciPy
from scipy.stats import norm

# Generate 5 random samples from N(5, 1.5)
samples = norm.rvs(loc=5, scale=1.5, size=5)
print(samples)
Sample Program

This program calculates the probability density at 55 and cumulative probability up to 45 for a normal distribution with mean 50 and standard deviation 5. It also generates 10 random samples from this distribution.

SciPy
from scipy.stats import norm

# Define mean and standard deviation
mean = 50
std_dev = 5

# Calculate PDF at 55
pdf_55 = norm.pdf(55, loc=mean, scale=std_dev)

# Calculate CDF at 45
cdf_45 = norm.cdf(45, loc=mean, scale=std_dev)

# Generate 10 random samples
samples = norm.rvs(loc=mean, scale=std_dev, size=10)

print(f"PDF at 55: {pdf_55:.4f}")
print(f"CDF at 45: {cdf_45:.4f}")
print("Random samples:")
print(samples)
OutputSuccess
Important Notes

The PDF value shows how likely a specific value is, but it is not a probability itself.

The CDF value shows the probability that a value is less than or equal to a point.

Random samples help simulate real data following the normal pattern.

Summary

The normal distribution models data around an average with a certain spread.

Use norm.pdf to find how dense the data is at a point.

Use norm.cdf to find the probability of data being below a point.