0
0
Data-analysis-pythonHow-ToBeginner ยท 3 min read

How to Use IQR Method for Outliers in Python

Use the IQR (Interquartile Range) method in Python by calculating the first quartile (Q1) and third quartile (Q3), then find the IQR as Q3 - Q1. Outliers are values below Q1 - 1.5 * IQR or above Q3 + 1.5 * IQR.
๐Ÿ“

Syntax

The IQR method involves these steps:

  • Calculate Q1 (25th percentile) and Q3 (75th percentile) of your data.
  • Compute IQR = Q3 - Q1.
  • Define lower bound = Q1 - 1.5 * IQR.
  • Define upper bound = Q3 + 1.5 * IQR.
  • Identify outliers as values outside these bounds.
python
import numpy as np

Q1 = np.percentile(data, 25)
Q3 = np.percentile(data, 75)
IQR = Q3 - Q1
lower_bound = Q1 - 1.5 * IQR
upper_bound = Q3 + 1.5 * IQR
outliers = [x for x in data if x < lower_bound or x > upper_bound]
๐Ÿ’ป

Example

This example shows how to find outliers in a list of numbers using the IQR method in Python.

python
import numpy as np

data = [10, 12, 14, 15, 18, 19, 20, 22, 23, 24, 100]

Q1 = np.percentile(data, 25)
Q3 = np.percentile(data, 75)
IQR = Q3 - Q1

lower_bound = Q1 - 1.5 * IQR
upper_bound = Q3 + 1.5 * IQR

outliers = [x for x in data if x < lower_bound or x > upper_bound]

print("Q1:", Q1)
print("Q3:", Q3)
print("IQR:", IQR)
print("Lower bound:", lower_bound)
print("Upper bound:", upper_bound)
print("Outliers:", outliers)
Output
Q1: 14.25 Q3: 23.0 IQR: 8.75 Lower bound: 1.125 Upper bound: 36.125 Outliers: [100]
โš ๏ธ

Common Pitfalls

Common mistakes when using the IQR method include:

  • Not using the correct percentiles (Q1 = 25th, Q3 = 75th).
  • Forgetting to multiply IQR by 1.5 when calculating bounds.
  • Applying the method on non-numeric or unsorted data.
  • Misinterpreting outliers as errors instead of potential important data points.
python
import numpy as np

data = [10, 12, 14, 15, 18, 19, 20, 22, 23, 24, 100]

# Wrong: Using 50th percentile instead of 25th and 75th
Q1_wrong = np.percentile(data, 50)
Q3_wrong = np.percentile(data, 50)
IQR_wrong = Q3_wrong - Q1_wrong

# Correct way
Q1 = np.percentile(data, 25)
Q3 = np.percentile(data, 75)
IQR = Q3 - Q1
๐Ÿ“Š

Quick Reference

Remember these key points for the IQR method:

  • Q1: 25th percentile
  • Q3: 75th percentile
  • IQR: Q3 - Q1
  • Lower bound: Q1 - 1.5 * IQR
  • Upper bound: Q3 + 1.5 * IQR
  • Outliers: Values outside lower and upper bounds
โœ…

Key Takeaways

Calculate Q1 and Q3 using the 25th and 75th percentiles of your data.
Compute IQR as the difference between Q3 and Q1.
Outliers are data points below Q1 - 1.5*IQR or above Q3 + 1.5*IQR.
Use numpy.percentile for easy percentile calculations in Python.
Check your data type and distribution before applying the IQR method.