0
0
Data-analysis-pythonHow-ToBeginner ยท 4 min read

How to Use Box Plot for Outliers in Python

Use matplotlib or seaborn libraries in Python to create a box plot, which visually shows outliers as points outside the whiskers. Outliers are data points that fall below the lower whisker or above the upper whisker, helping you identify unusual values easily.
๐Ÿ“

Syntax

A box plot in Python can be created using matplotlib.pyplot.boxplot() or seaborn.boxplot(). The main parts are:

  • data: The list or array of numbers to plot.
  • vert: (matplotlib) Whether the box plot is vertical (True) or horizontal (False).
  • showfliers: (matplotlib) Whether to show outliers as points.

Outliers appear as dots outside the whiskers, which represent 1.5 times the interquartile range (IQR).

python
import matplotlib.pyplot as plt
import seaborn as sns

# Matplotlib syntax
plt.boxplot(data, vert=True, showfliers=True)

# Seaborn syntax
sns.boxplot(x=data)
๐Ÿ’ป

Example

This example shows how to create a box plot with matplotlib and seaborn to detect outliers in a sample dataset.

python
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np

# Sample data with outliers
data = [7, 8, 5, 6, 9, 10, 15, 22, 3, 4, 100, 2, 5, 6]

# Using matplotlib
plt.figure(figsize=(8, 4))
plt.subplot(1, 2, 1)
plt.boxplot(data, vert=True, showfliers=True)
plt.title('Matplotlib Box Plot')

# Using seaborn
plt.subplot(1, 2, 2)
sns.boxplot(x=data)
plt.title('Seaborn Box Plot')

plt.tight_layout()
plt.show()
Output
A window opens showing two box plots side by side. Both plots display a box with whiskers and dots above the upper whisker representing outliers (like 100).
โš ๏ธ

Common Pitfalls

  • Not showing outliers: By default, some box plot functions may hide outliers if showfliers is set to False.
  • Misinterpreting whiskers: Whiskers do not show min/max but extend to 1.5 times the IQR; points beyond are outliers.
  • Using raw data without cleaning: Extreme outliers can distort the scale and hide details.

Always check if outliers are shown and understand what whiskers represent.

python
import matplotlib.pyplot as plt

# Wrong: Outliers hidden
plt.boxplot([1, 2, 3, 100], showfliers=False)
plt.title('Outliers Hidden')
plt.show()

# Right: Outliers shown
plt.boxplot([1, 2, 3, 100], showfliers=True)
plt.title('Outliers Shown')
plt.show()
Output
Two plots appear: first without the outlier dot (100), second with the outlier dot clearly visible above the whisker.
๐Ÿ“Š

Quick Reference

FunctionLibraryKey ParameterPurpose
boxplot(data, showfliers=True)matplotlib.pyplotshowfliersShow or hide outliers
boxplot(x=data)seabornNoneCreate box plot with outliers shown by default
plt.show()matplotlib.pyplotNoneDisplay the plot window
โœ…

Key Takeaways

Use matplotlib or seaborn box plots to visually detect outliers in your data.
Outliers appear as points outside the whiskers, which extend 1.5 times the IQR.
Set showfliers=True in matplotlib to ensure outliers are visible.
Understand whiskers do not represent min/max but a range to detect outliers.
Visualizing outliers helps identify unusual data points for cleaning or analysis.