We use categorical scatter plots with jitter to show data points for categories clearly. Jitter adds small random shifts so points don't overlap.
0
0
Categorical scatter with jitter in Matplotlib
Introduction
When you want to see individual data points for categories like fruits or colors.
When many points overlap in a category and you want to spread them out to see density.
When comparing groups and you want to show all data points, not just summaries.
When you want a simple way to visualize distribution within categories.
Syntax
Matplotlib
import matplotlib.pyplot as plt import numpy as np categories = ['A', 'B', 'C'] values = [5, 7, 6, 8, 5, 7, 6, 9, 5] category_labels = ['A', 'A', 'B', 'B', 'B', 'C', 'C', 'C', 'C'] # Convert categories to numbers x = [categories.index(cat) for cat in category_labels] # Add jitter jitter = np.random.uniform(-0.1, 0.1, size=len(x)) x_jittered = x + jitter plt.scatter(x_jittered, values) plt.xticks(range(len(categories)), categories) plt.xlabel('Category') plt.ylabel('Value') plt.title('Categorical scatter with jitter') plt.show()
Jitter is a small random number added to category positions to avoid overlap.
Categories are converted to numbers because scatter needs numeric x values.
Examples
This example shows two categories with jitter to spread points horizontally.
Matplotlib
import matplotlib.pyplot as plt import numpy as np categories = ['X', 'Y'] values = [1, 2, 3, 4, 5, 6] category_labels = ['X', 'X', 'Y', 'Y', 'Y', 'X'] x = [categories.index(cat) for cat in category_labels] jitter = np.random.uniform(-0.05, 0.05, len(x)) x_jittered = x + jitter plt.scatter(x_jittered, values) plt.xticks(range(len(categories)), categories) plt.show()
Here jitter is larger to spread points more for better visibility.
Matplotlib
import matplotlib.pyplot as plt import numpy as np cats = ['Dog', 'Cat', 'Bird'] vals = [3, 5, 2, 4, 6, 7, 3, 5] labels = ['Dog', 'Dog', 'Cat', 'Cat', 'Bird', 'Bird', 'Bird', 'Dog'] x = [cats.index(c) for c in labels] jitter = np.random.uniform(-0.2, 0.2, len(x)) x_jittered = x + jitter plt.scatter(x_jittered, vals, color='green') plt.xticks(range(len(cats)), cats) plt.title('Pets values with jitter') plt.show()
Sample Program
This program shows how to plot values for color categories with jitter to avoid overlapping points. The jitter is random but fixed by seed for consistent results.
Matplotlib
import matplotlib.pyplot as plt import numpy as np # Define categories and values categories = ['Red', 'Blue', 'Green'] values = [10, 15, 10, 20, 25, 15, 10, 30, 20] category_labels = ['Red', 'Red', 'Blue', 'Blue', 'Blue', 'Green', 'Green', 'Green', 'Green'] # Convert categories to numeric positions x = [categories.index(cat) for cat in category_labels] # Add jitter to x positions np.random.seed(42) # For reproducible jitter jitter = np.random.uniform(-0.15, 0.15, size=len(x)) x_jittered = x + jitter # Create scatter plot plt.scatter(x_jittered, values, color='purple') plt.xticks(range(len(categories)), categories) plt.xlabel('Color') plt.ylabel('Value') plt.title('Categorical scatter plot with jitter') plt.grid(True, linestyle='--', alpha=0.5) plt.show()
OutputSuccess
Important Notes
Use a small jitter range to keep points near their category.
Setting a random seed helps get the same jitter every time you run the code.
Jitter only affects horizontal position for categorical scatter plots.
Summary
Categorical scatter plots show individual points for categories.
Jitter adds small random shifts to avoid overlapping points.
This helps visualize data distribution within categories clearly.