0
0
Matplotlibdata~10 mins

Categorical scatter with jitter in Matplotlib - Step-by-Step Execution

Choose your learning style9 modes available
Concept Flow - Categorical scatter with jitter
Start with categorical data
Assign numeric positions to categories
Add small random noise (jitter) to positions
Plot points at jittered positions
Display scatter plot with spread points
We convert categories to numbers, add small random shifts to avoid overlap, then plot points spread out for clarity.
Execution Sample
Matplotlib
import matplotlib.pyplot as plt
import numpy as np

categories = ['A', 'B', 'A', 'C', 'B']
positions = [0, 1, 0, 2, 1]
jitter = np.random.uniform(-0.1, 0.1, size=len(positions))
plt.scatter(np.array(positions) + jitter, [1]*len(positions))
plt.xticks([0,1,2], ['A','B','C'])
plt.show()
This code plots categorical points with jitter to spread overlapping points horizontally.
Execution Table
StepVariableValue/ActionResult/Effect
1categories['A', 'B', 'A', 'C', 'B']Original categorical data
2positions[0, 1, 0, 2, 1]Numeric positions assigned to categories
3jitterrandom values between -0.1 and 0.1Small noise added to positions
4positions + jittere.g. [0.05, 0.92, -0.07, 2.03, 1.08]Jittered positions for plotting
5plt.scatterplot points at jittered positionsScatter plot with spread points
6plt.xticksset x-axis labels to ['A','B','C']Categories shown on x-axis
7plt.showdisplay plotVisual output with jittered categorical scatter
💡 All points plotted with jitter to avoid overlap, plot displayed.
Variable Tracker
VariableStartAfter Step 2After Step 3After Step 4Final
categoriesNone['A', 'B', 'A', 'C', 'B']SameSameSame
positionsNone[0, 1, 0, 2, 1]SameSameSame
jitterNoneNone[-0.07, -0.08, 0.05, 0.03, 0.08]SameSame
positions + jitterNoneNoneNone[-0.07, 0.92, -0.07, 2.03, 1.08]Same
Key Moments - 3 Insights
Why do we add jitter to the numeric positions?
Adding jitter spreads points horizontally so they don't overlap exactly, making each point visible. See execution_table step 4 where jitter is added.
How do categories become numbers for plotting?
Each category is assigned a number (like 0 for 'A', 1 for 'B', 2 for 'C') so matplotlib can plot them on a numeric axis. See execution_table step 2.
Why do we set x-axis ticks after plotting?
Because positions are numbers, we set x-axis labels to show category names for clarity. See execution_table step 6.
Visual Quiz - 3 Questions
Test your understanding
Look at the variable_tracker table, what is the value of 'positions + jitter' after step 4?
A[0, 1, 0, 2, 1]
B['A', 'B', 'A', 'C', 'B']
C[-0.07, 0.92, 0.05, 2.03, 1.08]
D[-0.1, 0.1, -0.1, 0.1, 0.1]
💡 Hint
Check the 'positions + jitter' row in variable_tracker after step 4.
At which step in the execution_table do we assign numeric positions to categories?
AStep 2
BStep 4
CStep 1
DStep 6
💡 Hint
Look for the step where 'positions' variable is set.
If jitter range was increased to (-0.5, 0.5), how would the scatter plot points change?
APoints would overlap more
BPoints would be more spread out horizontally
CPoints would move vertically
DPoints would disappear
💡 Hint
Jitter controls horizontal noise added to positions, see execution_table step 3.
Concept Snapshot
Categorical scatter with jitter:
- Map categories to numbers (e.g., A=0, B=1)
- Add small random noise (jitter) to numeric positions
- Plot points at jittered positions to avoid overlap
- Set x-axis ticks to category labels
- Result: clear scatter plot showing category distribution
Full Transcript
This visual execution shows how to plot categorical data points with jitter using matplotlib. First, categories are converted to numeric positions. Then, small random noise called jitter is added to these positions to spread points horizontally and avoid overlap. The jittered positions are plotted as a scatter plot. Finally, x-axis ticks are set to show category names for clarity. This method helps visualize overlapping categorical points clearly.