0
0
Matplotlibdata~10 mins

KDE overlay concept in Matplotlib - Step-by-Step Execution

Choose your learning style9 modes available
Concept Flow - KDE overlay concept
Start with raw data points
Calculate KDE for dataset 1
Calculate KDE for dataset 2
Plot KDE curves on same graph
Visualize overlapping density
Interpret density peaks and overlaps
We start with raw data, compute KDE for each dataset, then plot them together to see how their densities overlap.
Execution Sample
Matplotlib
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import gaussian_kde

x1 = np.random.normal(0, 1, 100)
x2 = np.random.normal(2, 1, 100)

kde1 = gaussian_kde(x1)
kde2 = gaussian_kde(x2)

x = np.linspace(-4, 6, 200)
plt.plot(x, kde1(x), label='Dataset 1')
plt.plot(x, kde2(x), label='Dataset 2')
plt.legend()
plt.show()
This code creates two datasets, computes their KDEs, and plots both KDE curves on the same graph to show their density overlap.
Execution Table
StepActionData/VariableResult/Output
1Generate dataset 1x1100 points from N(0,1)
2Generate dataset 2x2100 points from N(2,1)
3Compute KDE for dataset 1kde1 = gaussian_kde(x1)kde1 is a KDE function
4Compute KDE for dataset 2kde2 = gaussian_kde(x2)kde2 is a KDE function
5Create x values for plottingx = np.linspace(-4,6,200)Array of 200 points from -4 to 6
6Evaluate kde1 on xkde1(x)Density values for dataset 1
7Evaluate kde2 on xkde2(x)Density values for dataset 2
8Plot kde1 and kde2plt.plotTwo KDE curves overlayed
9Show plotplt.show()Visual graph with KDE overlays
10End-Execution complete
💡 All steps completed; KDE overlay plot displayed.
Variable Tracker
VariableStartAfter Step 1After Step 2After Step 3After Step 4After Step 5After Step 6After Step 7Final
x1None100 points N(0,1)100 points N(0,1)100 points N(0,1)100 points N(0,1)100 points N(0,1)100 points N(0,1)100 points N(0,1)100 points N(0,1)
x2NoneNone100 points N(2,1)100 points N(2,1)100 points N(2,1)100 points N(2,1)100 points N(2,1)100 points N(2,1)100 points N(2,1)
kde1NoneNoneNoneKDE function for x1KDE function for x1KDE function for x1KDE function for x1KDE function for x1KDE function for x1
kde2NoneNoneNoneNoneKDE function for x2KDE function for x2KDE function for x2KDE function for x2KDE function for x2
xNoneNoneNoneNoneNoneArray from -4 to 6 (200 pts)Array from -4 to 6 (200 pts)Array from -4 to 6 (200 pts)Array from -4 to 6 (200 pts)
kde1(x)NoneNoneNoneNoneNoneNoneDensity values (dataset 1)Density values (dataset 1)Density values (dataset 1)
kde2(x)NoneNoneNoneNoneNoneNoneNoneDensity values (dataset 2)Density values (dataset 2)
Key Moments - 3 Insights
Why do we use the same x values to evaluate both KDEs?
Using the same x values (step 5) ensures both KDE curves align on the same horizontal axis, making the overlay meaningful and comparable (see steps 6 and 7).
What does the KDE function represent after computation?
After steps 3 and 4, kde1 and kde2 are functions that estimate the probability density of the data, not just numbers. They can be evaluated at any x to get density values.
Why do the KDE curves overlap instead of being separate?
Because both KDEs are plotted on the same graph (step 8), their density curves can overlap where data distributions are close, showing areas of shared density.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution table at step 5. What does the variable x represent?
AThe KDE function for dataset 1
BThe original dataset 1 points
CAn array of points from -4 to 6 used for plotting KDEs
DThe density values of dataset 2
💡 Hint
Refer to step 5 in execution_table where x is created as linspace(-4,6,200).
At which step do we get the actual density values for dataset 2?
AStep 6
BStep 7
CStep 3
DStep 2
💡 Hint
Check step 7 in execution_table where kde2(x) is evaluated.
If we changed the range of x to linspace(-10, 10, 200), how would the execution_table change?
AStep 5's x values would range from -10 to 10 instead of -4 to 6
BDataset 1 would have more points
CKDE functions would change to different distributions
DThe plot would show only one KDE curve
💡 Hint
Look at step 5 where x is defined; changing linspace changes x values only.
Concept Snapshot
KDE overlay concept:
- Start with two datasets
- Compute KDE for each using gaussian_kde
- Use same x range to evaluate densities
- Plot both KDE curves on one graph
- Overlapping curves show density similarities
- Useful to compare distributions visually
Full Transcript
This visual execution traces the KDE overlay concept. We begin by generating two datasets from normal distributions. Then, we compute KDE functions for each dataset. Next, we create a common x array to evaluate both KDEs, ensuring they align on the same axis. We evaluate the KDE functions on this x array to get density values. Finally, we plot both KDE curves on the same graph to visualize where their densities overlap. This helps us compare the shape and spread of the two datasets visually. Key moments include understanding why the same x values are used for both KDEs, what the KDE functions represent, and why the curves overlap on the plot. The quiz checks understanding of variable roles and the effect of changing the x range.