0
0
Matplotlibdata~10 mins

Trend lines on scatter plots in Matplotlib - Step-by-Step Execution

Choose your learning style9 modes available
Concept Flow - Trend lines on scatter plots
Start with data points
Plot scatter points
Calculate trend line coefficients
Create line points using coefficients
Plot trend line on scatter plot
Show plot
We start with data points, plot them as a scatter plot, calculate the trend line, then draw it over the scatter plot.
Execution Sample
Matplotlib
import matplotlib.pyplot as plt
import numpy as np
x = np.array([1,2,3,4,5])
y = np.array([2,3,5,7,11])
plt.scatter(x, y)
coef = np.polyfit(x, y, 1)
plt.plot(x, coef[0]*x + coef[1])
plt.show()
This code plots points and draws a straight trend line fitting those points.
Execution Table
StepActionVariable/ExpressionResult/Value
1Create x arrayx[1 2 3 4 5]
2Create y arrayy[2 3 5 7 11]
3Plot scatter pointsplt.scatter(x, y)Scatter plot with 5 points
4Calculate trend line coefficientsnp.polyfit(x, y, 1)coef = [2.2, -1.0]
5Calculate trend line y valuescoef[0]*x + coef[1][1.2 3.4 5.6 7.8 10. ]
6Plot trend lineplt.plot(x, coef[0]*x + coef[1])Line drawn over scatter
7Show plotplt.show()Scatter plot with trend line displayed
💡 Plot displayed and execution ends
Variable Tracker
VariableStartAfter Step 1After Step 2After Step 4After Step 5Final
xundefined[1 2 3 4 5][1 2 3 4 5][1 2 3 4 5][1 2 3 4 5][1 2 3 4 5]
yundefinedundefined[2 3 5 7 11][2 3 5 7 11][2 3 5 7 11][2 3 5 7 11]
coefundefinedundefinedundefined[2.2, -1.0][2.2, -1.0][2.2, -1.0]
trend_yundefinedundefinedundefinedundefined[1.2 3.4 5.6 7.8 10. ][1.2 3.4 5.6 7.8 10. ]
Key Moments - 3 Insights
Why does the trend line not pass through all scatter points exactly?
The trend line is a best fit line calculated by np.polyfit which minimizes overall distance, not necessarily passing through every point (see Step 4 and 5 in execution_table).
What do the two numbers in coef represent?
coef[0] is the slope (steepness) of the line, coef[1] is the intercept (where line crosses y-axis), shown in Step 4 of execution_table.
Why do we multiply coef[0] by x and add coef[1]?
This calculates y values on the trend line for each x, creating points to draw the line (Step 5 in execution_table).
Visual Quiz - 3 Questions
Test your understanding
Look at the execution_table at Step 4, what does coef equal?
A[2.2, -1.0]
B[1, 0]
C[0, 1]
D[5, 11]
💡 Hint
Check the 'Result/Value' column at Step 4 in execution_table.
At which step do we calculate the y values for the trend line?
AStep 3
BStep 5
CStep 2
DStep 6
💡 Hint
Look for when coef[0]*x + coef[1] is computed in execution_table.
If we change the degree in np.polyfit from 1 to 2, what changes in the execution_table?
AScatter points change
Bx array changes
Ccoef will have 3 values instead of 2
DPlot will not show
💡 Hint
Think about polynomial degree and number of coefficients in Step 4.
Concept Snapshot
Trend lines on scatter plots:
- Use plt.scatter(x, y) to plot points
- Use np.polyfit(x, y, 1) to get line coefficients
- Calculate line y: slope*x + intercept
- Use plt.plot(x, line_y) to draw trend line
- Shows overall data trend visually
Full Transcript
We start with data points x and y. We plot these points as a scatter plot using plt.scatter. Then, we calculate the best fit line coefficients using np.polyfit with degree 1, which returns slope and intercept. Next, we compute y values on the trend line by multiplying slope with x and adding intercept. We plot this line over the scatter plot using plt.plot. Finally, we show the plot with plt.show. The trend line shows the general direction of the data but does not pass through all points exactly.