0
0
NumPydata~5 mins

Linear regression with np.polyfit() in NumPy

Choose your learning style9 modes available
Introduction

We use linear regression to find a straight line that best fits a set of points.
This helps us understand the relationship between two things.

Predicting house prices based on size.
Estimating sales based on advertising budget.
Finding the trend in temperature changes over time.
Understanding how study hours affect test scores.
Syntax
NumPy
coefficients = np.polyfit(x, y, degree)

x and y are arrays of data points.

degree is the degree of the fitting polynomial; use 1 for a straight line.

Examples
Fits a line to points (1,2), (2,4), (3,6). The result is slope and intercept.
NumPy
coefficients = np.polyfit([1, 2, 3], [2, 4, 6], 1)
Fits a curve of degree 2 (a parabola) to the data points.
NumPy
coefficients = np.polyfit(x, y, 2)
Sample Program

This program finds the best fit line for test scores based on hours studied.
It prints the slope and intercept, then shows a plot with points and the line.

NumPy
import numpy as np
import matplotlib.pyplot as plt

# Sample data: hours studied vs test score
x = np.array([1, 2, 3, 4, 5])
y = np.array([2, 4, 5, 4, 5])

# Find line coefficients: slope and intercept
coefficients = np.polyfit(x, y, 1)
slope, intercept = coefficients

# Print the slope and intercept
print(f"Slope: {slope:.2f}")
print(f"Intercept: {intercept:.2f}")

# Create points for the fitted line
x_line = np.linspace(min(x), max(x), 100)
y_line = slope * x_line + intercept

# Plot data points and fitted line
plt.scatter(x, y, color='blue', label='Data points')
plt.plot(x_line, y_line, color='red', label='Fitted line')
plt.xlabel('Hours Studied')
plt.ylabel('Test Score')
plt.title('Linear Regression with np.polyfit()')
plt.legend()
plt.show()
OutputSuccess
Important Notes

np.polyfit returns coefficients starting with the highest degree term.

Degree 1 means a straight line (linear regression).

Plotting helps visualize how well the line fits the data.

Summary

Use np.polyfit(x, y, 1) to find a line that fits your data.

The output gives slope and intercept for the line equation y = slope * x + intercept.

Visualizing the fit helps understand the relationship between variables.