0
0
MlopsConceptBeginner · 3 min read

What is Linear Regression in Python with sklearn

Linear regression in Python is a method to predict a number by finding the best straight line that fits data points. Using sklearn.linear_model.LinearRegression, you can easily create a model that learns the relationship between input features and a target value.
⚙️

How It Works

Linear regression finds the best straight line that goes through your data points. Imagine you want to guess someone's height based on their age. Linear regression draws a line that best fits the pattern of ages and heights in your data.

This line is chosen so that the total distance between the actual points and the line is as small as possible. The model learns the slope (how steep the line is) and the intercept (where the line crosses the vertical axis) to make predictions.

Once trained, you can give the model a new age, and it will predict the height by using the line it learned.

💻

Example

This example shows how to use sklearn's LinearRegression to fit a simple dataset and predict new values.

python
from sklearn.linear_model import LinearRegression
import numpy as np

# Sample data: ages (input) and heights (target)
ages = np.array([[5], [6], [7], [8], [9]])  # 2D array for sklearn
heights = np.array([110, 115, 120, 125, 130])

# Create and train the model
model = LinearRegression()
model.fit(ages, heights)

# Predict height for a new age
new_age = np.array([[10]])
predicted_height = model.predict(new_age)

print(f"Predicted height for age 10: {predicted_height[0]:.2f} cm")
Output
Predicted height for age 10: 135.00 cm
🎯

When to Use

Use linear regression when you want to predict a continuous number based on one or more input features. It works best when the relationship between inputs and output is roughly a straight line.

Real-world examples include predicting house prices from size, estimating sales based on advertising spend, or forecasting temperatures from time of year.

It's simple, fast, and interpretable, making it a great first step for many prediction problems.

Key Points

  • Linear regression models the relationship between inputs and a continuous output with a straight line.
  • It learns coefficients (slope) and intercept to make predictions.
  • Sklearn's LinearRegression makes it easy to train and predict.
  • Best for problems where output changes linearly with inputs.
  • Simple and interpretable, but not suitable for complex nonlinear patterns.

Key Takeaways

Linear regression predicts continuous values by fitting a straight line to data.
Use sklearn's LinearRegression to easily train and predict in Python.
Best for problems with a linear relationship between inputs and output.
It provides simple, fast, and interpretable models.
Not suitable for complex nonlinear relationships.