Polynomial regression helps us find curved lines that fit data better than straight lines. A pipeline makes it easy to do all steps together without mistakes.
0
0
Polynomial regression pipeline in ML Python
Introduction
When data points form a curve, not a straight line.
When you want to predict values that change in a non-linear way.
When you want to combine data changes and model training in one simple step.
When you want to avoid repeating data preparation steps manually.
When you want to test different curve degrees easily.
Syntax
ML Python
from sklearn.pipeline import Pipeline from sklearn.preprocessing import PolynomialFeatures from sklearn.linear_model import LinearRegression pipeline = Pipeline([ ('poly_features', PolynomialFeatures(degree=2)), ('linear_regression', LinearRegression()) ]) pipeline.fit(X_train, y_train) predictions = pipeline.predict(X_test)
The pipeline runs steps in order: first it creates polynomial features, then fits a linear model.
Change degree to control curve complexity (2 means square terms).
Examples
This pipeline fits a cubic curve (degree 3) to the data.
ML Python
pipeline = Pipeline([
('poly_features', PolynomialFeatures(degree=3)),
('linear_regression', LinearRegression())
])Degree 1 means no curve, just a straight line (simple linear regression).
ML Python
pipeline = Pipeline([
('poly_features', PolynomialFeatures(degree=1)),
('linear_regression', LinearRegression())
])Sample Model
This program creates curved data, fits a polynomial regression model using a pipeline, and shows how well it predicts new points.
ML Python
import numpy as np from sklearn.pipeline import Pipeline from sklearn.preprocessing import PolynomialFeatures from sklearn.linear_model import LinearRegression from sklearn.metrics import mean_squared_error # Create sample data: y = 1 + 2x + 3x^2 + noise np.random.seed(0) X = np.linspace(-3, 3, 100).reshape(-1, 1) y = 1 + 2 * X.flatten() + 3 * X.flatten()**2 + np.random.randn(100) * 3 # Split data into train and test X_train, X_test = X[:80], X[80:] y_train, y_test = y[:80], y[80:] # Build polynomial regression pipeline with degree 2 pipeline = Pipeline([ ('poly_features', PolynomialFeatures(degree=2)), ('linear_regression', LinearRegression()) ]) # Train model pipeline.fit(X_train, y_train) # Predict on test data predictions = pipeline.predict(X_test) # Calculate mean squared error mse = mean_squared_error(y_test, predictions) # Print results print(f"Mean Squared Error: {mse:.2f}") print(f"Predictions: {predictions[:5]}")
OutputSuccess
Important Notes
PolynomialFeatures adds new columns like x², x³ to help model curves.
Using a pipeline avoids mistakes by running all steps together.
Higher degree means more complex curves but can cause overfitting.
Summary
Polynomial regression fits curved lines to data.
Pipelines combine data changes and model training in one step.
Adjust degree to control curve complexity and fit quality.