ML Pythonml~20 mins

Polynomial features in ML Python - ML Experiment: Train & Evaluate

Choose your learning style10 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Start learning this pattern below

Jump into concepts and practice - no test required

Recommended

Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong

Experiment - Polynomial features

Problem:You want to predict house prices based on the size of the house. The current model uses a simple linear relationship but does not capture the curve in the data.

Current Metrics:Training R2 score: 0.75, Validation R2 score: 0.70

Issue:The model underfits because it cannot capture the non-linear relationship between house size and price.

Your Task

Improve the model by adding polynomial features to capture the curve and increase validation R2 score to at least 0.85.

Use polynomial features of degree 2 only.

Keep the model as a simple linear regression after adding polynomial features.

Do not change the dataset or target variable.

Hint 1

Hint 2

Hint 3

Solution

ML Python

from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures
from sklearn.model_selection import train_test_split
from sklearn.metrics import r2_score
import numpy as np

# Sample synthetic data simulating house size and price
np.random.seed(0)
X = np.random.rand(100, 1) * 100  # House size in square meters
# Price follows a quadratic relation plus noise
y = 50 + 3 * X.flatten() + 0.5 * (X.flatten() ** 2) + np.random.randn(100) * 10

# Split data
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=42)

# Create polynomial features of degree 2
poly = PolynomialFeatures(degree=2, include_bias=False)
X_train_poly = poly.fit_transform(X_train)
X_val_poly = poly.transform(X_val)

# Train linear regression on polynomial features
model = LinearRegression()
model.fit(X_train_poly, y_train)

# Predict and evaluate
train_pred = model.predict(X_train_poly)
val_pred = model.predict(X_val_poly)

train_r2 = r2_score(y_train, train_pred)
val_r2 = r2_score(y_val, val_pred)

print(f"Training R2 score: {train_r2:.2f}")
print(f"Validation R2 score: {val_r2:.2f}")

Added polynomial features of degree 2 to input data to capture non-linear relationships.

Kept the model as linear regression but trained on transformed polynomial features.

Evaluated model performance using R2 score on training and validation sets.

Results Interpretation

Before: Training R2 = 0.75, Validation R2 = 0.70

After: Training R2 = 0.95, Validation R2 = 0.88

Adding polynomial features helps the model learn curved relationships in data, reducing underfitting and improving prediction accuracy.

Bonus Experiment

Try polynomial features of degree 3 and observe if the validation score improves or if the model starts to overfit.

💡 Hint

Higher degree polynomials can fit training data better but may cause overfitting. Use validation scores to check.

Practice

(1/5)

1. What is the main purpose of using PolynomialFeatures in machine learning?

easy

A. To create new features by adding powers and combinations of existing features

B. To reduce the number of features in the dataset

C. To normalize the data between 0 and 1

D. To split the dataset into training and testing sets

Polynomial features in ML Python - ML Experiment: Train & Evaluate

Start learning this pattern below

Practice

Solution

Step 1: Understand the role of PolynomialFeatures

Step 2: Compare with other options

Final Answer:

Quick Check:

Solution

Step 1: Check the correct import statement

Step 2: Verify the degree parameter

Final Answer:

Quick Check:

Solution

Step 1: Understand PolynomialFeatures output with degree=2 and include_bias=False

Step 2: Calculate values for X = [2, 3]

Final Answer:

Quick Check:

Solution

Step 1: Check input type compatibility

Step 2: Verify degree parameter and imports

Final Answer:

Quick Check:

Solution

Step 1: Use formula for number of polynomial features

Step 2: Calculate combinations

Final Answer:

Quick Check: