Bird
Raised Fist0
ML Pythonml~8 mins

Why advanced regression handles non-linearity in ML Python - Why Metrics Matter

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Metrics & Evaluation - Why advanced regression handles non-linearity
Which metric matters and WHY

For regression tasks, common metrics are Mean Squared Error (MSE), Root Mean Squared Error (RMSE), and R-squared (R²). These metrics measure how close the predicted values are to the actual values. When handling non-linearity, these metrics help us see if the model captures complex patterns better than simple linear regression.

Confusion matrix or equivalent visualization

Regression does not use a confusion matrix. Instead, we look at error values. For example, if actual values are [3, 5, 7] and predicted are [2.8, 5.1, 6.9], errors are small, showing good fit. A simple table of actual vs predicted helps visualize this:

Actual:    3.0   5.0   7.0
Predicted: 2.8   5.1   6.9
Error:     0.2  -0.1   0.1
Precision vs Recall tradeoff (or equivalent)

In regression, the tradeoff is between bias and variance. Simple linear regression has high bias and low variance, so it misses non-linear patterns (underfitting). Advanced regression methods (like polynomial regression, decision trees, or kernel methods) reduce bias by fitting curves but can increase variance (overfitting). The goal is to balance this to capture non-linearity without fitting noise.

Example: Predicting house prices that rise sharply after a certain size. Linear regression misses this curve, advanced regression fits it better but might overfit if too complex.

What "good" vs "bad" metric values look like

Good regression model:

  • Low MSE or RMSE (errors close to zero)
  • High R² (close to 1), meaning predictions explain most of the variation

Bad regression model:

  • High MSE or RMSE (large errors)
  • Low or negative R², meaning predictions are worse than just guessing the average

Advanced regression models that handle non-linearity usually show better metrics on complex data than simple linear models.

Common pitfalls in metrics
  • Ignoring non-linearity: Using linear regression on non-linear data leads to poor fit and misleading metrics.
  • Overfitting: Advanced models may fit training data perfectly but fail on new data, causing low test performance.
  • Data leakage: Using future or target information in training inflates metrics falsely.
  • Relying on a single metric: Always check multiple metrics and visualize predictions to understand model behavior.
Self-check question

Your advanced regression model has an R² of 0.95 on training data but only 0.60 on test data. Is it good at handling non-linearity? Why or why not?

Answer: The model fits training data well, capturing non-linearity, but the drop on test data suggests overfitting. It handles non-linearity but needs better generalization.

Key Result
Advanced regression models improve fit on non-linear data by reducing bias but must balance variance to avoid overfitting.

Practice

(1/5)
1. Why do advanced regression models handle non-linearity better than simple linear regression?
easy
A. Because they only use one feature at a time
B. Because they ignore data points that don't fit a line
C. Because they can model complex curved relationships in data
D. Because they always use fewer data points

Solution

  1. Step 1: Understand simple linear regression limits

    Simple linear regression fits a straight line, so it cannot capture curves or bends in data.
  2. Step 2: Recognize advanced regression capabilities

    Advanced regression models like decision trees or polynomial regression can fit curves and complex patterns.
  3. Final Answer:

    Because they can model complex curved relationships in data -> Option C
  4. Quick Check:

    Advanced regression models handle curves [OK]
Hint: Advanced regression fits curves, not just straight lines [OK]
Common Mistakes:
  • Thinking advanced regression ignores data points
  • Believing advanced regression uses fewer data points
  • Assuming advanced regression only uses one feature
2. Which of the following is the correct way to create a polynomial regression model in Python using scikit-learn?
easy
A. from sklearn.preprocessing import PolynomialFeatures; poly = PolynomialFeatures(degree=2); X_poly = poly.fit_transform(X)
B. from sklearn.tree import DecisionTreeRegressor; model = DecisionTreeRegressor(); model.fit(X, y)
C. from sklearn.cluster import KMeans; model = KMeans(); model.fit(X)
D. from sklearn.linear_model import LinearRegression; model = LinearRegression(); model.fit(X_poly, y)

Solution

  1. Step 1: Identify polynomial feature creation

    Polynomial regression requires transforming features using PolynomialFeatures to add powers of features.
  2. Step 2: Recognize correct syntax for polynomial transformation

    from sklearn.preprocessing import PolynomialFeatures; poly = PolynomialFeatures(degree=2); X_poly = poly.fit_transform(X) correctly imports PolynomialFeatures and transforms X to X_poly for regression.
  3. Final Answer:

    from sklearn.preprocessing import PolynomialFeatures; poly = PolynomialFeatures(degree=2); X_poly = poly.fit_transform(X) -> Option A
  4. Quick Check:

    PolynomialFeatures creates polynomial features [OK]
Hint: Polynomial regression needs PolynomialFeatures to transform data [OK]
Common Mistakes:
  • Confusing decision tree with polynomial regression
  • Using clustering models for regression tasks
  • Not transforming features before fitting polynomial regression
3. Given the code below, what will be the output of print(predictions)?
from sklearn.tree import DecisionTreeRegressor
X = [[1], [2], [3], [4], [5]]
y = [1, 4, 9, 16, 25]
model = DecisionTreeRegressor()
model.fit(X, y)
predictions = model.predict([[6]])
print(predictions)
medium
A. [16]
B. [36]
C. [9]
D. [25]

Solution

  1. Step 1: Understand decision tree prediction behavior

    Decision trees predict by assigning the output of the closest training leaf node, not extrapolating beyond training data.
  2. Step 2: Check training data and prediction input

    Input 6 is beyond training max 5, so prediction will be the leaf value for closest known input, which is 5 with output 25.
  3. Final Answer:

    [25] -> Option D
  4. Quick Check:

    Decision tree predicts closest leaf value = 25 [OK]
Hint: Decision trees do not extrapolate; predict closest known value [OK]
Common Mistakes:
  • Assuming decision tree extrapolates like polynomial regression
  • Expecting exact square of 6 (36) as output
  • Confusing prediction with training labels
4. The following code tries to fit a polynomial regression but gives an error. What is the mistake?
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures
X = [[1], [2], [3], [4]]
y = [1, 4, 9, 16]
model = LinearRegression()
X_poly = PolynomialFeatures(degree=2)
model.fit(X_poly, y)
medium
A. LinearRegression cannot fit polynomial data
B. X_poly is a class, not transformed data; need to call fit_transform on X
C. y should be a 2D array, not 1D
D. Degree should be 1 for polynomial features

Solution

  1. Step 1: Identify how PolynomialFeatures is used

    PolynomialFeatures is a transformer class; it needs to be applied to X using fit_transform to create polynomial features.
  2. Step 2: Spot the error in code

    Code assigns X_poly to the class instance, not the transformed data. The model.fit expects numeric array, not a class object.
  3. Final Answer:

    X_poly is a class, not transformed data; need to call fit_transform on X -> Option B
  4. Quick Check:

    Call fit_transform on X before fitting model [OK]
Hint: Call fit_transform on X before fitting model [OK]
Common Mistakes:
  • Passing transformer class instead of transformed data
  • Thinking LinearRegression can't fit polynomial features
  • Misunderstanding y shape requirements
5. You have a dataset where the target variable changes in a complex curve with two features. Which approach best handles this non-linearity and why?
hard
A. Polynomial regression of degree 3 can model complex curves with multiple features
B. Simple linear regression will miss the curve
C. Decision tree with max depth 2 is too shallow to capture complexity
D. Dropping features reduces information and won't help non-linearity

Solution

  1. Step 1: Analyze model capabilities for non-linearity

    Simple linear regression cannot model curves; decision tree with low depth may underfit; dropping features loses info.
  2. Step 2: Evaluate polynomial regression for multiple features

    Polynomial regression with degree 3 creates interaction and power terms, capturing complex curves in multiple features.
  3. Final Answer:

    Polynomial regression of degree 3 can model complex curves with multiple features -> Option A
  4. Quick Check:

    Degree 3 polynomial regression models complex curves [OK]
Hint: Higher degree polynomial regression models complex curves well [OK]
Common Mistakes:
  • Choosing shallow decision trees that underfit
  • Dropping features reduces model power
  • Using simple linear regression for curved data