Polynomial fitting in SciPy - Time & Space Complexity
When fitting a polynomial to data, we want to know how the time to find the best curve changes as we add more data points or increase the polynomial degree.
We ask: How does the work grow when the input size or polynomial degree grows?
Analyze the time complexity of the following code snippet.
import numpy as np
n = 100 # number of data points
x = np.linspace(0, 10, n) # n data points
y = np.sin(x) + np.random.normal(0, 0.1, n)
d = 3 # polynomial degree
degree = d
coeffs = np.polyfit(x, y, degree)
This code fits a polynomial of degree d to n data points using scipy's polyfit.
- Primary operation: Constructing and solving a system of equations to find polynomial coefficients.
- How many times: The system size depends on the polynomial degree
d, and the data pointsnare used to build this system.
As the number of data points n grows, building the system grows linearly. But solving the system depends mostly on the polynomial degree d.
| Input Size (n) | Polynomial Degree (d) | Approx. Operations |
|---|---|---|
| 10 | 3 | About 10 (data) + solving 4x4 system |
| 100 | 3 | About 100 (data) + solving 4x4 system |
| 1000 | 3 | About 1000 (data) + solving 4x4 system |
Pattern observation: Increasing n adds work linearly, but solving depends on d squared or cubed.
Time Complexity: O(n d^2 + d^3)
This means the time grows mostly with the number of data points n times the square of the polynomial degree, plus the cube of the degree for solving.
[X] Wrong: "The time only depends on the number of data points n."
[OK] Correct: The polynomial degree d affects the size of the system to solve, which can be costly even if n is small.
Understanding how polynomial fitting scales helps you explain trade-offs in data modeling and algorithm choices clearly and confidently.
"What if we used a fixed small polynomial degree but increased the number of data points greatly? How would the time complexity change?"