Non-linear curve fitting in SciPy - Time & Space Complexity
When fitting a curve that is not a straight line, the computer tries many guesses to find the best fit.
We want to know how the time needed grows as we give more data points.
Analyze the time complexity of the following code snippet.
import numpy as np
from scipy.optimize import curve_fit
def model_func(x, a, b):
return a * np.exp(b * x)
xdata = np.linspace(0, 4, 50)
ydata = model_func(xdata, 2.5, 1.3) + 0.2 * np.random.normal(size=xdata.size)
popt, pcov = curve_fit(model_func, xdata, ydata)
This code fits an exponential curve to data points by trying to find the best parameters.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Repeated evaluation of the model function and adjustment of parameters.
- How many times: Many iterations until the best fit is found, depending on data size and convergence.
As the number of data points increases, the fitting process takes longer because it must consider more points each time it tests parameters.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | Low number of function evaluations |
| 100 | About 10 times more evaluations |
| 1000 | About 100 times more evaluations |
Pattern observation: The time grows roughly linearly with the number of data points multiplied by the number of iterations, which may depend on convergence.
Time Complexity: O(n × k), where n is the number of data points and k is the number of iterations until convergence.
This means if you double the data points, the time to fit the curve roughly doubles, assuming the number of iterations stays constant.
[X] Wrong: "The fitting time grows linearly with the number of data points because it just looks at each point once."
[OK] Correct: The fitting process repeats many times, each time using all data points, so the total work grows with both the number of points and the number of iterations.
Understanding how curve fitting time grows helps you explain performance in real data analysis tasks and shows you can think about algorithm costs clearly.
"What if we used a simpler linear model instead of a non-linear one? How would the time complexity change?"