Fitting custom models in SciPy - Time & Space Complexity
When fitting custom models using scipy, it is important to understand how the time needed grows as the data size increases.
We want to know how the number of calculations changes when we fit models to more data points.
Analyze the time complexity of the following code snippet.
import numpy as np
from scipy.optimize import curve_fit
def model_func(x, a, b):
return a * np.exp(b * x)
xdata = np.linspace(0, 4, 50)
ydata = model_func(xdata, 2.5, 1.3) + 0.2 * np.random.normal(size=xdata.size)
popt, pcov = curve_fit(model_func, xdata, ydata)
This code fits a custom exponential model to data points using scipy's curve_fit function.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Evaluating the model function many times during optimization.
- How many times: The optimizer calls the model function repeatedly, often hundreds of times, depending on convergence.
As the number of data points increases, the model function must be evaluated on each point every time the optimizer tries new parameters.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | Few hundred evaluations x 10 points = ~thousands |
| 100 | Few hundred evaluations x 100 points = ~tens of thousands |
| 1000 | Few hundred evaluations x 1000 points = ~hundreds of thousands |
Pattern observation: The total work grows roughly linearly with the number of data points.
Time Complexity: O(n)
This means the time to fit the model grows roughly in direct proportion to the number of data points.
[X] Wrong: "The fitting time depends only on the number of parameters, not the data size."
[OK] Correct: Each fitting step evaluates the model on all data points, so more data means more work even if parameters stay the same.
Understanding how fitting time grows helps you explain performance in real projects and shows you can reason about algorithm costs beyond just writing code.
"What if the model function was very complex and took longer to compute per point? How would that affect the overall time complexity?"