0
0
Data Analysis Pythondata~5 mins

Polynomial features in Data Analysis Python - Time & Space Complexity

Choose your learning style9 modes available
Time Complexity: Polynomial features
O(n * m^2)
Understanding Time Complexity

We want to understand how the time to create polynomial features grows as the input data size and feature count increase.

How does the work change when we add more data or more features?

Scenario Under Consideration

Analyze the time complexity of the following code snippet.

from sklearn.preprocessing import PolynomialFeatures

poly = PolynomialFeatures(degree=2, include_bias=False)
X_poly = poly.fit_transform(X)

This code creates new features by combining original features up to degree 2 for each data point.

Identify Repeating Operations

Identify the loops, recursion, array traversals that repeat.

  • Primary operation: Generating all combinations of features up to degree 2 for each data point.
  • How many times: For each of the n data points, the code creates about m + m(m+1)/2 new features, where m is the number of original features.
How Execution Grows With Input

As the number of data points (n) grows, the work grows linearly because each point is processed once. But as the number of features (m) grows, the number of new features grows roughly with the square of m.

Input Size (n)Approx. Operations (for fixed m=10)
10~10 * (10 + 55) = 650
100~100 * 65 = 6,500
1000~1000 * 65 = 65,000

Pattern observation: Operations grow linearly with data points but quadratically with features.

Final Time Complexity

Time Complexity: O(n * m^2)

This means the time grows linearly with the number of data points and roughly with the square of the number of features.

Common Mistake

[X] Wrong: "The time only depends on the number of data points, not the number of features."

[OK] Correct: Because polynomial features combine original features, the number of new features grows quickly with more original features, increasing the work.

Interview Connect

Understanding how feature transformations affect time helps you explain trade-offs clearly and shows you can think about scaling data processing steps.

Self-Check

"What if we increased the polynomial degree from 2 to 3? How would the time complexity change?"