0
0
Data Analysis Pythondata~5 mins

Scaling and normalization concepts in Data Analysis Python - Time & Space Complexity

Choose your learning style9 modes available
Time Complexity: Scaling and normalization concepts
O(n)
Understanding Time Complexity

When we scale or normalize data, we change its values to a common range or scale.

We want to know how the time to do this changes as the data size grows.

Scenario Under Consideration

Analyze the time complexity of the following code snippet.


import numpy as np

def min_max_scale(data):
    min_val = np.min(data)
    max_val = np.max(data)
    scaled = (data - min_val) / (max_val - min_val)
    return scaled

sample_data = np.array([10, 20, 30, 40, 50])
scaled_data = min_max_scale(sample_data)

This code rescales a list of numbers to a range between 0 and 1 using min-max scaling.

Identify Repeating Operations

Identify the loops, recursion, array traversals that repeat.

  • Primary operation: Scanning the data array to find minimum and maximum values.
  • How many times: Each element is visited twice: once for min, once for max, then once more for scaling.
How Execution Grows With Input

As the data size grows, the number of operations grows roughly in direct proportion.

Input Size (n)Approx. Operations
10About 30 (3 passes over 10 elements)
100About 300 (3 passes over 100 elements)
1000About 3000 (3 passes over 1000 elements)

Pattern observation: The operations increase linearly as the input size increases.

Final Time Complexity

Time Complexity: O(n)

This means the time to scale data grows directly with the number of data points.

Common Mistake

[X] Wrong: "Scaling data takes constant time no matter how big the data is."

[OK] Correct: Every data point must be processed, so time grows as data grows.

Interview Connect

Understanding how data scaling time grows helps you explain efficiency when preparing data for models.

Self-Check

"What if we used a scaling method that requires sorting the data? How would the time complexity change?"