0
0
Data Analysis Pythondata~5 mins

String cleaning (strip, lower, replace) in Data Analysis Python - Time & Space Complexity

Choose your learning style9 modes available
Time Complexity: String cleaning (strip, lower, replace)
O(n * m)
Understanding Time Complexity

We want to understand how the time to clean strings grows as we handle more data.

How does the work change when we clean many strings or longer strings?

Scenario Under Consideration

Analyze the time complexity of the following code snippet.

def clean_strings(strings):
    cleaned = []
    for s in strings:
        s = s.strip()
        s = s.lower()
        s = s.replace(' ', '_')
        cleaned.append(s)
    return cleaned

This code cleans a list of strings by removing spaces at ends, making all letters lowercase, and replacing spaces inside with underscores.

Identify Repeating Operations

Identify the loops, recursion, array traversals that repeat.

  • Primary operation: Looping over each string in the list.
  • How many times: Once for each string in the input list.
  • Inside the loop, string methods operate on each string's characters.
  • The dominant work depends on the total number of characters processed.
How Execution Grows With Input

As we add more strings or longer strings, the work grows roughly with the total characters.

Input Size (n strings)Approx. Operations (characters processed)
10 strings, avg 5 chars~50
100 strings, avg 5 chars~500
1000 strings, avg 5 chars~5000

Pattern observation: The work grows linearly with the total number of characters across all strings.

Final Time Complexity

Time Complexity: O(n * m)

This means the time grows with the number of strings (n) times the average length of each string (m).

Common Mistake

[X] Wrong: "String cleaning takes the same time no matter how long the strings are."

[OK] Correct: Each string method looks at every character, so longer strings take more time to process.

Interview Connect

Understanding how string operations scale helps you explain your code's efficiency clearly and confidently.

Self-Check

"What if we used a regular expression to replace spaces instead of the replace method? How would the time complexity change?"