str.len() for string length in Pandas - Time & Space Complexity
We want to understand how the time to find string lengths grows as we have more data.
How does the work change when we ask pandas to get lengths of many strings?
Analyze the time complexity of the following code snippet.
import pandas as pd
# Create a Series of strings
s = pd.Series(['apple', 'banana', 'cherry', 'date', 'elderberry'])
# Get length of each string
lengths = s.str.len()
This code creates a Series of words and finds the length of each word using pandas.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Checking the length of each string in the Series.
- How many times: Once for each string in the Series (one pass through all items).
As the number of strings grows, the time to find all lengths grows roughly the same way.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | 10 length checks |
| 100 | 100 length checks |
| 1000 | 1000 length checks |
Pattern observation: The work grows directly with the number of strings; doubling strings doubles work.
Time Complexity: O(n)
This means the time to get string lengths grows in a straight line with the number of strings.
[X] Wrong: "Getting string lengths is instant no matter how many strings there are."
[OK] Correct: Each string must be checked once, so more strings mean more work and more time.
Knowing how string operations scale helps you explain your code choices clearly and shows you understand data size impact.
"What if we used s.str.len() on a Series with very long strings instead of many short strings? How would the time complexity change?"