Series indexing and selection in Data Analysis Python - Time & Space Complexity
When working with a Series, we often pick values by their position or label. Understanding how long this takes helps us write faster code.
We want to know: How does the time to get items change as the Series grows?
Analyze the time complexity of the following code snippet.
import pandas as pd
s = pd.Series(range(1000))
# Select a single value by position
value = s.iloc[500]
# Select multiple values by label
subset = s.loc[100:200]
This code creates a Series and selects values by position and by label range.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Accessing elements by index or label.
- How many times: Single access is once; slicing accesses multiple elements equal to slice size.
Getting one item by position takes about the same time no matter the Series size.
Getting a slice of items takes time roughly proportional to how many items are in the slice.
| Input Size (n) | Approx. Operations for single item | Approx. Operations for slice of size k |
|---|---|---|
| 10 | 1 | 5 |
| 100 | 1 | 20 |
| 1000 | 1 | 50 |
Pattern observation: Single item access stays about the same; slice access grows with slice size.
Time Complexity: O(1) for single item, O(k) for slice of size k
This means picking one item is very fast no matter how big the Series is, but picking many items takes longer as you ask for more.
[X] Wrong: "Selecting any item from a Series always takes longer if the Series is bigger."
[OK] Correct: Single item access uses direct indexing, so it stays fast even if the Series grows large.
Knowing how fast you can get data from a Series helps you choose the right method and write code that runs smoothly in real projects.
"What if we changed from selecting by label slice to selecting by a list of random labels? How would the time complexity change?"