Feast feature store basics in MLOps - Time & Space Complexity
When using Feast, a feature store for machine learning, it's important to understand how the time to retrieve features grows as the number of features or entities increases.
We want to know how the system's response time changes when we ask for more data.
Analyze the time complexity of the following Feast feature retrieval code.
from feast import FeatureStore
store = FeatureStore(repo_path="./feature_repo")
entity_rows = [
{"customer_id": 1},
{"customer_id": 2},
# ... more entities
]
features = store.get_online_features(
feature_refs=["customer_features:age", "customer_features:total_orders"],
entity_rows=entity_rows
).to_dict()
This code fetches specific features for multiple customers from the Feast online store.
Look at what repeats when fetching features.
- Primary operation: Retrieving features for each entity row.
- How many times: Once per entity in the list (number of customers).
As you ask for features for more customers, the time to get all features grows roughly in direct proportion.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | 10 feature retrievals |
| 100 | 100 feature retrievals |
| 1000 | 1000 feature retrievals |
Pattern observation: Doubling the number of customers roughly doubles the work.
Time Complexity: O(n)
This means the time to fetch features grows linearly with the number of entities requested.
[X] Wrong: "Fetching features for multiple entities happens instantly regardless of count."
[OK] Correct: Each entity requires a separate lookup, so more entities mean more work and longer time.
Understanding how data retrieval scales helps you design efficient ML pipelines and shows you can think about system performance clearly.
"What if we batch entities differently or cache features? How would the time complexity change?"