Process Flow - Feature stores concept
Data Sources
Feature Extraction
Feature Store
Training
Model
Data flows from sources to feature extraction, then stored centrally in the feature store, which feeds both training and serving for models.
1. Extract raw data 2. Compute features 3. Store features in feature store 4. Retrieve features for training 5. Retrieve features for serving
| Step | Action | Input | Output | Notes |
|---|---|---|---|---|
| 1 | Extract raw data | Raw data sources | Raw data batch | Collect data from databases or logs |
| 2 | Compute features | Raw data batch | Feature vectors | Transform raw data into meaningful features |
| 3 | Store features | Feature vectors | Feature store updated | Features saved for reuse |
| 4 | Retrieve for training | Feature store | Training dataset | Features fetched for model training |
| 5 | Retrieve for serving | Feature store | Features for live prediction | Features fetched in real-time for inference |
| 6 | End | - | - | Process complete, features ready for ML lifecycle |
| Variable | Start | After Step 1 | After Step 2 | After Step 3 | After Step 4 | After Step 5 | Final |
|---|---|---|---|---|---|---|---|
| Raw data | None | Collected batch | Used for feature computation | N/A | N/A | N/A | N/A |
| Feature vectors | None | None | Computed features | Stored in feature store | Retrieved for training | Retrieved for serving | Used by model |
| Feature store | Empty | Empty | Empty | Updated with features | Provides features | Provides features | Central feature repository |
Feature stores centralize feature data for ML. They store computed features once for reuse. Support both training and live serving. Ensure feature consistency and efficiency. Avoid recomputing features repeatedly.