0
0
MLOpsdevops~5 mins

Online vs offline feature stores in MLOps - CLI Comparison

Choose your learning style9 modes available
Introduction
Feature stores help manage data features used in machine learning. Online feature stores provide fast access to fresh data for real-time predictions. Offline feature stores store historical data for training and batch processing.
When you need to serve real-time predictions with the latest data in a web app or mobile app
When you want to train machine learning models using historical data stored in a data warehouse
When you want to keep feature data consistent between training and serving environments
When you want to reduce data engineering work by centralizing feature management
When you want to monitor feature data quality and freshness over time
Commands
This command creates an offline feature store using Parquet files stored at /data/offline_features. Offline stores hold historical feature data for training.
Terminal
mlflow feature-store create-offline-store --name example_offline_store --type parquet --path /data/offline_features
Expected OutputExpected
Created offline feature store 'example_offline_store' with type 'parquet' at path '/data/offline_features'
--name - Sets the name of the offline feature store
--type - Specifies the storage type for offline features
--path - Defines the storage location for offline features
This command creates an online feature store using Redis for fast, low-latency access to features during model serving.
Terminal
mlflow feature-store create-online-store --name example_online_store --type redis --host 127.0.0.1 --port 6379
Expected OutputExpected
Created online feature store 'example_online_store' with type 'redis' at 127.0.0.1:6379
--name - Sets the name of the online feature store
--type - Specifies the storage type for online features
--host - Defines the Redis server host
--port - Defines the Redis server port
This command ingests historical user feature data into the offline feature store from a Parquet file for training use.
Terminal
mlflow feature-store ingest --store example_offline_store --feature-group user_features --file user_features.parquet
Expected OutputExpected
Ingested 10000 records into feature group 'user_features' in offline store 'example_offline_store'
--store - Specifies which feature store to ingest data into
--feature-group - Defines the feature group name
--file - Specifies the data file to ingest
This command starts serving the user_features from the online feature store for real-time model predictions.
Terminal
mlflow feature-store serve --store example_online_store --feature-group user_features
Expected OutputExpected
Serving feature group 'user_features' from online store 'example_online_store' on port 8080
--store - Specifies which online feature store to serve from
--feature-group - Defines the feature group to serve
Key Concept

If you remember nothing else from this pattern, remember: offline feature stores hold historical data for training, while online feature stores provide fast access to fresh data for real-time predictions.

Common Mistakes
Using the online feature store for batch training data ingestion
Online stores are optimized for low-latency access, not large-scale batch storage, which can cause performance issues
Use the offline feature store to ingest and store historical data for training
Not keeping feature definitions consistent between online and offline stores
This causes training-serving skew, where models see different feature values during training and prediction
Define features once and use the same definitions in both stores
Serving features from the offline store in real-time applications
Offline stores have higher latency and may have stale data, causing slow or inaccurate predictions
Serve features from the online store for real-time prediction needs
Summary
Create offline feature stores to hold historical data for training machine learning models.
Create online feature stores to serve fresh features quickly for real-time predictions.
Ingest data into offline stores for batch processing and serve features from online stores during model serving.