Lambda Architecture with Batch and Streaming Data
📖 Scenario: You work for a retail company that collects sales data from stores. The data comes in two ways: large files uploaded daily (batch data) and live sales events streaming in real-time (streaming data). You want to combine both to get a complete view of sales.
🎯 Goal: Build a simple Lambda architecture example that processes batch sales data and streaming sales data separately, then combines their results to get total sales per product.
📋 What You'll Learn
Create a batch dataset of sales with product names and quantities
Create a streaming dataset of sales events with product names and quantities
Write batch processing logic to sum quantities per product
Write streaming processing logic to sum quantities per product
Combine batch and streaming results to get total sales per product
Print the combined total sales
💡 Why This Matters
🌍 Real World
Retail companies often collect sales data in batches (daily reports) and streams (live transactions). Combining both helps get up-to-date sales insights.
💼 Career
Data engineers and data scientists use Lambda architecture to handle large-scale data processing combining batch and real-time data for analytics and reporting.
Progress0 / 4 steps