You are tasked with designing a search and filter system for an e-commerce platform that supports millions of products. Which architectural component is essential to ensure fast filtering by multiple attributes like price, brand, and rating?
Think about how to quickly find products matching multiple filter criteria without scanning all data.
Using a dedicated search engine with inverted indexes allows fast filtering by multiple attributes, supporting scalability and low latency.
Your search and filter system experiences a sudden spike in user queries. Which approach best helps maintain low latency and high availability?
Consider how to distribute load across multiple machines to handle more queries.
Horizontal scaling with load balancers and multiple search nodes distributes query load, improving availability and latency under high traffic.
Which tradeoff is true when deciding between real-time indexing and batch indexing in a search and filter system?
Think about update speed versus system resource demands.
Real-time indexing updates data quickly but requires more resources and complex design; batch indexing is simpler but updates less frequently.
What is the main advantage of using an inverted index in a search and filter system?
Consider how to quickly find all items matching a specific attribute value.
An inverted index maps attribute values to document IDs, enabling fast retrieval of matching items for filters.
You have 10 million products, each with 5 filterable attributes. Each attribute has on average 100 unique values. If each inverted index entry (mapping a value to product IDs) requires 100 bytes, estimate the total storage needed for the inverted indexes.
Calculate total entries as attributes × unique values, then multiply by entry size.
Total entries = 5 attributes × 100 values = 500 entries. Each entry maps to many products but storage is per entry. 500 entries × 100 bytes = 50,000 bytes per product? Actually, inverted index stores product IDs per value. Assuming 10 million products spread evenly, total storage is roughly 5 attributes × 100 values × 10 million products × size per product ID (say 4 bytes). This is large, but question simplifies to 50 GB as best estimate.