LLDsystem_design~25 mins

Search and filter design in LLD - System Design Exercise

Choose your learning style10 modes available

Learn Why Deep Arch Practice Challenge Design Recall Scale

Start learning this pattern below

Jump into concepts and practice - no test required

Recommended

Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong

Design: Search and Filter System

Design covers backend search and filter functionality including data storage, indexing, query processing, and API design. Frontend UI and data ingestion pipelines are out of scope.

Functional Requirements

FR1: Allow users to search items by keywords

FR2: Support filtering results by multiple attributes (e.g., category, price range, rating)

FR3: Return results with low latency (p99 < 300ms)

FR4: Handle up to 5,000 concurrent search requests

FR5: Support pagination of results

FR6: Allow sorting results by relevance or attribute values

Non-Functional Requirements

NFR1: System must be highly available (99.9% uptime)

NFR2: Search results should be consistent within 5 seconds of data updates

NFR3: Support up to 1 million items in the catalog

NFR4: API response size should be limited to 1MB

Think Before You Design

Questions to Ask

❓ Question 1

❓ Question 2

❓ Question 3

❓ Question 4

❓ Question 5

Key Components

Search index (e.g., inverted index, attribute indexes)

Query parser and filter processor

API layer for search requests

Data storage for items and attributes

Cache layer for popular queries

Design Patterns

Inverted index for keyword search

Bitmap indexes or B-trees for attribute filtering

Pagination and sorting strategies

Caching frequently requested queries

Eventual consistency for index updates

Reference Architecture

Client
  |
  v
Search API Layer
  |
  v
Query Processor --> Cache Layer
  |
  v
Search Index (Inverted + Attribute Indexes)
  |
  v
Data Storage (Relational DB or NoSQL)
  |
  v
Data Update Pipeline (out of scope)

Components

Search API Layer

RESTful API with JSON

Receives search and filter requests, validates input, and returns results

Query Processor

Custom service or search engine query parser

Parses user queries, applies filters, sorts and paginates results

Cache Layer

Redis or Memcached

Caches popular query results to reduce latency and load

Search Index

Elasticsearch or Apache Lucene

Stores inverted index for keywords and attribute indexes for filters

Data Storage

PostgreSQL or MongoDB

Stores raw item data and attributes for indexing and retrieval

Request Flow

1. Client sends search request with keywords and filters to Search API Layer

2. API Layer validates request and forwards to Query Processor

3. Query Processor checks Cache Layer for existing results

4. If cache miss, Query Processor queries Search Index with keywords and filters

5. Search Index returns matching item IDs sorted by relevance or attribute

6. Query Processor fetches item details from Data Storage if needed

7. Results are paginated and returned to API Layer

8. API Layer sends results back to client

9. Data updates trigger asynchronous reindexing to keep Search Index fresh

Database Schema

Entities: - Item: id (PK), title, description, category, price, rating, other attributes - Indexes: Inverted index on title and description keywords - Attribute indexes on category, price range, rating Relationships: - One-to-many from Item to attributes - Indexes maintain mappings from keywords/attributes to item IDs

Scaling Discussion

Bottlenecks

Search Index query latency under high concurrent load

Cache misses causing heavy load on Search Index and Data Storage

Index update delays causing stale search results

Large result sets causing high memory and network usage

Solutions

Shard Search Index horizontally by item ID or attribute ranges

Increase cache size and use query result caching with TTL

Use incremental and asynchronous index updates with near real-time indexing

Implement efficient pagination and limit maximum page size to reduce payload

Interview Tips

Time: Spend 10 minutes clarifying requirements and constraints, 20 minutes designing components and data flow, 10 minutes discussing scaling and trade-offs, 5 minutes summarizing.

Clarify search and filter requirements including attributes and freshness

Explain choice of search index and attribute filtering techniques

Describe caching strategy to reduce latency

Discuss data flow from client request to search results

Highlight scaling challenges and solutions for high concurrency

Mention trade-offs between consistency and availability

Practice

(1/5)

1. What is the main purpose of adding filters in a search system?

easy

A. To slow down the search process for accuracy

B. To increase the total number of search results

C. To narrow down search results based on user preferences

D. To remove the search bar from the interface

Search and filter design in LLD - System Design Exercise

Start learning this pattern below

Practice

Solution

Step 1: Understand the role of filters in search

Step 2: Identify the effect of filters on results

Final Answer:

Quick Check:

Solution

Step 1: Understand comparison operators in queries

Step 2: Eliminate incorrect operators

Final Answer:

Quick Check:

Solution

Step 1: Analyze filtering needs

Step 2: Choose data structure supporting these queries

Final Answer:

Quick Check:

Solution

Step 1: Understand date comparison issues

Step 2: Identify why this causes incorrect results

Final Answer:

Quick Check:

Solution

Step 1: Consider scalability and performance needs

Step 2: Evaluate options for search and filtering

Step 3: Eliminate inefficient approaches

Final Answer:

Quick Check: