HLDsystem_design~25 mins

Pagination patterns (cursor, offset) in HLD - System Design Exercise

Choose your learning style9 modes available

Learn Why Deep Arch Practice Challenge Design Recall Scale

Design: Pagination System for Large Data Sets

Design covers backend pagination API and data retrieval patterns. UI and client implementation are out of scope.

Functional Requirements

FR1: Support efficient retrieval of data in pages for large datasets

FR2: Provide two pagination methods: offset-based and cursor-based

FR3: Allow clients to request any page of data with a fixed page size

FR4: Ensure consistent ordering of data across pages

FR5: Handle data changes (inserts, deletes) gracefully during pagination

FR6: Support high read throughput with low latency (p99 < 150ms)

FR7: Allow pagination for datasets with millions of records

Non-Functional Requirements

NFR1: System must handle 10,000 concurrent pagination requests

NFR2: API response latency p99 under 150ms

NFR3: Availability target 99.9% uptime (max 8.77 hours downtime/year)

NFR4: Data consistency: eventual consistency acceptable for cursor pagination

NFR5: Offset pagination must support random page access

NFR6: Cursor pagination must support forward-only navigation

Think Before You Design

Questions to Ask

❓ Question 1

❓ Question 2

❓ Question 3

❓ Question 4

❓ Question 5

❓ Question 6

Key Components

API layer to accept pagination parameters

Database or storage system with indexing support

Cache layer to speed up repeated queries

Pagination logic module implementing offset and cursor methods

Ordering and filtering mechanisms

Design Patterns

Offset-based pagination using LIMIT and OFFSET

Cursor-based pagination using unique, ordered keys

Keyset pagination for efficient forward navigation

Caching frequently accessed pages

Handling data changes with snapshot or versioning

Reference Architecture

Client
  |
  | HTTP Request with pagination params
  v
API Server
  |-- Validate pagination params
  |-- Choose pagination method (offset or cursor)
  |-- Query DB with pagination logic
  |-- Return paged data + next cursor or page info
  v
Database
  |-- Indexed on ordering key(s)
  |-- Supports LIMIT/OFFSET and key-based queries
  
Cache (optional)
  |-- Stores popular pages or cursors

Components

API Server

REST/GraphQL API

Receives pagination requests, validates parameters, and returns paged data

Database

Relational DB (e.g., PostgreSQL) or NoSQL with indexing

Stores data and supports efficient queries with LIMIT/OFFSET or key-based filters

Cache Layer

Redis or Memcached

Caches frequently requested pages or cursor results to reduce DB load

Pagination Logic Module

Backend service code

Implements offset and cursor pagination algorithms and manages ordering

Request Flow

1. Client sends request with pagination parameters: page number and size for offset, or cursor and size for cursor pagination.

2. API Server validates parameters and determines which pagination method to use.

3. For offset pagination, API Server queries the database using LIMIT and OFFSET clauses.

4. For cursor pagination, API Server queries the database using WHERE clauses on the ordering key greater than the cursor value, with LIMIT.

5. Database returns the requested page of data ordered by the specified key.

6. API Server constructs response including data and next page info: next offset or next cursor value.

7. Response sent back to client for display or further navigation.

8. Optionally, popular pages or cursor results are cached to speed up future requests.

Database Schema

Entity: DataRecord - id (primary key, unique identifier) - created_at (timestamp, indexed for ordering) - other_data_fields Relationships: - None required for pagination; focus is on single large table Indexes: - Index on created_at for cursor pagination - Primary key index on id for uniqueness This schema supports ordering by created_at or id for pagination.

Scaling Discussion

Bottlenecks

Offset pagination becomes slow with large OFFSET values due to scanning and skipping rows

Cursor pagination requires stable ordering keys and may not support backward navigation

Database query load increases with high concurrency and large datasets

Cache invalidation complexity when underlying data changes

API server CPU and memory limits under high request volume

Solutions

Use cursor pagination for large datasets and sequential access to avoid OFFSET performance issues

Implement composite cursors with multiple keys to ensure stable ordering

Add caching layer for popular pages or cursors to reduce DB load

Use read replicas to distribute database read traffic

Implement rate limiting and request batching at API server

Use database partitioning or sharding for very large datasets

Design cache invalidation strategies based on data update patterns

Interview Tips

Time: Spend 10 minutes clarifying requirements and constraints, 20 minutes designing the architecture and data flow, 10 minutes discussing scaling and trade-offs, and 5 minutes summarizing.

Explain difference between offset and cursor pagination with pros and cons

Discuss how ordering and consistency affect pagination design

Describe how the system handles large datasets and high concurrency

Mention caching strategies and database indexing importance

Highlight trade-offs between random access and efficient navigation

Show awareness of real-world challenges like data changes during pagination