0
0
HLDsystem_design~25 mins

Pagination patterns (cursor, offset) in HLD - System Design Exercise

Choose your learning style9 modes available
Design: Pagination System for Large Data Sets
Design covers backend pagination API and data retrieval patterns. UI and client implementation are out of scope.
Functional Requirements
FR1: Support efficient retrieval of data in pages for large datasets
FR2: Provide two pagination methods: offset-based and cursor-based
FR3: Allow clients to request any page of data with a fixed page size
FR4: Ensure consistent ordering of data across pages
FR5: Handle data changes (inserts, deletes) gracefully during pagination
FR6: Support high read throughput with low latency (p99 < 150ms)
FR7: Allow pagination for datasets with millions of records
Non-Functional Requirements
NFR1: System must handle 10,000 concurrent pagination requests
NFR2: API response latency p99 under 150ms
NFR3: Availability target 99.9% uptime (max 8.77 hours downtime/year)
NFR4: Data consistency: eventual consistency acceptable for cursor pagination
NFR5: Offset pagination must support random page access
NFR6: Cursor pagination must support forward-only navigation
Think Before You Design
Questions to Ask
❓ Question 1
❓ Question 2
❓ Question 3
❓ Question 4
❓ Question 5
❓ Question 6
Key Components
API layer to accept pagination parameters
Database or storage system with indexing support
Cache layer to speed up repeated queries
Pagination logic module implementing offset and cursor methods
Ordering and filtering mechanisms
Design Patterns
Offset-based pagination using LIMIT and OFFSET
Cursor-based pagination using unique, ordered keys
Keyset pagination for efficient forward navigation
Caching frequently accessed pages
Handling data changes with snapshot or versioning
Reference Architecture
Client
  |
  | HTTP Request with pagination params
  v
API Server
  |-- Validate pagination params
  |-- Choose pagination method (offset or cursor)
  |-- Query DB with pagination logic
  |-- Return paged data + next cursor or page info
  v
Database
  |-- Indexed on ordering key(s)
  |-- Supports LIMIT/OFFSET and key-based queries
  
Cache (optional)
  |-- Stores popular pages or cursors
  
Components
API Server
REST/GraphQL API
Receives pagination requests, validates parameters, and returns paged data
Database
Relational DB (e.g., PostgreSQL) or NoSQL with indexing
Stores data and supports efficient queries with LIMIT/OFFSET or key-based filters
Cache Layer
Redis or Memcached
Caches frequently requested pages or cursor results to reduce DB load
Pagination Logic Module
Backend service code
Implements offset and cursor pagination algorithms and manages ordering
Request Flow
1. Client sends request with pagination parameters: page number and size for offset, or cursor and size for cursor pagination.
2. API Server validates parameters and determines which pagination method to use.
3. For offset pagination, API Server queries the database using LIMIT and OFFSET clauses.
4. For cursor pagination, API Server queries the database using WHERE clauses on the ordering key greater than the cursor value, with LIMIT.
5. Database returns the requested page of data ordered by the specified key.
6. API Server constructs response including data and next page info: next offset or next cursor value.
7. Response sent back to client for display or further navigation.
8. Optionally, popular pages or cursor results are cached to speed up future requests.
Database Schema
Entity: DataRecord - id (primary key, unique identifier) - created_at (timestamp, indexed for ordering) - other_data_fields Relationships: - None required for pagination; focus is on single large table Indexes: - Index on created_at for cursor pagination - Primary key index on id for uniqueness This schema supports ordering by created_at or id for pagination.
Scaling Discussion
Bottlenecks
Offset pagination becomes slow with large OFFSET values due to scanning and skipping rows
Cursor pagination requires stable ordering keys and may not support backward navigation
Database query load increases with high concurrency and large datasets
Cache invalidation complexity when underlying data changes
API server CPU and memory limits under high request volume
Solutions
Use cursor pagination for large datasets and sequential access to avoid OFFSET performance issues
Implement composite cursors with multiple keys to ensure stable ordering
Add caching layer for popular pages or cursors to reduce DB load
Use read replicas to distribute database read traffic
Implement rate limiting and request batching at API server
Use database partitioning or sharding for very large datasets
Design cache invalidation strategies based on data update patterns
Interview Tips
Time: Spend 10 minutes clarifying requirements and constraints, 20 minutes designing the architecture and data flow, 10 minutes discussing scaling and trade-offs, and 5 minutes summarizing.
Explain difference between offset and cursor pagination with pros and cons
Discuss how ordering and consistency affect pagination design
Describe how the system handles large datasets and high concurrency
Mention caching strategies and database indexing importance
Highlight trade-offs between random access and efficient navigation
Show awareness of real-world challenges like data changes during pagination