HLDsystem_design~10 mins

Sticky sessions in HLD - Scalability & System Analysis

Choose your learning style9 modes available

Learn Why Deep Arch Practice Challenge Design Recall Scale

Scalability Analysis - Sticky sessions

Growth Table: Sticky Sessions

Users	What Changes?
100 users	Single load balancer with sticky session enabled; one or two app servers; session stored in server memory.
10,000 users	More app servers added; load balancer manages sticky sessions via cookies; session affinity maintained; memory usage grows.
1,000,000 users	Multiple load balancers with consistent hashing or session-aware routing; session replication or centralized session store needed; risk of uneven load due to sticky sessions.
100,000,000 users	Global load balancing with geo-distribution; session data stored in distributed cache or database; sticky sessions cause scaling and failover challenges; need session statelessness or token-based sessions.

First Bottleneck

The first bottleneck is the application server memory and load balancer's ability to maintain session affinity. As users grow, servers become overloaded with session data, and load balancers struggle to route requests correctly, causing uneven load and potential downtime.

Scaling Solutions

Centralized Session Store: Move sessions to a fast shared cache like Redis to avoid server memory overload and allow any server to handle requests.
Session Replication: Replicate session data across app servers to handle failover but increases network and memory usage.
Stateless Sessions: Use tokens (e.g., JWT) so servers do not store session data, removing sticky session dependency.
Load Balancer Improvements: Use consistent hashing or advanced routing to better distribute load while maintaining session affinity.
Horizontal Scaling: Add more app servers and load balancers to distribute traffic.
Geo-Distributed Architecture: Use regional load balancers and session stores to reduce latency and improve availability.

Cost Analysis

At 1M users with 10 requests per user per minute, that is about 166,667 requests per second. A single app server can handle ~5,000 concurrent connections, so at least 334 servers are needed. Session data per user might be ~1KB, so 1M sessions require ~1GB memory in cache. Network bandwidth must support session replication traffic and user requests, potentially hundreds of MB/s. Centralized session stores add operational cost but reduce server memory needs.

Interview Tip

Start by explaining what sticky sessions are and why they are used. Then discuss how they work at small scale and what breaks as users grow. Identify the bottleneck clearly. Propose solutions step-by-step, weighing pros and cons. Mention trade-offs like complexity vs performance. Finally, discuss how to measure and monitor session-related metrics.

Self Check

Your database handles 1000 QPS. Traffic grows 10x. What do you do first?

Answer: Add read replicas and implement caching to reduce database load before scaling vertically or sharding.

Key Result

Sticky sessions work well at small scale but cause uneven load and memory pressure as users grow; moving session data out of app servers and using stateless sessions or centralized stores is key to scaling.