| Scale | Users | Key Changes |
|---|---|---|
| Small | 100 users | Single microservice instances, simple DB, minimal caching, direct client-server communication |
| Medium | 10,000 users | Multiple microservice instances, load balancers, caching layers (Redis), read replicas for DB, CDN for static content |
| Large | 1 million users | Horizontal scaling of microservices, sharded databases, distributed caches, advanced CDN usage, message queues for async tasks |
| Very Large | 100 million users | Global data centers, geo-distributed microservices, multi-region DB clusters with sharding, heavy use of CDNs, event-driven architecture, autoscaling |
Spotify architecture overview in Microservices - Scalability & System Analysis
Start learning this pattern below
Jump into concepts and practice - no test required
At around 10,000 to 100,000 concurrent users, the database becomes the first bottleneck. Spotify's metadata and user data queries increase, causing latency and throughput issues. The single database instance struggles with read/write loads, especially for personalized playlists and recommendations.
- Database Scaling: Use read replicas to offload read queries, and shard user data by region or user ID to distribute load.
- Caching: Implement Redis or Memcached to cache frequently accessed data like playlists and song metadata.
- Microservices: Horizontally scale microservices behind load balancers to handle increased API requests.
- CDN: Use Content Delivery Networks to serve static content like album art and audio files closer to users, reducing latency and bandwidth usage.
- Message Queues: Use Kafka or RabbitMQ for asynchronous processing like recommendations and analytics to smooth peak loads.
- Global Distribution: Deploy services and databases in multiple regions to reduce latency and improve fault tolerance.
At 1 million users, assuming 10% active concurrently, about 100,000 concurrent connections need handling.
- API requests: ~500,000 QPS (assuming 5 requests/user/second peak)
- Database: Needs to handle ~50,000 QPS (writes + reads), requiring sharding and replicas
- Cache: Must support ~200,000 ops/sec for hot data
- Bandwidth: Audio streaming at 160 kbps per user -> ~16 Gbps total bandwidth
- Storage: Petabytes of audio files stored across distributed object storage
Start by outlining Spotify's core components: user service, music catalog, streaming service, recommendation engine. Discuss scaling each component separately. Identify bottlenecks like DB and bandwidth early. Propose solutions like caching, sharding, and CDNs. Always justify why a solution fits the bottleneck. Use real numbers to show understanding.
Your database handles 1000 QPS. Traffic grows 10x. What do you do first?
Answer: Add read replicas to distribute read queries and reduce load on the primary database. Also, implement caching for frequent queries to reduce DB hits.
Practice
Solution
Step 1: Understand microservices purpose
Microservices split an app into small parts, each handling a specific task.Step 2: Connect to Spotify's needs
Spotify uses this to make the app scalable and easier to maintain by isolating tasks.Final Answer:
To separate different tasks for better scalability and maintenance -> Option BQuick Check:
Microservices = Separate tasks for scalability [OK]
- Thinking microservices reduce memory usage directly
- Believing microservices avoid APIs
- Assuming microservices reduce server count
Solution
Step 1: Identify common microservice communication
Microservices usually communicate via APIs or message queues for loose coupling.Step 2: Match with Spotify's design
Spotify uses APIs and message queues to keep services independent and responsive.Final Answer:
APIs and message queues -> Option AQuick Check:
Microservices communicate via APIs/message queues [OK]
- Choosing direct database access which breaks service independence
- Selecting shared memory which is uncommon in distributed systems
- Picking FTP which is unrelated to microservice communication
Solution
Step 1: Understand service responsibilities
The playlist service manages playlists and updates its own data store.Step 2: Recognize inter-service communication
After updating, it informs other services like recommendations via messages.Final Answer:
The playlist service updates its database and sends a message to the recommendation service -> Option DQuick Check:
Playlist service updates DB + notifies others [OK]
- Assuming direct DB access across services
- Thinking UI triggers backend updates
- Believing manual refresh is needed for updates
Solution
Step 1: Identify cause of inconsistent updates
Without message queues, updates may be lost or not delivered reliably.Step 2: Understand Spotify's architecture best practices
Spotify uses message queues to ensure reliable communication and consistency.Final Answer:
Message queues are not used, causing lost updates -> Option CQuick Check:
Missing message queues = lost updates [OK]
- Blaming shared database without evidence
- Confusing synchronous APIs with update loss
- Assuming deployment location causes data inconsistency
Solution
Step 1: Identify best practice for new feature in microservices
Adding a new microservice keeps responsibilities separate and scalable.Step 2: Use message queues for live data
Consuming live events via message queues fits asynchronous, decoupled design.Final Answer:
Create a new recommendation microservice that consumes live activity events via message queues -> Option AQuick Check:
New microservice + message queues = best fit [OK]
- Embedding logic in UI breaks separation
- Using monolithic DB reduces scalability
- FTP is outdated and slow for live data
