Jump into concepts and practice - no test required
or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Design: Spotify Music Streaming Service
Design covers backend microservices, data storage, streaming delivery, and user interaction APIs. Does not cover detailed frontend UI design or DRM encryption specifics.
Functional Requirements
FR1: Allow users to search and play music tracks instantly
FR2: Support personalized playlists and recommendations
FR3: Handle millions of concurrent users streaming music
FR4: Provide offline playback for subscribed users
FR5: Allow users to follow artists and share playlists
FR6: Support multiple devices and platforms (mobile, desktop, web)
FR7: Ensure high availability and low latency for streaming
FR8: Provide analytics on user listening behavior
Non-Functional Requirements
NFR1: Scale to 100 million monthly active users
NFR2: API response time p99 under 200ms for search and playback requests
NFR3: Streaming latency under 1 second from request to playback start
NFR4: Availability target of 99.9% uptime
NFR5: Data consistency for user playlists and subscriptions
NFR6: Support global distribution with data centers in multiple regions
Think Before You Design
Questions to Ask
❓ Question 1
❓ Question 2
❓ Question 3
❓ Question 4
❓ Question 5
❓ Question 6
❓ Question 7
Key Components
User Service for authentication and profiles
Music Catalog Service for metadata management
Streaming Service for audio delivery
Playlist Service for user playlists
Recommendation Service for personalized suggestions
Search Service for fast music lookup
Analytics Service for user behavior tracking
Content Delivery Network (CDN) for global streaming
Cache layers for metadata and popular tracks
Design Patterns
Microservices architecture with API Gateway
Event-driven communication for updates and analytics
CQRS for separating read and write workloads
Cache-aside pattern for metadata caching
Circuit breaker for service resilience
Data partitioning and sharding for scale
Use of CDN for content distribution
Reference Architecture
+---------------------+
| User Devices |
| (Mobile, Desktop, Web)|
+----------+----------+
|
v
+---------------------+
| API Gateway |
+----------+----------+
|
+-----------------------+-----------------------+
| | |
+--------------+ +---------------+ +--------------+
| User Service | | Search Service| |Playlist Svc |
+--------------+ +---------------+ +--------------+
| | |
v v v
+--------------+ +---------------+ +--------------+
| Auth & Profile| | Music Catalog | |Recommendation|
+--------------+ +---------------+ +--------------+
| | |
+-----------+-----------+-----------+-----------+
| |
v v
+---------------+ +----------------+
| Streaming Svc | | Analytics Svc |
+---------------+ +----------------+
|
v
+----------------+
| Content Delivery|
| Network (CDN) |
+----------------+
Components
API Gateway
Nginx / Envoy
Entry point for all client requests, routes to appropriate microservices, handles authentication and rate limiting
User Service
Spring Boot / Node.js microservice
Manages user authentication, profiles, subscriptions, and device management
Search Service
Elasticsearch
Provides fast search capabilities over music metadata
Music Catalog Service
PostgreSQL / Cassandra
Stores music metadata like tracks, albums, artists, genres
Playlist Service
MongoDB / DynamoDB
Manages user playlists, sharing, and collaborative editing
Recommendation Service
Python microservice with ML models
Generates personalized music recommendations based on user behavior
Streaming Service
Custom streaming servers with HLS/DASH
Delivers audio streams to users with low latency
Analytics Service
Kafka + Hadoop / Spark
Collects and processes user listening data for insights and recommendations
Content Delivery Network (CDN)
Akamai / Cloudflare
Caches and delivers audio content globally to reduce latency
Request Flow
1. User opens app and authenticates via API Gateway to User Service
2. User searches for a song; request routed to Search Service querying Elasticsearch
3. Search Service fetches metadata from Music Catalog Service if needed
4. User selects a track to play; Streaming Service is requested via API Gateway
5. Streaming Service fetches audio files from CDN or origin storage
6. User creates or modifies playlists via Playlist Service
7. Recommendation Service updates suggestions based on user activity events
8. Analytics Service consumes event streams for user behavior analysis
9. CDN caches popular audio content close to user location for fast delivery
Database Schema
Entities:
- User: user_id (PK), email, password_hash, subscription_status, created_at
- Device: device_id (PK), user_id (FK), device_type, last_active
- Track: track_id (PK), title, artist_id (FK), album_id (FK), duration, genre
- Artist: artist_id (PK), name, bio
- Album: album_id (PK), title, artist_id (FK), release_date
- Playlist: playlist_id (PK), user_id (FK), name, is_public
- Playlist_Track: playlist_id (FK), track_id (FK), position
- Listening_Event: event_id (PK), user_id (FK), track_id (FK), timestamp, device_id (FK)
Relationships:
- User to Device: 1 to many
- Artist to Album: 1 to many
- Album to Track: 1 to many
- User to Playlist: 1 to many
- Playlist to Track: many to many via Playlist_Track
- User to Listening_Event: 1 to many
Scaling Discussion
Bottlenecks
API Gateway becoming a single point of failure under high load
Search Service latency increasing with growing music catalog
Streaming Service bandwidth and server capacity limits
Database write contention on Playlist and Listening_Event tables
Recommendation Service model training and inference delays
Analytics pipeline lag with large event volumes
Solutions
Deploy multiple API Gateway instances behind a load balancer with health checks
Partition search index by genre or region; use replicas for read scaling
Use CDN aggressively to offload streaming traffic; autoscale streaming servers
Shard databases by user ID; use write-optimized stores for event data
Use incremental and distributed model training; cache recommendations
Implement real-time streaming analytics with scalable frameworks like Apache Flink
Interview Tips
Time: Spend 10 minutes clarifying requirements and constraints, 20 minutes designing components and data flow, 10 minutes discussing scaling and trade-offs, 5 minutes summarizing.
Explain microservices choice for modularity and independent scaling
Discuss caching and CDN use to reduce latency for streaming
Highlight separation of concerns: search, streaming, recommendations
Mention data consistency and eventual consistency trade-offs
Address global distribution and multi-region deployment
Talk about monitoring, logging, and fault tolerance strategies
Practice
(1/5)
1. What is the main reason Spotify uses microservices in its architecture?
easy
A. To avoid using APIs between components
B. To separate different tasks for better scalability and maintenance
C. To make the app use less memory on devices
D. To reduce the number of servers needed
Solution
Step 1: Understand microservices purpose
Microservices split an app into small parts, each handling a specific task.
Step 2: Connect to Spotify's needs
Spotify uses this to make the app scalable and easier to maintain by isolating tasks.
Final Answer:
To separate different tasks for better scalability and maintenance -> Option B
Quick Check:
Microservices = Separate tasks for scalability [OK]
Hint: Microservices split tasks for easier scaling and updates [OK]
2. Which communication method is commonly used between Spotify's microservices?
easy
A. APIs and message queues
B. FTP file transfers
C. Shared memory
D. Direct database access
Solution
Step 1: Identify common microservice communication
Microservices usually communicate via APIs or message queues for loose coupling.
Step 2: Match with Spotify's design
Spotify uses APIs and message queues to keep services independent and responsive.
Final Answer:
APIs and message queues -> Option A
Quick Check:
Microservices communicate via APIs/message queues [OK]
Hint: Microservices talk via APIs or message queues, not direct DB [OK]
Common Mistakes:
Choosing direct database access which breaks service independence
Selecting shared memory which is uncommon in distributed systems
Picking FTP which is unrelated to microservice communication
3. Consider a microservice that handles user playlists. If it receives a request to add a song, what is the likely flow in Spotify's architecture?
medium
A. The playlist service waits for the user to refresh the app manually
B. The playlist service directly modifies the recommendation service's database
C. The playlist service sends the request to the user interface to update
D. The playlist service updates its database and sends a message to the recommendation service
Solution
Step 1: Understand service responsibilities
The playlist service manages playlists and updates its own data store.
Step 2: Recognize inter-service communication
After updating, it informs other services like recommendations via messages.
Final Answer:
The playlist service updates its database and sends a message to the recommendation service -> Option D
Quick Check:
Playlist service updates DB + notifies others [OK]
Hint: Services update own data, notify others via messages [OK]
Common Mistakes:
Assuming direct DB access across services
Thinking UI triggers backend updates
Believing manual refresh is needed for updates
4. A developer notices that Spotify's microservices sometimes fail to update user data consistently. What is a likely cause in the architecture?
medium
A. APIs are synchronous, causing delays
B. Services are directly sharing the same database without coordination
C. Message queues are not used, causing lost updates
D. Microservices are deployed on the same server
Solution
Step 1: Identify cause of inconsistent updates
Without message queues, updates may be lost or not delivered reliably.
Step 2: Understand Spotify's architecture best practices
Spotify uses message queues to ensure reliable communication and consistency.
Final Answer:
Message queues are not used, causing lost updates -> Option C
Quick Check:
Missing message queues = lost updates [OK]
Hint: Lost updates often mean missing message queues [OK]
Common Mistakes:
Blaming shared database without evidence
Confusing synchronous APIs with update loss
Assuming deployment location causes data inconsistency
5. Spotify wants to add a new feature that recommends songs based on live user activity. Which architectural change fits best with their microservices approach?
hard
A. Create a new recommendation microservice that consumes live activity events via message queues
B. Add the recommendation logic directly inside the user interface code
C. Store all live activity data in a single monolithic database accessed by all services
D. Use FTP to transfer live activity logs to the recommendation service hourly
Solution
Step 1: Identify best practice for new feature in microservices
Adding a new microservice keeps responsibilities separate and scalable.
Step 2: Use message queues for live data
Consuming live events via message queues fits asynchronous, decoupled design.
Final Answer:
Create a new recommendation microservice that consumes live activity events via message queues -> Option A
Quick Check:
New microservice + message queues = best fit [OK]
Hint: New features get own microservice, use message queues for live data [OK]