| Scale | Users | Transactions per Second (TPS) | Key Changes |
|---|---|---|---|
| Small | 100 users | 1-5 TPS | Single payment gateway, single app server, simple DB setup |
| Medium | 10,000 users | 50-200 TPS | Multiple app servers behind load balancer, payment gateway failover, DB read replicas |
| Large | 1,000,000 users | 1,000-5,000 TPS | Horizontal scaling of app servers, sharded DB, caching, multiple payment gateways, async processing |
| Very Large | 100,000,000 users | 50,000+ TPS | Microservices architecture, global load balancing, multi-region DB clusters, advanced fraud detection, CDN for static content |
Payment integration architecture in HLD - Scalability & System Analysis
At small to medium scale, the database is the first bottleneck. Payment transactions require strong consistency and ACID properties, so the DB must handle many writes and reads reliably. As TPS grows beyond a few thousand, the DB can become overwhelmed by write locks and transaction volume.
At larger scales, the application servers handling payment processing and communication with external payment gateways become bottlenecks due to CPU and network limits.
- Database scaling: Use read replicas for read-heavy queries, implement sharding by user or transaction ID to distribute writes, and optimize indexes.
- Application scaling: Horizontally scale app servers behind load balancers to handle more concurrent payment requests.
- Caching: Cache non-sensitive data like payment method metadata to reduce DB load.
- Payment gateway: Integrate multiple payment gateways with failover and load balancing to avoid single points of failure.
- Asynchronous processing: Use message queues for non-critical tasks like sending receipts or fraud checks to reduce latency.
- Security and compliance: Ensure PCI DSS compliance and encrypt sensitive data to maintain trust and avoid penalties.
- At 1,000 TPS, expect ~86 million transactions per day.
- Each transaction record ~1 KB, so daily storage ~86 GB; monthly ~2.5 TB.
- Network bandwidth depends on payload size; assume 2 KB per transaction -> 2 MB/s at 1,000 TPS.
- Database must handle ~1,000 writes/sec plus reads; a single PostgreSQL instance can handle up to ~5,000 QPS with tuning.
- App servers: each can handle ~1,000 concurrent connections; scale horizontally as needed.
Start by clarifying the expected transaction volume and latency requirements. Then identify the main components: app servers, database, payment gateways. Discuss bottlenecks at each scale and propose targeted solutions like caching, sharding, and horizontal scaling. Emphasize security and compliance. Use real numbers to justify your choices.
Your database handles 1,000 QPS. Traffic grows 10x to 10,000 QPS. What do you do first?
Answer: Add read replicas to offload read queries and implement sharding to distribute write load. Also, consider caching and asynchronous processing to reduce DB pressure.
