You need to design a system that stores and retrieves transaction histories for millions of users efficiently. Which architectural choice best supports fast writes and reads while scaling horizontally?
Think about how to handle large data volumes and many users simultaneously.
Distributed NoSQL databases with sharding allow horizontal scaling and fast access by partitioning data by user ID, which suits transaction history storage.
Your system records 10 million transactions daily. Each transaction record is 1 KB. Estimate the storage needed for 1 year of data, considering 20% overhead for indexing and metadata.
Calculate total raw data size, then add 20% overhead.
10 million transactions/day * 1 KB = 10 GB/day. For 365 days = 3.65 TB raw. Adding 20% overhead: 3.65 TB * 1.2 = 4.38 TB.
Your transaction history system must be highly available but also provide accurate data. Which tradeoff approach is best?
Consider the CAP theorem and the system's need for availability.
Eventual consistency allows the system to remain available during network issues, accepting temporary stale data, which suits high availability needs.
Which set of components is essential for a robust transaction history system that supports querying, storage, and data integrity?
Think about components that ensure scalability, accessibility, and correctness.
A load balancer distributes requests, a distributed database stores data reliably, a query API allows access, and data validation ensures integrity.
Describe the correct sequence of steps when a client requests their transaction history from a distributed system.
Authentication must happen before routing the request.
The client request first reaches the API gateway, which authenticates the user, then routes the request to the correct service, which queries the database and returns data back through the gateway to the client.