Experiment - Load balancing for AI services
Problem:You have an AI service that handles many user requests for predictions. Currently, all requests go to a single server, causing slow responses and some requests to fail when traffic is high.
Current Metrics:Average response time: 1200 ms, Request failure rate: 15%, Throughput: 50 requests/second
Issue:The AI service is overloaded on one server, leading to slow responses and failures under high traffic.