Which of the following is the primary advantage of deploying a machine learning model via an API?
Think about how APIs let different programs talk to each other.
APIs provide a way for applications to send data to the model and get predictions without exposing the model's internal code or requiring local installation.
Given the following Python code snippet calling a deployed ML model API, what is the printed output?
import requests response = requests.post('https://api.example.com/predict', json={'input': [1, 2, 3]}) print(response.json())
Assume the API is working correctly and returns prediction probabilities.
The API returns a JSON with prediction probabilities for the input data, so the output is a dictionary with a 'prediction' key and a list of floats.
When deploying a machine learning model via an API, which timeout setting is most important to ensure good user experience without overloading the server?
Think about balancing responsiveness and server health.
A low timeout helps reject slow requests early, keeping the server responsive and preventing overload from stuck or slow calls.
You deployed a model via API and collected these latency times (in milliseconds) for 5 requests: [120, 150, 130, 200, 170]. What is the average latency?
Calculate the sum of all latencies and divide by the number of requests.
Sum is 120+150+130+200+170 = 770 ms. Divide by 5 requests = 154 ms average latency.
After deploying your ML model as an API, clients report a 500 Internal Server Error when sending valid requests. Which of the following is the most likely cause?
500 errors usually mean server-side problems, not client mistakes.
A 500 error indicates a server error, often caused by missing model files or code errors during prediction, not client input or URL issues.