When using a machine learning API that enforces rate limits, what is the best practice to avoid hitting the limit?
Think about how to respect the server's capacity to handle requests.
APIs with rate limits require clients to space out requests to avoid errors. Implementing delays or backoff strategies helps stay within limits.
What will be the output of the following Python code snippet that calls a machine learning API with error handling?
import time class APIError(Exception): pass def call_api(): raise APIError('Rate limit exceeded') try: call_api() except APIError as e: print(f'Error caught: {e}') time.sleep(1) print('Retrying...')
Look at what happens inside the except block.
The exception is caught and prints the error message, then waits 1 second, and prints 'Retrying...'.
You have a machine learning model API with a strict rate limit of 5 requests per second. Which strategy best handles this limit while maximizing throughput?
Consider how to keep requests within the allowed rate without wasting time.
Queuing requests and sending them at the allowed rate maximizes throughput without triggering errors.
You monitor your ML API calls and see the following metrics over 1000 requests: 950 successful, 30 rate limit errors, 20 timeout errors. What is the error rate percentage?
Error rate = (number of errors / total requests) * 100
Total errors = 30 + 20 = 50. Error rate = (50/1000)*100 = 5%.
Given the following Python code snippet that calls an ML API, which option correctly identifies the bug causing the program to crash?
import time rate_limit = 3 calls = 0 start_time = time.time() def call_api(): global calls, start_time if calls >= rate_limit: elapsed = time.time() - start_time time.sleep(1 - elapsed) calls = 0 start_time = time.time() calls += 1 print('API call made') for _ in range(5): call_api()
Check the calculation inside time.sleep().
If elapsed is greater than 1, 1 - elapsed is negative, and time.sleep() raises ValueError.