When running agents asynchronously, key metrics include throughput (how many tasks finish per time), latency (time to complete each task), and success rate (how many tasks finish correctly). These show if the system is fast, responsive, and reliable. Accuracy of the agent's output is also important to measure quality.
Async agent execution in Agentic Ai - Model Metrics & Evaluation
Async Agent Task Results:
| Task ID | Status | Result Correct? |
|---------|----------|-----------------|
| 1 | Success | Yes |
| 2 | Success | No |
| 3 | Failed | N/A |
| 4 | Success | Yes |
| 5 | Success | Yes |
Summary:
- Total tasks: 5
- Success: 4
- Failures: 1
- Correct results: 3
Metrics:
- Success Rate = 4/5 = 0.8
- Accuracy (on success) = 3/4 = 0.75
- Overall Accuracy = 3/5 = 0.6
In async agent execution, precision means how many completed tasks are actually correct. Recall means how many correct tasks the system completes out of all tasks that should be done.
Example: If the agent completes many tasks quickly but some are wrong, precision is low. If it completes only a few tasks but all are correct, recall is low.
Choosing between speed and correctness depends on use case. For urgent tasks, higher recall (completing more tasks) may be better. For critical tasks, higher precision (correct results) matters more.
Good: Success rate above 90%, accuracy above 85%, low latency (tasks finish quickly), and high throughput (many tasks done per second).
Bad: Success rate below 70%, accuracy below 60%, high latency (slow task completion), and low throughput (few tasks done).
Good metrics mean the async agent is fast, reliable, and produces correct results. Bad metrics mean delays, failures, or wrong outputs.
- Ignoring failed tasks: Only measuring successful tasks can hide failure problems.
- Data leakage: Using future info to evaluate current tasks inflates accuracy.
- Overfitting: Agent may perform well on test tasks but fail on new ones.
- Latency spikes: Average latency hides occasional very slow tasks.
- Throughput vs quality tradeoff: Maximizing speed may reduce accuracy.
Your async agent has 98% success rate but only 12% recall on critical tasks. Is it good for production? Why or why not?
Answer: No, it is not good. Although most tasks finish successfully, the agent misses many critical tasks (low recall). This means important work is not done, which can cause serious problems.
