When an agent chooses tools to solve tasks, the key metric is accuracy of the tool's output. This shows how often the chosen tool gives the right answer.
Additionally, latency (speed) matters because slow tools can delay the agent's response.
For some tasks, precision and recall matter if the tool filters or detects specific items. For example, if the agent picks a spam detection tool, high precision avoids false spam labels.
Overall, the agent should select tools that balance accuracy, speed, and task-specific metrics to perform well.