When running multiple tools or models at the same time, the key metric is throughput. Throughput measures how many tasks or requests are completed in a given time. It shows how well the system handles parallel work.
Another important metric is latency, which is the time taken to complete a single task. Low latency means faster responses.
We also watch resource utilization to ensure the system uses CPU, memory, and other resources efficiently without overload.