A/B testing compares two models to see which one works better in real life. The key metrics depend on the goal. For example, if the goal is to increase clicks, use click-through rate (CTR). If it is to improve sales, use conversion rate. These metrics show the real impact of each model on user behavior.
Besides these, statistical significance (like p-value) is important to know if the difference is real or just by chance.