When using a pipeline with GridSearchCV, the main goal is to find the best model settings that work well on new data. The metric you choose depends on your problem:
- Accuracy if classes are balanced and you want overall correctness.
- Precision if false alarms are costly (e.g., spam detection).
- Recall if missing positive cases is bad (e.g., disease detection).
- F1 score if you want a balance between precision and recall.
GridSearchCV uses this metric to compare different model setups inside the pipeline and pick the best one.