Experiment - Pipeline best practices
Problem:You have a machine learning pipeline that preprocesses data and trains a model. The pipeline runs but the model's validation accuracy is lower than expected and training takes longer than necessary.
Current Metrics:Training accuracy: 92%, Validation accuracy: 75%, Training time: 120 seconds
Issue:The pipeline is not optimized. Data preprocessing steps are repeated unnecessarily, causing longer training time. Also, the model may be overfitting due to lack of proper data splitting and scaling inside the pipeline.