Recall & Review
beginner
What is the purpose of a train-test split in machine learning?
It is used to divide data into two parts: one for training the model and one for testing its performance on new, unseen data.
Click to reveal answer
beginner
What typical ratio is used for train-test split?
A common ratio is 70% to 80% of data for training and 20% to 30% for testing.
Click to reveal answer
beginner
Why should the test set be separate from the training set?
To check if the model can generalize well to new data and avoid overfitting to the training data.
Click to reveal answer
beginner
What Python library and function is commonly used for train-test splitting?
The scikit-learn library with the function train_test_split from sklearn.model_selection.
Click to reveal answer
intermediate
How does random_state parameter affect train-test split?
It sets a seed for random shuffling so the split is reproducible and consistent across runs.
Click to reveal answer
What is the main goal of splitting data into train and test sets?
Which of these is a common train-test split ratio?
What does the random_state parameter do in train_test_split?
Why should the test set not be used during training?
Which Python function is used to split data into train and test sets?
Explain why train-test split is important in machine learning.
Describe how you would perform a train-test split using Python.