0
0
Prompt Engineering / GenAIml~5 mins

Training data preparation in Prompt Engineering / GenAI - Cheat Sheet & Quick Revision

Choose your learning style9 modes available
Recall & Review
beginner
What is training data preparation in machine learning?
Training data preparation is the process of cleaning, organizing, and formatting raw data so that a machine learning model can learn from it effectively.
Click to reveal answer
beginner
Why do we need to clean data before training a model?
Cleaning data removes errors, missing values, and inconsistencies that could confuse the model and reduce its accuracy.
Click to reveal answer
intermediate
What is feature scaling and why is it important?
Feature scaling adjusts the range of data features so they have similar scales, helping the model learn faster and perform better.
Click to reveal answer
beginner
Explain the difference between training, validation, and test data.
Training data is used to teach the model. Validation data helps tune the model’s settings. Test data checks how well the model works on new, unseen data.
Click to reveal answer
intermediate
What is data augmentation and when is it used?
Data augmentation creates new training examples by modifying existing data, like flipping images. It is used to increase data size and improve model robustness.
Click to reveal answer
Which step is NOT part of training data preparation?
ACleaning missing values
BSplitting data into sets
CTraining the model
DScaling features
Why do we split data into training, validation, and test sets?
ATo evaluate model performance fairly
BTo remove errors from data
CTo make the dataset smaller
DTo speed up data cleaning
What does feature scaling do?
AAdds new data points
BRemoves missing data
CSplits data into groups
DChanges data to a similar range
Data augmentation is mainly used to:
ACreate more training examples
BClean data errors
CSplit data into sets
DScale features
Which of these is a common data cleaning task?
ANormalizing features
BRemoving duplicates
CSplitting data
DTraining the model
Describe the key steps involved in preparing training data for a machine learning model.
Think about what you do to raw data before feeding it to a model.
You got /5 concepts.
    Explain why splitting data into training, validation, and test sets is important.
    Consider how you check if a model works well on new data.
    You got /5 concepts.