0
0
Prompt Engineering / GenAIml~12 mins

Pre-training and fine-tuning concept in Prompt Engineering / GenAI - Model Pipeline Trace

Choose your learning style9 modes available
Model Pipeline - Pre-training and fine-tuning concept

This pipeline shows how a large model first learns general knowledge from a big dataset (pre-training), then adapts to a specific task with a smaller dataset (fine-tuning).

Data Flow - 5 Stages
1Raw Data Collection
N/AGather large diverse text data1000000 samples x variable length text
"The sun rises in the east."
2Preprocessing
1000000 samples x variable length textClean text, tokenize into words or subwords1000000 samples x 50 tokens
["The", "sun", "rises", "in", "the", "east", "."]
3Pre-training
1000000 samples x 50 tokensTrain large model to predict missing wordsModel with learned general language patterns
Model predicts missing word 'east' in sentence
4Fine-tuning Data Preparation
10000 samples x 50 tokensPrepare smaller task-specific dataset10000 samples x 50 tokens
"Is this review positive?" with label "Yes"
5Fine-tuning
10000 samples x 50 tokensTrain model further on task dataModel adapted to specific task
Model learns to classify sentiment correctly
Training Trace - Epoch by Epoch
Loss
2.3 |*       
1.5 |  *     
0.9 |    *   
0.6 |      * 
0.5 |       *
0.3 |        *
     ----------------
     1  5  10 15 16 20
     Epochs
EpochLoss ↓Accuracy ↑Observation
12.30.10High loss, low accuracy as model starts learning
51.50.45Loss decreasing, accuracy improving steadily
100.90.75Model captures general language patterns well
150.60.85Pre-training converging with good accuracy
160.50.88Fine-tuning starts on task-specific data
200.30.95Fine-tuning improves task accuracy significantly
Prediction Trace - 5 Layers
Layer 1: Input Tokenization
Layer 2: Embedding Layer
Layer 3: Pre-trained Transformer Layers
Layer 4: Fine-tuned Classification Head
Layer 5: Output Decision
Model Quiz - 3 Questions
Test your understanding
What is the main purpose of pre-training in this pipeline?
ALearn general language patterns from large data
BTrain on small task-specific dataset
CClean and tokenize the text data
DMake final predictions on new sentences
Key Insight
Pre-training helps the model learn broad knowledge from large data, making it ready to quickly adapt during fine-tuning on smaller, specific tasks. This two-step approach saves time and improves accuracy.