0
0
Prompt Engineering / GenAIml~12 mins

Key models overview (GPT, DALL-E, Stable Diffusion) in Prompt Engineering / GenAI - Model Pipeline Trace

Choose your learning style9 modes available
Model Pipeline - Key models overview (GPT, DALL-E, Stable Diffusion)

This pipeline shows how three popular AI models work: GPT for text, DALL-E for images from text, and Stable Diffusion for creating images by gradually improving noise.

Data Flow - 4 Stages
1Input Text
1 sentenceUser provides a text prompt1 sentence
"A cat sitting on a sunny windowsill"
2GPT Text Generation
1 sentenceGenerate text continuation using language model1 paragraph
"The cat basked in the warm sunlight, purring softly as it watched birds outside."
3DALL-E Text to Image
1 sentenceConvert text prompt into an image using a transformer-based model256 x 256 pixels image
Image of a cat on a sunny windowsill
4Stable Diffusion Image Generation
Random noise (512 x 512 pixels)Iteratively denoise guided by text prompt to create image512 x 512 pixels image
Clear image of a cat on a sunny windowsill
Training Trace - Epoch by Epoch
Loss
2.3 |****
1.2 |************
0.7 |******************
0.4 |**********************
     1    5    10   20  Epochs
EpochLoss ↓Accuracy ↑Observation
12.30.10High loss and low accuracy as model starts learning basic patterns
51.20.45Loss decreases, accuracy improves as model learns language/image features
100.70.70Model shows good understanding, generating coherent text or images
200.40.85Loss low and accuracy high, model converges well on training data
Prediction Trace - 6 Layers
Layer 1: Text Input
Layer 2: GPT Transformer Layers
Layer 3: DALL-E Text Encoder
Layer 4: DALL-E Image Decoder
Layer 5: Stable Diffusion Noise Input
Layer 6: Stable Diffusion Iterative Denoising
Model Quiz - 3 Questions
Test your understanding
Which model generates text continuations from a prompt?
AStable Diffusion
BDALL-E
CGPT
DNone of the above
Key Insight
GPT, DALL-E, and Stable Diffusion use different approaches to generate text or images. GPT predicts text step-by-step, DALL-E converts text directly into images, and Stable Diffusion starts from noise and refines it guided by text. Understanding their data flow and training helps grasp how AI creates content.