Bird
Raised Fist0
Prompt Engineering / GenAIml~20 mins

Key models overview (GPT, DALL-E, Stable Diffusion) in Prompt Engineering / GenAI - Practice Problems & Coding Challenges

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Challenge - 5 Problems
πŸŽ–οΈ
GenAI Mastery Badge
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate
1:30remaining
What is the primary function of GPT models?

GPT models are designed mainly to:

AEnhance audio signals for speech clarity
BGenerate and understand human-like text
CCreate detailed images from text descriptions
DPredict stock market trends using numerical data
Attempts:
2 left
πŸ’‘ Hint

Think about what GPT stands for and what it is famous for.

❓ Predict Output
intermediate
1:30remaining
Output of a Stable Diffusion image generation step

Given a text prompt, what type of output does Stable Diffusion produce?

AA sound clip related to the prompt
BA text summary describing the prompt
CA high-resolution image matching the prompt
DA 3D model file of the described object
Attempts:
2 left
πŸ’‘ Hint

Stable Diffusion is known for creating visuals from words.

❓ Model Choice
advanced
2:00remaining
Choosing the right model for text-to-image generation

You want to create an artistic image from a text description. Which model is best suited?

AStable Diffusion
BGPT
CDALL-E
DBERT
Attempts:
2 left
πŸ’‘ Hint

Consider which models specialize in image creation from text.

❓ Metrics
advanced
2:00remaining
Evaluating GPT model text generation quality

Which metric is commonly used to measure how well GPT models generate coherent and relevant text?

AFrΓ©chet Inception Distance
BInception Score
CMean Squared Error
DBLEU score
Attempts:
2 left
πŸ’‘ Hint

Think about metrics used in natural language processing.

πŸ”§ Debug
expert
2:30remaining
Why might a DALL-E model fail to generate an image from a prompt?

Given a valid text prompt, which of the following is the most likely cause for DALL-E not producing an image?

AThe prompt contains unsupported special characters causing tokenization errors
BThe model is trained only on audio data, so it cannot generate images
CThe prompt is too short, so the model refuses to run
DThe model requires GPU but is running on CPU, which is unsupported
Attempts:
2 left
πŸ’‘ Hint

Consider how text inputs are processed before image generation.

Practice

(1/5)
1. Which model is mainly used to generate human-like text?
easy
A. GPT
B. DALL-E
C. Stable Diffusion
D. None of the above

Solution

  1. Step 1: Understand GPT's purpose

    GPT is designed to generate and understand human-like text.
  2. Step 2: Compare with other models

    DALL-E and Stable Diffusion create images, not text.
  3. Final Answer:

    GPT -> Option A
  4. Quick Check:

    Text generation = GPT [OK]
Hint: Text output? Think GPT first. [OK]
Common Mistakes:
  • Confusing DALL-E as text model
  • Thinking Stable Diffusion generates text
  • Choosing 'None of the above'
2. Which of the following is the correct way to describe DALL-E's function?
easy
A. It generates text based on images.
B. It compresses images for storage.
C. It creates images from text descriptions.
D. It translates text from one language to another.

Solution

  1. Step 1: Identify DALL-E's main function

    DALL-E creates images from text prompts given by users.
  2. Step 2: Eliminate incorrect options

    It does not generate text, translate languages, or compress images.
  3. Final Answer:

    It creates images from text descriptions. -> Option C
  4. Quick Check:

    Text to image = DALL-E [OK]
Hint: DALL-E = text to image creator. [OK]
Common Mistakes:
  • Thinking DALL-E generates text
  • Confusing with translation models
  • Assuming it compresses images
3. Given the following code snippet using a model, what type of output should you expect?
model = 'Stable Diffusion'
input_text = 'A sunny beach with palm trees'
output = model.generate(input_text)
medium
A. A photo-realistic image of a sunny beach
B. A summary of the text input
C. A written story about a beach
D. An error because Stable Diffusion cannot generate output

Solution

  1. Step 1: Identify Stable Diffusion's output type

    Stable Diffusion generates images from text prompts.
  2. Step 2: Match input and output

    Input is a text description; output will be an image matching that description.
  3. Final Answer:

    A photo-realistic image of a sunny beach -> Option A
  4. Quick Check:

    Text input + Stable Diffusion = Image output [OK]
Hint: Stable Diffusion turns words into pictures. [OK]
Common Mistakes:
  • Expecting text output
  • Thinking it summarizes text
  • Assuming it causes an error
4. You tried to use GPT to create an image by running this code:
model = 'GPT'
input_text = 'A cat sitting on a sofa'
output = model.generate_image(input_text)
What is the main problem here?
medium
A. The input text is too short for GPT to understand.
B. GPT cannot generate images; it only generates text.
C. The method name should be generate_text, not generate_image.
D. There is no problem; the code will work fine.

Solution

  1. Step 1: Understand GPT's capabilities

    GPT is designed to generate text, not images.
  2. Step 2: Analyze the method call

    Calling generate_image on GPT is invalid because GPT lacks image generation ability.
  3. Final Answer:

    GPT cannot generate images; it only generates text. -> Option B
  4. Quick Check:

    GPT = text only, no images [OK]
Hint: GPT does text, not images. [OK]
Common Mistakes:
  • Thinking GPT can create images
  • Believing method name is wrong only
  • Ignoring model capability limits
5. You want to build an app that lets users type a prompt to generate a story and then see an image illustrating it. Which combination of models should you use?
hard
A. Use GPT for image generation and DALL-E for text generation.
B. Use DALL-E to generate the story and GPT to create the image.
C. Use Stable Diffusion for both story and image generation.
D. Use GPT to generate the story and Stable Diffusion to create the image.

Solution

  1. Step 1: Identify model roles for text and image

    GPT is best for generating human-like text stories.
  2. Step 2: Identify model for image creation

    Stable Diffusion creates images from text descriptions, perfect for illustrating stories.
  3. Final Answer:

    Use GPT to generate the story and Stable Diffusion to create the image. -> Option D
  4. Quick Check:

    Text by GPT + Image by Stable Diffusion = App [OK]
Hint: Text with GPT, images with Stable Diffusion. [OK]
Common Mistakes:
  • Swapping roles of GPT and DALL-E
  • Using one model for both tasks
  • Confusing image and text generation roles