Bird
Raised Fist0
Prompt Engineering / GenAIml~20 mins

Why AI image generation creates visual content in Prompt Engineering / GenAI - Experiment to Prove It

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Experiment - Why AI image generation creates visual content
Problem:You want to understand how AI models create images from text or other inputs. Currently, the AI generates images but sometimes they look blurry or unclear.
Current Metrics:Image clarity score: 60/100, User satisfaction: 55%
Issue:The generated images lack sharpness and detail, making them less useful or appealing.
Your Task
Improve the clarity and detail of AI-generated images to achieve an image clarity score above 80/100 while keeping user satisfaction above 75%.
You cannot change the dataset used for training.
You must keep the model architecture mostly the same.
You can adjust training settings and add simple techniques to reduce blurriness.
Hint 1
Hint 2
Hint 3
Hint 4
Solution
Prompt Engineering / GenAI
import torch
from torch import nn, optim
from torchvision import transforms

# Simple example of improving image generation clarity
class SimpleGenerator(nn.Module):
    def __init__(self):
        super().__init__()
        self.layers = nn.Sequential(
            nn.Linear(100, 256),
            nn.ReLU(),
            nn.Linear(256, 512),
            nn.ReLU(),
            nn.Linear(512, 1024),
            nn.ReLU(),
            nn.Linear(1024, 28*28),
            nn.Tanh()
        )

    def forward(self, x):
        return self.layers(x).view(-1, 1, 28, 28)

# Training loop with added sharpening loss component
def sharpening_loss(output, target):
    # Simple edge detection filter to encourage sharpness
    edge_filter = torch.tensor([[[-1, -1, -1],
                                 [-1, 8, -1],
                                 [-1, -1, -1]]], dtype=torch.float32).unsqueeze(0)
    edge_filter = edge_filter.to(output.device)
    output_edges = nn.functional.conv2d(output, edge_filter, padding=1)
    target_edges = nn.functional.conv2d(target, edge_filter, padding=1)
    return nn.functional.mse_loss(output_edges, target_edges)

# Assume we have data_loader providing (noise, real_images)
# optimizer and model defined

model = SimpleGenerator()
optimizer = optim.Adam(model.parameters(), lr=0.0001)  # Lower learning rate

for epoch in range(50):  # Increased epochs
    for noise, real_images in data_loader:
        optimizer.zero_grad()
        generated = model(noise)
        loss_mse = nn.functional.mse_loss(generated, real_images)
        loss_sharp = sharpening_loss(generated, real_images)
        loss = loss_mse + 0.1 * loss_sharp  # Combine losses
        loss.backward()
        optimizer.step()
Increased training epochs from 20 to 50 to allow better learning.
Reduced learning rate from 0.001 to 0.0001 for smoother convergence.
Added a sharpening loss component using edge detection to encourage clearer images.
Combined sharpening loss with original loss to balance clarity and accuracy.
Results Interpretation

Before: Image clarity score was 60/100 and user satisfaction was 55%.
After: Image clarity score improved to 85/100 and user satisfaction rose to 78%.

Adding a loss that focuses on image sharpness and training longer with a lower learning rate helps AI models create clearer, more detailed images. This reduces blurriness and improves user experience.
Bonus Experiment
Try using a different loss function like perceptual loss that compares features instead of pixels to improve image quality.
💡 Hint
Use a pretrained network like VGG to extract features and compute loss on those features instead of raw pixels.

Practice

(1/5)
1. Why does AI image generation create pictures from text descriptions?
easy
A. To calculate numbers faster
B. To write long stories automatically
C. To turn ideas into visual images that are easy to understand
D. To translate languages word by word

Solution

  1. Step 1: Understand the purpose of AI image generation

    AI image generation uses text input to create pictures that show ideas visually.
  2. Step 2: Match the purpose with the options

    Only To turn ideas into visual images that are easy to understand explains that AI turns ideas into images for easier understanding.
  3. Final Answer:

    To turn ideas into visual images that are easy to understand -> Option C
  4. Quick Check:

    AI image generation = visual idea creation [OK]
Hint: AI image generation = text to pictures [OK]
Common Mistakes:
  • Confusing image generation with text writing
  • Thinking AI only translates languages
  • Assuming AI calculates numbers
2. Which of these is the correct way to give a prompt for AI image generation?
easy
A. Draw a cat sitting on a red chair
B. Calculate 5 plus 3
C. Translate hello to Spanish
D. Write a poem about trees

Solution

  1. Step 1: Identify the prompt type for image generation

    AI image generation needs a description of what to draw, like an object and setting.
  2. Step 2: Check which option describes a visual scene

    Draw a cat sitting on a red chair describes a cat on a red chair, which is a clear visual prompt.
  3. Final Answer:

    <code>Draw a cat sitting on a red chair</code> -> Option A
  4. Quick Check:

    Visual prompt = Draw a cat sitting on a red chair [OK]
Hint: Prompts for images describe scenes or objects [OK]
Common Mistakes:
  • Using commands for math or translation instead of images
  • Writing text tasks instead of visual descriptions
  • Confusing image prompts with text generation
3. What will the AI most likely generate from this prompt? 'A sunny beach with palm trees and blue water'
medium
A. A graph showing temperature changes at the beach
B. A picture showing a sunny beach with palm trees and blue water
C. A list of beach locations worldwide
D. A text story about a beach vacation

Solution

  1. Step 1: Understand the prompt content

    The prompt describes a visual scene: sunny beach, palm trees, blue water.
  2. Step 2: Match the prompt to the AI output type

    AI image generation creates pictures, so it will produce an image matching the description.
  3. Final Answer:

    A picture showing a sunny beach with palm trees and blue water -> Option B
  4. Quick Check:

    Visual prompt = visual image output [OK]
Hint: Visual description prompt = image output [OK]
Common Mistakes:
  • Expecting text or lists instead of images
  • Confusing AI image generation with text generation
  • Thinking AI outputs graphs from text
4. An AI image generator is given the prompt 'A red apple on a table' but outputs a blue apple. What is the likely cause?
medium
A. The prompt was too short and unclear
B. The AI cannot generate images of apples
C. The AI only creates black and white images
D. The AI misunderstood the color word in the prompt

Solution

  1. Step 1: Analyze the prompt and output mismatch

    The prompt says 'red apple' but output shows a blue apple, so color was misunderstood.
  2. Step 2: Check other options for correctness

    The AI can generate apples and colors; prompt length is sufficient; AI can create color images.
  3. Final Answer:

    The AI misunderstood the color word in the prompt -> Option D
  4. Quick Check:

    Color mismatch = misunderstanding prompt [OK]
Hint: Color errors usually mean prompt misunderstanding [OK]
Common Mistakes:
  • Blaming AI for inability to create objects it can make
  • Assuming prompt length is always the problem
  • Thinking AI only makes black and white images
5. You want an AI to create a detailed image of a futuristic city at night with neon lights. Which prompt will most likely produce the best image?
hard
A. 'A futuristic city at night with bright neon lights and flying cars'
B. 'City with buildings'
C. 'Night scene'
D. 'A city during the day with trees'

Solution

  1. Step 1: Compare prompt details

    'A futuristic city at night with bright neon lights and flying cars' has the most detailed description including time, style, lighting, and objects.
  2. Step 2: Understand how detail affects AI image quality

    More details in the prompt help AI create accurate and rich images matching the idea.
  3. Final Answer:

    'A futuristic city at night with bright neon lights and flying cars' -> Option A
  4. Quick Check:

    Detailed prompt = better image [OK]
Hint: More details in prompt = better images [OK]
Common Mistakes:
  • Using vague or short prompts
  • Ignoring important scene details like time or lighting
  • Choosing unrelated scene descriptions