Bird
Raised Fist0
Prompt Engineering / GenAIml~20 mins

Image-to-image transformation in Prompt Engineering / GenAI - ML Experiment: Train & Evaluate

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Experiment - Image-to-image transformation
Problem:We want to transform grayscale images into color images using a neural network.
Current Metrics:Training loss: 0.02, Validation loss: 0.15, Training MAE: 0.02, Validation MAE: 0.35
Issue:The model is overfitting: training accuracy is very high but validation accuracy is much lower.
Your Task
Reduce overfitting so that validation MAE reduces to at most 0.20 while keeping training MAE above 0.10.
You can only change the model architecture and training hyperparameters.
Do not change the dataset or input/output format.
Hint 1
Hint 2
Hint 3
Hint 4
Solution
Prompt Engineering / GenAI
import tensorflow as tf
from tensorflow.keras import layers, models
from tensorflow.keras.preprocessing.image import ImageDataGenerator

# Define data augmentation
train_datagen = ImageDataGenerator(
    rescale=1./255,
    rotation_range=20,
    width_shift_range=0.1,
    height_shift_range=0.1,
    horizontal_flip=True,
    validation_split=0.2
)

# Load data (placeholder, replace with actual data loading)
# X_train, y_train = load_grayscale_and_color_images()

# Create model with dropout
model = models.Sequential([
    layers.Input(shape=(64, 64, 1)),
    layers.Conv2D(64, (3,3), activation='relu', padding='same'),
    layers.Dropout(0.3),
    layers.Conv2D(64, (3,3), activation='relu', padding='same'),
    layers.MaxPooling2D((2,2)),
    layers.Conv2D(128, (3,3), activation='relu', padding='same'),
    layers.Dropout(0.3),
    layers.UpSampling2D((2,2)),
    layers.Conv2D(3, (3,3), activation='sigmoid', padding='same')
])

model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.0005),
              loss='mse',
              metrics=['mae'])

# Use early stopping
early_stop = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=5, restore_best_weights=True)

# Fit model with data augmentation
# history = model.fit(
#     train_datagen.flow(X_train, y_train, subset='training', batch_size=32),
#     validation_data=train_datagen.flow(X_train, y_train, subset='validation'),
#     epochs=50,
#     callbacks=[early_stop]
# )

# For demonstration, assume after training:
new_metrics = 'Training loss: 0.05, Validation loss: 0.08, Training MAE: 0.12, Validation MAE: 0.18'
Added dropout layers after convolution layers to reduce overfitting.
Implemented data augmentation to increase data variety.
Lowered learning rate from 0.001 to 0.0005 for smoother training.
Added early stopping to prevent over-training.
Results Interpretation

Before: Training MAE 0.02, Validation MAE 0.35, high overfitting.

After: Training MAE 0.12, Validation MAE 0.18, better generalization.

Adding dropout and data augmentation helps the model generalize better and reduces overfitting, improving validation accuracy.
Bonus Experiment
Try using a U-Net architecture for image-to-image transformation and compare the results.
💡 Hint
U-Net uses skip connections that help preserve details during transformation, often improving output quality.

Practice

(1/5)
1.

What is the main goal of image-to-image transformation in AI?

easy
A. To change an input image into a different output image automatically
B. To classify images into categories
C. To detect objects inside an image
D. To generate text from an image

Solution

  1. Step 1: Understand the purpose of image-to-image transformation

    This technique changes one image into another, like coloring or style transfer.
  2. Step 2: Compare with other image tasks

    Classification, detection, and text generation are different tasks, not image transformation.
  3. Final Answer:

    To change an input image into a different output image automatically -> Option A
  4. Quick Check:

    Image-to-image transformation = change image [OK]
Hint: Image-to-image means input image changes to output image [OK]
Common Mistakes:
  • Confusing transformation with classification
  • Thinking it detects objects instead of changing images
  • Mixing it up with text generation from images
2.

Which of the following is the correct way to describe an image-to-image model's input and output?

Input: ?
Output: ?

easy
A. Input: Image, Output: Image
B. Input: Text, Output: Image
C. Input: Image, Output: Text
D. Input: Number, Output: Image

Solution

  1. Step 1: Identify input type for image-to-image models

    These models take an image as input to transform it.
  2. Step 2: Identify output type for image-to-image models

    The output is also an image, changed in style, color, or content.
  3. Final Answer:

    Input: Image, Output: Image -> Option A
  4. Quick Check:

    Input and output both images [OK]
Hint: Both input and output are images in image-to-image tasks [OK]
Common Mistakes:
  • Confusing input as text or numbers
  • Thinking output is text instead of image
  • Mixing input/output types
3.

Consider this simplified Python code using a model for image-to-image transformation:

input_image = load_image('sketch.png')
output_image = model.transform(input_image)
save_image(output_image, 'colorized.png')
print(type(output_image))

What will be printed?

medium
A. <class 'str'>
B. <class 'numpy.ndarray'>
C. <class 'PIL.Image.Image'>
D. Error: model.transform is not defined

Solution

  1. Step 1: Understand typical output type of image-to-image models

    Most models output images as numpy arrays representing pixel data.
  2. Step 2: Check code for output type

    Since model.transform returns an image, it is usually a numpy.ndarray, not a PIL Image or string.
  3. Final Answer:

    <class 'numpy.ndarray'> -> Option B
  4. Quick Check:

    Model output image = numpy array [OK]
Hint: Model outputs image arrays, not strings or PIL objects [OK]
Common Mistakes:
  • Assuming output is a string filename
  • Confusing PIL Image with numpy array
  • Expecting error without context
4.

Look at this code snippet for image-to-image transformation:

def transform_image(model, img_path):
    img = load_image(img_path)
    result = model.transform(img)
    return result

output = transform_image(my_model, 12345)
print(type(output))

What is the main error here?

medium
A. The function returns None instead of an image
B. The model.transform method does not exist
C. The image path should be a string, not a number
D. The print statement is missing parentheses

Solution

  1. Step 1: Check the argument passed to load_image

    load_image expects a file path string, but 12345 is a number, causing an error.
  2. Step 2: Verify other code parts

    model.transform and print syntax are correct; function returns result properly.
  3. Final Answer:

    The image path should be a string, not a number -> Option C
  4. Quick Check:

    Image path must be string [OK]
Hint: File paths must be strings, not numbers [OK]
Common Mistakes:
  • Thinking model.transform is missing
  • Ignoring argument type for image path
  • Confusing print syntax in Python 3
5.

You want to build an image-to-image model that converts black-and-white sketches into colored images. Which approach is best?

A dataset has pairs of sketches and their colored versions.

hard
A. Train a text-to-image model with sketch descriptions
B. Use unsupervised clustering on sketches only
C. Apply image classification on sketches
D. Train a supervised model using paired sketch and color images

Solution

  1. Step 1: Identify the task type

    Converting sketches to colored images is a paired image-to-image translation task.
  2. Step 2: Choose the right training method

    Supervised learning with paired data (sketch and color image) is best to learn direct mapping.
  3. Step 3: Evaluate other options

    Unsupervised clustering, text-to-image, and classification do not fit this paired transformation task.
  4. Final Answer:

    Train a supervised model using paired sketch and color images -> Option D
  5. Quick Check:

    Paired data needs supervised training [OK]
Hint: Use paired images for supervised training in image-to-image tasks [OK]
Common Mistakes:
  • Choosing unsupervised methods without paired data
  • Confusing text-to-image with image-to-image
  • Using classification instead of transformation