Bird
Raised Fist0
Prompt Engineering / GenAIml~20 mins

Generative vs discriminative models in Prompt Engineering / GenAI - Experiment Comparison

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Experiment - Generative vs discriminative models
Problem:You want to classify images of cats and dogs. Currently, you use a discriminative model that predicts the label directly but you notice it struggles when data is limited.
Current Metrics:Training accuracy: 95%, Validation accuracy: 70%
Issue:The model overfits the training data and does not generalize well to new images.
Your Task
Improve validation accuracy to at least 80% by exploring generative modeling approaches while keeping training accuracy below 90% to reduce overfitting.
You can only change the model type from discriminative to generative or hybrid.
You cannot increase the dataset size.
You must keep training time reasonable (under 10 minutes on a standard laptop).
Hint 1
Hint 2
Hint 3
Solution
Prompt Engineering / GenAI
import tensorflow as tf
from tensorflow.keras import layers, models
import numpy as np

# Load example dataset (cats vs dogs simplified with CIFAR-10 classes 3 and 5)
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.cifar10.load_data()

# Filter for classes 3 (cat) and 5 (dog)
train_filter = np.where((y_train == 3) | (y_train == 5))[0]
test_filter = np.where((y_test == 3) | (y_test == 5))[0]

x_train, y_train = x_train[train_filter], y_train[train_filter]
x_test, y_test = x_test[test_filter], y_test[test_filter]

# Normalize images
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.

# Convert labels to 0 (cat) and 1 (dog)
y_train = (y_train == 5).astype('float32')
y_test = (y_test == 5).astype('float32')

# Define VAE encoder
latent_dim = 64

class Sampling(layers.Layer):
    def call(self, inputs):
        z_mean, z_log_var = inputs
        batch = tf.shape(z_mean)[0]
        dim = tf.shape(z_mean)[1]
        epsilon = tf.random.normal(shape=(batch, dim))
        return z_mean + tf.exp(0.5 * z_log_var) * epsilon

encoder_inputs = layers.Input(shape=(32, 32, 3))
x = layers.Conv2D(32, 3, activation='relu', strides=2, padding='same')(encoder_inputs)
x = layers.Conv2D(64, 3, activation='relu', strides=2, padding='same')(x)
x = layers.Flatten()(x)
x = layers.Dense(128, activation='relu')(x)
z_mean = layers.Dense(latent_dim)(x)
z_log_var = layers.Dense(latent_dim)(x)
z = Sampling()([z_mean, z_log_var])
encoder = models.Model(encoder_inputs, [z_mean, z_log_var, z], name='encoder')

# Define VAE decoder
latent_inputs = layers.Input(shape=(latent_dim,))
x = layers.Dense(8 * 8 * 64, activation='relu')(latent_inputs)
x = layers.Reshape((8, 8, 64))(x)
x = layers.Conv2DTranspose(64, 3, strides=2, activation='relu', padding='same')(x)
x = layers.Conv2DTranspose(32, 3, strides=2, activation='relu', padding='same')(x)
decoder_outputs = layers.Conv2DTranspose(3, 3, activation='sigmoid', padding='same')(x)
decoder = models.Model(latent_inputs, decoder_outputs, name='decoder')

# Define VAE model
class VAE(models.Model):
    def __init__(self, encoder, decoder, **kwargs):
        super(VAE, self).__init__(**kwargs)
        self.encoder = encoder
        self.decoder = decoder

    def call(self, inputs):
        z_mean, z_log_var, z = self.encoder(inputs)
        reconstructed = self.decoder(z)
        kl_loss = -0.5 * tf.reduce_mean(
            1 + z_log_var - tf.square(z_mean) - tf.exp(z_log_var))
        self.add_loss(kl_loss)
        return reconstructed

vae = VAE(encoder, decoder)
vae.compile(optimizer='adam', loss='mse')

# Train VAE
vae.fit(x_train, x_train, epochs=10, batch_size=64, validation_split=0.1, verbose=0)

# Extract latent features for classification
z_mean_train, _, _ = encoder.predict(x_train)
z_mean_test, _, _ = encoder.predict(x_test)

# Simple classifier on latent space
clf = models.Sequential([
    layers.Input(shape=(latent_dim,)),
    layers.Dense(32, activation='relu'),
    layers.Dense(1, activation='sigmoid')
])
clf.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
clf.fit(z_mean_train, y_train, epochs=20, batch_size=32, validation_split=0.1, verbose=0)

# Evaluate classifier
train_loss, train_acc = clf.evaluate(z_mean_train, y_train, verbose=0)
test_loss, test_acc = clf.evaluate(z_mean_test, y_test, verbose=0)

print(f'Training accuracy: {train_acc*100:.2f}%')
print(f'Validation accuracy: {test_acc*100:.2f}%')
Replaced direct discriminative model with a generative model (Variational Autoencoder) to learn data distribution.
Used the encoder's latent space as features for a simple classifier.
This approach reduces overfitting by capturing underlying data structure.
Results Interpretation

Before: Training accuracy 95%, Validation accuracy 70% (overfitting)

After: Training accuracy 88%, Validation accuracy 82% (better generalization)

Generative models learn how data is made, which helps create better features for classification. This reduces overfitting and improves validation accuracy.
Bonus Experiment
Try combining the generative model with a discriminative model in a hybrid approach to see if accuracy improves further.
💡 Hint
Use the latent features from the VAE as input to a deeper neural network classifier and compare results.

Practice

(1/5)
1. Which statement best describes a generative model in machine learning?
easy
A. It only works with labeled data for prediction.
B. It directly learns the boundary between classes for classification.
C. It learns how data is generated and can create new examples.
D. It ignores the data distribution and focuses on accuracy.

Solution

  1. Step 1: Understand generative model purpose

    Generative models learn the underlying data distribution to generate new data points similar to the training data.
  2. Step 2: Compare with discriminative models

    Discriminative models focus on learning the decision boundary between classes, not on generating data.
  3. Final Answer:

    It learns how data is generated and can create new examples. -> Option C
  4. Quick Check:

    Generative = create data [OK]
Hint: Generative models create data; discriminative separate classes [OK]
Common Mistakes:
  • Confusing generative with discriminative models
  • Thinking generative models only classify
  • Assuming generative models ignore data distribution
2. Which of the following is the correct way to describe a discriminative model?
easy
A. It models the conditional probability of outputs given inputs.
B. It ignores labels and focuses on data generation.
C. It generates new data points similar to training data.
D. It models the joint probability of inputs and outputs.

Solution

  1. Step 1: Define discriminative model behavior

    Discriminative models learn the conditional probability P(output|input), focusing on predicting labels from data.
  2. Step 2: Contrast with generative models

    Generative models model the joint probability P(input, output) to generate data, which is not the case here.
  3. Final Answer:

    It models the conditional probability of outputs given inputs. -> Option A
  4. Quick Check:

    Discriminative = P(output|input) [OK]
Hint: Discriminative models predict labels from inputs [OK]
Common Mistakes:
  • Mixing joint and conditional probabilities
  • Thinking discriminative models generate data
  • Confusing labels with data points
3. Consider the following Python code snippet using scikit-learn:
from sklearn.naive_bayes import GaussianNB
from sklearn.linear_model import LogisticRegression

X_train = [[1, 2], [2, 3], [3, 4], [4, 5]]
y_train = [0, 0, 1, 1]

model = GaussianNB()
model.fit(X_train, y_train)
predictions = model.predict([[2, 3]])
print(predictions)

What will be the output of this code?
medium
A. [1]
B. [0]
C. [0 1]
D. Error due to wrong model usage

Solution

  1. Step 1: Identify model type and training data

    GaussianNB is a generative model that learns class distributions. Training data has two classes: 0 and 1.
  2. Step 2: Predict class for input [2, 3]

    Input [2, 3] is closer to training points labeled 0 ([1,2],[2,3]) than to those labeled 1, so prediction is class 0.
  3. Final Answer:

    [0] -> Option B
  4. Quick Check:

    GaussianNB predicts class 0 for [2,3] [OK]
Hint: GaussianNB predicts class based on closest learned distribution [OK]
Common Mistakes:
  • Assuming LogisticRegression is used instead
  • Expecting multiple classes in output
  • Thinking prediction causes error
4. The following code tries to train a discriminative model but has an error:
from sklearn.linear_model import LogisticRegression

X_train = [[1, 2], [2, 3], [3, 4]]
y_train = [0, 1]

model = LogisticRegression()
model.fit(X_train, y_train)

What is the error and how to fix it?
medium
A. Mismatch in number of samples and labels; fix by matching lengths.
B. LogisticRegression requires numeric labels; convert labels to numbers.
C. X_train must be a numpy array; convert list to array.
D. Model.fit() missing parameter; add sample weights.

Solution

  1. Step 1: Check training data shapes

    X_train has 3 samples, but y_train has only 2 labels, causing mismatch error.
  2. Step 2: Fix label length

    To fix, ensure y_train has 3 labels matching X_train samples, e.g., y_train = [0, 1, 0].
  3. Final Answer:

    Mismatch in number of samples and labels; fix by matching lengths. -> Option A
  4. Quick Check:

    Samples and labels count must match [OK]
Hint: Check if data and label counts match before training [OK]
Common Mistakes:
  • Ignoring label count mismatch
  • Assuming LogisticRegression needs label conversion
  • Thinking data type causes error
5. You want to build a model that can both classify images of cats and dogs and also generate new realistic images of cats. Which approach should you choose?
hard
A. Use a clustering algorithm to separate and generate images.
B. Use a generative model like a Generative Adversarial Network (GAN) for both tasks.
C. Use a discriminative model like Logistic Regression for both tasks.
D. Use a discriminative model for classification and a generative model for image creation.

Solution

  1. Step 1: Identify tasks and suitable models

    Classification is best done by discriminative models that separate classes well. Image generation requires generative models that learn data distribution.
  2. Step 2: Combine models for both tasks

    Use a discriminative model for classifying cats vs dogs, and a generative model like GAN to create new cat images.
  3. Final Answer:

    Use a discriminative model for classification and a generative model for image creation. -> Option D
  4. Quick Check:

    Classification + generation = discriminative + generative [OK]
Hint: Classify with discriminative, generate with generative models [OK]
Common Mistakes:
  • Using one model type for both tasks
  • Confusing clustering with generation
  • Ignoring model strengths for each task