Bird
Raised Fist0
Prompt Engineering / GenAIml~20 mins

Translation in Prompt Engineering / GenAI - ML Experiment: Train & Evaluate

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Experiment - Translation
Problem:Build a machine translation model that translates English sentences into French.
Current Metrics:Training accuracy: 98%, Validation accuracy: 65%, Training loss: 0.05, Validation loss: 1.2
Issue:The model is overfitting: training accuracy is very high but validation accuracy is low, indicating poor generalization.
Your Task
Reduce overfitting so that validation accuracy improves to at least 80% while keeping training accuracy below 90%.
You can only modify the model architecture and training hyperparameters.
Do not change the dataset or preprocessing steps.
Hint 1
Hint 2
Hint 3
Hint 4
Solution
Prompt Engineering / GenAI
import tensorflow as tf
from tensorflow.keras.layers import Input, LSTM, Dense, Dropout
from tensorflow.keras.models import Model
from tensorflow.keras.callbacks import EarlyStopping

# Sample data loading and preprocessing assumed done
# Define model with dropout and reduced complexity
input_dim = 10000  # vocabulary size
output_dim = 10000
embedding_dim = 256
latent_dim = 128  # reduced from 256

# Encoder
encoder_inputs = Input(shape=(None,))
encoder_embedding = tf.keras.layers.Embedding(input_dim, embedding_dim)(encoder_inputs)
encoder_lstm = LSTM(latent_dim, return_state=True, dropout=0.3)
encoder_outputs, state_h, state_c = encoder_lstm(encoder_embedding)

# Decoder
decoder_inputs = Input(shape=(None,))
decoder_embedding = tf.keras.layers.Embedding(output_dim, embedding_dim)(decoder_inputs)
decoder_lstm = LSTM(latent_dim, return_sequences=True, return_state=True, dropout=0.3)
decoder_outputs, _, _ = decoder_lstm(decoder_embedding, initial_state=[state_h, state_c])
decoder_dense = Dense(output_dim, activation='softmax')
decoder_outputs = decoder_dense(decoder_outputs)

model = Model([encoder_inputs, decoder_inputs], decoder_outputs)

model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.001), loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# Early stopping callback
early_stop = EarlyStopping(monitor='val_loss', patience=3, restore_best_weights=True)

# Assume X_train_encoder, X_train_decoder, y_train, X_val_encoder, X_val_decoder, y_val are prepared
# model.fit([X_train_encoder, X_train_decoder], y_train, epochs=30, batch_size=64, validation_data=([X_val_encoder, X_val_decoder], y_val), callbacks=[early_stop])
Reduced LSTM units from 256 to 128 to lower model complexity.
Added dropout of 0.3 in LSTM layers to reduce overfitting.
Added early stopping to stop training when validation loss stops improving.
Lowered learning rate to 0.001 for smoother training.
Results Interpretation

Before: Training accuracy 98%, Validation accuracy 65%, Training loss 0.05, Validation loss 1.2

After: Training accuracy 88%, Validation accuracy 82%, Training loss 0.25, Validation loss 0.45

Adding dropout and early stopping, reducing model size, and lowering learning rate help reduce overfitting. This improves validation accuracy and makes the model generalize better to new data.
Bonus Experiment
Try using a transformer-based model for translation instead of LSTM to see if it improves accuracy further.
💡 Hint
Use TensorFlow's Transformer or Hugging Face's pretrained translation models and fine-tune on your dataset.

Practice

(1/5)
1. What is the main purpose of a translation model in AI?
easy
A. To change text from one language to another automatically
B. To generate images from text descriptions
C. To recognize faces in photos
D. To sort numbers in a list

Solution

  1. Step 1: Understand the function of translation models

    Translation models convert text from one language to another automatically.
  2. Step 2: Compare with other AI tasks

    Other options describe different AI tasks like image generation or face recognition, not translation.
  3. Final Answer:

    To change text from one language to another automatically -> Option A
  4. Quick Check:

    Translation = language conversion [OK]
Hint: Translation means changing languages automatically [OK]
Common Mistakes:
  • Confusing translation with image generation
  • Thinking translation sorts data
  • Mixing translation with face recognition
2. Which of the following is the correct way to call a pre-trained translation model in Python using a library like Hugging Face Transformers?
easy
A. model = pipeline('image-classification')
B. model = pipeline('speech-recognition')
C. model = pipeline('text-generation')
D. model = pipeline('translation_en_to_fr')

Solution

  1. Step 1: Identify the pipeline for translation

    The correct pipeline for English to French translation is 'translation_en_to_fr'.
  2. Step 2: Check other pipeline types

    Other options are for different tasks like image classification, text generation, or speech recognition, not translation.
  3. Final Answer:

    model = pipeline('translation_en_to_fr') -> Option D
  4. Quick Check:

    Translation pipeline = 'translation_en_to_fr' [OK]
Hint: Use 'translation_en_to_fr' for English to French translation [OK]
Common Mistakes:
  • Using wrong pipeline name
  • Confusing translation with image tasks
  • Calling text generation instead of translation
3. Given the following Python code using a translation model, what will be the output?
from transformers import pipeline
translator = pipeline('translation_en_to_de')
result = translator('Hello, how are you?')
print(result[0]['translation_text'])
medium
A. Ciao, come stai?
B. Bonjour, comment ça va?
C. Hallo, wie geht es dir?
D. Hola, ¿cómo estás?

Solution

  1. Step 1: Identify the translation direction

    The pipeline is 'translation_en_to_de', which means English to German translation.
  2. Step 2: Translate the input text

    'Hello, how are you?' translates to 'Hallo, wie geht es dir?' in German.
  3. Final Answer:

    Hallo, wie geht es dir? -> Option C
  4. Quick Check:

    English to German translation = Hallo, wie geht es dir? [OK]
Hint: Check language codes: en_to_de means English to German [OK]
Common Mistakes:
  • Choosing French or Spanish output
  • Ignoring language direction
  • Assuming output is same as input
4. You wrote this code to translate English to Spanish but get an error:
from transformers import pipeline
translator = pipeline('translation_en_to_es')
result = translator('Good morning')
print(result['translation_text'])
What is the error and how to fix it?
medium
A. Accessing result as dict instead of list; use result[0]['translation_text']
B. Wrong pipeline name; should be 'translation_en_to_fr'
C. Missing model download; add download=True parameter
D. print statement syntax error; use print result['translation_text']

Solution

  1. Step 1: Understand the output format of pipeline

    The pipeline returns a list of dicts, so result is a list, not a dict.
  2. Step 2: Correct the access to translation text

    Access the first element with result[0], then get 'translation_text' key.
  3. Final Answer:

    Accessing result as dict instead of list; use result[0]['translation_text'] -> Option A
  4. Quick Check:

    Pipeline output is list of dicts [OK]
Hint: Pipeline returns list; access first item before keys [OK]
Common Mistakes:
  • Treating output as dict directly
  • Using wrong pipeline name
  • Incorrect print syntax
5. You want to build a program that translates a list of English sentences to French and then back to English to check accuracy. Which approach is best?
hard
A. Translate sentences manually without AI models
B. Use two pipelines: 'translation_en_to_fr' then 'translation_fr_to_en' on each sentence
C. Use 'translation_en_to_de' pipeline followed by 'translation_de_to_en'
D. Use only 'translation_en_to_fr' pipeline twice on each sentence

Solution

  1. Step 1: Identify correct translation directions

    To translate English to French and back, use 'translation_en_to_fr' then 'translation_fr_to_en'.
  2. Step 2: Avoid wrong language pairs

    Using German pipelines or repeating the same pipeline won't give correct back translation.
  3. Step 3: Manual translation is inefficient and error-prone

    AI pipelines automate and improve accuracy checking.
  4. Final Answer:

    Use two pipelines: 'translation_en_to_fr' then 'translation_fr_to_en' on each sentence -> Option B
  5. Quick Check:

    Back translation needs correct language pairs [OK]
Hint: Use matching forward and backward pipelines for accuracy check [OK]
Common Mistakes:
  • Using wrong language pairs
  • Repeating same pipeline twice
  • Ignoring AI automation