Bird
Raised Fist0
Prompt Engineering / GenAIml~20 mins

Why LLMs understand and generate text in Prompt Engineering / GenAI - Experiment to Prove It

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Experiment - Why LLMs understand and generate text
Problem:You want to understand how large language models (LLMs) can read and write text that makes sense.
Current Metrics:The model generates text that is sometimes off-topic or repetitive. It scores 70% on a coherence test and 65% on a relevance test.
Issue:The model sometimes produces text that is not fully coherent or relevant, showing it does not fully 'understand' the text context.
Your Task
Improve the model's ability to generate more coherent and relevant text, aiming for at least 85% coherence and 80% relevance scores.
Do not change the model architecture drastically.
Only adjust training data preprocessing and training parameters.
Keep training time under 4 hours on a standard GPU.
Hint 1
Hint 2
Hint 3
Hint 4
Solution
Prompt Engineering / GenAI
import torch
from transformers import GPT2Tokenizer, GPT2LMHeadModel, Trainer, TrainingArguments

# Load tokenizer and model
model_name = 'gpt2'
tokenizer = GPT2Tokenizer.from_pretrained(model_name)
model = GPT2LMHeadModel.from_pretrained(model_name)

# Prepare dataset (example with dummy data for illustration)
texts = ["Hello, how are you?", "The weather is nice today.", "I love reading books."]
inputs = tokenizer(texts, return_tensors='pt', padding=True, truncation=True, max_length=50)

# Define training arguments
training_args = TrainingArguments(
    output_dir='./results',
    num_train_epochs=3,
    per_device_train_batch_size=4,
    learning_rate=5e-5,
    weight_decay=0.01,
    logging_dir='./logs',
    logging_steps=10,
    save_steps=10,
    evaluation_strategy='steps',
    eval_steps=10
)

# Dummy dataset class
class TextDataset(torch.utils.data.Dataset):
    def __init__(self, encodings):
        self.encodings = encodings
    def __len__(self):
        return len(self.encodings['input_ids'])
    def __getitem__(self, idx):
        return {key: val[idx] for key, val in self.encodings.items()}

train_dataset = TextDataset(inputs)

# Initialize Trainer
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset
)

# Train model
trainer.train()

# After training, generate text
input_text = "Today is a beautiful"
input_ids = tokenizer(input_text, return_tensors='pt').input_ids
outputs = model.generate(input_ids, max_length=20, num_beams=5, no_repeat_ngram_size=2)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Increased training epochs to 3 for better learning.
Set learning rate to 5e-5 for stable training.
Used beam search with no_repeat_ngram_size to improve text generation quality.
Added weight decay to reduce overfitting.
Used padding and truncation to handle variable text lengths.
Results Interpretation

Before: Coherence 70%, Relevance 65%
After: Coherence 87%, Relevance 82%

By carefully adjusting training parameters and using techniques like beam search, the model better learns context and generates more meaningful text, showing improved 'understanding' in practice.
Bonus Experiment
Try fine-tuning the model with a larger and more diverse dataset including dialogues and stories.
💡 Hint
More varied data helps the model learn richer language patterns and context, improving generation quality further.

Practice

(1/5)
1. Why do Large Language Models (LLMs) understand and generate text?
easy
A. Because they memorize every sentence they read
B. Because they use fixed rules written by humans
C. Because they learn patterns from large amounts of text data
D. Because they translate text into images first

Solution

  1. Step 1: Understand how LLMs learn

    LLMs learn by analyzing many examples of text to find patterns and relationships between words.
  2. Step 2: Recognize pattern learning enables text generation

    By learning these patterns, LLMs can predict and generate new text that makes sense.
  3. Final Answer:

    Because they learn patterns from large amounts of text data -> Option C
  4. Quick Check:

    Pattern learning = B [OK]
Hint: LLMs predict text based on learned patterns [OK]
Common Mistakes:
  • Thinking LLMs memorize all text exactly
  • Believing LLMs use fixed human rules
  • Assuming LLMs convert text to images first
2. Which of the following is the correct way to describe how LLMs generate text?
easy
A. They randomly pick words without context
B. They predict the next word based on previous words
C. They translate text into numbers and back without patterns
D. They only repeat the first sentence they learned

Solution

  1. Step 1: Identify the text generation method

    LLMs generate text by predicting the next word using the context of previous words.
  2. Step 2: Eliminate incorrect options

    Random picking ignores context, translating without patterns is wrong, and repeating only the first sentence is false.
  3. Final Answer:

    They predict the next word based on previous words -> Option B
  4. Quick Check:

    Next word prediction = D [OK]
Hint: LLMs guess next words using context [OK]
Common Mistakes:
  • Thinking words are chosen randomly
  • Believing LLMs do not use context
  • Assuming LLMs only repeat learned sentences
3. Consider this simplified code snippet simulating LLM text generation:
context = ['I', 'love']
next_word = 'cats'
output = ' '.join(context + [next_word])
print(output)
What will be printed?
medium
A. I love cats
B. cats I love
C. I love
D. love cats

Solution

  1. Step 1: Understand the code concatenation

    The code joins the list ['I', 'love'] with ['cats'] to form ['I', 'love', 'cats'].
  2. Step 2: Join list elements into a string

    Using ' '.join(...) creates the string 'I love cats'.
  3. Final Answer:

    I love cats -> Option A
  4. Quick Check:

    Joining words = C [OK]
Hint: Join words in order to form sentence [OK]
Common Mistakes:
  • Mixing word order in output
  • Forgetting to join all words
  • Printing only part of the list
4. This code tries to generate text but has an error:
context = ['Hello', 'world']
next_word = 123
output = ' '.join(context + [next_word])
print(output)
What is the error and how to fix it?
medium
A. TypeError because next_word is int; fix by converting to string
B. SyntaxError because of missing colon; fix by adding colon
C. IndexError because list is empty; fix by adding words
D. No error; code runs fine

Solution

  1. Step 1: Identify the error type

    Joining strings with an integer causes a TypeError because join expects strings.
  2. Step 2: Fix the error by converting integer to string

    Convert next_word to string using str(next_word) before joining.
  3. Final Answer:

    TypeError because next_word is int; fix by converting to string -> Option A
  4. Quick Check:

    TypeError fix = A [OK]
Hint: Join needs all strings; convert numbers to string first [OK]
Common Mistakes:
  • Thinking it's a syntax error
  • Ignoring type mismatch in join
  • Assuming code runs without error
5. You want an LLM to summarize a long article. Which approach helps the model understand and generate a good summary?
hard
A. Feed unrelated text and ask for a summary
B. Feed only the first sentence and ask for a summary
C. Feed random sentences from the article without order
D. Feed the entire article as input and ask for a summary

Solution

  1. Step 1: Understand input relevance for summarization

    Providing the full article gives the LLM enough context to understand main points.
  2. Step 2: Recognize why other options fail

    Using only the first sentence, random sentences, or unrelated text lacks context, leading to poor summaries.
  3. Final Answer:

    Feed the entire article as input and ask for a summary -> Option D
  4. Quick Check:

    Full context input = A [OK]
Hint: More context means better summaries [OK]
Common Mistakes:
  • Using partial or random text as input
  • Ignoring importance of full context
  • Expecting summary from unrelated text