Text generation models create new sentences instead of copying existing ones. Why is this possible?
Think about how the model understands language structure rather than memorizing exact sentences.
Text generation models learn the rules and patterns of language from data. They use this knowledge to predict the next word in a sequence, which lets them create new, unique sentences rather than just repeating what they saw.
Given a trained language model that predicts the next word, what is the output of this code snippet?
import random vocab = ['hello', 'world', 'machine', 'learning'] # Simulate next word prediction probabilities probs = [0.1, 0.7, 0.1, 0.1] next_word = random.choices(vocab, weights=probs, k=1)[0] print(next_word)
random.choices picks based on weights but can pick any word.
The code uses random.choices with weights to pick the next word. "world" has the highest chance but other words can still be picked randomly.
You want to build a system that writes creative stories with varied vocabulary and style. Which model type is best?
Consider which model can capture complex language patterns and creativity.
Transformer models learn deep language patterns and context, enabling creative and varied text generation. Markov chains and rule-based systems are limited and less flexible.
Which metric best measures how well a text generation model produces fluent and coherent sentences?
Think about metrics that compare generated text to human-written examples.
BLEU score measures similarity between generated and reference texts, reflecting fluency and coherence. Other options do not evaluate text quality directly.
Consider this code snippet generating text word by word. It produces repetitive phrases like "the the the...". What is the most likely cause?
import random vocab = ['the', 'cat', 'sat', 'on', 'mat'] probs = [0.9, 0.025, 0.025, 0.025, 0.025] output = [] for _ in range(5): next_word = random.choices(vocab, weights=probs, k=1)[0] output.append(next_word) print(' '.join(output))
Look at the probabilities assigned to each word.
The high probability (0.9) for "the" makes it very likely to be chosen repeatedly, causing repetitive output. Balancing probabilities helps generate varied text.