Bird
Raised Fist0
NLPml~20 mins

Why text generation creates content in NLP - Experiment to Prove It

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Experiment - Why text generation creates content
Problem:You want to understand how a text generation model creates new content based on input prompts.
Current Metrics:The model generates text that is often repetitive and sometimes irrelevant to the prompt.
Issue:The model tends to repeat phrases and lacks diversity in generated content, showing limited creativity.
Your Task
Improve the text generation model so it produces more diverse and relevant content without losing coherence.
Keep the model architecture the same (a simple LSTM-based text generator).
Only adjust training parameters and sampling methods during generation.
Hint 1
Hint 2
Hint 3
Solution
NLP
import numpy as np
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense, Embedding
from tensorflow.keras.utils import to_categorical

# Sample text data
text = "hello world hello machine learning world hello ai world"

# Create character mapping
chars = sorted(list(set(text)))
char_to_idx = {c:i for i,c in enumerate(chars)}
idx_to_char = {i:c for i,c in enumerate(chars)}

# Prepare sequences
seq_length = 5
sequences = []
next_chars = []
for i in range(len(text) - seq_length):
    sequences.append(text[i:i+seq_length])
    next_chars.append(text[i+seq_length])

X = np.zeros((len(sequences), seq_length), dtype=int)
y = np.zeros((len(sequences), len(chars)), dtype=int)
for i, seq in enumerate(sequences):
    for t, char in enumerate(seq):
        X[i, t] = char_to_idx[char]
    y[i, char_to_idx[next_chars[i]]] = 1

# Build model
model = Sequential([
    Embedding(len(chars), 10, input_length=seq_length),
    LSTM(50),
    Dense(len(chars), activation='softmax')
])
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

# Train model
model.fit(X, y, epochs=50, batch_size=8, verbose=0)

# Text generation function with temperature and top-k sampling

def sample(preds, temperature=1.0, top_k=None):
    preds = np.asarray(preds).astype('float64')
    preds = np.log(preds + 1e-8) / temperature
    exp_preds = np.exp(preds)
    preds = exp_preds / np.sum(exp_preds)
    if top_k is not None:
        indices_to_remove = preds < np.sort(preds)[-top_k]
        preds[indices_to_remove] = 0
        preds = preds / np.sum(preds)
    probas = np.random.multinomial(1, preds, 1)
    return np.argmax(probas)

# Generate text
seed_text = "hello"
generated = seed_text
for _ in range(50):
    input_seq = [char_to_idx[c] for c in generated[-seq_length:]]
    input_seq = np.array(input_seq).reshape(1, seq_length)
    preds = model.predict(input_seq, verbose=0)[0]
    next_index = sample(preds, temperature=0.8, top_k=3)
    next_char = idx_to_char[next_index]
    generated += next_char

print("Generated text:", generated)
Added temperature parameter to control randomness in predictions.
Implemented top-k sampling to limit choices to the most probable characters.
Trained the model for 50 epochs with a small batch size to balance learning.
Results Interpretation

Before: The model generated repetitive and less relevant text.

After: With temperature and top-k sampling, the text is more varied and coherent.

Adjusting sampling methods like temperature and top-k helps text generation models create more diverse and meaningful content without changing the model itself.
Bonus Experiment
Try using nucleus (top-p) sampling instead of top-k to see if it improves text diversity further.
💡 Hint
Nucleus sampling selects from the smallest set of words whose cumulative probability exceeds a threshold p, balancing diversity and coherence.

Practice

(1/5)
1. What is the main reason text generation models create new content?
easy
A. They predict the next word based on previous words
B. They copy sentences from a fixed list
C. They randomly select words without context
D. They translate text from one language to another

Solution

  1. Step 1: Understand how text generation works

    Text generation models use previous words to predict the next word, creating new sentences.
  2. Step 2: Compare options with this understanding

    Only They predict the next word based on previous words describes this process correctly; others describe unrelated or incorrect methods.
  3. Final Answer:

    They predict the next word based on previous words -> Option A
  4. Quick Check:

    Next word prediction = C [OK]
Hint: Text generation predicts next words, not copy or random picks [OK]
Common Mistakes:
  • Thinking text is copied from a list
  • Believing words are chosen randomly
  • Confusing generation with translation
2. Which of the following is the correct way to start generating text using a model like GPT-2?
easy
A. model.train(start_text)
B. model.generate(start_text)
C. model.predict_label(start_text)
D. model.translate(start_text)

Solution

  1. Step 1: Identify the function for text generation

    Text generation uses a method like generate to produce new text from a start.
  2. Step 2: Eliminate unrelated functions

    train is for learning, predict_label is for classification, and translate is for language translation.
  3. Final Answer:

    model.generate(start_text) -> Option B
  4. Quick Check:

    Text generation method = generate [OK]
Hint: Use generate() to create text, not train() or translate() [OK]
Common Mistakes:
  • Confusing training with generating
  • Using classification methods for generation
  • Mixing translation with generation
3. Given this Python code using a text generation model:
start_text = 'Once upon a time'
output = model.generate(start_text, max_length=10)
print(output)

What is the expected output type?
medium
A. A list of numbers representing word indexes
B. An error because max_length is invalid
C. A string containing a sentence starting with 'Once upon a time'
D. A boolean indicating success or failure

Solution

  1. Step 1: Understand the generate function output

    The generate function returns generated text as a string starting with the input.
  2. Step 2: Analyze the code snippet

    It prints the output, which should be a string sentence starting with 'Once upon a time'.
  3. Final Answer:

    A string containing a sentence starting with 'Once upon a time' -> Option C
  4. Quick Check:

    Output type = string sentence [OK]
Hint: Generate outputs text strings, not lists or booleans [OK]
Common Mistakes:
  • Expecting numeric lists instead of text
  • Assuming max_length causes errors
  • Thinking output is a success flag
4. This code tries to generate text but raises an error:
start = 'Hello'
output = model.generate(start, max_len=20)
print(output)

What is the likely cause of the error?
medium
A. The parameter name should be max_length, not max_len
B. The start text must be a list, not a string
C. The model.generate method does not exist
D. The print statement is missing parentheses

Solution

  1. Step 1: Check parameter names for generate()

    The correct parameter to limit output length is max_length, not max_len.
  2. Step 2: Verify other code parts

    Start text as string is valid, model.generate exists, and print uses parentheses correctly.
  3. Final Answer:

    The parameter name should be max_length, not max_len -> Option A
  4. Quick Check:

    Correct param name = max_length [OK]
Hint: Use exact parameter names like max_length to avoid errors [OK]
Common Mistakes:
  • Using wrong parameter names
  • Thinking input must be a list
  • Ignoring Python 3 print syntax
5. You want to generate a story summary using a text generation model. Which approach best explains why the model creates new content rather than copying existing text?
hard
A. The model translates the original story into another language and back
B. The model searches a database for exact matching summaries and returns them
C. The model randomly selects words from a dictionary without context
D. The model predicts each next word based on learned patterns, creating unique sentences

Solution

  1. Step 1: Understand text generation for summaries

    Models generate summaries by predicting next words using learned language patterns, not copying exact text.
  2. Step 2: Evaluate options based on this understanding

    Only The model predicts each next word based on learned patterns, creating unique sentences describes this predictive generation; others describe copying, random selection, or translation.
  3. Final Answer:

    The model predicts each next word based on learned patterns, creating unique sentences -> Option D
  4. Quick Check:

    Generation = prediction of next words [OK]
Hint: Generation predicts words, it doesn't copy or translate [OK]
Common Mistakes:
  • Thinking generation copies exact text
  • Confusing generation with translation
  • Assuming random word selection