Bird
0
0

You want to build a trigram language model to predict the next word given two previous words. Which approach best handles the problem of unseen trigrams in your training data?

hard📝 Model Choice Q15 of 15
NLP - Text Generation
You want to build a trigram language model to predict the next word given two previous words. Which approach best handles the problem of unseen trigrams in your training data?
AOnly use unigram probabilities for all predictions
BIgnore unseen trigrams and assign zero probability
CUse smoothing techniques like Kneser-Ney smoothing
DIncrease the training data size without smoothing
Step-by-Step Solution
Solution:
  1. Step 1: Understand the unseen trigram problem

    Unseen trigrams cause zero probabilities, which harm model predictions.
  2. Step 2: Identify solution to zero probability issue

    Smoothing techniques like Kneser-Ney adjust probabilities to handle unseen cases effectively.
  3. Step 3: Evaluate other options

    Ignoring unseen trigrams or only using unigram probabilities lose context; increasing data alone may not solve sparsity.
  4. Final Answer:

    Use smoothing techniques like Kneser-Ney smoothing -> Option C
  5. Quick Check:

    Smoothing fixes zero probs for unseen trigrams [OK]
Quick Trick: Use smoothing to avoid zero probabilities [OK]
Common Mistakes:
MISTAKES
  • Assigning zero probability to unseen trigrams
  • Ignoring context by using only unigrams
  • Relying solely on more data without smoothing

Want More Practice?

15+ quiz questions · All difficulty levels · Free

Free Signup - Practice All Questions
More NLP Quizzes