Challenge - 5 Problems
N-gram Mastery
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate2:00remaining
Understanding the purpose of N-gram models
What is the main purpose of using an N-gram language model in natural language processing?
Attempts:
2 left
💡 Hint
Think about what N-gram models estimate about word sequences.
✗ Incorrect
N-gram models estimate the probability of a word given the previous N-1 words, helping predict the next word in a sequence.
❓ Predict Output
intermediate2:00remaining
Output of a bigram probability calculation
Given the sentence: 'I love machine learning', and the bigram counts: {('I', 'love'): 3, ('love', 'machine'): 2, ('machine', 'learning'): 4}, what is the bigram probability P('machine'|'love') using maximum likelihood estimation?
NLP
bigram_counts = {('I', 'love'): 3, ('love', 'machine'): 2, ('machine', 'learning'): 4}
unigram_counts = {'I': 3, 'love': 4, 'machine': 4}
prob = bigram_counts[('love', 'machine')] / unigram_counts['love']
print(prob)Attempts:
2 left
💡 Hint
Divide the count of the bigram by the count of the first word in the bigram.
✗ Incorrect
The bigram probability P('machine'|'love') = count('love machine') / count('love') = 2 / 4 = 0.5
❓ Hyperparameter
advanced2:00remaining
Choosing the value of N in N-gram models
Which of the following is a common trade-off when increasing the value of N in an N-gram language model?
Attempts:
2 left
💡 Hint
Think about how longer sequences affect data requirements and sparsity.
✗ Incorrect
Increasing N captures more context but causes data sparsity and needs more training data, making option B correct.
❓ Metrics
advanced2:00remaining
Evaluating N-gram language models with perplexity
If an N-gram language model has a perplexity of 50 on a test set, what does this indicate about the model's performance?
Attempts:
2 left
💡 Hint
Lower perplexity means better prediction; higher means worse.
✗ Incorrect
Perplexity measures how well a model predicts data; higher perplexity means more uncertainty and worse performance.
🔧 Debug
expert2:00remaining
Identifying the error in smoothing implementation
Consider this code snippet for add-one smoothing in a bigram model:
```python
bigram_counts = {('the', 'cat'): 3, ('cat', 'sat'): 2}
unigram_counts = {'the': 5, 'cat': 4}
vocab_size = 10
word1 = 'cat'
word2 = 'sat'
prob = (bigram_counts.get((word1, word2), 0) + 1) / (unigram_counts[word1] + vocab_size)
print(prob)
```
What error will this code raise when calculating the probability for the bigram ('sat', 'on')?
Attempts:
2 left
💡 Hint
Check if all words used as keys exist in unigram_counts.
✗ Incorrect
If 'sat' is not in unigram_counts, accessing unigram_counts[word1] raises KeyError.