Challenge - 5 Problems
N-grams Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
❓ Predict Output
intermediate2:00remaining
What is the output of this code generating bigrams?
Given the code below that generates bigrams from a sentence, what is the output?
NLP
sentence = "I love machine learning" words = sentence.split() bigrams = [(words[i], words[i+1]) for i in range(len(words)-1)] print(bigrams)
Attempts:
2 left
💡 Hint
Remember bigrams are pairs of consecutive words.
✗ Incorrect
The code splits the sentence into words and pairs each word with the next one, producing consecutive word pairs (bigrams).
🧠 Conceptual
intermediate1:30remaining
Which statement best describes the purpose of n-grams in text processing?
Choose the best description of why n-grams are used in natural language processing.
Attempts:
2 left
💡 Hint
Think about how n-grams help capture word relationships.
✗ Incorrect
N-grams group words into sequences to help models learn context and order, which is important for understanding meaning.
❓ Hyperparameter
advanced2:00remaining
Choosing the right n for n-grams
If you want to capture longer phrases but avoid very sparse data, which n-gram size is usually the best choice?
Attempts:
2 left
💡 Hint
Longer n-grams capture more context but can cause data sparsity.
✗ Incorrect
Bigrams often provide a good balance between capturing context and keeping data manageable without too much sparsity.
❓ Metrics
advanced1:30remaining
Evaluating n-gram language models
Which metric is commonly used to evaluate the quality of an n-gram language model?
Attempts:
2 left
💡 Hint
This metric measures uncertainty in predicting text sequences.
✗ Incorrect
Perplexity measures how surprised a model is by the test data; lower perplexity means better prediction.
🔧 Debug
expert2:30remaining
Why does this n-gram code raise an error?
Consider this code snippet to generate trigrams. Why does it raise an IndexError?
NLP
sentence = "Data science is fun" words = sentence.split() trigrams = [(words[i], words[i+1], words[i+2]) for i in range(len(words)-2)] print(trigrams)
Attempts:
2 left
💡 Hint
Check the range limit for accessing words[i+2].
✗ Incorrect
The loop tries to access words[i+2] even when i is near the end, causing an IndexError.