Practice

(1/5)

1. What does an n-gram language model primarily do?

easy

A. Predict the next word based on previous words

B. Translate text from one language to another

C. Generate images from text descriptions

D. Detect the sentiment of a sentence

Solution

Step 1: Understand the purpose of n-gram models
N-gram models look at sequences of words to predict what comes next.
Step 2: Identify the main function
They use previous words to guess the next word in a sentence.
Final Answer:
Predict the next word based on previous words -> Option A
Quick Check:
N-gram models predict next word = A [OK]

Hint: N-grams predict next word from previous words [OK]

Common Mistakes:

Confusing n-gram with translation models
Thinking n-grams generate images
Mixing up sentiment analysis with n-grams

2. Which of the following is the correct way to represent a bigram from the sentence 'I love AI'?

easy

A. ('AI', 'love')

B. ('I', 'love')

C. ('love', 'AI', 'I')

D. ('I', 'AI')

Solution

Step 1: Understand bigrams
Bigrams are pairs of consecutive words in a sentence.
Step 2: Extract bigrams from 'I love AI'
The pairs are ('I', 'love') and ('love', 'AI'). ('I', 'love') shows a correct bigram.
Final Answer:
('I', 'love') -> Option B
Quick Check:
Bigram = consecutive word pairs = C [OK]

Hint: Bigrams are pairs of consecutive words [OK]

Common Mistakes:

Including three words instead of two
Mixing word order in pairs
Selecting non-consecutive words

3. Given the sentence 'the cat sat on the mat', what is the count of the trigram ('the', 'cat', 'sat')?

medium

A. 0

B. 2

C. 1

D. 3

Solution

Step 1: Identify trigrams in the sentence
Trigrams are sequences of three consecutive words. The trigrams are: ('the', 'cat', 'sat'), ('cat', 'sat', 'on'), ('sat', 'on', 'the'), ('on', 'the', 'mat').
Step 2: Count the trigram ('the', 'cat', 'sat')
This trigram appears once at the start of the sentence.
Final Answer:
1 -> Option C
Quick Check:
Trigram count = 1 [OK]

Hint: Count exact three-word sequences in order [OK]

Common Mistakes:

Counting non-consecutive words
Confusing bigrams with trigrams
Overcounting repeated words

4. Consider this Python code snippet to generate bigrams from a list of words:

words = ['hello', 'world', 'hello']
bigrams = [(words[i], words[i+1]) for i in range(len(words))]

What error will this code produce?

medium

A. No error, code runs correctly

B. SyntaxError: invalid syntax

C. TypeError: unsupported operand type(s)

D. IndexError: list index out of range

Solution

Step 1: Analyze the loop range
The loop runs from 0 to len(words)-1, which is 0 to 2 for 3 words.
Step 2: Check index access inside loop
At i=2, words[i+1] tries to access words[3], which is out of range, causing IndexError.
Final Answer:
IndexError: list index out of range -> Option D
Quick Check:
Loop index exceeds list length = D [OK]

Hint: Check loop range when accessing i+1 index [OK]

Common Mistakes:

Using full length in range causing out-of-bounds
Assuming no error without testing
Confusing syntax errors with runtime errors

5. You want to build a trigram model from a text corpus but notice many rare trigrams cause sparse data issues. Which technique can help improve your model's predictions?

hard

A. Use smoothing methods like Laplace smoothing

B. Increase the n in n-gram to 5-grams

C. Remove all trigrams that appear less than 10 times

D. Ignore the problem and use raw counts

Solution

Step 1: Understand sparse data in n-gram models
Rare trigrams cause zero or low counts, making predictions unreliable.
Step 2: Identify smoothing techniques
Smoothing like Laplace adds small counts to all n-grams, reducing zero probabilities and improving predictions.
Final Answer:
Use smoothing methods like Laplace smoothing -> Option A
Quick Check:
Smoothing reduces sparse data issues = A [OK]

Hint: Apply smoothing to handle rare n-grams [OK]

Common Mistakes:

Increasing n worsens sparsity
Removing rare n-grams loses useful info
Ignoring sparsity leads to poor predictions

Why N-gram language models in NLP? - Purpose & Use Cases

Start learning this pattern below

Practice

Solution

Step 1: Understand the purpose of n-gram models

Step 2: Identify the main function

Final Answer:

Quick Check:

Solution

Step 1: Understand bigrams

Step 2: Extract bigrams from 'I love AI'

Final Answer:

Quick Check:

Solution

Step 1: Identify trigrams in the sentence

Step 2: Count the trigram ('the', 'cat', 'sat')

Final Answer:

Quick Check:

Solution

Step 1: Analyze the loop range

Step 2: Check index access inside loop

Final Answer:

Quick Check:

Solution

Step 1: Understand sparse data in n-gram models

Step 2: Identify smoothing techniques

Final Answer:

Quick Check: