Challenge - 5 Problems

🎖️

N-grams Master

Get all challenges correct to earn this badge!

Test your skills under time pressure!

❓ Predict Output

intermediate

2:00remaining

What is the output of this code generating bigrams?

Given the code below that generates bigrams from a sentence, what is the output?

NLP

sentence = "I love machine learning"
words = sentence.split()
bigrams = [(words[i], words[i+1]) for i in range(len(words)-1)]
print(bigrams)

A[('I', 'love'), ('love', 'machine'), ('learning', 'machine')]

B[('I', 'love'), ('love', 'machine'), ('machine', 'learning')]

C[('love', 'I'), ('machine', 'love'), ('learning', 'machine')]

D[('I', 'machine'), ('love', 'learning')]

Attempts:

2 left

🧠 Conceptual

intermediate

1:30remaining

Which statement best describes the purpose of n-grams in text processing?

Choose the best description of why n-grams are used in natural language processing.

AThey capture sequences of words to understand context and word order.

BThey remove stop words to reduce noise in text data.

CThey translate text from one language to another.

DThey count the total number of characters in a text.

Attempts:

2 left

❓ Hyperparameter

advanced

2:00remaining

Choosing the right n for n-grams

If you want to capture longer phrases but avoid very sparse data, which n-gram size is usually the best choice?

AUnigrams (n=1) because they are simple and cover all words.

BFour-grams (n=4) because they capture the most detailed phrases.

CBigrams (n=2) because they balance context and data sparsity.

DTrigrams (n=3) because longer sequences always improve accuracy.

Attempts:

2 left

❓ Metrics

advanced

1:30remaining

Evaluating n-gram language models

Which metric is commonly used to evaluate the quality of an n-gram language model?

APerplexity, which measures how well the model predicts a sample.

BAccuracy, which counts correct word predictions only.

CMean Squared Error, used for regression tasks.

DF1 Score, used for classification balance.

Attempts:

2 left

🔧 Debug

expert

2:30remaining

Why does this n-gram code raise an error?

Consider this code snippet to generate trigrams. Why does it raise an IndexError?

NLP

sentence = "Data science is fun"
words = sentence.split()
trigrams = [(words[i], words[i+1], words[i+2]) for i in range(len(words)-2)]
print(trigrams)

AThe split method does not create a list of words.

BThe print statement is missing parentheses.

CTuples cannot have three elements in Python.

DThe range goes too far, causing words[i+2] to exceed list length.

Attempts:

2 left

Practice

(1/5)

1. What is an n-gram in natural language processing?

easy

A. A random selection of n words from a text

B. A single word repeated n times

C. A sentence with n words

D. A group of n consecutive words in a text

N-grams in NLP - Practice Problems & Coding Challenges

Start learning this pattern below

Practice

Solution

Step 1: Understand the definition of n-gram

Step 2: Compare options with definition

Final Answer:

Quick Check:

Solution

Step 1: Understand ngram_range parameter

Step 2: Evaluate each option

Final Answer:

Quick Check:

Solution

Step 1: Understand trigram extraction

Step 2: List trigrams from the sentence

Final Answer:

Quick Check:

Solution

Step 1: Check method usage

Step 2: Validate other parts

Final Answer:

Quick Check:

Solution

Step 1: Understand requirements

Step 2: Evaluate options

Final Answer:

Quick Check: