0
0
Intro to Computingfundamentals~20 mins

Natural language processing basics in Intro to Computing - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
NLP Basics Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate
1:00remaining
What does tokenization do in NLP?

In natural language processing, what is the main purpose of tokenization?

AIt converts text into numerical vectors directly.
BIt translates text from one language to another.
CIt removes all punctuation from the text.
DIt splits text into smaller pieces like words or sentences.
Attempts:
2 left
💡 Hint

Think about how you break a sentence into parts before understanding it.

🔍 Analysis
intermediate
1:30remaining
Output of simple word count code

What is the output of this Python code that counts word frequencies?

Intro to Computing
from collections import Counter
text = 'apple banana apple orange banana apple'
word_counts = Counter(text.split())
print(word_counts)
ACounter({'apple': 3, 'banana': 2, 'orange': 1})
BCounter({'banana': 3, 'apple': 2, 'orange': 1})
C{'apple': 3, 'banana': 2, 'orange': 1}
DCounter({'apple': 1, 'banana': 1, 'orange': 1})
Attempts:
2 left
💡 Hint

Count how many times each word appears in the text.

Model Choice
advanced
2:00remaining
Best model for sentiment analysis

You want to build a model to classify movie reviews as positive or negative. Which model type is best suited for this task?

AConvolutional Neural Network (CNN) designed for image recognition
BK-Means clustering for grouping similar reviews
CRecurrent Neural Network (RNN) or Transformer model for text sequences
DLinear regression predicting review length
Attempts:
2 left
💡 Hint

Think about models that understand sequences of words.

Metrics
advanced
2:00remaining
Choosing the right metric for imbalanced text classification

You trained a spam detection model where spam messages are rare. Which metric is best to evaluate your model's performance?

APrecision and recall, to balance false positives and false negatives
BAccuracy, because it shows overall correct predictions
CMean squared error, to measure prediction errors
DConfusion matrix size, to count total classes
Attempts:
2 left
💡 Hint

Think about what matters when one class is much smaller than the other.

🔍 Analysis
expert
2:30remaining
Why does this text preprocessing code raise an error?

Consider this Python code snippet for lowercasing and removing punctuation from text. Why does it raise an error?

Intro to Computing
import string
text = 'Hello, World!'
clean_text = text.lower().translate(str.maketrans('', '', string.punctuation))
print(clean_text)
Alower() cannot be chained with translate(), causing an AttributeError
Btranslate() expects a translation table, but string.punctuation is a string, causing a TypeError
Cstring.punctuation is empty, so translate() does nothing
Dprint() is missing parentheses, causing a SyntaxError
Attempts:
2 left
💡 Hint

Check what the translate() method needs as input.