Recall & Review
beginner
What is the main purpose of BERT tokenization using WordPiece?
To split words into smaller subword units so that rare or unknown words can be represented as combinations of known pieces, improving the model's understanding of language.
Click to reveal answer
beginner
How does WordPiece handle unknown words during tokenization?
It breaks unknown words into smaller known subword units, starting from the beginning of the word and adding pieces until the whole word is covered, allowing the model to understand new words from familiar parts.
Click to reveal answer
beginner
Why does WordPiece add '##' before some tokens?
The '##' symbol marks that the token is a continuation of a previous token and not a standalone word, helping the model know how subwords connect to form full words.
Click to reveal answer
intermediate
Explain the difference between a word and a WordPiece token in BERT tokenization.
A word is a complete unit of language, while a WordPiece token can be a full word or a smaller part of a word. WordPiece tokens allow BERT to handle rare or new words by breaking them into known pieces.
Click to reveal answer
intermediate
What is the advantage of using WordPiece tokenization over simple word-level tokenization?
WordPiece reduces the vocabulary size and handles rare or new words better by splitting them into subwords, which helps the model learn more efficiently and generalize to unseen words.
Click to reveal answer
What does the '##' symbol indicate in WordPiece tokens?
✗ Incorrect
The '##' symbol shows that the token continues from the previous token, indicating it is part of the same word.
Why does BERT use WordPiece tokenization instead of splitting only by spaces?
✗ Incorrect
WordPiece helps BERT handle rare or new words by splitting them into known subword units.
If the word 'unhappiness' is unknown, how might WordPiece tokenize it?
✗ Incorrect
WordPiece breaks words into meaningful subwords like 'un', 'happy', and 'ness' with '##' marking continuations.
What is a key benefit of having a smaller vocabulary with WordPiece?
✗ Incorrect
A smaller vocabulary reduces model size and helps it learn better by reusing subword units.
Which of these is NOT true about WordPiece tokenization?
✗ Incorrect
WordPiece often splits words into multiple tokens, especially for rare or unknown words.
Describe how BERT's WordPiece tokenization works and why it is useful.
Think about how breaking words into smaller parts helps the model.
You got /5 concepts.
Explain the role of the '##' symbol in WordPiece tokens and give an example.
Consider how subwords connect to form full words.
You got /3 concepts.