Recall & Review
beginner
What does BERT stand for in NLP?
BERT stands for Bidirectional Encoder Representations from Transformers. It is a model designed to understand the context of words in both directions in a sentence.
Click to reveal answer
beginner
What are the two main tasks used in BERT pre-training?
The two main tasks are:<br>1. Masked Language Modeling (MLM): Randomly hides some words and trains the model to predict them.<br>2. Next Sentence Prediction (NSP): Trains the model to understand if one sentence logically follows another.
Click to reveal answer
intermediate
Why is BERT called 'bidirectional'?
Because BERT looks at the words before and after a target word at the same time during training. This helps it understand the full context, unlike older models that read text only left-to-right or right-to-left.
Click to reveal answer
beginner
Explain Masked Language Modeling (MLM) in simple terms.
MLM is like a fill-in-the-blank game. Some words in a sentence are hidden, and BERT tries to guess those missing words using the surrounding words. This helps BERT learn word meanings and context.
Click to reveal answer
intermediate
What is the purpose of Next Sentence Prediction (NSP) in BERT pre-training?
NSP teaches BERT to understand relationships between sentences. It learns to predict if one sentence naturally follows another, which helps in tasks like question answering and text summarization.
Click to reveal answer
What does BERT use to understand the context of words?
✗ Incorrect
BERT reads sentences in both directions to understand full context.
In Masked Language Modeling, what does BERT try to predict?
✗ Incorrect
MLM hides some words and BERT predicts those missing words.
What is the goal of Next Sentence Prediction in BERT?
✗ Incorrect
NSP helps BERT learn if one sentence logically follows another.
Why is BERT pre-trained before fine-tuning on specific tasks?
✗ Incorrect
Pre-training helps BERT learn language patterns useful for many tasks.
Which architecture does BERT use?
✗ Incorrect
BERT is based on the Transformer encoder architecture.
Describe the two main pre-training tasks of BERT and why they are important.
Think about how BERT learns words and sentence order.
You got /4 concepts.
Explain why BERT's bidirectional approach helps it understand language better than previous models.
Consider how knowing words before and after helps guess meaning.
You got /3 concepts.