Recall & Review

beginner

What does BERT stand for in NLP?

BERT stands for Bidirectional Encoder Representations from Transformers. It is a model designed to understand the context of words in both directions in a sentence.

Click to reveal answer

beginner

What are the two main tasks used in BERT pre-training?

The two main tasks are:
1. Masked Language Modeling (MLM): Randomly hides some words and trains the model to predict them.
2. Next Sentence Prediction (NSP): Trains the model to understand if one sentence logically follows another.

Click to reveal answer

intermediate

Why is BERT called 'bidirectional'?

Because BERT looks at the words before and after a target word at the same time during training. This helps it understand the full context, unlike older models that read text only left-to-right or right-to-left.

Click to reveal answer

beginner

Explain Masked Language Modeling (MLM) in simple terms.

MLM is like a fill-in-the-blank game. Some words in a sentence are hidden, and BERT tries to guess those missing words using the surrounding words. This helps BERT learn word meanings and context.

Click to reveal answer

intermediate

What is the purpose of Next Sentence Prediction (NSP) in BERT pre-training?

NSP teaches BERT to understand relationships between sentences. It learns to predict if one sentence naturally follows another, which helps in tasks like question answering and text summarization.

Click to reveal answer

What does BERT use to understand the context of words?

ABidirectional reading of sentences

BOnly left-to-right reading

COnly right-to-left reading

DRandom word order

In Masked Language Modeling, what does BERT try to predict?

AThe topic of the text

BThe next sentence

CHidden words in a sentence

DThe length of the sentence

What is the goal of Next Sentence Prediction in BERT?

APredict the next word in a sentence

BPredict if one sentence follows another

CPredict the sentiment of a sentence

DPredict the length of a paragraph

Why is BERT pre-trained before fine-tuning on specific tasks?

ATo avoid training

BTo memorize answers

CTo reduce model size

DTo learn general language understanding

Which architecture does BERT use?

ATransformer Encoder

BConvolutional Neural Network

CSupport Vector Machine

DRecurrent Neural Network

Describe the two main pre-training tasks of BERT and why they are important.

Explain why BERT's bidirectional approach helps it understand language better than previous models.

Practice

(1/5)

1. What are the two main tasks used during BERT pre-training?

easy

A. Text Classification and Named Entity Recognition

B. Masked Language Model and Next Sentence Prediction

C. Part-of-Speech Tagging and Dependency Parsing

D. Sentiment Analysis and Machine Translation

BERT pre-training concept in NLP - Cheat Sheet & Quick Revision

Start learning this pattern below

Practice

Solution

Step 1: Understand BERT pre-training tasks

Step 2: Match tasks to options

Final Answer:

Quick Check:

Solution

Step 1: Define Masked Language Model (MLM)

Step 2: Match definition to options

Final Answer:

Quick Check:

Solution

Step 1: Identify the masked word in the sentence

Step 2: Predict the masked word

Final Answer:

Quick Check:

Solution

Step 1: Understand NSP task

Step 2: Identify incorrect statement

Final Answer:

Quick Check:

Solution

Step 1: Understand NSP goal

Step 2: Choose best enhancement

Final Answer:

Quick Check: