Practice - 5 Tasks

Answer the questions below

1fill in blank

easy

Complete the code to import the BERT tokenizer from the transformers library.

NLP

from transformers import [1]

Drag options to blanks, or click blank then click option'

ATokenizer

BAutoTokenizer

CBertTokenizer

DBertModel

Attempts:

3 left

2fill in blank

medium

Complete the code to load the pretrained BERT tokenizer for 'bert-base-uncased'.

NLP

tokenizer = BertTokenizer.[1]('bert-base-uncased')

Drag options to blanks, or click blank then click option'

Aload

Bfrom_pretrained

Cinit

Dtokenize

Attempts:

3 left

3fill in blank

hard

Fix the error in the code to tokenize the sentence using the tokenizer.

NLP

tokens = tokenizer.[1]('Hello, how are you?')

Drag options to blanks, or click blank then click option'

Atokenize

Bsplit

Cparse

Dencode

Attempts:

3 left

4fill in blank

hard

Fill both blanks to create a dictionary of token ids and attention mask for the input text.

NLP

encoded_input = tokenizer('[1]', return_tensors='pt', padding=True, truncation=True)
input_ids = encoded_input['[2]']

Drag options to blanks, or click blank then click option'

AHello, how are you?

Binput_ids

Cattention_mask

Dtokens

Attempts:

3 left

5fill in blank

hard

Fill all three blanks to decode token ids back to the original text without special tokens.

NLP

decoded_text = tokenizer.[1](encoded_input['[2]'][0], skip_special_tokens=[3])

Drag options to blanks, or click blank then click option'

Adecode

Binput_ids

CTrue

DFalse

Attempts:

3 left

Practice

(1/5)

1. What is the main purpose of BERT's WordPiece tokenization?

easy

A. To split words into smaller known pieces for better handling of unknown words

B. To translate text into another language

C. To remove stop words from sentences

D. To convert text into numerical vectors directly

BERT tokenization (WordPiece) in NLP - Interactive Code Practice

Start learning this pattern below

Practice

Solution

Step 1: Understand WordPiece tokenization

Step 2: Identify the purpose of this splitting

Final Answer:

Quick Check:

Solution

Step 1: Understand WordPiece token format

Step 2: Analyze the options

Final Answer:

Quick Check:

Solution

Step 1: Tokenize 'Playing'

Step 2: Tokenize 'football'

Step 3: Check remaining words

Final Answer:

Quick Check:

Solution

Step 1: Check token continuation rules

Step 2: Analyze given tokens

Final Answer:

Quick Check:

Solution

Step 1: Understand unknown word handling

Step 2: Analyze 'unbreakable'

Step 3: Check other tokens

Final Answer:

Quick Check: