Challenge - 5 Problems

🎖️

Hugging Face Integration Master

Get all challenges correct to earn this badge!

Test your skills under time pressure!

❓ Predict Output

intermediate

2:00remaining

Output of Hugging Face tokenizer usage

What is the output of the following code snippet that uses a Hugging Face tokenizer to tokenize a sentence?

PyTorch

from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased')
inputs = tokenizer('Hello world!', return_tensors='pt')
print(inputs['input_ids'].tolist())

A[[7592, 2088, 999]]

B[[101, 7592, 2088, 999, 102]]

C[[101, 7592, 2088, 999]]

D[[101, 7592, 2088, 102]]

Attempts:

2 left

❓ Model Choice

intermediate

2:00remaining

Choosing the correct Hugging Face model for text classification

You want to perform sentiment analysis on movie reviews using Hugging Face. Which model is best suited for this task?

Adistilbert-base-uncased-finetuned-sst-2-english

Bbert-base-uncased

Croberta-base

Dgpt2

Attempts:

2 left

❓ Hyperparameter

advanced

2:00remaining

Effect of batch size on Hugging Face model training

During fine-tuning a Hugging Face transformer model, what is the main effect of increasing the batch size?

AIt decreases the learning rate automatically.

BIt increases the model's ability to generalize by adding noise to gradients.

CIt always improves model accuracy regardless of dataset size.

DIt reduces training time per epoch but may require more memory.

Attempts:

2 left

❓ Metrics

advanced

2:00remaining

Correct metric for evaluating Hugging Face text generation

Which metric is most appropriate to evaluate the quality of text generated by a Hugging Face language model?

ABLEU score

BF1 score

CAccuracy

DMean Squared Error

Attempts:

2 left

🔧 Debug

expert

2:00remaining

Identifying error in Hugging Face model loading code

What error will this code raise when trying to load a Hugging Face model and tokenizer?

PyTorch

from transformers import AutoModel, AutoTokenizer

model = AutoModel.from_pretrained('bert-base-uncased')
tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased')
inputs = tokenizer('Test input', return_tensors='pt')
outputs = model(inputs)

ARuntimeError: CUDA out of memory

BValueError: Invalid model name

CTypeError: forward() missing 1 required positional argument: 'input_ids'

DNo error, code runs successfully

Attempts:

2 left