Practice - 5 Tasks

Answer the questions below

1fill in blank

easy

Complete the code to load a GPT-4V vision-language model.

Prompt Engineering / GenAI

model = GPT4VModel.from_pretrained([1])

Drag options to blanks, or click blank then click option'

A"gpt4v-text"

B"gpt4v-base"

C"gpt4v-vision"

D"gpt4v-audio"

Attempts:

3 left

2fill in blank

medium

Complete the code to preprocess an image for GPT-4V input.

Prompt Engineering / GenAI

processed_image = processor.preprocess([1])

Drag options to blanks, or click blank then click option'

Araw_text

Bimage_path

Caudio_clip

Dvideo_frame

Attempts:

3 left

3fill in blank

hard

Fix the error in the code to generate a caption from an image using GPT-4V.

Prompt Engineering / GenAI

outputs = model.generate([1])

Drag options to blanks, or click blank then click option'

Aaudio_input

Braw_text

Cvideo_input

Dprocessed_image

Attempts:

3 left

4fill in blank

hard

Fill both blanks to create a dictionary of images and their features.

Prompt Engineering / GenAI

features = {img: [1] for img in images if img [2] None}

Drag options to blanks, or click blank then click option'

Amodel.extract_features(img)

Bmodel.generate_caption(img)

Cis not

D==

Attempts:

3 left

5fill in blank

hard

Fill all three blanks to filter images and generate captions for valid inputs.

Prompt Engineering / GenAI

captions = [model.generate_caption([1]) for [2] in images if [3] is not None]

Drag options to blanks, or click blank then click option'

Aimg

Bimage

Attempts:

3 left

Practice

(1/5)

1. What is the main capability of vision-language models like GPT-4V?

easy

A. They understand and generate responses based on both images and text.

B. They only process text data without images.

C. They only analyze images without any text understanding.

D. They translate languages without any image input.

Vision-language models (GPT-4V) in Prompt Engineering / GenAI - Interactive Code Practice

Start learning this pattern below

Practice

Solution

Step 1: Understand the model's input types

Step 2: Recognize the model's output capabilities

Final Answer:

Quick Check:

Solution

Step 1: Identify the prompt that asks for image description

Step 2: Eliminate unrelated commands

Final Answer:

Quick Check:

Solution

Step 1: Understand the prompt and image input

Step 2: Predict the model's response

Final Answer:

Quick Check:

Solution

Step 1: Check required inputs for vision-language query

Step 2: Identify missing argument

Final Answer:

Quick Check:

Solution

Step 1: Understand the task requirements

Step 2: Choose the prompt that requests object listing and counting

Step 3: Eliminate other options

Final Answer:

Quick Check: