Practice - 5 Tasks

Answer the questions below

1fill in blank

easy

Complete the code to load text data for a multimodal model.

Prompt Engineering / GenAI

text_data = load_[1]('data/text_samples.txt')

Drag options to blanks, or click blank then click option'

Atext

Baudio

Cvideo

Dimage

Attempts:

3 left

2fill in blank

medium

Complete the code to combine image and audio features for multimodal input.

Prompt Engineering / GenAI

combined_features = concatenate([image_features, [1]_features])

Drag options to blanks, or click blank then click option'

Atext

Bvideo

Caudio

Dsensor

Attempts:

3 left

3fill in blank

hard

Fix the error in the code that processes multimodal inputs by selecting the correct modality.

Prompt Engineering / GenAI

if modality == '[1]':
    process_text(data)
elif modality == 'image':
    process_image(data)
else:
    process_audio(data)

Drag options to blanks, or click blank then click option'

Avideo

Baudio

Csensor

Dtext

Attempts:

3 left

4fill in blank

hard

Fill both blanks to create a dictionary mapping modalities to their processing functions.

Prompt Engineering / GenAI

processors = {
    'text': [1],
    'image': [2]
}

Drag options to blanks, or click blank then click option'

Aprocess_text

Bprocess_audio

Cprocess_image

Dprocess_video

Attempts:

3 left

5fill in blank

hard

Fill all three blanks to create a multimodal input pipeline combining text, image, and audio.

Prompt Engineering / GenAI

inputs = {
    'text': [1],
    'image': [2],
    'audio': [3]
}
combined = combine_features([inputs['text'], inputs['image'], inputs['audio']])

Drag options to blanks, or click blank then click option'

Atext_features

Bimage_features

Caudio_features

Dvideo_features

Attempts:

3 left

Practice

(1/5)

1. Why do multimodal AI models combine text, images, and audio?

easy

A. To understand information better by using different types of data together

B. Because text alone is always enough for understanding

C. To make the model run faster without extra data

D. To avoid using any visual or sound information

Why multimodal combines text, image, and audio in Prompt Engineering / GenAI - Test Your Understanding

Start learning this pattern below

Practice

Solution

Step 1: Understand what multimodal means

Step 2: Why combine different data types?

Final Answer:

Quick Check:

Solution

Step 1: Define multimodal input

Step 2: Match the correct description

Final Answer:

Quick Check:

Solution

Step 1: Identify data types in the video

Step 2: Understand multimodal model behavior

Final Answer:

Quick Check:

Solution

Step 1: Analyze model output behavior

Step 2: Identify possible cause

Final Answer:

Quick Check:

Solution

Step 1: Understand the goal

Step 2: Choose best approach

Final Answer:

Quick Check: