0
0
Prompt Engineering / GenAIml~20 mins

Streaming responses to users in Prompt Engineering / GenAI - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Streaming Response Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate
2:00remaining
Why use streaming responses in AI chat applications?

Which of the following best explains why streaming responses are used in AI chat applications?

ATo send parts of the response as they are generated, reducing wait time and improving user experience.
BTo store the response on the server for later retrieval by the user.
CTo send the entire response only after the AI finishes generating it, ensuring completeness.
DTo compress the response data to save bandwidth during transmission.
Attempts:
2 left
💡 Hint

Think about how users feel when they wait for a long answer to appear all at once.

Predict Output
intermediate
2:00remaining
Output of streaming response simulation code

What will be printed by the following Python code simulating streaming AI responses?

Prompt Engineering / GenAI
import time
responses = ['Hello', ', ', 'how ', 'can ', 'I ', 'help ', 'you?']
for part in responses:
    print(part, end='', flush=True)
    time.sleep(0.1)
print('\nDone')
AHello, how can I help you?\nDone
BHello\n, \nhow \ncan \nI \nhelp \nyou?\nDone
CHello, how can I help you? Done
DHello, how can I help you?
Attempts:
2 left
💡 Hint

Look at how print uses end='' and flush=True.

Model Choice
advanced
2:00remaining
Choosing a model architecture for streaming text generation

Which model architecture is best suited for generating streaming text responses token-by-token in real time?

AAutoencoder trained for data compression
BConvolutional Neural Network (CNN) trained for image classification
CRecurrent Neural Network (RNN) or Transformer decoder that generates tokens sequentially
DFeedforward Neural Network with fixed-size input and output
Attempts:
2 left
💡 Hint

Think about models that generate sequences one piece at a time.

Metrics
advanced
2:00remaining
Evaluating streaming response quality

Which metric is most appropriate to evaluate the quality of streaming text responses from an AI model?

ASilhouette score for clustering quality
BConfusion matrix of classification labels
CMean Squared Error (MSE) between predicted and true token embeddings
DBLEU score comparing generated tokens to reference text
Attempts:
2 left
💡 Hint

Consider metrics that compare generated text to expected text.

🔧 Debug
expert
2:00remaining
Identifying the cause of delayed streaming output

An AI chat app uses streaming to send tokens as they are generated. However, users report that the entire response appears only after a long delay. Which is the most likely cause?

AThe model generates tokens in parallel and streams them immediately.
BThe model generates tokens one by one but the server buffers output and sends it only after completion.
CThe client displays tokens as soon as they arrive from the server.
DThe network connection is very fast and stable.
Attempts:
2 left
💡 Hint

Think about where buffering might happen in the streaming pipeline.