0
0
Prompt Engineering / GenAIml~20 mins

Audio transcription (Whisper) in Prompt Engineering / GenAI - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Whisper Transcription Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
Predict Output
intermediate
2:00remaining
Output of Whisper transcription code snippet
What is the output of the following Python code that uses Whisper to transcribe a short audio file?
Prompt Engineering / GenAI
import whisper
model = whisper.load_model('small')
result = model.transcribe('audio_sample.wav')
print(result['text'])
A"[Noise] Unintelligible speech detected."
B"Hello, this is a test audio for transcription."
CSyntaxError: invalid syntax
DFileNotFoundError: [Errno 2] No such file or directory: 'audio_sample.wav'
Attempts:
2 left
💡 Hint
The model transcribes the audio content into text stored in the 'text' key of the result dictionary.
Model Choice
intermediate
1:30remaining
Choosing the right Whisper model for fast transcription
You want to transcribe a large number of short audio clips quickly with reasonable accuracy. Which Whisper model should you choose?
Atiny
Blarge-v2
Cmedium
Dlarge
Attempts:
2 left
💡 Hint
Smaller models run faster but with less accuracy.
Hyperparameter
advanced
2:00remaining
Effect of temperature parameter in Whisper transcription
In Whisper's transcribe method, what is the effect of increasing the 'temperature' parameter from 0.0 to 1.0?
ADecreases transcription speed significantly
BImproves transcription accuracy by reducing errors
CIncreases randomness in transcription, possibly producing more diverse but less stable text
DSwitches the model to a different language automatically
Attempts:
2 left
💡 Hint
Temperature controls randomness in text generation.
Metrics
advanced
1:30remaining
Evaluating Whisper transcription quality
Which metric is most appropriate to measure the accuracy of Whisper's transcriptions compared to ground truth text?
AWord Error Rate (WER)
BMean Squared Error (MSE)
CAccuracy Score
DF1 Score
Attempts:
2 left
💡 Hint
This metric counts word-level differences between predicted and true text.
🔧 Debug
expert
2:30remaining
Debugging a Whisper transcription error
You run the following code but get a RuntimeError: CUDA out of memory. What is the best way to fix this error?
Prompt Engineering / GenAI
import whisper
model = whisper.load_model('large')
result = model.transcribe('long_audio.wav')
print(result['text'])
ARun the code without a GPU by setting device='cpu' in load_model()
BIncrease the batch size to process more audio at once
CUse a higher temperature value in transcribe()
DSwitch to a smaller model like 'medium' or 'small' to reduce GPU memory usage
Attempts:
2 left
💡 Hint
Large models use more GPU memory; smaller models use less.