Challenge - 5 Problems
Bidirectional LSTM Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
❓ Predict Output
intermediate2:00remaining
Output shape of Bidirectional LSTM layer
Consider the following Keras code snippet that creates a Bidirectional LSTM layer. What is the shape of the output tensor after passing an input batch of shape (32, 10, 8) through this layer?
from tensorflow.keras.layers import Bidirectional, LSTM from tensorflow.keras.models import Sequential model = Sequential() model.add(Bidirectional(LSTM(16, return_sequences=False), input_shape=(10, 8))) output = model.layers[0].output_shape
NLP
from tensorflow.keras.layers import Bidirectional, LSTM from tensorflow.keras.models import Sequential model = Sequential() model.add(Bidirectional(LSTM(16, return_sequences=False), input_shape=(10, 8))) output = model.layers[0].output_shape
Attempts:
2 left
💡 Hint
Remember that Bidirectional doubles the units of the LSTM output when return_sequences=False.
✗ Incorrect
The LSTM has 16 units. Bidirectional wraps it and concatenates forward and backward outputs, doubling the units to 32. Since return_sequences=False, output shape is (batch_size, 32). Batch size is None (unspecified).
❓ Model Choice
intermediate2:00remaining
Choosing Bidirectional LSTM for sequence classification
You want to build a model to classify movie reviews as positive or negative based on the text. Which model architecture below best uses a Bidirectional LSTM for this task?
Attempts:
2 left
💡 Hint
For classification, the LSTM should output a single vector per sample, not a sequence.
✗ Incorrect
Option D uses Bidirectional LSTM with return_sequences=False, producing one vector per input sequence, suitable for classification. Options A and C output sequences, which is not ideal here. Option D incorrectly places a Dense layer before LSTM, which is not standard.
❓ Hyperparameter
advanced2:00remaining
Effect of return_sequences in Bidirectional LSTM
What is the effect of setting return_sequences=True in a Bidirectional LSTM layer in Keras?
Attempts:
2 left
💡 Hint
Think about whether the output keeps the time dimension or not.
✗ Incorrect
When return_sequences=True, the LSTM returns the hidden state for every time step, so the output shape is (batch_size, time_steps, units*2) for Bidirectional. If False, it returns only the last hidden state (batch_size, units*2).
❓ Metrics
advanced2:00remaining
Interpreting training metrics of Bidirectional LSTM
You train a Bidirectional LSTM model for sentiment analysis. After 10 epochs, training accuracy is 95% but validation accuracy is 60%. What does this indicate?
Attempts:
2 left
💡 Hint
High training accuracy but low validation accuracy usually means the model memorizes training data.
✗ Incorrect
High training accuracy with low validation accuracy means the model learned the training data too well but fails to generalize to new data, a classic sign of overfitting.
🔧 Debug
expert3:00remaining
Debugging shape mismatch in Bidirectional LSTM model
You have this Keras model code snippet:
What is the cause of the shape error?
from tensorflow.keras.models import Sequential from tensorflow.keras.layers import Embedding, Bidirectional, LSTM, Dense model = Sequential() model.add(Embedding(input_dim=1000, output_dim=64, input_length=20)) model.add(Bidirectional(LSTM(32))) model.add(Dense(10, activation='softmax')) model.compile(optimizer='adam', loss='categorical_crossentropy') # You try to train with labels shape (batch_size, 20, 10) but get a shape error.
What is the cause of the shape error?
NLP
from tensorflow.keras.models import Sequential from tensorflow.keras.layers import Embedding, Bidirectional, LSTM, Dense model = Sequential() model.add(Embedding(input_dim=1000, output_dim=64, input_length=20)) model.add(Bidirectional(LSTM(32))) model.add(Dense(10, activation='softmax')) model.compile(optimizer='adam', loss='categorical_crossentropy') # Training labels shape: (batch_size, 20, 10)
Attempts:
2 left
💡 Hint
Check the output shape of the Bidirectional LSTM and the shape of the labels.
✗ Incorrect
The Bidirectional LSTM without return_sequences outputs (batch_size, 64). The Dense layer outputs (batch_size, 10). But the labels have shape (batch_size, 20, 10), implying a sequence of labels per time step. This mismatch causes the error.