Consider a bidirectional RNN layer in TensorFlow with units=32 and return_sequences=True. If the input shape is (batch_size, timesteps, features), what will be the output shape of this layer?
Remember that a bidirectional RNN concatenates outputs from forward and backward passes.
A bidirectional RNN with 32 units runs two RNNs (forward and backward), each producing 32 outputs per timestep. These outputs are concatenated, resulting in 64 features per timestep. Since return_sequences=True, the output shape keeps the timesteps dimension.
What is the output of the following TensorFlow code snippet?
import tensorflow as tf import numpy as np inputs = tf.constant(np.random.random((1, 5, 10)), dtype=tf.float32) layer = tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(4, return_sequences=False)) output = layer(inputs) print(output.shape)
Check the return_sequences parameter and how bidirectional layers concatenate outputs.
The LSTM has 4 units, and since return_sequences=False, it outputs the last timestep's output only. The bidirectional wrapper concatenates forward and backward outputs, doubling the feature size to 8. The batch size is 1, so the output shape is (1, 8).
You want to build a model to tag each word in a sentence with its part of speech. Which bidirectional RNN layer configuration is best suited for this task?
Think about whether you need output for each timestep or just one output per sequence.
For sequence tagging, you need an output for every word (timestep). Using return_sequences=True ensures the model outputs a vector for each timestep. Bidirectional GRU is efficient and suitable. Options with return_sequences=False output only one vector per sequence, which is not suitable for tagging each word.
What is the most likely effect of doubling the number of units in a bidirectional RNN layer on model training?
More units mean more parameters to learn.
Doubling units increases the number of parameters, which increases model capacity and training time. This can improve accuracy if the model was underfitting, but may also risk overfitting.
Given the following code, what is the cause of the error?
import tensorflow as tf inputs = tf.keras.Input(shape=(10, 8)) # Bidirectional LSTM with 16 units x = tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(16))(inputs) outputs = tf.keras.layers.Dense(1)(x) model = tf.keras.Model(inputs, outputs) model.compile(optimizer='adam', loss='mse') # Trying to train with input shape (32, 10, 8) and target shape (32, 10, 1) x_train = tf.random.normal((32, 10, 8)) y_train = tf.random.normal((32, 10, 1)) model.fit(x_train, y_train, epochs=1)
Check if the model output shape matches the target shape.
The bidirectional LSTM without return_sequences=True outputs only the last timestep's output, shape (batch_size, 32). The target has shape (batch_size, 10, 1), expecting output for each timestep. Setting return_sequences=True makes the LSTM output a sequence matching the target shape.