NLP - Sequence Models for NLP
Find the bug in this GRU implementation snippet:
```python
gru = nn.GRU(input_size=30, hidden_size=50, batch_first=True)
input_tensor = torch.randn(4, 10, 30)
hidden = torch.randn(1, 4, 50)
output, hidden = gru(input_tensor, hidden)
print(output.shape)
```
