-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Closed
Labels
Description
According to the docs nn.LSTM outputs:
output : A (seq_len x batch x hidden_size) tensor containing the output features (h_t) from the last layer of the RNN, for each t
h_n : A (num_layers x batch x hidden_size) tensor containing the hidden state for t=seq_len
c_n : A (num_layers x batch x hidden_size) tensor containing the cell state for t=seq_len
However, if the LSTM is initialized as a bidirectional LSTM what you get is:
output : A (seq_len x batch x hidden_size * num_directions) tensor containing the output features (h_t) from the last layer of the RNN, for each t
h_n : A (num_layers * num_directions x batch x hidden_size) tensor containing the hidden state for t=seq_len
c_n : A (num_layers * num_directions x batch x hidden_size) tensor containing the cell state for t=seq_len
It'd be nice to make this explicit in the docs.
Also, I was wondering if the default concatenatin behaviour can be tweaked (to merging, for instance)?
These would all be nice features to have.
hassyGo, davidalbertonogueira and AbimbolaOO