Skip to content

Variable-length mini-batches in bidirectional RNNs #1811

@root20

Description

@root20

As per my understanding, PackedSequence enabled use of variable-length mini-batches without worrying about gradients flowing through the zero-padded frames.
I am curious about if it is also valid in bidirectional RNNs. (bi-RNN)
If it performs reversal of the sequences, I think it should exclude zero-paddings.
If we include zero-paddings during reversal, in the output of the bi-RNN, zero-paddings will be concatenated to the non-paddings in some short sequences in a mini-batch.
This behaviour is different with the case when we use bi-RNNs with mini-batch of size 1.

Example of wrong case (where ● is non-paddng frame; ○ is zero-padding)>
(before reversal)
●●●●○○○○
●●●●●●○○
●●●●●●●○
(after reversal)
○○○○●●●●
○○●●●●●●
○●●●●●●●

Example of correct case>
(before reversal)
●●●●○○○○
●●●●●●○○
●●●●●●●○
(after reversal)
●●●●○○○○
●●●●●●○○
●●●●●●●○

I don't really know how the bi-RNNs are implemented, but I could not see the part that takes lengths informations to handle this problem.

Thank you.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions