From slack discussions, it's obvious that we need to wrap THNN's TemporalConvolution into conv1d, prob hidden behind a kwarg. The layout for convs needs to be B x C x T