Anant Gupta comments

Results 11 comments of


                                            Anant Gupta

Add option to resume training

Hi, Should I add the `resume training` feature? I guess the option to save at intervals is already present (via model_save_interval).

only sentences with a fixed size are chosen?

Currently, yes. However, it would be very easy to simply do a padding to account for dynamic length.

While training the model can see all words (beside the last one)

When convolving inputs, the zero-padding added to the top rows of input layer makes sure that a hidden state does not contain information from future words.

While training the model can see all words (beside the last one)

@ruotianluo Zero padding is used in every layer to keep the layer size same: https://github.com/anantzoid/Language-Modeling-GatedCNN/blob/master/model.py#L62 The zero padding I referred to in the above comment is the extra padding required...

is the dimension of convolution output same with the embeddings size ??

Since there is no pooling, the height and width of the output layer remains the same. The depth is also kept constant for each layer, but can be modified to...

is the dimension of convolution output same with the embeddings size ??

That's an interesting observation. However, it's mentioned in the paper that X (input to any hidden layer h) has the dimension Nxm, and this input could be either word embeddings...

seq2seq-translation-batched: Bahdanau attention does not work

#119 addresses some of the mentioned issues.

where is activation_fn defined?

You can pass the activation_fn in conv2d as `relu`, `sigmoid`, `tanh` etc. `binarize` function converts pixel intensities to binary values. Try declaring an array: `ar = np.asarray([[0.12,1.2], [1.1, 0.5]])` and...

out of memory

I'm not sure why the memory blows up but a quick fix is to reduce the batch size.

Reproducibility on Task 14

There's a reproducibility issue on Task 19 as well. Running the code for 53041 global steps reports: "Best step: 37440 with accuracy = 0.412", i.e., 0.6 error, while the repo...