dropout in place incompatible with max pooling

It took me several hours to finally find this problem.
In my own implementation of dropout in cuda-convnet, I randomly drop half of the nodes during training time, and multiply by one half during test. In caffe, the nodes are multiplied by two during training time, and nothing is done during testing.
The two approach seems the same, but not the same when dropout is applied on max pooling layer in place. As the backward pass of max pooling layer needs its output, but dropout disrupts by multiplying a factor of two. For dropout, this can be resolved by multiplying one half during test time.

Any ideas to prevent in place operation when the data is needed in the backward pass? as in cuda-convnet, there is a useAct flag that indicates the activation data is needed by the layer in the future, and should not be overwritten.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dropout in place incompatible with max pooling #117

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

dropout in place incompatible with max pooling #117

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions