Skip to content

New Accuracy Layer on GPU interferes with training #5981

@vlomonaco

Description

@vlomonaco

Issue summary

Using the "Accuracy" layer in the "Training net" on GPU breaks the training. The layer somehow interferes with the gradient. Loss explodes quickly and Train/Test Accuracies stall to 1.

Steps to reproduce

  1. Download the the latest version of Caffe (commit 691febc)
  2. Compile it with this Makefile.config
  3. Run a network with Accuracy layer in the TRAIN phase.

My system configuration

Operating system: Ubuntu 14.04.5 LTS
CUDA version (if applicable): 7.0
CUDNN version (if applicable): 4.7
BLAS: libblas.so.3
Python version: 3.5

How to fix it

All these three solutions can fix it:

  • Remove the Accuracy Layer from the "Training net", not problems with the phase: TEST.
  • Change back-end to CPU.
  • Rollback before commit 62e0c85 (which I suspect caused the issue).

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions