I'm training the seq2seq example from PyTorch's tutorial site on a GTX 1070 but according to nvidia-smi it only utilizes 75% of the GPU.
I'm using Python 3.6 and CUDA 8.0 with Nvidia's proprietary drivers.
Any idea what could be causing this?
Could it be that the tutorial doesn't batch properly or something and the GPU doesn't get enough work to do quick enough?