-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Bug in newly added multi-gpu code #2922
Copy link
Copy link
Closed
Labels
Description
Trying out simple networks with multi-gpu segfaults because they don't have learnable parameters:
In parallel.c these lines are critical:
CUDA_CHECK(cudaMalloc(&data_, size_ * sizeof(Dtype)));
[...]
CUDA_CHECK(cudaMalloc(&diff_, size_ * sizeof(Dtype)));
[...]
CUDA_CHECK(cudaMalloc(&parent_grads_, size_ * sizeof(Dtype)));
If the net does not have learnable parameters, size_ will be 0 and cudaMalloc will return null pointers.
I currently work around this by adding +1 byte to the allocated size, but there should be a better fix.
Reactions are currently unavailable