-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Closed
Labels
high prioritymodule: cudnnRelated to torch.backends.cudnn, and CuDNN supportRelated to torch.backends.cudnn, and CuDNN supportmodule: multithreadingRelated to issues that occur when running on multiple CPU threadsRelated to issues that occur when running on multiple CPU threadsmodule: windowsWindows support for PyTorchWindows support for PyTorchtriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate moduleThis issue has been looked at a team member, and triaged and prioritized into an appropriate module
Description
🐛 Bug
When using a JIT model twice at the same time, for the first time, I get a crash here.
PoolWindow::reserve seems to try to access a vector out of range.
The responsible code was added in #14861
Backtrace:
msvcp140d.dll!std::_Debug_message(const wchar_t * message, const wchar_t * file, unsigned int line) Line 9 C++
> caffe2_gpu.dll!std::vector<std::_List_unchecked_iterator<std::_List_val<std::_List_simple_types<std::pair<int const ,cudnnContext * __ptr64> > > >,std::allocator<std::_List_unchecked_iterator<std::_List_val<std::_List_simple_types<std::pair<int const ,cudnnContext * __ptr64> > > > > >::operator[](const unsigned __int64 _Pos) Line 1796 C++
caffe2_gpu.dll!std::_Hash<std::_Umap_traits<int,cudnnContext * __ptr64,std::_Uhash_compare<int,std::hash<int>,std::equal_to<int> >,std::allocator<std::pair<int const ,cudnnContext * __ptr64> >,0> >::_Vec_lo(unsigned __int64 _Bucket) Line 822 C++
caffe2_gpu.dll!std::_Hash<std::_Umap_traits<int,cudnnContext * __ptr64,std::_Uhash_compare<int,std::hash<int>,std::equal_to<int> >,std::allocator<std::pair<int const ,cudnnContext * __ptr64> >,0> >::_Begin(unsigned __int64 _Bucket) Line 841 C++
caffe2_gpu.dll!std::_Hash<std::_Umap_traits<int,cudnnContext * __ptr64,std::_Uhash_compare<int,std::hash<int>,std::equal_to<int> >,std::allocator<std::pair<int const ,cudnnContext * __ptr64> >,0> >::lower_bound(const int & _Keyval) Line 647 C++
caffe2_gpu.dll!std::_Hash<std::_Umap_traits<int,cudnnContext * __ptr64,std::_Uhash_compare<int,std::hash<int>,std::equal_to<int> >,std::allocator<std::pair<int const ,cudnnContext * __ptr64> >,0> >::find(const int & _Keyval) Line 630 C++
caffe2_gpu.dll!at::native::`anonymous namespace'::PoolWindow::reserve(int device) Line 88 C++
caffe2_gpu.dll!at::native::getCudnnHandle() Line 149 C++
caffe2_gpu.dll!at::native::setCuDNNStreamToCurrent() Line 13 C++
caffe2_gpu.dll!at::native::cudnn_convolution(const at::Tensor & input_t, const at::Tensor & weight_t, const at::Tensor & bias_t, c10::ArrayRef<__int64> padding, c10::ArrayRef<__int64> stride, c10::ArrayRef<__int64> dilation, __int64 groups, bool benchmark, bool deterministic) Line 930 C++
caffe2_gpu.dll!at::CUDAFloatType::cudnn_convolution(const at::Tensor & self, const at::Tensor & weight, const at::Tensor & bias, c10::ArrayRef<__int64> padding, c10::ArrayRef<__int64> stride, c10::ArrayRef<__int64> dilation, __int64 groups, bool benchmark, bool deterministic) Line 5315 C++
To Reproduce
Steps to reproduce the behavior:
- Load a Jit model (once, so on one thread)
- After the model is loaded, forward it twice simultaneously (separate threads)
- Experience crash in header
Called from two threads at same time:
static std::once_flag model_flag;
std::call_once(model_flag, [&modelFile]() {
model = torch::jit::load(modelFile, torch::kCUDA); }
);
// All works fine up until here
model->forward({torch::randn({1, 3, 1024, 1024}, torch::kCUDA)});
// Crashes when called from a different thread than model was created on.
Expected behavior
Works when loaded from one thread without having to wait for some random time period to prevent race condition.
Environment
- PyTorch Version (e.g., 1.0): 1.1.0-pre
- OS (e.g., Linux): Windows 64-bit
- How you installed PyTorch (
conda,pip, source): nightly - Build command you used (if compiling from source): N/A
- Python version: 3.7
- CUDA/cuDNN version: 10.0/7.5
- GPU models and configuration: GTX1060
Workarounds
Other models that I have in a special model thread that batches the inputs will work fine as it ensures they don't get run simultaneously.
This specific model has tensors of different sizes which I am unable to batch together.
Metadata
Metadata
Assignees
Labels
high prioritymodule: cudnnRelated to torch.backends.cudnn, and CuDNN supportRelated to torch.backends.cudnn, and CuDNN supportmodule: multithreadingRelated to issues that occur when running on multiple CPU threadsRelated to issues that occur when running on multiple CPU threadsmodule: windowsWindows support for PyTorchWindows support for PyTorchtriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate moduleThis issue has been looked at a team member, and triaged and prioritized into an appropriate module