As reported in #5353. > F0303 23:23:01.868011 30044 parallel.cpp:135] Check failed: result == ncclSuccess (2 vs. 0) system error *** Check failure stack trace: ***