distributed/distributed_c10d.py", line 1489, in barrierRuntimeError: NCCLerror in: /pytorch/torch/lib/c10d/ProcessGroupNCCL.cpp:410, unhandled system error, NCCL version 2.4.8
从tensorflow 1.13来看,似乎没有像tf.contrib.nccl.allsum这样的api。= [dev_grads[dev][var_idx][0] for dev in devices] g = tf.contrib.nccl.all_sum(g)我查看了Tensorflow官方网站,似乎程