Skip to content

/usr/bin/ld: failed to convert GOTPCREL relocation; relink with --no-relax #39968

@zasdfgbnm

Description

@zasdfgbnm

I am seeing an error

caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/THC/torch_cuda_generated_THCStorage.cu.o: in function `__device_stub__ZN6thrust8cuda_cub4core13_kernel_agentINS0_14__parallel_for16ParallelForAgentINS0_6__fill7functorINS_10device_ptrIN3c107complexIdEEEESA_EElEESC_lEEvT0_T1_(thrust::cuda_cub::__fill::functor<thrust::device_ptr<c10::complex<double> >, c10::complex<double> >&, long)':
/home/gaoxiang/cuda11/include/cuda_runtime.h:209:(.text+0x7d4): additional relocation overflows omitted from the output
/usr/bin/ld: failed to convert GOTPCREL relocation; relink with --no-relax
collect2: error: ld returned 1 exit status
ninja: build stopped: subcommand failed.

when building with CUDA11 on my archlinux box.

There was no problem with the combination CUDA 10.2 + my archlinux, and CUDA11 + NVIDIA's NGC container.

After searching, seems that it is due to the code size being too large (see code model, thank @mcarilli for pointing out the issue), and MXNet had the same issue: apache/mxnet#17045

The cause of this problem could be because we are generating code for too many architectures by default:

--     CUDA include path   : /home/gaoxiang/cuda11/include
--     NVCC executable     : /home/gaoxiang/cuda11/bin/nvcc
--     NVCC flags          : -DONNX_NAMESPACE=onnx_torch;-gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_70,code=compute_70;-gencode;arch=compute_75,code=compute_75;-Xcudafe;--diag_suppress=cc_clobber_ignored;-Xcudafe;--diag_suppress=integer_sign_change;-Xcudafe;--diag_suppress=useless_using_declaration;-Xcudafe;--diag_suppress=set_but_not_used;-Xcudafe;--diag_suppress=field_without_dll_interface;-Xcudafe;--diag_suppress=base_class_has_different_dll_interface;-Xcudafe;--diag_suppress=dll_interface_conflict_none_assumed;-Xcudafe;--diag_suppress=dll_interface_conflict_dllexport_assumed;-Xcudafe;--diag_suppress=implicit_return_from_non_void_function;-Xcudafe;--diag_suppress=unsigned_compare_with_zero;-Xcudafe;--diag_suppress=declared_but_not_referenced;-Xcudafe;--diag_suppress=bad_friend_decl;-std=c++14;-Xcompiler;-fPIC;--expt-relaxed-constexpr;--expt-extended-lambda;-Wno-deprecated-gpu-targets;--expt-extended-lambda;-gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_70,code=compute_70;-gencode;arch=compute_75,code=compute_75;-Xcompiler;-fPIC;-DCUDA_HAS_FP16=1;-D__CUDA_NO_HALF_OPERATORS__;-D__CUDA_NO_HALF_CONVERSIONS__;-D__CUDA_NO_HALF2_OPERATORS__
--     CUDA host compiler  : /usr/bin/gcc-8
--     NVCC --device-c     : OFF
--     USE_TENSORRT        : OFF

I am wondering if it is possible to add -mcmodel=medium during linking (tried for a few hours, didn't figure out how).

cc @ezyang @gchanan @zou3519 @malfet @ngimel

Metadata

Metadata

Assignees

No one assigned

    Labels

    high prioritymodule: buildBuild system issuesmodule: cudaRelated to torch.cuda, and CUDA support in generaltriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions