Skip to content

Segfault in nightly binaries while creating a tensor form a list if numpy is missing #66353

@ptrblck

Description

@ptrblck

🐛 Bug

PyTorch raises a segfault while executing:

import torch
torch.tensor([torch.tensor(0.), torch.tensor(1.)])
<stdin>:1: UserWarning: Failed to initialize NumPy: numpy.core.multiarray failed to import (Triggered internally at  ../torch/csrc/utils/tensor_numpy.cpp:68.)
Segmentation fault (core dumped)

Backtrace points to:

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff5254097 in torch::utils::(anonymous namespace)::recursive_store(char*, c10::ArrayRef<long>, c10::ArrayRef<long>, long, c10::ScalarType, int, _object*) ()
   from /home/pbialecki/miniforge3/envs/nightly_pip_cuda111/lib/python3.8/site-packages/torch/lib/libtorch_python.so
(gdb) bt
#0  0x00007ffff5254097 in torch::utils::(anonymous namespace)::recursive_store(char*, c10::ArrayRef<long>, c10::ArrayRef<long>, long, c10::ScalarType, int, _object*) ()
   from /home/pbialecki/miniforge3/envs/nightly_pip_cuda111/lib/python3.8/site-packages/torch/lib/libtorch_python.so
#1  0x00007ffff52562bd in torch::utils::(anonymous namespace)::internal_new_from_data(c10::TensorOptions, c10::ScalarType, c10::optional<c10::Device>, _object*, bool, bool, bool, bool) ()
   from /home/pbialecki/miniforge3/envs/nightly_pip_cuda111/lib/python3.8/site-packages/torch/lib/libtorch_python.so
#2  0x00007ffff525a0e6 in torch::utils::tensor_ctor(c10::DispatchKey, c10::ScalarType, _object*, _object*) ()
   from /home/pbialecki/miniforge3/envs/nightly_pip_cuda111/lib/python3.8/site-packages/torch/lib/libtorch_python.so
#3  0x00007ffff4f9d211 in torch::autograd::THPVariable_tensor(_object*, _object*, _object*) ()
   from /home/pbialecki/miniforge3/envs/nightly_pip_cuda111/lib/python3.8/site-packages/torch/lib/libtorch_python.so
#4  0x0000555555683914 in cfunction_call_varargs (kwargs=<optimized out>, args=<optimized out>, 
    func=0x7ffef8b14e00)

After installing numpy the code works fine.

CC @seemethere @malfet

Environment

PyTorch version: 1.11.0.dev20211008+cu111
Is debug build: False
CUDA used to build PyTorch: 11.1
ROCM used to build PyTorch: N/A

OS: Ubuntu 20.04.2 LTS (x86_64)
GCC version: (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
Clang version: Could not collect
CMake version: version 3.16.3
Libc version: glibc-2.31

Python version: 3.8.10 | packaged by conda-forge | (default, May 11 2021, 07:01:05) [GCC 9.3.0] (64-bit runtime)
Python platform: Linux-5.8.0-53-generic-x86_64-with-glibc2.10
Is CUDA available: True
CUDA runtime version: 11.3.109
GPU models and configuration: GPU 0: NVIDIA GeForce RTX 3090
Nvidia driver version: 465.19.01
cuDNN version: Probably one of the following:
/usr/local/cuda-11.3/targets/x86_64-linux/lib/libcudnn.so.8.2.0
/usr/local/cuda-11.3/targets/x86_64-linux/lib/libcudnn_adv_infer.so.8.2.0
/usr/local/cuda-11.3/targets/x86_64-linux/lib/libcudnn_adv_train.so.8.2.0
/usr/local/cuda-11.3/targets/x86_64-linux/lib/libcudnn_cnn_infer.so.8.2.0
/usr/local/cuda-11.3/targets/x86_64-linux/lib/libcudnn_cnn_train.so.8.2.0
/usr/local/cuda-11.3/targets/x86_64-linux/lib/libcudnn_ops_infer.so.8.2.0
/usr/local/cuda-11.3/targets/x86_64-linux/lib/libcudnn_ops_train.so.8.2.0
HIP runtime version: N/A
MIOpen runtime version: N/A

Versions of relevant libraries:
[pip3] torch==1.11.0.dev20211008+cu111
[conda] torch 1.11.0.dev20211008+cu111 pypi_0 pypi

cc @ezyang @gchanan @zou3519 @bdhirsh @jbschlosser @seemethere @malfet @mruberry

Metadata

Metadata

Assignees

Labels

high prioritymodule: binariesAnything related to official binaries that we release to usersmodule: crashProblem manifests as a hard crash, as opposed to a RuntimeErrormodule: regressionIt used to work, and now it doesn'tmodule: tensor creationtriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions