Skip to content

Conversation

@XuehaiPan
Copy link
Collaborator

@XuehaiPan XuehaiPan commented Dec 14, 2024

Stack from ghstack (oldest at bottom):

Before this change:

$ make setup-env-cuda PYTHON="${HOMEBREW_PREFIX}/bin/python3.12"
$ source venv/bin/activate
$ python3 -c 'import torch'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/PanXuehai/Projects/pytorch/torch/__init__.py", line 379, in <module>
    from torch._C import *  # noqa: F403
    ^^^^^^^^^^^^^^^^^^^^^^
ImportError: libcudnn.so.9: cannot open shared object file: No such file or directory

This PR adds site-packages/nvidia/**/lib to LD_LIBRARY_PATH in venv/bin/activate script to let NVIDIA PyPI packages can be loaded correctly.

See also:

cc @ZainRizvi @kit1980 @huydhn @clee2000

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Dec 14, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/143262

Note: Links to docs will display an error until the docs builds have been completed.

⏳ No Failures, 50 Pending

As of commit de9d701 with merge base a10b765 (image):
💚 Looks good so far! There are no failures yet. 💚

UNSTABLE - The following jobs are marked as unstable, possibly due to flakiness on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added the topic: not user facing topic category label Dec 14, 2024
XuehaiPan added a commit to XuehaiPan/pytorch that referenced this pull request Dec 15, 2024
… tool

Before this change:

```console
$ make setup-env-cuda PYTHON="${HOMEBREW_PREFIX}/bin/python3.12"
$ source venv/bin/activate
$ python3 -c 'import torch'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/PanXuehai/Projects/pytorch/torch/__init__.py", line 379, in <module>
    from torch._C import *  # noqa: F403
    ^^^^^^^^^^^^^^^^^^^^^^
ImportError: libcudnn.so.9: cannot open shared object file: No such file or directory
```

This PR add `site-packages/nvidia/**/lib` to `LD_LIBRARY_PATH` to let NVIDIA PyPI packages to be loaded correctly.

See also:

- pytorch#141837

ghstack-source-id: b4d4b78
Pull Request resolved: pytorch#143262
[ghstack-poisoned]
XuehaiPan added a commit to XuehaiPan/pytorch that referenced this pull request Dec 15, 2024
… tool

Before this change:

```console
$ make setup-env-cuda PYTHON="${HOMEBREW_PREFIX}/bin/python3.12"
$ source venv/bin/activate
$ python3 -c 'import torch'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/PanXuehai/Projects/pytorch/torch/__init__.py", line 379, in <module>
    from torch._C import *  # noqa: F403
    ^^^^^^^^^^^^^^^^^^^^^^
ImportError: libcudnn.so.9: cannot open shared object file: No such file or directory
```

This PR add `site-packages/nvidia/**/lib` to `LD_LIBRARY_PATH` to let NVIDIA PyPI packages to be loaded correctly.

See also:

- pytorch#141837

ghstack-source-id: ba7bdd3
Pull Request resolved: pytorch#143262
[ghstack-poisoned]
@XuehaiPan XuehaiPan added the module: devx Related to PyTorch contribution experience (HUD, pytorchbot) label Dec 24, 2024
[ghstack-poisoned]
XuehaiPan added a commit that referenced this pull request Dec 24, 2024
… tool

Before this change:

```console
$ make setup-env-cuda PYTHON="${HOMEBREW_PREFIX}/bin/python3.12"
$ source venv/bin/activate
$ python3 -c 'import torch'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/PanXuehai/Projects/pytorch/torch/__init__.py", line 379, in <module>
    from torch._C import *  # noqa: F403
    ^^^^^^^^^^^^^^^^^^^^^^
ImportError: libcudnn.so.9: cannot open shared object file: No such file or directory
```

This PR add `site-packages/nvidia/**/lib` to `LD_LIBRARY_PATH` to let NVIDIA PyPI packages to be loaded correctly.

See also:

- #141837

ghstack-source-id: 4a1b1ab
Pull Request resolved: #143262
XuehaiPan added a commit to XuehaiPan/pytorch that referenced this pull request Dec 24, 2024
… tool

Before this change:

```console
$ make setup-env-cuda PYTHON="${HOMEBREW_PREFIX}/bin/python3.12"
$ source venv/bin/activate
$ python3 -c 'import torch'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/PanXuehai/Projects/pytorch/torch/__init__.py", line 379, in <module>
    from torch._C import *  # noqa: F403
    ^^^^^^^^^^^^^^^^^^^^^^
ImportError: libcudnn.so.9: cannot open shared object file: No such file or directory
```

This PR add `site-packages/nvidia/**/lib` to `LD_LIBRARY_PATH` to let NVIDIA PyPI packages to be loaded correctly.

See also:

- pytorch#141837

ghstack-source-id: 4a1b1ab
Pull Request resolved: pytorch#143262
@albanD
Copy link
Collaborator

albanD commented Dec 25, 2024

This shouldn't be needed? My understanding is that we buildin appropriate rpath into our shared libraries so they can find the nvidia packages without any LD_LIBRARY path (which the user will never have locally).

@XuehaiPan
Copy link
Collaborator Author

This shouldn't be needed? My understanding is that we buildin appropriate rpath into our shared libraries so they can find the nvidia packages without any LD_LIBRARY path (which the user will never have locally).

For normal installation from PyPI, this is not needed. Because the torch package and nvidia-* dependencies are installed into site-packages. See also:

This PR fixes the ld search path for in-tree installation for the nightly development tool. This does not affect most users.

[ghstack-poisoned]
XuehaiPan added a commit that referenced this pull request Dec 26, 2024
… tool

Before this change:

```console
$ make setup-env-cuda PYTHON="${HOMEBREW_PREFIX}/bin/python3.12"
$ source venv/bin/activate
$ python3 -c 'import torch'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/PanXuehai/Projects/pytorch/torch/__init__.py", line 379, in <module>
    from torch._C import *  # noqa: F403
    ^^^^^^^^^^^^^^^^^^^^^^
ImportError: libcudnn.so.9: cannot open shared object file: No such file or directory
```

This PR add `site-packages/nvidia/**/lib` to `LD_LIBRARY_PATH` to let NVIDIA PyPI packages to be loaded correctly.

See also:

- #141837

ghstack-source-id: a1c9d21
Pull Request resolved: #143262
Copy link
Contributor

@malfet malfet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move typo fixes/refactors to the separate RP.
And changes to the environment variable proposed in this PR feels wrong to me: If I'm not mistaken, PyPI package sometimes missing header files, and this change is not updating system includes search path, is it?
I.e. you are building with one set of header, but than dynamically load other libraries

[ghstack-poisoned]
@XuehaiPan
Copy link
Collaborator Author

If I'm not mistaken, PyPI package sometimes missing header files, and this change is not updating system includes search path, is it?

The goal of the nightly tool is to allow contributors to run and test Python code without building from source, if they want to contribute Python code only. Since the nightly installation is an in-tree git repository, all torch headers are in the local directory. The NVIDIA headers are under venv/lib/pythonX.Y/site-packages/nvidia/cu*/include.

@github-actions
Copy link
Contributor

Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as Stale.
Feel free to remove the Stale label if you feel this was a mistake.
If you are unable to remove the Stale label please contact a maintainer in order to do so.
If you want the bot to never mark this PR stale again, add the no-stale label.
Stale pull requests will automatically be closed after 30 days of inactivity.

[ghstack-poisoned]
XuehaiPan added a commit that referenced this pull request Mar 27, 2025
… tool

Before this change:

```console
$ make setup-env-cuda PYTHON="${HOMEBREW_PREFIX}/bin/python3.12"
$ source venv/bin/activate
$ python3 -c 'import torch'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/PanXuehai/Projects/pytorch/torch/__init__.py", line 379, in <module>
    from torch._C import *  # noqa: F403
    ^^^^^^^^^^^^^^^^^^^^^^
ImportError: libcudnn.so.9: cannot open shared object file: No such file or directory
```

This PR add `site-packages/nvidia/**/lib` to `LD_LIBRARY_PATH` to let NVIDIA PyPI packages to be loaded correctly.

See also:

- #141837

ghstack-source-id: 79e38fd
Pull Request resolved: #143262
[ghstack-poisoned]
XuehaiPan added a commit that referenced this pull request Mar 31, 2025
… tool

Before this change:

```console
$ make setup-env-cuda PYTHON="${HOMEBREW_PREFIX}/bin/python3.12"
$ source venv/bin/activate
$ python3 -c 'import torch'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/PanXuehai/Projects/pytorch/torch/__init__.py", line 379, in <module>
    from torch._C import *  # noqa: F403
    ^^^^^^^^^^^^^^^^^^^^^^
ImportError: libcudnn.so.9: cannot open shared object file: No such file or directory
```

This PR add `site-packages/nvidia/**/lib` to `LD_LIBRARY_PATH` to let NVIDIA PyPI packages to be loaded correctly.

See also:

- #141837

ghstack-source-id: 424d923
Pull Request resolved: #143262
[ghstack-poisoned]
XuehaiPan added a commit that referenced this pull request Apr 1, 2025
… tool

Before this change:

```console
$ make setup-env-cuda PYTHON="${HOMEBREW_PREFIX}/bin/python3.12"
$ source venv/bin/activate
$ python3 -c 'import torch'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/PanXuehai/Projects/pytorch/torch/__init__.py", line 379, in <module>
    from torch._C import *  # noqa: F403
    ^^^^^^^^^^^^^^^^^^^^^^
ImportError: libcudnn.so.9: cannot open shared object file: No such file or directory
```

This PR add `site-packages/nvidia/**/lib` to `LD_LIBRARY_PATH` to let NVIDIA PyPI packages to be loaded correctly.

See also:

- #141837

ghstack-source-id: e25e05f
Pull Request resolved: #143262
Copy link
Contributor

@malfet malfet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, if lint is green...

@malfet
Copy link
Contributor

malfet commented Apr 1, 2025

@pytorchbot merge -f "Lint + MPS are green"

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

amathewc pushed a commit to amathewc/pytorch that referenced this pull request Apr 17, 2025
… tool (pytorch#143262)

Before this change:

```console
$ make setup-env-cuda PYTHON="${HOMEBREW_PREFIX}/bin/python3.12"
$ source venv/bin/activate
$ python3 -c 'import torch'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/PanXuehai/Projects/pytorch/torch/__init__.py", line 379, in <module>
    from torch._C import *  # noqa: F403
    ^^^^^^^^^^^^^^^^^^^^^^
ImportError: libcudnn.so.9: cannot open shared object file: No such file or directory
```

This PR adds `site-packages/nvidia/**/lib` to `LD_LIBRARY_PATH` in `venv/bin/activate` script to let NVIDIA PyPI packages can be loaded correctly.

See also:

- pytorch#141837

Pull Request resolved: pytorch#143262
Approved by: https://github.com/malfet
@github-actions github-actions bot deleted the gh/XuehaiPan/211/head branch May 12, 2025 02:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Merged module: devx Related to PyTorch contribution experience (HUD, pytorchbot) no-stale open source Stale topic: not user facing topic category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants