Skip to content

Conversation

@alykhantejani
Copy link
Contributor

Seed all cuda devices when torch.manual_seed is called (issue #1615)

@alykhantejani
Copy link
Contributor Author

urm a weird Conv.cpp seems to have snuck into this push - let me fix that.

@alykhantejani
Copy link
Contributor Author

ok, It's fixed - sorry about that

@apaszke apaszke merged commit 5f1a16a into pytorch:master Jun 10, 2017
@Kaixhin
Copy link
Contributor

Kaixhin commented Jun 20, 2017

What was the rationale behind this? I can't set random seeds in my CPU multiprocessing code because it's trying to set CUDA seeds. This might be more convenient than controlling CUDA seeds manually, but this seems like a bug to me.

  File "/home/ka709/ferry/test.py", line 13, in test
    torch.manual_seed(args.seed + rank)
  File "/home/ka709/miniconda3/lib/python3.6/site-packages/torch/__init__.py", line 133, in manual_seed
    torch.cuda.manual_seed_all(seed)
  File "/home/ka709/miniconda3/lib/python3.6/site-packages/torch/cuda/random.py", line 37, in manual_seed_all
    _lazy_init()
  File "/home/ka709/miniconda3/lib/python3.6/site-packages/torch/cuda/__init__.py", line 83, in _lazy_init
    "Cannot re-initialize CUDA in forked subprocess. " + msg)
RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method

@alykhantejani
Copy link
Contributor Author

Bug report filed here: #1856

jjsjann123 pushed a commit to jjsjann123/pytorch that referenced this pull request Jun 22, 2022
pytorchmergebot pushed a commit that referenced this pull request Jul 13, 2022
Syncing nvfuser devel branch to upstream master. https://github.com/csarofeen/pytorch/

Code changes includes:

- TransformPropagator refactor: switched to Dijkstra instead of exhaustive enumeration on all possible paths to reduce compilation time on transform propagation;
- Indexing refactor: remove reference tensor creation in all tensor indexing logic (#1690)
- (more) generic grouped grid reduction kernel;
- Minor parser/fuser patches:
  1. zero-dim tensor reduction support
  3. no-op binary removal within fused graph
  4. expand supported in fusion

Squashed commits to WAR github API
Commits that's actually in this PR from the devel branch:

```
a054b3e Refactor TransormPropagator to allow specifying a position and propagating to part of the DAG (#1775)
d67e1cd Indexing refactor stage 1: remove reference tensor creation in all tensor indexing logic (#1690)
1b65299 Issue 1770 (#1774)
35b0427 Avoid compilation errors like below: (#1773)
452c773 Ignore reductions of zero-dim tensors per PyTorch conventions (#1771)
31d6c56 TransformPropagator refactor (#1769)
570c5a8 Merge pull request #1767 from csarofeen/upstream_merge_0621
9d6c3d8 merging upstream 61305cd
0ed815f New TransformPropagator algorithm (#1763)
6c19520 no-op binary removal (#1764)
ec7fa41 Proper propagation of IterType (#1762)
b263562 Fix dimensionality check (#1759)
2d6343f More generic grouped grid reduction kernel (#1740)
64e2b56 [nvfuser] prevent spamming warning message (#77777) (#1758)
0c43162 [nvFuser] Improving bitwise ops support (#77158) (#1757)
b93a147 Parser expand (#1754)
```

RUN_TORCHBENCH: nvfuser
Pull Request resolved: #80355
Approved by: https://github.com/davidberard98
facebook-github-bot pushed a commit that referenced this pull request Jul 13, 2022
Summary:
Syncing nvfuser devel branch to upstream master. https://github.com/csarofeen/pytorch/

Code changes includes:

- TransformPropagator refactor: switched to Dijkstra instead of exhaustive enumeration on all possible paths to reduce compilation time on transform propagation;
- Indexing refactor: remove reference tensor creation in all tensor indexing logic (#1690)
- (more) generic grouped grid reduction kernel;
- Minor parser/fuser patches:
  1. zero-dim tensor reduction support
  3. no-op binary removal within fused graph
  4. expand supported in fusion

Squashed commits to WAR github API
Commits that's actually in this PR from the devel branch:

```
a054b3e Refactor TransormPropagator to allow specifying a position and propagating to part of the DAG (#1775)
d67e1cd Indexing refactor stage 1: remove reference tensor creation in all tensor indexing logic (#1690)
1b65299 Issue 1770 (#1774)
35b0427 Avoid compilation errors like below: (#1773)
452c773 Ignore reductions of zero-dim tensors per PyTorch conventions (#1771)
31d6c56 TransformPropagator refactor (#1769)
570c5a8 Merge pull request #1767 from csarofeen/upstream_merge_0621
9d6c3d8 merging upstream 61305cd
0ed815f New TransformPropagator algorithm (#1763)
6c19520 no-op binary removal (#1764)
ec7fa41 Proper propagation of IterType (#1762)
b263562 Fix dimensionality check (#1759)
2d6343f More generic grouped grid reduction kernel (#1740)
64e2b56 [nvfuser] prevent spamming warning message (#77777) (#1758)
0c43162 [nvFuser] Improving bitwise ops support (#77158) (#1757)
b93a147 Parser expand (#1754)
```

RUN_TORCHBENCH: nvfuser

Pull Request resolved: #80355

Reviewed By: qihqi

Differential Revision: D37573400

Pulled By: davidberard98

fbshipit-source-id: 52ab68d89ec01ef61f69f5abeb18c9d3a312aa64
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants