Prioritize reentrant tasks and execute them recursively until close to limit #22397

malvika2147 · 2019-07-01T15:26:12Z

Stack from ghstack:

Prioritize reentrant tasks and execute them recursively until close to limit #22397 Prioritize reentrant tasks and execute them recursively until close to limit

Summary: The ready queue prioritizes the most nested reentrant tasks. Run the reentrant tasks recursively until the recursion depth is close to max_recursion_depth, which depends on the python recursion limit. Once the limit is reached, further reentrant backwards tasks will be run in a different thread.

Test Plan: Added test for reentrant backwards with checkpoint and a test for a recursive backwards function (which should fail if we run all the reentrant tasks recursively in the same thread) and for testing priority of reentrant tasks.
~~Will add a test for priority of reentrant tasks in future pr.~~

Differential Revision: D16131955

…o limit Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

ezyang · 2019-07-01T15:41:40Z

torch/csrc/autograd/python_engine.cpp

 namespace torch { namespace autograd { namespace python {

+PythonEngine::PythonEngine () {
+  max_recursion_depth_ = 0.1*Py_GetRecursionLimit();


A little note explaining how we got the magic number 0.1 would be appreciated here :)

So, I guess, now the constructor for PythonEngine must be GIL protected. Have we checked this invariant is respected in all the use sites?

ezyang · 2019-07-01T15:44:01Z

Test Plan

A formally written out test plan would be very useful here, since there are no test suite changes (and indeed, it's a bit hard to think of how you would actually go about writing unit tests; probably only manual tests are possible here.)

torch/csrc/autograd/engine.cpp

torch/csrc/autograd/engine.h

torch/csrc/autograd/engine.cpp

…til close to limit" Prioritize reentrant tasks and execute them recursively until close to limit Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: gh-metadata: pytorch pytorch 22397 gh/mal2147/10/head

ezyang · 2019-07-02T20:55:28Z

Don't forget to rerequest review when you need it!

torch/csrc/autograd/engine.cpp

…til close to limit" Prioritize reentrant tasks and execute them recursively until close to limit Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: gh-metadata: pytorch pytorch 22397 gh/mal2147/10/head

torch/csrc/autograd/engine.cpp

VitalyFedyunin · 2019-07-03T16:25:07Z

torch/csrc/autograd/engine.cpp

+    if(current_depth >= max_recursion_depth_){
+      // See Note [Reentrant backwards]
+      // If reached the max depth, switch to a different thread
+      add_thread_pool_task(&graph_task);


What will happen if we exceed thread_pool_shared_->graphtasks_queue_.size() here?

std::queue should resize dynamically, or am I misunderstanding your question?

VitalyFedyunin · 2019-07-03T16:26:41Z

I suggest to have clearly defined constant how deep we can go with recursion.

malvika2147 · 2019-07-03T16:49:34Z

I suggest to have clearly defined constant how deep we can go with recursion.

But in that case, we will always create new threads even if there is enough stack space to run on a single thread.

…til close to limit" Prioritize reentrant tasks and execute them recursively until close to limit Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: gh-metadata: pytorch pytorch 22397 gh/mal2147/10/head

ezyang · 2019-07-03T20:24:40Z

test/test_autograd.py

+                with torch.enable_grad():
+                    ctx.x = Variable(x.data, requires_grad=True)
+                    ctx.x = ctx.x - 1
+                return ctx.x.detach()


This is a pretty funky forward function. (It's funky because we don't normally do autograd operations inside of a forward function.) It seems like it's doing two things: you want to return x (forward is just identity), but you also want to create a leaf variable on context with some non-trivial autograd history. Is there a reason x has to be used in both cases? I'll keep reading and see if I can figure out why you create leaf variables in forward ;)

Oh I see, you're also using ctx.x to keep track about how many times you recurse.

torch/csrc/autograd/engine.cpp

…til close to limit" Prioritize reentrant tasks and execute them recursively until close to limit Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: gh-metadata: pytorch pytorch 22397 gh/mal2147/10/head

torch/csrc/autograd/engine.cpp

mrshenli · 2019-07-03T22:33:14Z

torch/csrc/autograd/engine.cpp

 }

-Engine::Engine() = default;
+// This limit is based on the default python recursion limit which is 1000


comments do not match the code 1000 vs 100

We need to set it lower than the actual python limit to take into account the function calls within python

…til close to limit" Prioritize reentrant tasks and execute them recursively until close to limit Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: gh-metadata: pytorch pytorch 22397 gh/mal2147/10/head

facebook-github-bot · 2019-07-05T17:32:40Z

This pull request has been merged in 0140a75.

facebook-github-bot · 2019-07-05T17:32:52Z

This pull request has been merged in 0140a75.

…o limit Summary: Pull Request resolved: pytorch#22397 Test Plan: Added test for reentrant backwards with checkpoint and a test for a recursive backwards function (which should fail if we run all the reentrant tasks recursively in the same thread) and for testing priority of reentrant tasks. ~~Will add a test for priority of reentrant tasks in future pr.~~ Imported from OSS Differential Revision: D16131955 fbshipit-source-id: 18301d45c1ec9fbeb566b1016dbaf7a84a09c7ac

Prioritize reentrant tasks and execute them recursively until close t…

3e2bbb5

…o limit Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

pytorchbot added module: autograd Related to torch.autograd, and the autograd engine in general module: pybind Related to our Python bindings / interactions with other Python libraries labels Jul 1, 2019

malvika2147 requested review from apaszke and ezyang July 1, 2019 15:28

ezyang reviewed Jul 1, 2019

View reviewed changes

torch/csrc/autograd/engine.cpp Outdated Show resolved Hide resolved

ezyang reviewed Jul 1, 2019

View reviewed changes

torch/csrc/autograd/engine.h Outdated Show resolved Hide resolved

ezyang reviewed Jul 1, 2019

View reviewed changes

torch/csrc/autograd/engine.cpp Outdated Show resolved Hide resolved

mal added 2 commits July 1, 2019 21:07

Update on "Prioritize reentrant tasks and execute them recursively un…

6d8a960

…til close to limit" Prioritize reentrant tasks and execute them recursively until close to limit Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: gh-metadata: pytorch pytorch 22397 gh/mal2147/10/head

Update on "Prioritize reentrant tasks and execute them recursively un…

a7079cc

…til close to limit" Prioritize reentrant tasks and execute them recursively until close to limit Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: gh-metadata: pytorch pytorch 22397 gh/mal2147/10/head

ezyang reviewed Jul 2, 2019

View reviewed changes

torch/csrc/autograd/engine.cpp Show resolved Hide resolved

ezyang requested a review from VitalyFedyunin July 2, 2019 21:00

ezyang approved these changes Jul 2, 2019

View reviewed changes

ezyang requested a review from mrshenli July 2, 2019 21:17

mal added 4 commits July 2, 2019 18:14

Update on "Prioritize reentrant tasks and execute them recursively un…

1bcd8b3

…til close to limit" Prioritize reentrant tasks and execute them recursively until close to limit Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: gh-metadata: pytorch pytorch 22397 gh/mal2147/10/head

Update on "Prioritize reentrant tasks and execute them recursively un…

59d281d

…til close to limit" Prioritize reentrant tasks and execute them recursively until close to limit Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: gh-metadata: pytorch pytorch 22397 gh/mal2147/10/head

Update on "Prioritize reentrant tasks and execute them recursively un…

4ea502a

…til close to limit" Prioritize reentrant tasks and execute them recursively until close to limit Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: gh-metadata: pytorch pytorch 22397 gh/mal2147/10/head

Update on "Prioritize reentrant tasks and execute them recursively un…

6d3ab30

…til close to limit" Prioritize reentrant tasks and execute them recursively until close to limit Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: gh-metadata: pytorch pytorch 22397 gh/mal2147/10/head

VitalyFedyunin reviewed Jul 3, 2019

View reviewed changes

Update on "Prioritize reentrant tasks and execute them recursively un…

00e9b83

…til close to limit" Prioritize reentrant tasks and execute them recursively until close to limit Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: gh-metadata: pytorch pytorch 22397 gh/mal2147/10/head

malvika2147 requested a review from ezyang July 3, 2019 20:19

ezyang reviewed Jul 3, 2019

View reviewed changes

torch/csrc/autograd/engine.cpp Outdated Show resolved Hide resolved

Update on "Prioritize reentrant tasks and execute them recursively un…

b6e118d

…til close to limit" Prioritize reentrant tasks and execute them recursively until close to limit Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: gh-metadata: pytorch pytorch 22397 gh/mal2147/10/head

mrshenli reviewed Jul 3, 2019

View reviewed changes

Update on "Prioritize reentrant tasks and execute them recursively un…

affc526

…til close to limit" Prioritize reentrant tasks and execute them recursively until close to limit Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: gh-metadata: pytorch pytorch 22397 gh/mal2147/10/head

facebook-github-bot closed this in 0140a75 Jul 5, 2019

zou3519 deleted the gh/mal2147/10/head branch July 5, 2019 15:53

facebook-github-bot added the merged label Jul 5, 2019

mruberry added the Merged label Oct 28, 2020

Prioritize reentrant tasks and execute them recursively until close to limit #22397

Prioritize reentrant tasks and execute them recursively until close to limit #22397

Uh oh!

Conversation

malvika2147 commented Jul 1, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ezyang commented Jul 1, 2019

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ezyang commented Jul 2, 2019

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

VitalyFedyunin commented Jul 3, 2019

Uh oh!

malvika2147 commented Jul 3, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Jul 5, 2019

Uh oh!

facebook-github-bot commented Jul 5, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

malvika2147 commented Jul 1, 2019 •

edited

Loading