Skip to content

Conversation

@zou3519
Copy link
Contributor

@zou3519 zou3519 commented Mar 29, 2018

Fixes #3164
Picks up from #4502

I moved gesv to ATen.
Adds bindings for MAGMA's gesv_batched function for CUDA.
For CPU, runs THLapack(gesv) in a for loop.

The new function supports arbitrary batch dimensions (and broadcasting
of those dimensions). For example, the 4-d tensor A x B x M x M should
be treated as having batch-size (A x B).

The overhead of creating the magma_queue_t is: ~350000 microseconds
the first time it's called and ~6 microseconds every time after that.

cc @colesbury @apaszke

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

@zou3519
Copy link
Contributor Author

zou3519 commented Apr 27, 2018

@ssnl could I get another review on this please? I've addressed all the previous comments. IIRC, your last pending comment was that we should delete the TH gesv and move it to ATen as well (because the batched gesv does the TH gesv in a loop). Could we leave that for a future PR? (I'll open another issue for it). It's much more difficult to implement because of the following:

  • the single version of gesv takes an _out keyword. Implementing that efficiently (or even just as efficiently as in TH) is going to take a lot of refactoring, especially because many of the ATen functions are designed without an out keyword in mind.

I don't think it's worth the marginal benefit to deduplicate our gesv code right now, especially since users have been requesting batched gesv for a while.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

Copy link
Collaborator

@ssnl ssnl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two minor comments

This comment was marked as off-topic.

This comment was marked as off-topic.

zou3519 added 8 commits May 7, 2018 07:20
Fixes pytorch#3164
Picks up from pytorch#4502

I moved `gesv` to ATen.
Adds bindings for MAGMA's `gesv_batched` function for CUDA.
For CPU, runs `THLapack(gesv)` in a for loop.

The new function supports arbitrary batch dimensions (and broadcasting
of those dimensions). For example, the 4-d tensor `A x B x M x M` should
be treated as having batch-size `(A x B)`.

The overhead of creating the magma_queue_t is: ~350000 microseconds
the first time it's called and ~6 microseconds every time after that.
@zou3519
Copy link
Contributor Author

zou3519 commented May 7, 2018

@ssnl done :) sorry for the wait!

@ssnl
Copy link
Collaborator

ssnl commented May 7, 2018

@pytorchbot retest this please

1 similar comment
@ssnl
Copy link
Collaborator

ssnl commented May 8, 2018

@pytorchbot retest this please

Copy link
Collaborator

@ssnl ssnl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@soumith soumith merged commit 7162649 into pytorch:master May 8, 2018
onnxbot added a commit to onnxbot/onnx-fb-universe that referenced this pull request May 8, 2018
weiyangfb pushed a commit to weiyangfb/pytorch that referenced this pull request Jun 11, 2018
* Add batched linear solver to torch.gesv()

Fixes pytorch#3164
Picks up from pytorch#4502

I moved `gesv` to ATen.
Adds bindings for MAGMA's `gesv_batched` function for CUDA.
For CPU, runs `THLapack(gesv)` in a for loop.

The new function supports arbitrary batch dimensions (and broadcasting
of those dimensions). For example, the 4-d tensor `A x B x M x M` should
be treated as having batch-size `(A x B)`.

The overhead of creating the magma_queue_t is: ~350000 microseconds
the first time it's called and ~6 microseconds every time after that.

* Tests and docs

* Address comments

* Address comments

* Rebase

* Address comments

* Fix rebase

* Addressed comments

* Address comments

* Address comments

* Addressed comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants