-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Port CPU torch.geqrf to ATen #56249
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Port CPU torch.geqrf to ATen #56249
Conversation
This PR ports `torch.geqrf` from TH to ATen. CUDA path will be implemented in a follow-up PR. With ATen port support for complex and batched inputs is added. There were no correctness tests, they are added in this PR and I added OpInfo for this operation. We can implement the QR decomposition as a composition of geqrf and orgqr (torch.linalg.householder_product). Also we can implement the least squares solver with geqrf + ormqr + trtrs. So it's useful to have this function renewed at least for the internal code. Resolves #24705 [ghstack-poisoned]
💊 CI failures summary and remediationsAs of commit dd13e03 (more details on the Dr. CI page): 💚 💚 Looks good so far! There are no failures yet. 💚 💚 This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions to the (internal) Dr. CI Users group. |
This PR ports `torch.geqrf` from TH to ATen. CUDA path will be implemented in a follow-up PR. With ATen port support for complex and batched inputs is added. There were no correctness tests, they are added in this PR and I added OpInfo for this operation. We can implement the QR decomposition as a composition of geqrf and orgqr (torch.linalg.householder_product). Also we can implement the least squares solver with geqrf + ormqr + trtrs. So it's useful to have this function renewed at least for the internal code. Resolves pytorch#24705 ghstack-source-id: cd264d7 Pull Request resolved: pytorch#56249
This PR ports `torch.geqrf` from TH to ATen. CUDA path will be implemented in a follow-up PR. With ATen port support for complex and batched inputs is added. There were no correctness tests, they are added in this PR and I added OpInfo for this operation. We can implement the QR decomposition as a composition of geqrf and orgqr (torch.linalg.householder_product). Also we can implement the least squares solver with geqrf + ormqr + trtrs. So it's useful to have this function renewed at least for the internal code. Resolves #24705 [ghstack-poisoned]
This PR ports `torch.geqrf` from TH to ATen. CUDA path will be implemented in a follow-up PR. With ATen port support for complex and batched inputs is added. There were no correctness tests, they are added in this PR and I added OpInfo for this operation. We can implement the QR decomposition as a composition of geqrf and orgqr (torch.linalg.householder_product). Also we can implement the least squares solver with geqrf + ormqr + trtrs. So it's useful to have this function renewed at least for the internal code. Resolves pytorch#24705 ghstack-source-id: c663026 Pull Request resolved: pytorch#56249
| static void apply_geqrf(const Tensor& self, const Tensor& tau, int64_t m, int64_t n) { | ||
| #ifndef USE_LAPACK | ||
| AT_ERROR("qr: LAPACK library not found in compilation"); | ||
| AT_ERROR("geqrf: LAPACK library not found in compilation"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TORCH_CHECK(false, ...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a different error string for LAPACK not being available in the other linalg ops, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not consistent, some use the message as here and others use a nicer message. I'll change it here to a nicer message.
test/test_linalg.py
Outdated
| def test_geqrf(self, device, dtype): | ||
|
|
||
| def run_test(shape): | ||
| # NumPy outputs the result of geqrf operation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is a little confusing as written. It's that NumPy doesn't have a function named geqrf, but np.linalg.qr with mode='raw' computes the same operation, so this test compares against that function.
| batches = [(), (0, ), (2, ), (2, 1)] | ||
| ns = [5, 2, 0] | ||
| for batch, (m, n) in product(batches, product(ns, ns)): | ||
| # TODO: CUDA path doesn't work with batched or empty inputs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assert this fails instead of just continuing so when the behavior changes the test demands an update, too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test is updated in #56251. So can we leave this one as is? In other cases, I agree that it's better to assert failures and not skip the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add the assert.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
haha, really negotiating against yourself there
torch/_torch_docs.py
Outdated
| Rather, this directly calls the underlying LAPACK function `?geqrf` | ||
| Computes a QR decomposition of :attr:`input`. | ||
| Both `Q` and `R` matrices are stored in the same tensor `a`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"in the same output tensor `a`."
torch/_torch_docs.py
Outdated
| The elements of `R` are stored on and above the diagonal. | ||
| Elementary reflectors (or Householder vectors) implicitly defining matrix `Q` | ||
| are stored below the diagonal. | ||
| Result of this function can be used together with :func:`torch.linalg.householder_product` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Result" -> "The results"
torch/_torch_docs.py
Outdated
| are stored below the diagonal. | ||
| Result of this function can be used together with :func:`torch.linalg.householder_product` | ||
| to obtain the `Q` matrix or | ||
| with :func:`torch.ormqr` that uses implicit representation of matrix `Q` for efficient matrix-matrix multiplication. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"with :func:`torch.ormqr` that uses implicit representation of matrix `Q` for efficient matrix-matrix multiplication." -> "with :func:`torch.ormqr`, which uses an implicit representation of the `Q` matrix, for an efficient matrix-matrix multiplication."
torch/_torch_docs.py
Outdated
| to obtain the `Q` matrix or | ||
| with :func:`torch.ormqr` that uses implicit representation of matrix `Q` for efficient matrix-matrix multiplication. | ||
| This function directly calls the underlying LAPACK function `geqrf` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sentence seems redundant - maybe it can be removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we can remove it.
torch/_torch_docs.py
Outdated
| r""" | ||
| geqrf(input, *, out=None) -> (Tensor, Tensor) | ||
| This is a low-level function for calling LAPACK directly. This function |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"for calling LAPACK's geqrf directly." ?
torch/_torch_docs.py
Outdated
| .. note:: | ||
| To obtain explicit Q and R matrices it is recommended to use :func:`torch.linalg.qr`. | ||
| .. note:: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would combine this note with the previous to make one "see also" note.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"See also :func:`torch.linalg.qr`, which computes explicit Q and R matrices, and :func:`torch.linalg.lstsq` with the ``driver="gels"`` option for a function that can solve matrix equations using a QR decomposition." ?
|
|
||
| def sample_inputs_geqrf(op_info, device, dtype, requires_grad=False): | ||
| """ | ||
| This function generates input for torch.geqrf |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment doesn't hurt anything but I don't think it adds anything to the code, either
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, we can remove it.
| out = [] | ||
| for batch, (m, n) in product(batches, product(ns, ns)): | ||
| # TODO: CUDA path doesn't work with batched or empty inputs | ||
| if 'cuda' in device and (batch != () or m == 0 or n == 0): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if torch.device(device).type == 'cuda'
| """ | ||
| batches = [(), (0, ), (2, ), (1, 1)] | ||
| ns = [5, 2, 0] | ||
| out = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"out" -> "samples"
mruberry
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this looks great, as usual; I've suggested a few tweaks and also pinged @ngimel since she's been reviewing most TH->ATen ports.
This PR ports `torch.geqrf` from TH to ATen. CUDA path will be implemented in a follow-up PR. With ATen port support for complex and batched inputs is added. There were no correctness tests, they are added in this PR and I added OpInfo for this operation. We can implement the QR decomposition as a composition of geqrf and orgqr (torch.linalg.householder_product). Also we can implement the least squares solver with geqrf + ormqr + trtrs. So it's useful to have this function renewed at least for the internal code. Resolves #24705 [ghstack-poisoned]
This PR ports `torch.geqrf` from TH to ATen. CUDA path will be implemented in a follow-up PR. With ATen port support for complex and batched inputs is added. There were no correctness tests, they are added in this PR and I added OpInfo for this operation. We can implement the QR decomposition as a composition of geqrf and orgqr (torch.linalg.householder_product). Also we can implement the least squares solver with geqrf + ormqr + trtrs. So it's useful to have this function renewed at least for the internal code. Resolves #24705 [ghstack-poisoned]
|
@mruberry thank you for your suggestions! I've updated this pull request. |
This PR ports `torch.geqrf` from TH to ATen. CUDA path will be implemented in a follow-up PR. With ATen port support for complex and batched inputs is added. There were no correctness tests, they are added in this PR and I added OpInfo for this operation. We can implement the QR decomposition as a composition of geqrf and orgqr (torch.linalg.householder_product). Also we can implement the least squares solver with geqrf + ormqr + trtrs. So it's useful to have this function renewed at least for the internal code. Resolves #24705 [ghstack-poisoned]
This PR ports `torch.geqrf` from TH to ATen. CUDA path will be implemented in a follow-up PR. With ATen port support for complex and batched inputs is added. There were no correctness tests, they are added in this PR and I added OpInfo for this operation. We can implement the QR decomposition as a composition of geqrf and orgqr (torch.linalg.householder_product). Also we can implement the least squares solver with geqrf + ormqr + trtrs. So it's useful to have this function renewed at least for the internal code. Resolves pytorch#24705 ghstack-source-id: 956f37f Pull Request resolved: pytorch#56249
This PR ports `torch.geqrf` from TH to ATen. CUDA path will be implemented in a follow-up PR. With ATen port support for complex and batched inputs is added. There were no correctness tests, they are added in this PR and I added OpInfo for this operation. We can implement the QR decomposition as a composition of geqrf and orgqr (torch.linalg.householder_product). Also we can implement the least squares solver with geqrf + ormqr + trtrs. So it's useful to have this function renewed at least for the internal code. Resolves #24705 [ghstack-poisoned]
This PR ports `torch.geqrf` from TH to ATen. CUDA path will be implemented in a follow-up PR. With ATen port support for complex and batched inputs is added. There were no correctness tests, they are added in this PR and I added OpInfo for this operation. We can implement the QR decomposition as a composition of geqrf and orgqr (torch.linalg.householder_product). Also we can implement the least squares solver with geqrf + ormqr + trtrs. So it's useful to have this function renewed at least for the internal code. Resolves pytorch#24705 ghstack-source-id: 3bbcb9f Pull Request resolved: pytorch#56249
| } | ||
|
|
||
| static void geqrf_out_helper(const Tensor& input, const Tensor& QR, const Tensor& tau) { | ||
| TORCH_INTERNAL_ASSERT(input.dim() >= 2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want to make the TORCH_INTERNAL_ASSERT_DEBUG_ONLY? You've already checked all these conditions, and this function is not user facing.
|
@IvanYashchuk just ping me when you're happy with this PR, @IvanYashchuk, and I'll start merging this stack |
Summary: Pull Request resolved: pytorch#56249 This PR ports `torch.geqrf` from TH to ATen. CUDA path will be implemented in a follow-up PR. With ATen port support for complex and batched inputs is added. There were no correctness tests, they are added in this PR and I added OpInfo for this operation. We can implement the QR decomposition as a composition of geqrf and orgqr (torch.linalg.householder_product). Also we can implement the least squares solver with geqrf + ormqr + trtrs. So it's useful to have this function renewed at least for the internal code. Resolves pytorch#24705 Test Plan: Imported from OSS Reviewed By: ngimel Differential Revision: D27907357 Pulled By: mruberry fbshipit-source-id: 94e1806078977417e7903db76eab9d578305f585
Stack from ghstack:
This PR ports
torch.geqrffrom TH to ATen. CUDA path will beimplemented in a follow-up PR.
With ATen port support for complex and batched inputs is added.
There were no correctness tests, they are
added in this PR and I added OpInfo for this operation.
We can implement the QR decomposition as a composition of geqrf and
orgqr (torch.linalg.householder_product).
Also we can implement the least squares solver with geqrf + ormqr +
trtrs. So it's useful to have this function renewed at least for the
internal code.
Resolves #24705
Differential Revision: D27907357