-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Document PCA
#3677
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document PCA
#3677
Conversation
shrit
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything looks good, I left a couple of comments but even the docs webpage looks great.
| LMNNFunction<MetricType>::LMNNFunction(const arma::mat& datasetIn, | ||
| const arma::Row<size_t>& labelsIn, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason why not move this to MatType ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll come back later for it, the goal here was just to modify MakeAlias().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay, good for me
| const arma::mat& datasetIn, | ||
| const arma::Row<size_t>& labelsIn, | ||
| MetricType metric) : |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same comment here ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same, it will happen, just not in this PR (likely when I document NCA or LMNN).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is fine then
| const MatType& predictors, | ||
| const arma::Row<size_t>& responses, | ||
| const MatType& predictorsIn, | ||
| const arma::Row<size_t>& responsesIn, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one is already refactored, LogisticRegression already takes whatever MatType as input. The labels have to be arma::Row<size_t> though.
| /** | ||
| * @file make_alias.hpp | ||
| * @author Ryan Curtin | ||
| * | ||
| * Implementation of `MakeAlias()`, a utility function. This is meant to be | ||
| * used in `SetWeights()` calls in various layers, to wrap internal weight | ||
| * objects as aliases around the given memory pointers. | ||
| * | ||
| * mlpack is free software; you may redistribute it and/or modify it under the | ||
| * terms of the 3-clause BSD license. You should have received a copy of the | ||
| * 3-clause BSD license along with mlpack. If not, see | ||
| * http://www.opensource.org/licenses/BSD-3-Clause for more information. | ||
| */ | ||
| #ifndef MLPACK_METHODS_ANN_MAKE_ALIAS_HPP | ||
| #define MLPACK_METHODS_ANN_MAKE_ALIAS_HPP | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where this class has been moved? are you using directly the one that is in the core/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am using it in the coot refactoring 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the signature is exactly the same but it is moved to core/ now. I cleaned up a few other overloads. I think everything is the same for your usage though: https://www.ratml.org/misc/mlpack-markdown-doc/user/core.html#aliases
| BaseMatType eigvec; | ||
| BaseColType eigVal; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the difference between BaseColType and ColType ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was to catch expressions that are passed in. So if I call, e.g., pca.Apply(2 * x, ...), MatType will be inferred as an Armadillo expression (some complex type). So BaseMatType and BaseColType are the "base" types that we can use to store results in. The ensmallen code does similar things at the beginning of every Optimize() function: https://github.com/mlpack/ensmallen/blob/master/include/ensmallen_bits/de/de_impl.hpp#L40
| * Allow PCA to take different matrix types (#3677). | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should remember do the same for all the PR we merged
| * Reconstruct `m` as an alias around the memory `newMem`, with size `numRows` x | ||
| * `numCols`. | ||
| */ | ||
| template<typename ElemType> | ||
| arma::Cube<ElemType> MakeAlias(arma::Cube<ElemType>& input, | ||
| const bool strict = true) | ||
| template<typename MatType> | ||
| void MakeAlias(MatType& m, | ||
| typename MatType::elem_type* newMem, | ||
| const size_t numRows, | ||
| const size_t numCols, | ||
| const bool strict = true, | ||
| const typename std::enable_if_t<!IsCube<MatType>::value>* = 0) | ||
| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay I see it here.
| #### See also: | ||
| <!-- TODO: add link --> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess something for later?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I can't add the link to RADICAL until the documentation page for that method is done.
| blockSize = std::min((size_t) data.n_rows, | ||
| std::min((size_t) data.n_cols, rank + 10)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like the indentation is off?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix looks good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in 787e14b. 👍
zoq
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No more comments from my side.
This PR adds documentation for PCA (principal components analysis). This was a pretty simple method to document, but I had to make some minor changes:
Add non-inplace overloads for dimensionality reduction and variance keeping overloads. This is necessary so you can, e.g., pass in a sparse dataset and get back a dense one.
Templatize
PCA::Apply()so it can work on different matrix types (e.g.arma::fmat). That in turn meant templatizing all of the decomposition strategies... as those get individually documented they will be tested further. As part of this, I refactoredMakeAlias(), which existed in two places, so that it now only exists in one place. I also updated the documentation incore.mdaccordingly.Fix a minor bug in the block Krylov randomized SVD; @zoq can you make sure my changes here are correct? I believe according to the paper that
kmust be less than or equal tod(see Algorithm 2'sInput:section).Limit the width of the sidebar so that it doesn't overlap with the content.
You can click around the documentation here: