KFold cross-validation#8
Conversation
There was a problem hiding this comment.
Hi @Mec-iS, this is a great start! I have couple of suggestions for you that will help to improve this PR:
- Run
cargo fmtevery time before committing your code, otherwise automatic build will fail if code is incorrectly formatted. - Test how project's documentation looks by running
cargo doc --no-deps --features "nalgebra-bindings ndarray-bindings". This command will generate html with documentation in thetarget/doc/folder.
I am looking forward to see implementation of the function that splits dataset into k folds!
Co-authored-by: VolodymyrOrlov <[email protected]>
Co-authored-by: VolodymyrOrlov <[email protected]>
|
Thanks. As a next step I will try to implement the interface with only the matrix as input (not considering shuffle and groups). Obviously all the documentation will be cleaned up and fixed after the implementation is satisfactory. |
|
@VolodymyrOrlov can you specify the meaning of this assignment for me please? |
|
Thanks for the explanation. Please check that |
|
Ready for review 👍 |
|
@Mec-iS the code looks good! Couple of things left before I can proceed to merging this PR:
|
Co-authored-by: VolodymyrOrlov <[email protected]>
Co-authored-by: VolodymyrOrlov <[email protected]>
|
I had problem understanding the shape of the matrix, in my understanding:
Please provide guidance. I couldn't find this information clearly stated in the documentation. |
|
ahah, sorry I got the dimensions confused. I just had to switch which index in Lesson learned: the developer may not know (or have forgotten, like in my case) the conventions used by |
|
Question: in my last commit I added this line to explicitly define |
Good catch, thanks for pointing it out! I'll fix documentation of the |
No, it is not necessary since you pass ownership of the I am sure you've read this book but will leave this link here in case you haven't: https://doc.rust-lang.org/book/ch04-00-understanding-ownership.html. |
yes but I dont see the necessity of passing the ownership of a mutable vector. It could be an interface leak to allow mutability on something that is not meant to be modified. Once the indices and the splits are defined, there is no reason for the developer-user to be able to modify it. That is why is conceptually more correct to return an immutable maybe. |
Co-authored-by: VolodymyrOrlov <[email protected]>
Co-authored-by: VolodymyrOrlov <[email protected]>
Co-authored-by: VolodymyrOrlov <[email protected]>
I agree with you conceptually but I don't know an easy way to do it without wrapping your return value into another struct that will prevent a caller from modifying it. https://users.rust-lang.org/t/how-to-return-an-immutable-reference-of-a-mutable-vector/30773/5 I think wrapping return value into protective shell is an overkill |
|
... REDACTED ... EDIT: ok, forget it. I didn't consider that the accessibility is defined by the caller, not the callee 🥇 |
* Add documentation and API * Add public keyword * Implement test_indices (debug version) * Return indices as Vec of Vec * Consume vector using drain() * Use shape() to return num of samples * Implement test_masks * Implement KFold.split() * Make trait public * Add test for split * Fix samples in shape() * Implement shuffle * Simplify return values * Use usize for n_splits Co-authored-by: VolodymyrOrlov <[email protected]>
* Add documentation and API * Add public keyword * Implement test_indices (debug version) * Return indices as Vec of Vec * Consume vector using drain() * Use shape() to return num of samples * Implement test_masks * Implement KFold.split() * Make trait public * Add test for split * Fix samples in shape() * Implement shuffle * Simplify return values * Use usize for n_splits Co-authored-by: VolodymyrOrlov <[email protected]>


#5
...
test_indices,test_matrices,splitto accept a matrix and return aVec