-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Include log header in all files with calls to Log::Assert #3800
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
While I fully understand why you are proposing this, but what I do not understand is the following:
Technically mlpack is fully header-only, but for now, we do not offer one single-file library and we are not thinking of doing so, not in the close of far future. Also I am trying to understand what are the disadvantages of including something that you are not using, usually there is not any, it will not affect the performance or the binary size at all. However, from a practice point of view, yes it is useless and not necessary. Could you please post part of the code that is causing this, including the compiler error just to have an idea what is happening ? I am surprised that our testing did not catch this |
|
Hi @shrit I got the compilation error within a translation unit which includes first I have to admit I did not fully follow all inclusion paths, but trusting my compiler it and given the fact that within that header Edit: I looked up the compiler error, I got the same error from multiple files in multiple lines (all in all perhaps two dozens), e.g. I probably don't get your second point fully. I did not include any other files before, as I don't want to rely on inclusion order (which is basically the major motivation of "include what you use"). Including something that is not used will indeed not affect runtime (afaik), but it will affect compile time. In large codebases this is a concern and they therefore tend to strive to include only as much as required. |
On version 4.4.0 an include of the `kde_model.hpp` header leads to compilation errors due to the missing symbol `Log::Assert` within multiple transitively included headers. This commit applies "include what you use" to all headers using `Log::Assert` in order to prevent dependencies on inclusion order within other headers. Note that I did not simply change the include order in `kde_model.hpp`, such that `mlpack/core.hpp` is included first, as I would consider that solution as unstable, it could easily be changed again without noticing. Also would it solve the problem only in one inclusion path, whereas "include what you use" will solve it for each and every inclusion path.
dc896fe to
9d19795
Compare
|
After having seen some other tests of Might be a philosophical question how |
|
The idea for users is that they just I think that most of the tests include Each individual method's header should definitely be include-able on its own, and so the problem you encountered is definitely a bug we should fix (thank you for the PR by the way!). But beyond those "public" headers that we expect users to include (e.g. Glancing at the diff (I haven't had time to review in full yet), I can see a couple places where maybe the new includes aren't needed; one instance is |
|
Hi @rcurtin , For me it would be fine to adapt this PR to your liking, you just should decide on a policy you want to follow. I could also add that policy to the documentation within this PR, if you want me to. A small addition about how I would do it, knowing that I know almost nothing about
|
|
For your information, we are working on restructuring mlpack internals and header to make it look similar to other header-only libraries, so this is on our list, it will look more similar to the ensmallen project if you are interested. This improvement might not directly respond to your question but will make the situation slightly better, I will try to find some time for it next week, but it is on our agenda. |
|
I will put this PR on hold for now until we fix all the headers inside the library and then will see if this PR still make sense or not |
|
@shrit I think you mean just moving the library's includes from @marlamb I don't have a problem with using clang-format, but I would prefer it to be as a CI job, not an auto-run post-commit hook or anything like that. Another important thing we have found over the years with style checkers and linters is that there has to be an easy way to whitelist things---there are always false positives. Ideally any system is very light-touch and so doesn't cause a huge amount of friction writing new code. I agree with the rest of what you wrote, but the key here will be establishing a policy of some sort. I think it can be as simple as: any user-level include should include everything it needs. This would include all the method-level inclusion files like Some other little comments:
It's a nice idea but could be very complicated, as individual test files may test components not specific to a particular header file. Personally I don't see that kind of refactoring as worthwhile, but if you're up for it I suppose I don't have an issue with it.
If this was a road we went down, then controlling the version via specification in a CI job (like a Github Actions job or something we ran on Jenkins) would be the way to do it. However at least for now it would be ideal to do that with includes only and not the rest of the style guide. Anyway, thanks again for pointing these issues out and the suggestions. I think for this PR, @shrit would it interfere with what you're planning to get this one in quickly for the user-level |
|
Hi you two, thanks for all the nice replies! Regarding the structure: yes indeed, it is sometimes difficult to then judge which tests should go where. I usually would try to have one test file per header file, which contain unit tests for functionality provided in that header file. In case the public header does only include transitive includes (convenience header for users), then the test file would only contain the include statement to ensure that the header is self-contained. Finally for all remaining tests, which cover functionality from multiple headers, an additional directory can be created (could call it "integration" or "component" tests or similar). Regarding Regarding CI based formatting: I know multiple ways howto achieve this, e.g. creating an automated pull request pointing to the open pull request sorting the includes. Also directly pushing to the pull request is possible, but is a little more intrusive to the contributor. I assume simply failing the CI is probably not desired, as people without having My proposal how we could go forward:
Does this sound reasonable to you? In case @shrit figures out that the activity creating an |
|
We do not usually apply new features or code base modification to previous releases, the only reason to do so, is if there is a bug that is affecting previous releases. Otherwise, new features go to new releases. You can but, I would wait until we have a better restructuring, it should more look like ensmallen (https://github.com/mlpack/ensmallen) or armadillo. I will try to make it simpler, but you should know that it is very hard to include what you need, and our objective is simplicity comes first as long as it does not affect performance. If you would like to help, we have so many things going on, we are more than happy to guide you, but this is not priority. |
Actually that would be the best way to do it, then contributors can just check what the failure was and go fix it. This is often how I fix CI issues; I tend to prefer manually fixing it as opposed to an automated fix (which could go wrong in its own ways, causing further confusion). I think the rest of the plan you proposed is just fine, but yeah, like @shrit said, no need to consider anything other than the most current code. |
|
Living on head sounds good to me 👍 . Unfortunately it looks like llvm disabled sorting includes while keeping the rest unformatted (see https://github.com/llvm/llvm-project/blob/main/clang/lib/Format/Format.cpp#L3479 and llvm/llvm-project@61dc0f2). That means we would need to rely on an very old Edit: |
|
I had a little time for a first draft how |
|
I don't really have time to review closely right now but auto-formatting code with clang-format is really not something we should do. The reason is that the rules for these formatting programs apply really badly to heavily templated code; some amount of "freedom" for people to format code in a way that is visually reasonable is really important. Especially with length limits of 80 characters it can be a huge problem and this is why we have avoided clang-format in the past. I don't mind a CI job that checks include orders and issues an error if there is a problem, or the same for style guides; but I don't want to set up a system that auto-formats all code to a very rigid style that has no room for subjectiveness. If you want to see what I mean take a look at classes like |
|
(I will get back to this when I have a chance, I am just traveling right now so only have a few seconds for a quick response. Sorry about that) |
|
No reason to apologize. To the contrary: I am quite impressed that you take the time to answer while traveling! Perhaps I create a ticket on llvm side to enable pure sorting of includes again 😆 . However, regarding the main discussion here: then I would not wait for a conceptional improvement of "include what you use", but only fix the now already apparent issue, which was the original intention of this PR. As far as I remember the main point in this more philosophical thread was this one here:
As it wouldn't matter to include it twice, but would IMO improve the maintenance (readers of one file don't have to look into another one to figure out, if the include is already done), I would prefer to keep the include. However, if you insist I will remove it. |
rcurtin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with the changes and that we should get this particular PR merged. Do you want to add a note to HISTORY.md? (I can do it before merge too.)
I'll respond to the other comments in the discussion next...
Yeah---unfortunately the killing of the manual formatting is what has always kept us from using automated tools for this sort of thing. If it is possible to use a C++ style checker in a way where it leaves room for subjectivity, then this is okay; but clang-format's "my way or the highway" approach isn't suitable. As it stands, we use https://github.com/cpplint/cpplint to merely report some issues in the CI jobs. A better or more specific tool (or another one alongside it) could help enforce some of the more automatable points of the style guide.
If possible I would actually prefer to remove these since they are unnecessary. The general format throughout has been that if something is needed in the definition file (e.g. |
Note that we consider the wider include to `core` as being sufficient.
Although more files are touched in order to improve "include what you use", the message focuses on the behavioral change that can be detected by users for sure.
I added a message, but keeping the scope very small to the change that can for sure be detected by users.
Should be done in all files where the pattern applied (there were a few more of them than |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Second approval provided automatically after 24 hours. 👍
|
Thanks again @marlamb! Very much appreciated. |
On version 4.4.0 an include of the
kde_model.hppheader leads to compilation errors due to the missing symbolLog::Assertwithin multiple transitively included headers. This commit applies "include what you use" to all headers usingLog::Assertin order to prevent dependencies on inclusion order within other headers.Note that I did not simply change the include order in
kde_model.hpp, such thatmlpack/core.hppis included first, as I would consider that solution as unstable, it could easily be changed again without noticing. Also would it solve the problem only in one inclusion path, whereas "include what you use" will solve it for each and every inclusion path.