Skip to content

Support CgroupsV2 in info.cpp's calculations#2519

Closed
elukey wants to merge 2 commits into
catboost:masterfrom
elukey:master
Closed

Support CgroupsV2 in info.cpp's calculations#2519
elukey wants to merge 2 commits into
catboost:masterfrom
elukey:master

Conversation

@elukey
Copy link
Copy Markdown
Contributor

@elukey elukey commented Oct 30, 2023

There is currently support for CPUs count approximation for Cgroups V1, but not for Cgroups V2.
This change adds a similar heuristic as
dmlc/xgboost#9622, so
that the count doesn't fall back to /proc/cpuinfo
on more recent systems.

issue: #2518

Before submitting a pull request, please do the following steps:

  1. Read instructions for contributors.
  2. Run ya make in catboost folder to make sure the code builds.
  3. Add tests that test your change.
  4. Run tests using ya make -t -A command.
  5. If you haven't already, complete the CLA.

@elukey
Copy link
Copy Markdown
Contributor Author

elukey commented Oct 30, 2023

Still working on building the code locally, I'd also like to get some info about how to better test it :)

@elukey
Copy link
Copy Markdown
Contributor Author

elukey commented Nov 1, 2023

I am able to build the Python wheel via python3 setup.py bdist_wheel. Need to find a minimal test to verify the new code.

@andrey-khropov
Copy link
Copy Markdown
Member

Please, agree to the CLA.

@andrey-khropov
Copy link
Copy Markdown
Member

I am able to build the Python wheel via python3 setup.py bdist_wheel. Need to find a minimal test to verify the new code.

See the documentation about the tests

@elukey
Copy link
Copy Markdown
Contributor Author

elukey commented Nov 1, 2023

@andrey-khropov Hi! Thanks a lot for the feedback! I have a couple of questions:

  1. I am using the python wrapper to build with cmake, but from the test docs it seems that I'd need to run cmake directly to build a specific test. What would you use to build utils/system/info_ut.cpp ? I get what the target should be (util-system-info-ut) but I can't make cmake work :(
  2. To properly test the code I'd need to mock some sysfs files, or to somehow create tmp files with the content that I want and inject them in the code (in this case, in static/internal methods). I am not 100% familiar with how to do this in cpp/catboost, do you have any example/recommendation?

@elukey
Copy link
Copy Markdown
Contributor Author

elukey commented Nov 3, 2023

I am still trying to figure out how to run tests (reading https://catboost.ai/en/docs/concepts/development-and-contributions#run-tests), any suggestion?

I tried:

cmake .
cmake --build . --target help

But I can't find anything that is related to the info.cpp ut test file..

@andrey-khropov
Copy link
Copy Markdown
Member

@andrey-khropov Hi! Thanks a lot for the feedback! I have a couple of questions:

1. I am using the python wrapper to build with cmake, but from the test docs it seems that I'd need to run cmake directly to build a specific test. What would you use to build `utils/system/info_ut.cpp` ? I get what the target should be (`util-system-info-ut`) but I can't make cmake work :(

2. To properly test the code I'd need to mock some sysfs files, or to somehow create tmp files with the content that I want and inject them in the code (in this case, in static/internal methods). I am not 100% familiar with how to do this in cpp/catboost, do you have any example/recommendation?

Somehow CMake build for unittests in catboost/util in particular seems to be non-working. I'll look into it: #2525 . I'm sorry.

@elukey
Copy link
Copy Markdown
Contributor Author

elukey commented Nov 4, 2023

Thanks! I'll concentrate on finding a good unit test for the new code :)

@elukey
Copy link
Copy Markdown
Contributor Author

elukey commented Nov 12, 2023

I don't have a better idea to test this than to run a special unit test that prints the number of cpus with different container specs. It will be a one-off, but it is inline with the current tests for util/info. If there is a way to mock some sysfs file contents in the unit tests we may be able to have something better, but I don't have any expertise in c++ testing so I'd need some guidance.

@elukey
Copy link
Copy Markdown
Contributor Author

elukey commented Nov 12, 2023

Built the static library of catboost and used the following to test cgroups v2:

#include "util/system/info.h"
#include "stdio.h"

int main() {
    long int cpus = NSystemInfo::NumberOfCpus();
    printf("CPUs: %ld\n", cpus);
}

On Debian Bookworm (default is using cgroups v2) I tested:

  • docker run --cpus=2 -it -v $(pwd):/test debian:bookworm (CPUS: 2)
  • docker run --cpus=0.5 -it -v $(pwd):/test debian:bookworm (CPUS: 1)
  • docker run -it -v $(pwd):/test debian:bookworm (CPUS: 8, max present on my laptop)

@elukey
Copy link
Copy Markdown
Contributor Author

elukey commented Nov 13, 2023

Copy link
Copy Markdown
Member

@andrey-khropov andrey-khropov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, change the names of the local variables to camelCase style according to the C++ style guide

Comment thread util/system/info.cpp Outdated
@elukey elukey force-pushed the master branch 2 times, most recently from 97602e0 to 356eb3b Compare November 21, 2023 07:39
@elukey
Copy link
Copy Markdown
Contributor Author

elukey commented Nov 21, 2023

@andrey-khropov thanks! Sent a new patch with variables in camel case and the ) on the previous line. Re-tested as explained above as well.

@elukey
Copy link
Copy Markdown
Contributor Author

elukey commented Nov 22, 2023

@andrey-khropov thanks a lot for the approval, what is the next step? Should I help/work on #2525 ?

There is currently support for CPUs count approximation
for Cgroups V1, but not for Cgroups V2.
This change adds a similar heuristic as
dmlc/xgboost#9622, so
that the count doesn't fall back to /proc/cpuinfo
on more recent systems.

issue: catboost#2518
@elukey
Copy link
Copy Markdown
Contributor Author

elukey commented Nov 25, 2023

Gentle ping, not sure what to do folks :)

@andrey-khropov
Copy link
Copy Markdown
Member

Gentle ping, not sure what to do folks :)

I've discovered some strange buffer underflow issue when I tried to merge this PR but it is not related to your changes directly, I'm investigating. You don't have to do anything for now.

@elukey
Copy link
Copy Markdown
Contributor Author

elukey commented Nov 26, 2023

Gentle ping, not sure what to do folks :)

I've discovered some strange buffer underflow issue when I tried to merge this PR but it is not related to your changes directly, I'm investigating. You don't have to do anything for now.

Thanks a lot! I asked since I thought the code may have needed more reviewers, and where I work (Wikimedia foundation) we are trying to solve a k8s performance bug when running catboost with cgroups v2. Nothing urgent but ideally we'll love to work with you to have a new minor release in Pypi with the patch to fix the issue :)

@andrey-khropov
Copy link
Copy Markdown
Member

Gentle ping, not sure what to do folks :)

I've discovered some strange buffer underflow issue when I tried to merge this PR but it is not related to your changes directly, I'm investigating. You don't have to do anything for now.

Thanks a lot! I asked since I thought the code may have needed more reviewers, and where I work (Wikimedia foundation) we are trying to solve a k8s performance bug when running catboost with cgroups v2. Nothing urgent but ideally we'll love to work with you to have a new minor release in Pypi with the patch to fix the issue :)

Yes, CatBoost 1.2.3 that is expected to contain this change is planned to be released this week.

@elukey
Copy link
Copy Markdown
Contributor Author

elukey commented Dec 11, 2023

@andrey-khropov Hi! Is there anything still pending or shall we merge this? Thanks in advance :)

robot-piglet pushed a commit to yandex/yatool that referenced this pull request Dec 28, 2023
robot-piglet pushed a commit to ytsaurus/ytsaurus that referenced this pull request Dec 28, 2023
robot-piglet pushed a commit to alexv-smirnov/ydb that referenced this pull request Dec 28, 2023
robot-piglet pushed a commit that referenced this pull request Dec 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants