Skip to content

Less number of threads in builder#14362

Merged
qoega merged 2 commits intomasterfrom
less-number-of-threads-in-builder
Sep 2, 2020
Merged

Less number of threads in builder#14362
qoega merged 2 commits intomasterfrom
less-number-of-threads-in-builder

Conversation

@alexey-milovidov
Copy link
Copy Markdown
Member

@alexey-milovidov alexey-milovidov commented Sep 2, 2020

@robot-clickhouse robot-clickhouse added the pr-not-for-changelog This PR should not be mentioned in the changelog label Sep 2, 2020
@alexey-milovidov alexey-milovidov added the testing Special issue with list of bugs found by CI label Sep 2, 2020
@alexey-milovidov
Copy link
Copy Markdown
Member Author

@qoega Sorry, but it's totally wrong.

2020-09-02 01:50:38 /bin/sh: 1: 16: not found
2020-09-02 01:50:38 /bin/sh: 1: 16: not found
2020-09-02 01:50:38 /bin/sh: 1: 16: not found
2020-09-02 01:50:38 /bin/sh: 1: 16: not found
2020-09-02 01:50:38 # Fix for ninja. Do not add -O.
2020-09-02 01:50:38 /bin/sh: 1: 16: not found
2020-09-02 01:50:38 /usr/bin/ninja  -C obj-x86_64-linux-gnu clickhouse-bundle

@alexey-milovidov
Copy link
Copy Markdown
Member Author

And you forgot to assign yourself and mark the pull request as "Approved" (to indicate that you have reviewed it).

@alexey-milovidov
Copy link
Copy Markdown
Member Author

If you can write makefiles, you can help...

@bobrik
Copy link
Copy Markdown
Contributor

bobrik commented Sep 14, 2020

This increased our internal build times significantly, to the point where they started breaching 3h timeout. Previous build of v20.7 took just 1h14m.

On a system with 2 x Xeon E5-2630 v3 (32 logical CPUs in total) I see:

  • 92m with job-per-logical-cpu (before)
  • 100m with job-per-physical-core (after)

Memory usage between two builds, taken at 1m interval:

image

Seems like there's also some other regression in terms of compilation times:

Either way, I don't think Clickhouse build should make assumptions about CPU topology:

  • Not all CPUs have hyperthreads (many aarch64, some x86_64)
  • VMs may be scheduled across physical CPUs even when there are hyperthreads
  • Linux scheduler won't necessarily spread jobs ideally across non-hypertheaded cores

From the log you linked it seems that your build system doesn't have enough memory and compiler/linker gets killed when many copies run at the same time (has and concat are particularly memory hungry functions to compile, 256 bit integer change doesn't help). I suggest bumping memory or maybe adding swap if memory is not an option instead of crippling CPUs.

@KochetovNicolai KochetovNicolai mentioned this pull request Sep 17, 2020
@alexey-milovidov
Copy link
Copy Markdown
Member Author

alexey-milovidov commented Sep 17, 2020

@bobrik This is just a temporary change to allow CI to run before we optimize the build.
Now the build is optimized and we are changing it back.

PS. We don't control the amount of memory on CI servers. I think, we should assume that they have 128 GiB.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

no-docs-needed pr-not-for-changelog This PR should not be mentioned in the changelog testing Special issue with list of bugs found by CI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants