Skip to content

[RFC] Disable percpu arena in case of non deterministic CPU count#2181

Merged
Lapenkov merged 1 commit intojemalloc:devfrom
azat-archive:per-cpu-arena-check
Dec 21, 2021
Merged

[RFC] Disable percpu arena in case of non deterministic CPU count#2181
Lapenkov merged 1 commit intojemalloc:devfrom
azat-archive:per-cpu-arena-check

Conversation

@azat
Copy link
Copy Markdown
Contributor

@azat azat commented Dec 17, 2021

Determinitic number of CPUs is important for percpu arena to work
correctly, since it uses cpu index - sched_getcpu(), and if it will
greater then number of CPUs bad thing will happen, or assertion will be
failed in debug build:

<jemalloc>: ../contrib/jemalloc/src/jemalloc.c:321: Failed assertion: "ind <= narenas_total_get()"
Aborted (core dumped)

Number of CPUs can be obtained from the following places:

  • sched_getaffinity()
  • sysconf(_SC_NPROCESSORS_ONLN)
  • sysconf(_SC_NPROCESSORS_CONF)

For the sched_getaffinity() you may simply use taskset(1) to run program
on a different cpu, and in case it will be not first, percpu will work
incorrectly, i.e.:

$ taskset --cpu-list $(( $(getconf _NPROCESSORS_ONLN)-1 )) <your_program>

_SC_NPROCESSORS_ONLN uses /sys/devices/system/cpu/online, LXD/LXC
virtualize /sys/devices/system/cpu/online file 1, and so when you run
container with limited limits.cpus it will bind randomly selected CPU to
it

_SC_NPROCESSORS_CONF uses /sys/devices/system/cpu/cpu*, and AFAIK nobody
playing with dentries there.

So if all three of these are equal, percpu arenas should work correctly.

And a small note regardless _SC_NPROCESSORS_ONLN/_SC_NPROCESSORS_CONF,
musl uses sched_getaffinity() for both. So this will also increase the
entropy.

Also note, that you can check is percpu arena really applied using
abort_conf:true.

Refs: #1939
Refs: ClickHouse/ClickHouse#32806

v2: move malloc_cpu_count_is_deterministic() into
malloc_init_hard_recursible() since _SC_NPROCESSORS_CONF does
allocations for readdir()

@Lapenkov
Copy link
Copy Markdown
Contributor

Thanks for the PR, it's a step in the right direction to make the allocator more robust! I added some feedback, but if you don't have the capacity to address it, let me know, and I'll finish it myself.

@azat azat force-pushed the per-cpu-arena-check branch from f94a24c to f969ea3 Compare December 18, 2021 06:17
@Lapenkov
Copy link
Copy Markdown
Contributor

Can you please rebase on dev for CI?

Determinitic number of CPUs is important for percpu arena to work
correctly, since it uses cpu index - sched_getcpu(), and if it will
greater then number of CPUs bad thing will happen, or assertion will be
failed in debug build:

    <jemalloc>: ../contrib/jemalloc/src/jemalloc.c:321: Failed assertion: "ind <= narenas_total_get()"
    Aborted (core dumped)

Number of CPUs can be obtained from the following places:
- sched_getaffinity()
- sysconf(_SC_NPROCESSORS_ONLN)
- sysconf(_SC_NPROCESSORS_CONF)

For the sched_getaffinity() you may simply use taskset(1) to run program
on a different cpu, and in case it will be not first, percpu will work
incorrectly, i.e.:

    $ taskset --cpu-list $(( $(getconf _NPROCESSORS_ONLN)-1 )) <your_program>

_SC_NPROCESSORS_ONLN uses /sys/devices/system/cpu/online, LXD/LXC
virtualize /sys/devices/system/cpu/online file [1], and so when you run
container with limited limits.cpus it will bind randomly selected CPU to
it

  [1]: lxc/lxcfs#301

_SC_NPROCESSORS_CONF uses /sys/devices/system/cpu/cpu*, and AFAIK nobody
playing with dentries there.

So if all three of these are equal, percpu arenas should work correctly.

And a small note regardless _SC_NPROCESSORS_ONLN/_SC_NPROCESSORS_CONF,
musl uses sched_getaffinity() for both. So this will also increase the
entropy.

Also note, that you can check is percpu arena really applied using
abort_conf:true.

Refs: jemalloc#1939
Refs: ClickHouse/ClickHouse#32806

v2: move malloc_cpu_count_is_deterministic() into
    malloc_init_hard_recursible() since _SC_NPROCESSORS_CONF does
    allocations for readdir()
v3:
- mark cpu_count_is_deterministic static
- check only if percpu arena is enabled
- check narenas
@azat azat force-pushed the per-cpu-arena-check branch from f969ea3 to 1003171 Compare December 21, 2021 04:42
@azat
Copy link
Copy Markdown
Contributor Author

azat commented Dec 21, 2021

Can you please rebase on dev for CI?

Sure, done.

@Lapenkov Lapenkov merged commit cafe9a3 into jemalloc:dev Dec 21, 2021
@azat azat deleted the per-cpu-arena-check branch December 22, 2021 04:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants