UCS/TOPO: Replace numa_distance with sysfs based implementation#8698
UCS/TOPO: Replace numa_distance with sysfs based implementation#8698yosefe merged 1 commit intoopenucx:masterfrom
Conversation
80a3b9a to
7bb6d56
Compare
f5b41df to
479b0d4
Compare
9ff60ba to
cd5254a
Compare
test/apps/uct_info/Makefile.in
Outdated
|
|
||
| UCX_STATIC_LDFLAGS = -static $(shell pkg-config --libs --static $(EXTRA_MODULES) ucx) -lnuma | ||
| UCT_STATIC_LDFLAGS = -static $(shell pkg-config --libs --static $(EXTRA_MODULES) ucx-uct) -lnuma | ||
| UCX_STATIC_LDFLAGS = -static $(shell pkg-config --libs --static $(EXTRA_MODULES) ucx) $(NUMA_LIBS) |
There was a problem hiding this comment.
Need to revert in the following PR and remove linking with libnuma from all makefiles
|
PR description:
|
src/ucs/sys/topo/base/topo.c
Outdated
| return ucs_topo_get_avg_distance_default(device, distance); | ||
| } | ||
|
|
||
| ret = ucs_sys_getaffinity(&process_affinity); |
There was a problem hiding this comment.
Need to use thread affinity instead (pthread_getaffinity_np).
Please add a func in ucs_sys, reuse in uct_ib_md_handle_mr_list_multithreaded.
src/ucs/memory/numa_defs.h
Outdated
|
|
||
| #include <stdint.h> | ||
|
|
||
| #define UCS_NUMA_MIN_DISTANCE 10 |
There was a problem hiding this comment.
why does it need to be in H file?
There was a problem hiding this comment.
We have references to this macro in other files (topo.c)
Need to remove from numa.h (leftovers...)
There was a problem hiding this comment.
I would expect topo.c to use ucs_topo_default_distance
|
|
||
| static ucs_status_t | ||
| ucs_sys_enum_threads_cb(const struct dirent *entry, void *_ctx) | ||
| ucs_sys_enum_threads_cb(const struct dirent *entry, void *arg) |
gleon99
left a comment
There was a problem hiding this comment.
If already committing some minor fixes, please go over the docstrings - I think there are several places with "excessive spacing" (args, return, etc).
src/uct/ib/base/ib_md.c
Outdated
| if (ret != 0) { | ||
| ucs_error("pthread_getaffinity_np() failed: %m"); | ||
| return UCS_ERR_INVALID_PARAM; | ||
| return ret; |
There was a problem hiding this comment.
"int ret" -> ucs_status_t status
!= 0 -> != UCS_OK
|
@ofirfarjun7 please squash. |
66098b2 to
ef439b0
Compare
ef439b0 to
c21b5bb
Compare
What
Implement
numa_distancefunction to calculate the distance between IB devices and CPU's NUMA node.Why ?
libnumadependencyHow ?
libnumadependent code.libnumafrom UCX