Topic/topo numa distance part2#8853
Conversation
b77fd85 to
42ad98c
Compare
42ad98c to
e118212
Compare
|
pls don't review until first part is merged. |
|
Please look for |
8e84160 to
b60fb6d
Compare
|
b60fb6d to
5a6feea
Compare
| print_row_separator(distance_width, name_width, num_devices, ' ', '|'); | ||
| print_row_separator(distance_width, name_width, num_devices, '-', '+'); |
There was a problem hiding this comment.
can you give example of the output?
There was a problem hiding this comment.
There was a problem hiding this comment.
can you pls fix units/dev name?
see #8853 (comment)
src/tools/info/sys_info.c
Outdated
| unsigned num_devices = ucs_topo_num_devices(); | ||
| static const int distance_width = 10; | ||
| const char *distance_unit = "MB/s"; | ||
| unsigned num_devices = ucs_topo_num_devices(); |
| print_row_separator(distance_width, name_width, num_devices, ' ', '|'); | ||
| print_row_separator(distance_width, name_width, num_devices, '-', '+'); |
There was a problem hiding this comment.
can you pls fix units/dev name?
see #8853 (comment)
src/ucp/core/ucp_worker.c
Outdated
| ucp_worker_get_sys_device_memory_distance(ucp_worker_iface_t *wiface) | ||
| { | ||
| ucs_sys_dev_distance_t *distance = &wiface->memory_distance; | ||
| ucs_sys_device_t sys_dev = ucp_worker_get_sys_device(wiface); |
There was a problem hiding this comment.
maybe ucs_topo_get_memory_distance should return void?
There was a problem hiding this comment.
We can take it even further, maybe we need to add a constrain to the topo providers API definition to have fallback behavior?
This way both get_distance and get_memory_distance will return void...
But maybe in another PR?
I can do it for the memory_distance for now and add it to the API description.
There was a problem hiding this comment.
let's do it for memory distance for now
| ucp_worker_iface_add_distance(&wiface->attr, &distance); | ||
| } | ||
| ucp_worker_iface_add_distance(&wiface->attr, &wiface->distance); | ||
| } |
There was a problem hiding this comment.
why need to save memory_distance on the wiface?
seems it's used only during initialization
There was a problem hiding this comment.
Will reduce the calls to ucs_topo_get_memory_distance in ucp_worker_iface_estimate_perf
There was a problem hiding this comment.
but ucp_worker_iface_estimate_perf still calls UCT estimate perf
and ucs_topo_get_memory_distance should be quite fast since the NUMA distances are saved in a hash in ucs/topo
There was a problem hiding this comment.
This is true. Will revert.
gleon99
left a comment
There was a problem hiding this comment.
@ofirfarjun7 please squash.
08ba262 to
91c1b25
Compare
What
Why ?
libnumadependencyHow ?
libnumadependent code.libnumafrom UCX