Skip to content

Topic/topo numa distance part2#8853

Merged
gleon99 merged 1 commit intoopenucx:masterfrom
ofirfarjun7:topic/topo-numa-distance-part2
Apr 18, 2023
Merged

Topic/topo numa distance part2#8853
gleon99 merged 1 commit intoopenucx:masterfrom
ofirfarjun7:topic/topo-numa-distance-part2

Conversation

@ofirfarjun7
Copy link
Copy Markdown
Contributor

@ofirfarjun7 ofirfarjun7 commented Feb 7, 2023

What

  • Move numa latency calculations to UCP layer
  • Remove libnuma from UCX

Why ?

  • Remove UCX libnuma dependency

How ?

  • Move device memory latency calculation to UCP
  • Remove/Replace libnuma dependent code.
  • Remove libnuma from UCX

@ofirfarjun7 ofirfarjun7 added the WIP-DNM Work in progress / Do not review label Feb 7, 2023
@ofirfarjun7 ofirfarjun7 force-pushed the topic/topo-numa-distance-part2 branch from b77fd85 to 42ad98c Compare February 21, 2023 18:33
@ofirfarjun7 ofirfarjun7 removed the WIP-DNM Work in progress / Do not review label Feb 21, 2023
@ofirfarjun7 ofirfarjun7 requested a review from gleon99 February 21, 2023 18:34
@ofirfarjun7 ofirfarjun7 force-pushed the topic/topo-numa-distance-part2 branch from 42ad98c to e118212 Compare February 21, 2023 18:37
@ofirfarjun7 ofirfarjun7 added the WIP-DNM Work in progress / Do not review label Mar 19, 2023
@ofirfarjun7
Copy link
Copy Markdown
Contributor Author

ofirfarjun7 commented Mar 19, 2023

pls don't review until first part is merged.

@gleon99
Copy link
Copy Markdown
Contributor

gleon99 commented Mar 22, 2023

Please look for numa / NUMA across the repo and cleanup where possible, seems there are several places.

@ofirfarjun7 ofirfarjun7 force-pushed the topic/topo-numa-distance-part2 branch 2 times, most recently from 8e84160 to b60fb6d Compare April 8, 2023 10:32
@ofirfarjun7
Copy link
Copy Markdown
Contributor Author

ofirfarjun7 commented Apr 8, 2023

  • In the FAQ we have "Important note" saying that (if I understand correctly) numactl is a dependency of rdma-core, and as a result also dependency of ucx-ib. Is it still true?
    I've tried to run a docker without numactl installed to see if I can compile UCX but it complained that MLNX_OFED requires libnuma:
    "Error: One or more required packages for installing MLNX_OFED_LINUX are missing.
    Please install the missing packages using your Linux distribution Package Management tool.
    Run:
    yum install numactl-libs"
  • In numa.c and numa.h we have definitions of numa_policy, Do we still need it?
  • In ProtoV2 (proto_common.c) why we consider lane distance only for zcopy send?

@ofirfarjun7 ofirfarjun7 removed the WIP-DNM Work in progress / Do not review label Apr 8, 2023
@ofirfarjun7 ofirfarjun7 force-pushed the topic/topo-numa-distance-part2 branch from b60fb6d to 5a6feea Compare April 9, 2023 13:41
@ofirfarjun7 ofirfarjun7 requested a review from gleon99 April 9, 2023 13:43
@ofirfarjun7 ofirfarjun7 added WIP-DNM Work in progress / Do not review and removed WIP-DNM Work in progress / Do not review labels Apr 9, 2023
Comment on lines +214 to +210
print_row_separator(distance_width, name_width, num_devices, ' ', '|');
print_row_separator(distance_width, name_width, num_devices, '-', '+');
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you give example of the output?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you pls fix units/dev name?
see #8853 (comment)

unsigned num_devices = ucs_topo_num_devices();
static const int distance_width = 10;
const char *distance_unit = "MB/s";
unsigned num_devices = ucs_topo_num_devices();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const?

Comment on lines +214 to +210
print_row_separator(distance_width, name_width, num_devices, ' ', '|');
print_row_separator(distance_width, name_width, num_devices, '-', '+');
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you pls fix units/dev name?
see #8853 (comment)

ucp_worker_get_sys_device_memory_distance(ucp_worker_iface_t *wiface)
{
ucs_sys_dev_distance_t *distance = &wiface->memory_distance;
ucs_sys_device_t sys_dev = ucp_worker_get_sys_device(wiface);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe ucs_topo_get_memory_distance should return void?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can take it even further, maybe we need to add a constrain to the topo providers API definition to have fallback behavior?
This way both get_distance and get_memory_distance will return void...
But maybe in another PR?

I can do it for the memory_distance for now and add it to the API description.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's do it for memory distance for now

ucp_worker_iface_add_distance(&wiface->attr, &distance);
}
ucp_worker_iface_add_distance(&wiface->attr, &wiface->distance);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why need to save memory_distance on the wiface?
seems it's used only during initialization

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will reduce the calls to ucs_topo_get_memory_distance in ucp_worker_iface_estimate_perf

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but ucp_worker_iface_estimate_perf still calls UCT estimate perf
and ucs_topo_get_memory_distance should be quite fast since the NUMA distances are saved in a hash in ucs/topo

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is true. Will revert.

Copy link
Copy Markdown
Contributor

@yosefe yosefe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gleon99 can you pls take another look?

yosefe
yosefe previously approved these changes Apr 18, 2023
gleon99
gleon99 previously approved these changes Apr 18, 2023
Copy link
Copy Markdown
Contributor

@gleon99 gleon99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ofirfarjun7 please squash.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants