Skip to content

Conversation

@phip1611
Copy link
Member

@phip1611 phip1611 commented Jun 19, 2025

This is a split-out from #7150 for better review-ability. It contains pre-requisites that I consider as crucial and/or helpful for the planned changes but also universally.

Please review this commit by commit.

Steps to Undraft

@phip1611 phip1611 requested a review from a team as a code owner June 19, 2025 16:32
@phip1611 phip1611 force-pushed the network-fd-livemig-prerequisites branch 5 times, most recently from 1003f9c to 96df2a4 Compare June 19, 2025 17:56
@phip1611 phip1611 force-pushed the network-fd-livemig-prerequisites branch 2 times, most recently from 748d6b1 to a3567fb Compare June 19, 2025 18:21
@phip1611 phip1611 force-pushed the network-fd-livemig-prerequisites branch 2 times, most recently from ad8fe5e to 0fca3b0 Compare June 20, 2025 07:41
/// Once these valid file descriptors are received here, we integrate the actual
/// FDs into the virtual machine configuration, allowing the virtio-net device to
/// function correctly with its backing network resources.
fn apply_new_fds_to_cfg<T>(
Copy link
Member Author

@phip1611 phip1611 Jun 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very cool and helpful abstraction for #7150.
Prevents a lot of code duplication and unifies logging and error handling across handlers!

@phip1611 phip1611 force-pushed the network-fd-livemig-prerequisites branch 2 times, most recently from ebb9a40 to 968e378 Compare June 20, 2025 13:23
Copy link
Member

@alyssais alyssais left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Support what you're trying to do here, but I have some comments :)

serialize_with = "serialize_restorednetconfig_fds",
deserialize_with = "deserialize_restorednetconfig_fds"
)]
#[serde(default, deserialize_with = "deserialize_restorednetconfig_fds")]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this mean we end up serializing the file descriptor numbers? I don't think that makes sense — it would be better to not emit them at all, with a proper deprecation process if necessary.

Copy link
Member Author

@phip1611 phip1611 Jun 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The commit message explained the why part. The reason here is that the "info" call (e.g. ch-remote --api-socket ... info returns something valid`.

I personally think this is fine. I don't have a strong opinion here.

@phip1611 phip1611 force-pushed the network-fd-livemig-prerequisites branch 5 times, most recently from 8705c22 to 015196f Compare June 23, 2025 13:43
@phip1611 phip1611 force-pushed the network-fd-livemig-prerequisites branch from 015196f to a502c4a Compare June 23, 2025 13:45
@phip1611 phip1611 force-pushed the network-fd-livemig-prerequisites branch from a502c4a to 45c0d6c Compare June 24, 2025 07:58
@likebreath likebreath self-requested a review June 24, 2025 17:45
phip1611 added 2 commits June 27, 2025 09:36
One can call `to_vec()` anyway if one needs an owned copy. This change
further helps to prevent needless copies in upcoming changes.

Signed-off-by: Philipp Schuster <[email protected]>
On-behalf-of: SAP [email protected]
@phip1611 phip1611 force-pushed the network-fd-livemig-prerequisites branch from 45c0d6c to 6050b08 Compare June 27, 2025 07:39
@phip1611 phip1611 force-pushed the network-fd-livemig-prerequisites branch 2 times, most recently from ba81699 to b191c44 Compare June 27, 2025 08:08
phip1611 added 6 commits June 27, 2025 11:00
To ease debugging of networking in the field, especially in context of
libvirt, state save/resume, and live-migration, more logging helps to
identify what happens behind the scenes in certain corner-cases as well
as (apparently) normal operation.

For example: There are multiple ways to create virtio-net devices:
- with vhost-user backend
- from provided network FDs
- from provided interface name of Tap device is given
- Tap device is created by CHV (fallback)

To confirm that the expected behavior occurs—especially in the more
complex case of network file descriptors—these logging statements
provide valuable insight into the system's internal operations.

Signed-off-by: Philipp Schuster <[email protected]>
On-behalf-of: SAP [email protected]
Live migration, state save/resume, and hotplug are not trivial when it
comes to virtio-net devices backed by network FDs. As the mechanism
behind it can be considered as quite "multi-step magic" even for
experienced programmers, it makes sense to thoroughly document this to
ease debugging and to improve the mental model of developers working on
this in the future.

Signed-off-by: Philipp Schuster <[email protected]>
On-behalf-of: SAP [email protected]
(1) The old messages are missing the "why" part. With this change, users
    of Cloud Hypervisor have somehow more context and people looking at
    the code perfectly know what's going on.

(2) Using warn! implies that the user should take action, but in this
    case, there’s nothing the user can do. If the API is used correctly
    and file descriptors are passed via an SCM_RIGHTS message over a
    UNIX domain socket, everything works as intended. In that case,
    there's no need to issue a warning — debug! is sufficient.

Signed-off-by: Philipp Schuster <[email protected]>
On-behalf-of: SAP [email protected]
Deserializing values as `-1` makes sense to prevent errors, so let's
keep it. However, serializing them differently adds confusion. For
example, a `ch-remote info` call should not report `-1` but the actual
FDs.

Signed-off-by: Philipp Schuster <[email protected]>
On-behalf-of: SAP [email protected]
This unifies error handling, the implementation, and logging. Otherwise,
we have code repetition, especially with the upcoming support for the
VmReceiveMigration patch.

Signed-off-by: Philipp Schuster <[email protected]>
On-behalf-of: SAP [email protected]
@phip1611 phip1611 force-pushed the network-fd-livemig-prerequisites branch from b191c44 to c88c5cc Compare June 27, 2025 09:00
@phip1611 phip1611 marked this pull request as draft June 27, 2025 11:50
@phip1611 phip1611 changed the title misc: pre-requisites for live-migration of virtio-net devices & network FDs [WIP] misc: pre-requisites for live-migration of virtio-net devices & network FDs Jun 27, 2025
@phip1611
Copy link
Member Author

phip1611 commented Jul 8, 2025

What started as a small PR has grown relatively large, so I'm closing it in favor of splitting it into multiple smaller, more manageable PRs for easier review.

Please note that I'll be on vacation from July 10th to July 30th.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants