Skip to content

Conversation

@tpressure
Copy link
Member

@tpressure tpressure commented Nov 13, 2025

Summary

This pull request introduces live disk resizing support for RAW images. This enhancement enables dynamic adjustment of disks while the guest is running. This is part of our cloud-hypervisor openstack enablement.

Testing

@tpressure tpressure requested a review from a team as a code owner November 13, 2025 14:43
@tpressure tpressure force-pushed the disk_resize_upstream branch 4 times, most recently from 0086f70 to 0513eb9 Compare November 13, 2025 15:22
@tpressure tpressure force-pushed the disk_resize_upstream branch from 0513eb9 to 95c9352 Compare November 13, 2025 16:38
@tpressure tpressure requested a review from rbradford November 13, 2025 17:40
@tpressure tpressure force-pushed the disk_resize_upstream branch 8 times, most recently from 349270d to 3fe60fb Compare November 13, 2025 19:01
Copy link
Member

@rbradford rbradford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm kinda impressed that Linux works with this! Will need an integration test.

@tpressure tpressure force-pushed the disk_resize_upstream branch from 3fe60fb to c835bec Compare November 13, 2025 19:16
@tpressure
Copy link
Member Author

I'm kinda impressed that Linux works with this! Will need an integration test.

I will have a look how these integration tests work soon.

@russell-islam
Copy link
Contributor

Are we supporting both shrink and expanse? For expanse it is okay, but for shrink don't see a possibility of corruption?

@tpressure
Copy link
Member Author

Are we supporting both shrink and expanse? For expanse it is okay, but for shrink don't see a possibility of corruption?

I tried to keep this compatible to qemu. If you shrink the disk, the guest has to play nice, i.e. the part that is shrinked must not be active at the time. There's also this comment in the qemu libvirt driver which basically states the same: https://gitlab.com/libvirt/libvirt/-/blob/master/src/qemu/qemu_driver.c?ref_type=heads#L9402

Copy link
Member

@rbradford rbradford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I have no problems in principal with the feature/PR. Please try and add an integration test.

@tpressure tpressure force-pushed the disk_resize_upstream branch from c835bec to e2eb29f Compare December 15, 2025 07:43
This change is a prerequisite for live disk resizing. Before this
commit, the epoll-handler threads just got a copy of the sector
size which we cannot update during runtime.

On-behalf-of: SAP [email protected]
Signed-off-by: Thomas Prescher <[email protected]>
@tpressure tpressure force-pushed the disk_resize_upstream branch 2 times, most recently from a7bfb15 to 9440b87 Compare December 15, 2025 07:56
@tpressure
Copy link
Member Author

I'm kinda impressed that Linux works with this! Will need an integration test.

@rbradford test has been added.

@tpressure tpressure force-pushed the disk_resize_upstream branch from 9440b87 to d18e840 Compare December 15, 2025 08:22
@phip1611
Copy link
Member

Please beware that since the latest QoL improvements (#7489), you can run cargo clippy locally fairly easy :)

@tpressure tpressure force-pushed the disk_resize_upstream branch from d18e840 to 9456396 Compare December 15, 2025 08:36
@phip1611
Copy link
Member

phip1611 commented Dec 15, 2025

TL;DR: Open question: disk resizing and byte-range OFD locks?!

One last question: Since we have #7494: How does this work together with disk locking using byte-ranges? The old range will remain locked but that will not correspond to the new size.

From a Cloud Hypervisor perspective, as the locks are just advisory and overlapping, we can keep the old lock as it will in any case prevent further Cloud Hypervisor instances from locking again. But technically, other software could see regions of the file being unlocked.

@tpressure
Copy link
Member Author

tpressure commented Dec 16, 2025

TL;DR: Open question: disk resizing and byte-range OFD locks?!

One last question: Since we have #7494: How does this work together with disk locking using byte-ranges? The old range will remain locked but that will not correspond to the new size.

From a Cloud Hypervisor perspective, as the locks are just advisory and overlapping, we can keep the old lock as it will in any case prevent further Cloud Hypervisor instances from locking again. But technically, other software could see regions of the file being unlocked.

@phip1611 you are absolutely right. Would it be ok to do this in a follow-up PR?

@phip1611
Copy link
Member

phip1611 commented Dec 16, 2025

@phip1611 you are absolutely right. Would it be ok to do this in a follow-up PR?

I think it would be sufficient to create a follow-up ticket. In fact, this also has some caveats. For example, next to this live resize functionality, in the static qcow2 case, the physical image can also grow transparently to the guest, if I'm not mistaken. No need to fix all of this here in this PR.

@tpressure
Copy link
Member Author

@phip1611 you are absolutely right. Would it be ok to do this in a follow-up PR?

I think it would be sufficient to create a follow-up ticket. In fact, this also has some caveats. For example, next to this live resize functionality, in the static qcow2 case, the physical image can also grow transparently to the guest, if I'm not mistaken. No need to fix all of this here in this PR.

I've created #7569

Copy link
Member

@rbradford rbradford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some nits.

Add basic infrastructure so resize events are
propagated to the underlying disk implementation.

On-behalf-of: SAP [email protected]
Signed-off-by: Thomas Prescher <[email protected]>
Support for resize events for raw_async disks.

On-behalf-of: SAP [email protected]
Signed-off-by: Thomas Prescher <[email protected]>
Support disk resizing via ch-remote and REST api.

On-behalf-of: SAP [email protected]
Signed-off-by: Thomas Prescher <[email protected]>
On-behalf-of: SAP [email protected]
Signed-off-by: Thomas Prescher <[email protected]>
This test verifies that we can grow and shrink
disk images during runtime.

On-behalf-of: SAP [email protected]
Signed-off-by: Thomas Prescher <[email protected]>
@tpressure tpressure force-pushed the disk_resize_upstream branch from 9456396 to 95d0235 Compare December 16, 2025 19:59
@tpressure tpressure requested a review from rbradford December 16, 2025 19:59
@rbradford rbradford added this pull request to the merge queue Dec 17, 2025
Merged via the queue into cloud-hypervisor:main with commit 736813f Dec 17, 2025
43 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

5 participants