Batch submit block Async IO Requests #7146

russell-islam · 2025-06-18T03:59:32Z

Compare to QEMU CLH block performace is worse in smaller block sizes. For smaller block sizes, multiple requests are in a Queue, Batching those requests improves the IO performance while not regressing the perfromace of the larger block sizes. Attaching the FIO bench marking here (SEQ read and SEQ write). Comparing the reg(Current impletion) with batch(this PR).
This pull request introduces batch request submission functionality to the asynchronous I/O subsystem, improves error handling, and refactors existing code to support the new feature. The changes primarily affect the block and virtio-devices modules, with updates to enums, traits, and implementations to enable caching and submission of batched I/O requests.

rbradford

Commits need some restructuring to make more sense.

block/src/async_io.rs

block/src/raw_async.rs

block/src/fixed_vhd_async.rs

liuw

~~Can you confirm each individual commit builds fine? We cannot break bisection.~~

I asked GitHub Copilot to write me a script to check. It is fine.

block/src/raw_async.rs

block/src/fixed_vhd_async.rs

virtio-devices/src/block.rs

russell-islam · 2025-07-19T23:32:37Z

@liuw @rbradford PTAL

liuw

Judging from the numbers, this is an improvement over the status quo.

virtio-devices/src/block.rs

block/src/raw_async.rs

liuw · 2025-07-22T04:19:38Z

There is one problem. Although each commit builds, they are not functional. This still breaks bisection.

For example, the first commit only collects the requests into a vector, but they are never submitted. If we run purely that commit, guest I/O will stall. This is still bad.

Given the complexity (or lack thereof) of this change, you should just squash everything into one commit.

Commits break bisection

likebreath

@russell-islam Thank you for the good work and the detailed results. I like the idea and believe this is a good improvement to have.

Some comments below about the implementation. Also, can you share the comparison with QEMU before and after the change?

I am thinking such feature should be useful for other async backends too, say the aio backend. If that's the case, such feature might be better to be added at a higher layer, perhaps refactor execute_async(). This would potentially solve the problem of avoiding multiple duplications and also thread-safety issue with the new struct. I still need to put more thoughts on it. Let me know your thoughts.

block/src/raw_async.rs

@russell-islam

The idea of request submission in batch was proprosed and measured by @russell-islam from cloud-hypervisor#7146. This presents a different design to support such feature, aimming to address the following issues from cloud-hypervisor#7146: * Make the implementation thread safe * Support multiple block backend * Reduce heap allocation * Properly handle errors from request submission in batch (todo) Signed-off-by: Bo Chen <[email protected]>

likebreath · 2025-08-01T20:03:10Z

I am thinking such feature should be useful for other async backends too, say the aio backend. If that's the case, such feature might be better to be added at a higher layer, perhaps refactor execute_async(). This would potentially solve the problem of avoiding multiple duplications and also thread-safety issue with the new struct. I still need to put more thoughts on it. Let me know your thoughts.

Okay. I did a quick PoC based on this idea and here is implementation: https://github.com/likebreath/cloud-hypervisor/commits/0801/redesign_block_batch_submit/, particularly ~~likebreath@97afe58~~ likebreath@241483d. I tested with raw file using io_uring backend.

It tries to address to following issues (including ones being raised from previous comments):

Make the implementation thread safe
Support multiple block backend
Reduce heap allocation
Properly handle errors from request submission in batch ~~(see todo in comments)~~

@russell-islam If we agree on this design, I'd love to hand over it to you for further improvements and measurements. Please take a look and let me know your thoughts.

@russell-islam

The idea of request submission in batch was proposed and measured by @russell-islam from cloud-hypervisor#7146. This presents a different design to support such feature, aiming to address the following issues from cloud-hypervisor#7146: * Make the implementation thread safe * Support multiple block backend * Reduce heap allocation * Properly handle errors from request submission in batch Signed-off-by: Bo Chen <[email protected]>

@russell-islam

The idea of request submission in batch was proposed and measured by @russell-islam from cloud-hypervisor#7146. This presents a different design to support such feature, aiming to address the following issues from cloud-hypervisor#7146: * Make the implementation thread safe * Support multiple block backend * Reduce heap allocation * Properly handle errors from request submission in batch Signed-off-by: Bo Chen <[email protected]>

russell-islam · 2025-08-07T21:03:24Z

@likebreath This week I am on call. Will resume this work next week,

likebreath · 2025-08-08T22:55:24Z

@likebreath This week I am on call. Will resume this work next week,

No problem. Thanks for the heads-up.

likebreath · 2025-08-19T18:31:53Z

CLH Command:

sudo ./cloud-hypervisor_reg --kernel ./bzImage --disk path=focal-server-cloudimg-amd64-ch.raw,direct=on path=big-disk-ch.img,direct=on --cmdline "console=hvc0 root=/dev/vda1 rw" --cpus boot=$1 --memory size=32G --net tap=ich0,mac=00:11:22:33:44:55,ip=192.168.4.2,mask=255.255.255.0 --api-socket /tmp/cloud-hypervisor.sock --console pty

The num_queues for the virtio-blk device should match the --numjobs and the vcpu count. Otherwise, testing on multiple IO are competing a single virt queue, which is why you don't see performance improvements as the --numjobs increases.

russell-islam · 2025-08-19T20:15:25Z

CLH Command:
sudo ./cloud-hypervisor_reg --kernel ./bzImage --disk path=focal-server-cloudimg-amd64-ch.raw,direct=on path=big-disk-ch.img,direct=on --cmdline "console=hvc0 root=/dev/vda1 rw" --cpus boot=$1 --memory size=32G --net tap=ich0,mac=00:11:22:33:44:55,ip=192.168.4.2,mask=255.255.255.0 --api-socket /tmp/cloud-hypervisor.sock --console pty

The num_queues for the virtio-blk device should match the --numjobs and the vcpu count. Otherwise, testing on multiple IO are competing a single virt queue, which is why you don't see performance improvements as the --numjobs increases.

If num_queues not provided in clh command, by default it is equal to number of VCPU, right? In the guest I used the num-jobs equal to number of cpu. Did I miss anything?

likebreath · 2025-08-19T22:26:38Z

CLH Command:
sudo ./cloud-hypervisor_reg --kernel ./bzImage --disk path=focal-server-cloudimg-amd64-ch.raw,direct=on path=big-disk-ch.img,direct=on --cmdline "console=hvc0 root=/dev/vda1 rw" --cpus boot=$1 --memory size=32G --net tap=ich0,mac=00:11:22:33:44:55,ip=192.168.4.2,mask=255.255.255.0 --api-socket /tmp/cloud-hypervisor.sock --console pty

The num_queues for the virtio-blk device should match the --numjobs and the vcpu count. Otherwise, testing on multiple IO are competing a single virt queue, which is why you don't see performance improvements as the --numjobs increases.

If num_queues not provided in clh command, by default it is equal to number of VCPU, right? In the guest I used the num-jobs equal to number of cpu. Did I miss anything?

I think the default is 1, both with CLI and HTTP api:

cloud-hypervisor/vmm/src/vm_config.rs

Lines 290 to 294 in 91d15c3

    
           pub const DEFAULT_DISK_NUM_QUEUES: usize = 1; 
        
           pub fn default_diskconfig_num_queues() -> usize { 
        
               DEFAULT_DISK_NUM_QUEUES 
        
           }

russell-islam · 2025-08-27T19:41:15Z

russell-islam · 2025-08-27T19:41:35Z

russell-islam · 2025-08-27T19:42:32Z

@likebreath Numbers are way better now. We can improve further.

russell-islam · 2025-08-27T20:14:53Z

@rbradford Could you please take another look? Looks like your requested change might block the merging.

Code changed

likebreath · 2025-08-27T21:06:37Z

@likebreath Numbers are way better now. We still can improve further.

Yes, I see it. It looks good and is an improvement for smaller block sizes.

Some more finding - I believe we now saturated at the limit of the actual device on the host around 2735 MB/s for write and 6600 MB/s for read. This happens basically for all tests with block size larger than 128KB. With this in mind, the comparison result won't provide much actual insights between batch vs regular and CH vs QEMU.

russell-islam · 2025-08-27T21:06:49Z

Code rebased.

likebreath · 2025-08-27T21:09:07Z

Code rebased.

@russell-islam I think you pushed the wrong commit - my two comments (#7146 (review)) you addressed got discarded with the current push.

russell-islam force-pushed the muislam/block-batch-req branch from e6f93bc to c343843

Instead of returning boolean return an struct of completion status so that it can be cached for batch submission. Signed-off-by: Bo Chen <[email protected]> Signed-off-by: Muminul Islam <[email protected]>

Cache and batch IO requests after parsing all items in the queue, improving performance—especially for small block sizes—by reducing per-request overhead. Introduced two methods in the AsyncIo trait for batch submission, with implementation in the raw disk backend. This method should be called during/after parsing all block IO requests in the available queue. If the batch submission is not enabled, by default it does the old way of submitting requests. Signed-off-by: Bo Chen <[email protected]> Signed-off-by: Muminul Islam <[email protected]>

Implement the batch submission function for raw disk, default it is enabled. After parsing the requests this method is called for better IO latency and bandwidth. Signed-off-by: Bo Chen <[email protected]> Signed-off-by: Muminul Islam <[email protected]>

Updated VHD async implementation to call the batch submit method via the raw async IO layer. Signed-off-by: Muminul Islam <[email protected]>

russell-islam · 2025-08-27T21:54:20Z

Code rebased.

@russell-islam I think you pushed the wrong commit - my two comments (#7146 (review)) you addressed got discarded with the current push.

russell-islam force-pushed the muislam/block-batch-req branch from e6f93bc to c343843

Thank you. Sorry, too many places to dev and test on my side. Fixed now.

Code updated.

russell-islam requested a review from a team as a code owner June 18, 2025 03:59

russell-islam force-pushed the muislam/block-batch-req branch 2 times, most recently from 5f5cc77 to 6cd2808 Compare June 18, 2025 04:21

rbradford previously requested changes Jun 18, 2025

View reviewed changes

russell-islam force-pushed the muislam/block-batch-req branch from 6cd2808 to 8a58caa Compare June 18, 2025 16:16

liuw reviewed Jun 18, 2025

View reviewed changes

block/src/raw_async.rs Outdated Show resolved Hide resolved

block/src/fixed_vhd_async.rs Outdated Show resolved Hide resolved

rbradford reviewed Jun 21, 2025

View reviewed changes

virtio-devices/src/block.rs Outdated Show resolved Hide resolved

russell-islam force-pushed the muislam/block-batch-req branch 2 times, most recently from c228e3b to feca574 Compare July 10, 2025 21:34

russell-islam marked this pull request as draft July 10, 2025 22:29

russell-islam force-pushed the muislam/block-batch-req branch 3 times, most recently from cea5167 to 57ed9bf Compare July 11, 2025 23:16

russell-islam marked this pull request as ready for review July 14, 2025 19:23

liuw previously approved these changes Jul 21, 2025

View reviewed changes

virtio-devices/src/block.rs Outdated Show resolved Hide resolved

block/src/raw_async.rs Outdated Show resolved Hide resolved

russell-islam force-pushed the muislam/block-batch-req branch from 57ed9bf to 873ca81 Compare July 21, 2025 20:33

liuw self-requested a review July 22, 2025 04:11

russell-islam force-pushed the muislam/block-batch-req branch 2 times, most recently from bc75b5c to 78ef10a Compare July 23, 2025 19:35

liuw approved these changes Jul 28, 2025

View reviewed changes

likebreath reviewed Jul 31, 2025

View reviewed changes

block/src/raw_async.rs Outdated Show resolved Hide resolved

block/src/raw_async.rs Outdated Show resolved Hide resolved

russell-islam force-pushed the muislam/block-batch-req branch from e6f93bc to c343843 Compare August 27, 2025 19:55

likebreath self-requested a review August 27, 2025 20:54

russell-islam added 4 commits August 27, 2025 21:53

block: virtio-devices: block: Clarify the return of execute_async()

0e0aa95

Instead of returning boolean return an struct of completion status so that it can be cached for batch submission. Signed-off-by: Bo Chen <[email protected]> Signed-off-by: Muminul Islam <[email protected]>

block: batch submit requests for fixed VHD

f1281fe

Updated VHD async implementation to call the batch submit method via the raw async IO layer. Signed-off-by: Muminul Islam <[email protected]>

russell-islam force-pushed the muislam/block-batch-req branch from c343843 to f1281fe Compare August 27, 2025 21:53

likebreath approved these changes Aug 28, 2025

View reviewed changes

likebreath added this pull request to the merge queue Aug 29, 2025

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Aug 29, 2025

likebreath added this pull request to the merge queue Sep 2, 2025

Merged via the queue into cloud-hypervisor:main with commit a9d6807 Sep 2, 2025
40 checks passed

likebreath added this to Cloud Hypervisor Roadmap Sep 10, 2025

github-project-automation bot moved this to 🆕 New in Cloud Hypervisor Roadmap Sep 10, 2025

likebreath moved this from 🆕 New to ✅ Done in Cloud Hypervisor Roadmap Sep 10, 2025

likebreath mentioned this pull request Sep 18, 2025

nvidia-smi fails after hotplug/unplug #7328

Closed

Batch submit block Async IO Requests #7146

Batch submit block Async IO Requests #7146

Uh oh!

Conversation

russell-islam commented Jun 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rbradford left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

liuw left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

russell-islam commented Jul 19, 2025

Uh oh!

liuw left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

liuw commented Jul 22, 2025

Uh oh!

likebreath left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

likebreath commented Aug 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

russell-islam commented Aug 7, 2025

Uh oh!

likebreath commented Aug 8, 2025

Uh oh!

likebreath commented Aug 19, 2025

Uh oh!

russell-islam commented Aug 19, 2025

Uh oh!

likebreath commented Aug 19, 2025

Uh oh!

russell-islam commented Aug 27, 2025

Uh oh!

russell-islam commented Aug 27, 2025

Uh oh!

russell-islam commented Aug 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

russell-islam commented Aug 27, 2025

Uh oh!

likebreath commented Aug 27, 2025

Uh oh!

russell-islam commented Aug 27, 2025

Uh oh!

likebreath commented Aug 27, 2025

Uh oh!

russell-islam commented Aug 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

russell-islam commented Jun 18, 2025 •

edited

Loading

liuw left a comment •

edited

Loading

likebreath commented Aug 1, 2025 •

edited

Loading

russell-islam commented Aug 27, 2025 •

edited

Loading

russell-islam commented Aug 27, 2025 •

edited

Loading