Skip to content

Conversation

@liuw
Copy link
Member

@liuw liuw commented Dec 21, 2024

Significant improvements in block device performance across the board.

With this new feature:

Test 'block_read_MiBps' running .. (control: test_timeout = 10s, test_iterations = 5, num_queues = 1, queue_size = 128, fio_ops = read, bandwidth = true, overrides: )
Test 'block_read_MiBps' .. ok: mean = 1041.1836760485135, std_dev = 55.7625072234085
Test 'block_write_MiBps' running .. (control: test_timeout = 10s, test_iterations = 5, num_queues = 1, queue_size = 128, fio_ops = write, bandwidth = true, overrides: )
Test 'block_write_MiBps' .. ok: mean = 630.6574363756009, std_dev = 58.07112046007685
Test 'block_random_read_MiBps' running .. (control: test_timeout = 10s, test_iterations = 5, num_queues = 1, queue_size = 128, fio_ops = randread, bandwidth = true, overrides: )
Test 'block_random_read_MiBps' .. ok: mean = 1049.6611271606039, std_dev = 12.955092699638199
Test 'block_random_write_MiBps' running .. (control: test_timeout = 10s, test_iterations = 5, num_queues = 1, queue_size = 128, fio_ops = randwrite, bandwidth = true, overrides: )
Test 'block_random_write_MiBps' .. ok: mean = 649.828933515884, std_dev = 15.113753535445566
Test 'block_multi_queue_read_MiBps' running .. (control: test_timeout = 10s, test_iterations = 5, num_queues = 2, queue_size = 128, fio_ops = read, bandwidth = true, overrides: )
Test 'block_multi_queue_read_MiBps' .. ok: mean = 1045.9945313128374, std_dev = 5.687116997857165
Test 'block_multi_queue_write_MiBps' running .. (control: test_timeout = 10s, test_iterations = 5, num_queues = 2, queue_size = 128, fio_ops = write, bandwidth = true, overrides: )
Test 'block_multi_queue_write_MiBps' .. ok: mean = 1001.8954735340525, std_dev = 17.82514962812694
Test 'block_multi_queue_random_read_MiBps' running .. (control: test_timeout = 10s, test_iterations = 5, num_queues = 2, queue_size = 128, fio_ops = randread, bandwidth = true, overrides: )
Test 'block_multi_queue_random_read_MiBps' .. ok: mean = 1029.3417347532672, std_dev = 8.32247648480339
Test 'block_multi_queue_random_write_MiBps' running .. (control: test_timeout = 10s, test_iterations = 5, num_queues = 2, queue_size = 128, fio_ops = randwrite, bandwidth = true, overrides: )
Test 'block_multi_queue_random_write_MiBps' .. ok: mean = 771.0545246611134, std_dev = 5.895151767945272
Test 'block_read_IOPS' running .. (control: test_timeout = 10s, test_iterations = 5, num_queues = 1, queue_size = 128, fio_ops = read, bandwidth = false, overrides: )
Test 'block_read_IOPS' .. ok: mean = 272408.07766566763, std_dev = 6170.61954484717
Test 'block_write_IOPS' running .. (control: test_timeout = 10s, test_iterations = 5, num_queues = 1, queue_size = 128, fio_ops = write, bandwidth = false, overrides: )
Test 'block_write_IOPS' .. ok: mean = 164097.1681134429, std_dev = 7045.576954582243
Test 'block_random_read_IOPS' running .. (control: test_timeout = 10s, test_iterations = 5, num_queues = 1, queue_size = 128, fio_ops = randread, bandwidth = false, overrides: )
Test 'block_random_read_IOPS' .. ok: mean = 270026.4975599876, std_dev = 3818.1133536678954
Test 'block_random_write_IOPS' running .. (control: test_timeout = 10s, test_iterations = 5, num_queues = 1, queue_size = 128, fio_ops = randwrite, bandwidth = false, overrides: )
Test 'block_random_write_IOPS' .. ok: mean = 167925.33109767124, std_dev = 4540.00892989962
Test 'block_multi_queue_read_IOPS' running .. (control: test_timeout = 10s, test_iterations = 5, num_queues = 2, queue_size = 128, fio_ops = read, bandwidth = false, overrides: )
Test 'block_multi_queue_read_IOPS' .. ok: mean = 267630.6297761863, std_dev = 1144.5182710512386
Test 'block_multi_queue_write_IOPS' running .. (control: test_timeout = 10s, test_iterations = 5, num_queues = 2, queue_size = 128, fio_ops = write, bandwidth = false, overrides: )
Test 'block_multi_queue_write_IOPS' .. ok: mean = 249247.9256193595, std_dev = 2903.4336638199043
Test 'block_multi_queue_random_read_IOPS' running .. (control: test_timeout = 10s, test_iterations = 5, num_queues = 2, queue_size = 128, fio_ops = randread, bandwidth = false, overrides: )
Test 'block_multi_queue_random_read_IOPS' .. ok: mean = 261706.90908901737, std_dev = 1360.5256601287383
Test 'block_multi_queue_random_write_IOPS' running .. (control: test_timeout = 10s, test_iterations = 5, num_queues = 2, queue_size = 128, fio_ops = randwrite, bandwidth = false, overrides: )
Test 'block_multi_queue_random_write_IOPS' .. ok: mean = 196864.48090243965, std_dev = 1069.8654182837279

Without:

Test 'block_read_MiBps' running .. (control: test_timeout = 10s, test_iterations = 5, num_queues = 1, queue_size = 128, fio_ops = read, bandwidth = true, overrides: )
Test 'block_read_MiBps' .. ok: mean = 485.82088019858895, std_dev = 1.7447280725526355
Test 'block_write_MiBps' running .. (control: test_timeout = 10s, test_iterations = 5, num_queues = 1, queue_size = 128, fio_ops = write, bandwidth = true, overrides: )
Test 'block_write_MiBps' .. ok: mean = 487.8825615445379, std_dev = 31.092624291944347
Test 'block_random_read_MiBps' running .. (control: test_timeout = 10s, test_iterations = 5, num_queues = 1, queue_size = 128, fio_ops = randread, bandwidth = true, overrides: )
Test 'block_random_read_MiBps' .. ok: mean = 155.45657135073682, std_dev = 0.5834015665376312
Test 'block_random_write_MiBps' running .. (control: test_timeout = 10s, test_iterations = 5, num_queues = 1, queue_size = 128, fio_ops = randwrite, bandwidth = true, overrides: )
Test 'block_random_write_MiBps' .. ok: mean = 190.13974678009396, std_dev = 2.979427312202239
Test 'block_multi_queue_read_MiBps' running .. (control: test_timeout = 10s, test_iterations = 5, num_queues = 2, queue_size = 128, fio_ops = read, bandwidth = true, overrides: )
Test 'block_multi_queue_read_MiBps' .. ok: mean = 882.0916764400447, std_dev = 26.671466624352554
Test 'block_multi_queue_write_MiBps' running .. (control: test_timeout = 10s, test_iterations = 5, num_queues = 2, queue_size = 128, fio_ops = write, bandwidth = true, overrides: )
Test 'block_multi_queue_write_MiBps' .. ok: mean = 865.7831747164018, std_dev = 108.55131155781514
Test 'block_multi_queue_random_read_MiBps' running .. (control: test_timeout = 10s, test_iterations = 5, num_queues = 2, queue_size = 128, fio_ops = randread, bandwidth = true, overrides: )
Test 'block_multi_queue_random_read_MiBps' .. ok: mean = 166.38560575436063, std_dev = 1.18464999551714
Test 'block_multi_queue_random_write_MiBps' running .. (control: test_timeout = 10s, test_iterations = 5, num_queues = 2, queue_size = 128, fio_ops = randwrite, bandwidth = true, overrides: )
Test 'block_multi_queue_random_write_MiBps' .. ok: mean = 185.04609699916395, std_dev = 4.761066952685287
Test 'block_read_IOPS' running .. (control: test_timeout = 10s, test_iterations = 5, num_queues = 1, queue_size = 128, fio_ops = read, bandwidth = false, overrides: )
Test 'block_read_IOPS' .. ok: mean = 124633.69637797368, std_dev = 1161.9433928482986
Test 'block_write_IOPS' running .. (control: test_timeout = 10s, test_iterations = 5, num_queues = 1, queue_size = 128, fio_ops = write, bandwidth = false, overrides: )
Test 'block_write_IOPS' .. ok: mean = 128261.38608432449, std_dev = 436.4849919404479
Test 'block_random_read_IOPS' running .. (control: test_timeout = 10s, test_iterations = 5, num_queues = 1, queue_size = 128, fio_ops = randread, bandwidth = false, overrides: )
Test 'block_random_read_IOPS' .. ok: mean = 39719.41820628812, std_dev = 198.91780571227557
Test 'block_random_write_IOPS' running .. (control: test_timeout = 10s, test_iterations = 5, num_queues = 1, queue_size = 128, fio_ops = randwrite, bandwidth = false, overrides: )
Test 'block_random_write_IOPS' .. ok: mean = 48465.39677010305, std_dev = 1094.3443985519684
Test 'block_multi_queue_read_IOPS' running .. (control: test_timeout = 10s, test_iterations = 5, num_queues = 2, queue_size = 128, fio_ops = read, bandwidth = false, overrides: )
Test 'block_multi_queue_read_IOPS' .. ok: mean = 234775.41672233236, std_dev = 6278.984603821063
Test 'block_multi_queue_write_IOPS' running .. (control: test_timeout = 10s, test_iterations = 5, num_queues = 2, queue_size = 128, fio_ops = write, bandwidth = false, overrides: )
Test 'block_multi_queue_write_IOPS' .. ok: mean = 231800.0319106508, std_dev = 27757.126228905294
Test 'block_multi_queue_random_read_IOPS' running .. (control: test_timeout = 10s, test_iterations = 5, num_queues = 2, queue_size = 128, fio_ops = randread, bandwidth = false, overrides: )
Test 'block_multi_queue_random_read_IOPS' .. ok: mean = 42492.338109083124, std_dev = 520.181025050384
Test 'block_multi_queue_random_write_IOPS' running .. (control: test_timeout = 10s, test_iterations = 5, num_queues = 2, queue_size = 128, fio_ops = randwrite, bandwidth = false, overrides: )
Test 'block_multi_queue_random_write_IOPS' .. ok: mean = 47620.581976033485, std_dev = 1079.9372604029465

Cc @russell-islam @xietou

@liuw liuw requested a review from a team as a code owner December 21, 2024 01:40
@liuw
Copy link
Member Author

liuw commented Dec 21, 2024

The VHDX test on ARM64 is broken by this. Need to investigate.

@rbradford
Copy link
Member

The VHDX test on ARM64 is broken by this. Need to investigate.

And on x86-64 too - so I don't think it's architecture specific!

@liuw
Copy link
Member Author

liuw commented Dec 22, 2024

The VHDX test on ARM64 is broken by this. Need to investigate.

And on x86-64 too - so I don't think it's architecture specific!

Just noticed that.

QCOW tests are passing, so this is probably a latent bug in VHDX implementation.

@liuw
Copy link
Member Author

liuw commented Dec 24, 2024

I will resend this PR after #6890 is merged.

@liuw
Copy link
Member Author

liuw commented Dec 25, 2024

There is another check we should add in all the Virtio device config validation function. The queue size should be a power of 2.

This PR introduces a new InvalidQueueSize error. It can be used later for the new check.

@liuw liuw force-pushed the blk-seg-max branch 2 times, most recently from c69522a to ba31f83 Compare December 29, 2024 07:49
@liuw
Copy link
Member Author

liuw commented Dec 31, 2024

@cloud-hypervisor/cloud-hypervisor-reviewers any more comments on this PR?

@russell-islam
Copy link
Contributor

LGTM

Copy link
Member

@likebreath likebreath left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good work. I can see single queue tests benefit more comparing with multiple queue tests, which is kind of expected.

For the random read/write tests, they are seeing the most significant improvements, and basically matching up with the sequential tests performance. Do you think this is expected?

physical_block_exp,
min_io_size: (topology.minimum_io_size / logical_block_size) as u16,
opt_io_size: (topology.optimal_io_size / logical_block_size) as u32,
seg_max: (queue_size - 2) as u32,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please explain why the seg_max is set to this value?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A request is consist of at least one our header and one in header, IIRC. What's left in the queue can be used for data segments.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the explanation. Can you please share some pointers for a more detailed context? I always find virtio spec way too concise to understand by itself.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at QEMU code. It was always like that since the beginning with no explanation.

The closest I can find is virtblk_add_req in Linux.

@liuw
Copy link
Member Author

liuw commented Jan 1, 2025

For the random read/write tests, they are seeing the most significant improvements, and basically matching up with the sequential tests performance. Do you think this is expected?

My only expectation is the performance will improve a lot. I cannot say one way or another whether random rws can be better or worse than seq rws.

liuw added 2 commits January 1, 2025 02:06
This allows the guest to put in more than one segment per request. It
can improve the throughput of the system.

Introduce a new check to make sure the queue size configured by the user
is large enough to hold at least one segment.

Signed-off-by: Wei Liu <[email protected]>
The size was set to one because without VIRTIO_BLK_F_SEG_MAX, the guest
only used one data descriptor per request.

The value 32 is empirically derived from booting a guest. This value
eliminates all SmallVec allocations observable by DHAT.

Signed-off-by: Wei Liu <[email protected]>
@likebreath
Copy link
Member

For the random read/write tests, they are seeing the most significant improvements, and basically matching up with the sequential tests performance. Do you think this is expected?

My only expectation is the performance will improve a lot. I cannot say one way or another whether random rws can be better or worse than seq rws.

Yes, the performance improvements are substantial and awesome. No doubt on that.

It was the random read/write matching up with sequential read/write across the board that puzzled me. I wanted to see if you have any insights. (I always thought random read/write are supposed to be much slower.)

@likebreath
Copy link
Member

@TimePrinciple @rbradford The risc-v runner is offline. Would you please take a look? Thanks.

@likebreath likebreath added this pull request to the merge queue Jan 1, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jan 1, 2025
@rbradford
Copy link
Member

test_virtio_block_vhdx is failing on musl on AMD - this might just be a timing flake

@rbradford rbradford added this pull request to the merge queue Jan 1, 2025
Merged via the queue into cloud-hypervisor:main with commit 1f7b809 Jan 1, 2025
34 of 38 checks passed
@liuw liuw deleted the blk-seg-max branch January 1, 2025 20:26
@TimePrinciple
Copy link
Member

TimePrinciple commented Jan 2, 2025

@TimePrinciple @rbradford The risc-v runner is offline. Would you please take a look? Thanks.

I've synced the message in our Slack channel, and unfortunately it looks like the process is about to take much longer than expected(Our lab is building a new server room). I have moved that machine to my office and get it online for now(and will be moved into server room when it's completed) 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

6 participants