Opportunistically use dense iteration for archetypal iteration by re0312 · Pull Request #14049 · bevyengine/bevy

re0312 · 2024-06-27T13:25:20Z

Objective

currently, bevy employs sparse iteration if any of the target components in the query are stored in a sparse set. it may lead to increased cache misses in some cases, potentially impacting performance.
partial fixes Opprotunistically use dense iteration when an archetype and its table are 1:1 #12381

Solution

use dense iteration when an archetype and its table have the same entity count.
to avoid introducing complicate unsafe noise, this pr only implement for for_each style iteration.
added a benchmark to test performance for hybrid iteration.

Performance

nearly 2x win in specific scenarios, and no performance degradation in other test cases.

re0312 · 2024-06-27T14:47:24Z

this pr also unblocks some compiler auto SIMD ,consider following code

pub fn new() -> Self {
        let mut world = World::new();

        for _ in 0..10000 {
            world.spawn((TableData(0.0), SparseData(0.0))).id();
        }

        let query = world.query_filtered::<&mut TableData, With<SparseData>>();
        Self(world, query)
    }

    #[inline(never)]
    pub fn run(&mut self) {
        self.1
            .iter_mut(&mut self.0)
            .for_each(|mut v1| v1.0 += 1.)
}

cBournhonesque · 2024-06-27T15:10:19Z

crates/bevy_ecs/src/query/iter.rs

-                accum =
+                // SAFETY: Matched table IDs are guaranteed to still exist.
+                let table = unsafe { self.tables.get(archetype.table_id()).debug_checked_unwrap() };
+                if table.entity_count() == archetype.len() {


I think it would be worthwhile to add a comment explaining that this an optimization for hybrid queries where the archetype is actually dense

alice-i-cecile · 2024-06-27T15:55:48Z

Does this fully fix #2144? I suppose not, since it only optimizes for_each.

re0312 · 2024-06-27T16:15:19Z

Does this fully fix #2144? I suppose not, since it only optimizes for_each.

yeah , Optimizing for-style iteration is much more complicated because it could significantly degrade performance due to introducing branch in the hot path.

alice-i-cecile · 2024-06-27T16:18:07Z

Sounds good. Let's leave that to follow-up then.

cBournhonesque · 2024-06-27T17:51:25Z

benches/benches/bevy_ecs/iteration/iter_simple_foreach_hybrid.rs

+            v.push(world.spawn(TableData(0.)).id());
+        }
+
+        v.shuffle(&mut deterministic_rand());


v.shuffle(&mut deterministic_rand());

I think this should have a comment to explain why you're doing it.
If I understand correctly you have archetypes
A: [TableData, SparseData]
B: [TableData]

Entities are added in this order:
e0: A
e1: B
e2: A
e3: B
...

So that the order of entities is
Table: [e0, e1, e2, ...]
A: [e0, e2, ..]
B: [e1, e3, ..]

Then if you just despawned all the entities of B in order, you would remove e1, then e3, etc.
Each despawn call would do a swap_remove in both the archetype and the table.
So the order of entities in the table would be somewhat jumbled compared with the order in A, which is still [e0, e2, ...]

By adding the shuffling, you want to guarantee that their orders are very different; which maximimes the benefits of your PR (which keeps cache-locality by iterating through the table order, instead of the archetype order

cBournhonesque · 2024-06-27T19:31:10Z

Could you please explain why this code is faster?

        for _ in 0..10000 {
            world.spawn((TableData(0.0), SparseData(0.0))).id();
        }

        let query = world.query_filtered::<&mut TableData, With<SparseData>>();
        Self(world, query)

In this case the entities should be iterated in the same order before/after your change, no?

crates/bevy_ecs/src/query/iter.rs

re0312 · 2024-06-28T02:59:30Z

In this case the entities should be iterated in the same order before/after your change, no?

it would theoretically have the same cache locality. however, dense iteration could imply to the compiler that we are iterating over continuous memory which could enable automatic SIMD optimizations , This could potentially make the operation nearly 4 times faster.

crates/bevy_ecs/src/query/iter.rs

ItsDoot

Not an ECS expert, but the code quality overall looks fine.

crates/bevy_ecs/src/query/iter.rs

Co-authored-by: Alice Cecile <[email protected]>

Co-authored-by: Christian Hughes <[email protected]>

cart

Makes sense to me! Nice work.

…#14673) # Objective - follow of #14049 ,we could use it on our Parallel Iterator,this pr also unified the used function in both regular iter and parallel iterations. ## Performance ![image](https://github.com/user-attachments/assets/cba700bc-169c-4b58-b504-823bdca8ec05) no performance regression for regular itertaion 3.5X faster in hybrid parallel iteraion,this number is far greater than the benefits obtained in regular iteration(~1.81) because mutable iterations on continuous memory can effectively reduce the cost of mataining core cache coherence

alice-i-cecile · 2024-10-20T14:25:59Z

Thank you to everyone involved with the authoring or reviewing of this PR! This work is relatively important and needs release notes! Head over to bevyengine/bevy-website#1667 if you'd like to help out.

re0312 added 7 commits June 16, 2024 04:05

init

1a973b4

Merge branch 'main' of https://github.com/re0312/bevy

615497b

Merge branch 'main' of https://github.com/re0312/bevy

42df8f9

dense archetype fold

0bbd0ea

bench

029baeb

hybrid

585bd5a

clean

7eeddb7

re0312 marked this pull request as ready for review June 27, 2024 13:26

alice-i-cecile added C-Performance A change motivated by improving speed, memory usage or compile times M-Release-Note Work that should be called out in the blog due to impact A-ECS Entities, components, systems, and events labels Jun 27, 2024

alice-i-cecile requested review from cart and james7132 June 27, 2024 13:48

alice-i-cecile added the S-Needs-Review Needs reviewer attention (from anyone!) to move forward label Jun 27, 2024

alice-i-cecile added this to the 0.15 milestone Jun 27, 2024

alice-i-cecile added D-Complex Quite challenging from either a design or technical perspective. Ask for help! D-Unsafe Touches with unsafe code in some way X-Uncontroversial This work is generally agreed upon labels Jun 27, 2024

cBournhonesque reviewed Jun 27, 2024

View reviewed changes

comment

4e52bc3

cBournhonesque reviewed Jun 27, 2024

View reviewed changes

address review

9559fbf

cBournhonesque approved these changes Jun 27, 2024

View reviewed changes

cart reviewed Jun 27, 2024

View reviewed changes

crates/bevy_ecs/src/query/iter.rs Outdated Show resolved Hide resolved

debug assert

3c7fd4d

alice-i-cecile reviewed Jul 14, 2024

View reviewed changes

crates/bevy_ecs/src/query/iter.rs Outdated Show resolved Hide resolved

alice-i-cecile changed the title ~~Opprotunistically use dense iteration for archetypal iteration.~~ Opportunistically use dense iteration for archetypal iteration Jul 15, 2024

ItsDoot reviewed Jul 22, 2024

View reviewed changes

crates/bevy_ecs/src/query/iter.rs Outdated Show resolved Hide resolved

re0312 and others added 3 commits July 23, 2024 02:33

Update crates/bevy_ecs/src/query/iter.rs

92a89fa

Co-authored-by: Alice Cecile <[email protected]>

Update crates/bevy_ecs/src/query/iter.rs

def540e

Co-authored-by: Christian Hughes <[email protected]>

Merge branch 'main' into opt-ecs

3286332

re0312 mentioned this pull request Jul 25, 2024

Retain Rendering World #14449

Closed

cart approved these changes Aug 2, 2024

View reviewed changes

cart added this pull request to the merge queue Aug 2, 2024

Merged via the queue into bevyengine:main with commit 8235daa Aug 2, 2024

re0312 mentioned this pull request Aug 9, 2024

Opportunistically use dense iter for archetypal iteration in Par_iter #14673

Merged

alice-i-cecile mentioned this pull request Oct 20, 2024

Write release notes for PR #14049: Opportunistically use dense iteration for archetypal iteration bevyengine/bevy-website#1667

Closed

Uh oh!

Conversation

re0312 commented Jun 27, 2024

Objective

Solution

Performance

Uh oh!

re0312 commented Jun 27, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cBournhonesque Jun 27, 2024

Choose a reason for hiding this comment

Uh oh!

alice-i-cecile commented Jun 27, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

re0312 commented Jun 27, 2024

Uh oh!

alice-i-cecile commented Jun 27, 2024

Uh oh!

cBournhonesque Jun 27, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cBournhonesque commented Jun 27, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

re0312 commented Jun 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

ItsDoot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cart left a comment

Choose a reason for hiding this comment

Uh oh!

alice-i-cecile commented Oct 20, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

re0312 commented Jun 27, 2024 •

edited

Loading

alice-i-cecile commented Jun 27, 2024 •

edited

Loading

cBournhonesque Jun 27, 2024 •

edited

Loading

cBournhonesque commented Jun 27, 2024 •

edited

Loading

re0312 commented Jun 28, 2024 •

edited

Loading