Parallel Transform Propagation by aevyrie · Pull Request #17840 · bevyengine/bevy

aevyrie · 2025-02-13T06:35:49Z

Objective

Make transform propagation faster.

Solution

Work sharing worker threads
Parallel tree traversal excluding leaves
Second cache friendly wide pass over all leaves
3-10x faster than main

Testing

Tracy
Caldera hotel is showing 3-7x faster on my M4 Max. Timing for bevy's existing transform system shifts wildly run to run, so I don't know that I would advertise a particular number. But this implementation is faster in a... statistically significant way.

aevyrie · 2025-02-13T06:50:32Z

Getting this up now that it seems to function, marking as draft because it still needs cleanup.

crates/bevy_transform/src/systems.rs

alice-i-cecile · 2025-02-13T20:02:40Z

FYI, I'm merging #17815. The methods there are probably worth benching here.

alice-i-cecile · 2025-02-14T18:24:21Z

I'd love a before / after tracy histogram BTW. I've seen enough tests on Discord to be convinced that this is markedly (3x or so) faster, but it would be lovely to show in the release notes.

NthTensor · 2025-02-21T14:25:02Z

Alright, I am finally going to be able to give this a look. Right out of the gate I find the increase in run variance concerning, but I'm in favor of merging and leaving that as follow up if we can't easily identify the cause.

aevyrie · 2025-02-22T17:06:57Z

Alright, I am finally going to be able to give this a look. Right out of the gate I find the increase in run variance concerning, but I'm in favor of merging and leaving that as follow up if we can't easily identify the cause.

If you are referring to my comment, that was about bevy's existing system. It swings wildly +- 1ms. Variance (timing distribution) within a run is consistent for both. Variance between runs in this PR seems better than main. My guess is this is caused by E-cores vs P-cores. The old implementation usually ended up spending most time in a single core, whereas the new one is spread across all available cores in the thread pool. I think this explains why inter-run variance has improved, the work is spread across a mix of P and E cores.

aevyrie · 2025-02-23T20:01:48Z

The CI failure was not caused by this PR, seems to be some sort of MESA failure according to discord.

alice-i-cecile · 2025-03-25T20:08:45Z

Thank you to everyone involved with the authoring or reviewing of this PR! This work is relatively important and needs release notes! Head over to bevyengine/bevy-website#1995 if you'd like to help out.

# Objective - Follow up from previous transform optimization (#18589), make the `mark_dirty_trees` system more intelligent - don't run this expensive static scene optimization for dynamic scenes. - Using a threshold was mentioned as a follow up in that PR, and we also want this threshold to be user-configurable. - This was not implemented previously because the optimizations were still large improvements even in dynamic scenes thanks to the improved parallelism #17840 ## Solution - Don't run static scene optimization (dirty tree tracking) for very dynamic scenes - defined here as scenes where more than 30% of objects have their `Transform` updated. - This is configurable with a percentage threshold, or it can be unconditionally enabled or disabled when setting to `0.0` or `1.0` to avoid the cost of computing the threshold. - For dynamic scenes, this makes transform prop much faster, twice as fast in the stress tests shown here. ## Testing transform_hierarchy stress tests, all of these cases spawn about a quarter million entities: - humanoids_active - dynamic scene that should be faster than `main`: <img width="609" height="395" alt="image" src="https://github.com/user-attachments/assets/bf3d6b93-aa09-4440-b8ac-18af7e46a00f" /> - humanoids_inactive - static scene that should be unchanged from `main`: <img width="631" height="377" alt="image" src="https://github.com/user-attachments/assets/a0306109-600b-4cdd-a217-5cc15e269bca" /> - humanoids_mixed - half dynamic scene that should be faster than `main` <img width="604" height="372" alt="image" src="https://github.com/user-attachments/assets/2751ece2-d4b9-4daa-af24-fe379eaf75b2" /> - large_tree - dynamic scene (50% of entities are moved) we expect to see improvements <img width="665" height="371" alt="image" src="https://github.com/user-attachments/assets/c6b08abe-eb1d-44fb-be36-457f9d5ba78e" />

aevyrie added 8 commits February 11, 2025 19:54

parallel hierarchy propagation

55ae4f6

Update comments

c2c20dc

Remove extra tracing

032b869

Tweak stack and chunking

e57ab7a

reduce transform derefs

640b375

mpsc queue transform prop

b9c74d1

Merge remote-tracking branch 'origin/main' into parallel-transform

c3af3dd

Fix lints

b33088e

aevyrie requested a review from Victoronz February 13, 2025 06:36

alice-i-cecile added this to the 0.16 milestone Feb 13, 2025

aevyrie marked this pull request as draft February 13, 2025 06:49

aevyrie added 3 commits February 12, 2025 23:02

Clean up most egregious issues.

5b5f256

Factor out unsafe code

e89c2a4

More factoring and comments

b108d0f

Victoronz reviewed Feb 13, 2025

View reviewed changes

crates/bevy_transform/src/systems.rs Outdated Show resolved Hide resolved

crates/bevy_transform/src/systems.rs Outdated Show resolved Hide resolved

Victoronz reviewed Feb 13, 2025

View reviewed changes

crates/bevy_transform/src/systems.rs Outdated Show resolved Hide resolved

aevyrie added 2 commits February 13, 2025 01:12

Pass hierarchy tests

e65f972

Fix docs

7be8c3c

cBournhonesque reviewed Feb 13, 2025

View reviewed changes

crates/bevy_transform/src/systems.rs Outdated Show resolved Hide resolved

cBournhonesque reviewed Feb 13, 2025

View reviewed changes

crates/bevy_transform/src/systems.rs Outdated Show resolved Hide resolved

aevyrie added 2 commits February 14, 2025 00:06

Code cleanup

f54f2aa

Improve safety justification

ce29222

factor batch sending

ea7d217

mockersf approved these changes Feb 22, 2025

View reviewed changes

alice-i-cecile added S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it and removed S-Needs-Review Needs reviewer attention (from anyone!) to move forward labels Feb 22, 2025

alice-i-cecile added this pull request to the merge queue Feb 22, 2025

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Feb 22, 2025

Merge branch 'main' into parallel-transform

8869402

alice-i-cecile enabled auto-merge February 23, 2025 00:44

alice-i-cecile added this pull request to the merge queue Feb 23, 2025

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Feb 23, 2025

Merge branch 'main' into parallel-transform

1158d40

mockersf enabled auto-merge February 23, 2025 20:28

mockersf added this pull request to the merge queue Feb 23, 2025

Merged via the queue into bevyengine:main with commit dba1f7a Feb 23, 2025
33 checks passed

mockersf mentioned this pull request Feb 23, 2025

Upgrade to Rust Edition 2024 #17967

Merged

Victoronz mentioned this pull request Mar 14, 2025

implement get_many_unique #18315

Merged

alice-i-cecile mentioned this pull request Mar 14, 2025

Transform Propagation Broken for Deep Hierarchies #18314

Closed

alice-i-cecile mentioned this pull request Mar 25, 2025

Write release notes for PR #17840: Parallel Transform Propagation bevyengine/bevy-website#1995

Closed

aevyrie mentioned this pull request Mar 31, 2025

Faster transform propagation release notes bevyengine/bevy-website#2041

Merged

ChristopherBiscardi mentioned this pull request Apr 18, 2025

First-party tile maps #13782

Open

aevyrie mentioned this pull request Dec 26, 2025

Optimize transform propagation for dynamic scenes #22281

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Parallel Transform Propagation#17840

Parallel Transform Propagation#17840
mockersf merged 39 commits intobevyengine:mainfrom
aevyrie:parallel-transform

aevyrie commented Feb 13, 2025 •

edited

Loading

Uh oh!

aevyrie commented Feb 13, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

alice-i-cecile commented Feb 13, 2025

Uh oh!

alice-i-cecile commented Feb 14, 2025

Uh oh!

NthTensor commented Feb 21, 2025

Uh oh!

aevyrie commented Feb 22, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

aevyrie commented Feb 23, 2025

Uh oh!

Uh oh!

alice-i-cecile commented Mar 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

12 participants

Uh oh!

Conversation

aevyrie commented Feb 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Objective

Solution

Testing

Uh oh!

aevyrie commented Feb 13, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

alice-i-cecile commented Feb 13, 2025

Uh oh!

alice-i-cecile commented Feb 14, 2025

Uh oh!

NthTensor commented Feb 21, 2025

Uh oh!

aevyrie commented Feb 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

aevyrie commented Feb 23, 2025

Uh oh!

Uh oh!

alice-i-cecile commented Mar 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

12 participants

aevyrie commented Feb 13, 2025 •

edited

Loading

aevyrie commented Feb 22, 2025 •

edited

Loading