column/table level change detection by re0312 · Pull Request #11120 · bevyengine/bevy

re0312 · 2023-12-28T17:09:51Z

Objective

currently, bevy's change detection will check all matched archtype entities ,leading to unnecessary overhead in certain scenarios.
try to fix Accelerate change detection by redundantly storing change ticks #5097

Solution

Introducing an AtomicU32 type to track the last mutable access for tables and columns.
Assigning the responsibility of updating this information to the set_table and set_archetype methods.
Implementing the archetype_filter_fetch method on QueryFilter to filter out tables or columns that have not been subject to mutable access since system last run.

Notes

I !!!express reservations about the current implementation due to its involvement in multiple data flows, potentially increasing code vulnerability and expensive . I would highly appreciate it if someone could propose an alternative implementation details with fewer complexities and improved security considerations.

Performance

Intel 13400kf NV4070ti

to comprehensively evaluate the change detection mechanism, additional benchmarking across various scenarios is deemed necessary

regression:

The regression in spawn/insert is attributed to the introduction of an atomic operation for each table per component.
iter_fragmented_* method appears to only iterate over few entities, making the maintenance of tick costly.
query.get_mut_compoent/unchecked(entity) is expensive becase it set the archetype for each entity retrieved.

It is worth mentioning that many_foxes frame_time dropped from 3.06ms to 4.21ms
yellow is main ,red is the pr

the hotspot is animation_player ,propagate_transform which parallel use of get_unchecked for each entity seems leading to numerous invalid cache instances across CPU cores.

Migration Guide

This section is optional. If there are no breaking changes, you can delete this section.

If this PR is a breaking change (relative to the last release of Bevy), describe how a user might need to migrate their code to support these changes
Simply adding new functionality is not a breaking change.
Fixing behavior that was definitely a bug, rather than a questionable design choice is not a breaking change.

alice-i-cecile · 2023-12-28T18:17:19Z

Okay, really neat stuff. Thanks for trying this out and presenting those benchmark results! Initial thoughts:

I think / hope we should be able to do this without the introduction of atomics. As you've seen, they introduce a meaningful performance overhead on all operations. The scheduler should already guarantee that only one system ever has mutable access to each archetype-component (I think this is what you called a table) at once. However, we will still need to be cautious in how we implement parallel iteration over queries directly.
Firmly agree on the need for more benchmarks for change detection before merging something in this space. I want to have a clear sense of the performance tradeoffs involved, and the
We should consider making change detection configurable (Change Detection as Components / Opt-in or opt-out of Change Detection #4882) before merging this work: I suspect the correct trade-offs will vary substantially by component type.

JMS55 · 2023-12-28T20:21:57Z

Definitely excited to have this, as it opens up an opportunity for my reactive UI crate to skip regenerating the UI when a query has no changed data at the archetype/column level.

re0312 · 2023-12-29T03:41:08Z

I think / hope we should be able to do this without the introduction of atomics. As you've seen, they introduce a meaningful performance overhead on all operations. The scheduler should already guarantee that only one system ever has mutable access to each archetype-component (I think this is what you called a table) at once. However, we will still need to be cautious in how we implement parallel iteration over queries directly.

current impl of this pr includes both table (built with their owned table-storage columns) and every columns (both table-storage and sparse-set-storage). schedule cannot guarantee only one mutable access to a table and sparse-set column simultaneously. If we avoid using AtomicType, the viable option would be to only implement column-level change detection specifically for table-storage columns. (of course ,i think not to use atomic is the right way that will greatly reduce overhead )

Firmly agree on the need for more benchmarks for change detection before merging something in this space. I want to have a clear sense of the performance tradeoffs involved, and the

yeah,i think i can add more benchmark related to change detection in another pr

Davier · 2023-12-29T11:43:16Z

crates/bevy_ecs/src/storage/table.rs

+    pub fn read_last_mutable_access_tick(&self) -> Tick {
+        Tick::new(
+            self.last_mutable_access_tick
+                .load(std::sync::atomic::Ordering::Relaxed),
+        )
+    }
+    pub fn update_last_mutable_access_tick(&self, tick: Tick) {
+        // self.last_mutable_access_tick
+        //     .store(tick.get(), std::sync::atomic::Ordering::Relaxed);
+        self.last_mutable_access_tick
+            .fetch_max(tick.get(), std::sync::atomic::Ordering::AcqRel);
+    }


Could you explain the choice for memory ordering? Naively, I would have expected the fetch_max to be Release and the load to be Acquire, so that we cannot miss changes. However I'm not confident in my understanding of atomic ordering.

I haven't deeply considered this aspect yet(so please correct me if i am wrong). My thought was that there might not be any other data flow related to access-tick. Therefore, using a relaxed load operation would be acceptable. Bevy's scheduler may run systems with same table access in parallel, but we cannot guarantee the progress when system would load or store corresponding tick, when two systems write to the same tick concurrently. store operations must be observed. but for load operation ,even if changes are missed,it is not a big deal,our focus is on comparing the loaded value to the system's last_change_tick.

hymm · 2024-01-03T06:34:21Z

The only time we need to update change ticks in parallel is when we're using par_iter. Could we special case that scenario? Maybe use something like https://docs.rs/thread_local/1.1.7/thread_local/ and take the max of all the ticks after the par_iter is done and use that to update the archetype ticks.

Also adding branching inside the iterator is very bad for vectorization. You probably want to somehow update the matched archetypes. Probably needs benchmarking to be sure.

re0312 · 2024-01-03T14:59:24Z

The only time we need to update change ticks in parallel is when we're using par_iter. Could we special case that scenario? Maybe use something like https://docs.rs/thread_local/1.1.7/thread_local/ and take the max of all the ticks after the par_iter is done and use that to update the archetype ticks.

Also adding branching inside the iterator is very bad for vectorization. You probably want to somehow update the matched archetypes. Probably needs benchmarking to be sure.

yeah,this impl is just a try to identify a method that balances both ergonomics and performance.still need more benchmark,my current thoughts are:

archetype/table level change detection is not easy to impl in bevy,A archetype not only updates pararllel when using par_iter but also between systems run(mulityple systems can run parallel with same archetype)
it is not worth it that accurate update archetype/table change tick like Mut when deref_mut of each coloumn/archetype.
similar to flecs,we could update the archetype/column level tick when it first mutable access in a system run, for example ,Query<&mut T >.iter_mut().foreach that will update the corresponding archeype tick even if the loop does nothing;

chengts95 · 2025-10-19T16:28:10Z

This is a critical issue, as current change tick prevents the use of SIMD instructions and significantly reduces effective memory bandwidth. This throws ecs back to the age of even 8086, let alone compete with seriously optimzied OOP designs.

In this context, atomic operations could utilize relaxed memory ordering, since strict ordering is unnecessary—as long as system-level access to the table is already properly sequenced.
The most important consideration is to ensure that the system itself does not rely on any implicit ordering or introduce ordering bugs.

I hope this fix can be implemented soon, since having to write and read massive amounts of unnecessary data would be a rather unfortunate and inefficient outcome for everyone.

chengts95 · 2025-10-19T16:43:56Z

im most cases i don't even care the changed ticks. If someone wants to do the old ways why not put a Changed data component to those entities and do the ticking on their own tables. Why our fast and clean table should pay for the crazy O(n) operation.

re0312 added 2 commits December 27, 2023 14:06

init

5ed428f

table

1b78203

Davier reviewed Dec 29, 2023

View reviewed changes

BenjaminBrienen added the S-Waiting-on-Author The author needs to make changes or address concerns before this can be merged label Jan 23, 2025

notmd mentioned this pull request Apr 10, 2025

Column change detection for immutable components #18788

Open

ElliottjPierce mentioned this pull request Mar 27, 2026

Implement opt-in change indexes for dense components. #23519

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

column/table level change detection#11120

column/table level change detection#11120
re0312 wants to merge 2 commits intobevyengine:mainfrom
re0312:mutable-detection

re0312 commented Dec 28, 2023

Uh oh!

alice-i-cecile commented Dec 28, 2023

Uh oh!

JMS55 commented Dec 28, 2023 •

edited

Loading

Uh oh!

re0312 commented Dec 29, 2023

Uh oh!

Davier Dec 29, 2023

Uh oh!

re0312 Dec 30, 2023

Uh oh!

hymm commented Jan 3, 2024

Uh oh!

re0312 commented Jan 3, 2024

Uh oh!

chengts95 commented Oct 19, 2025 •

edited

Loading

Uh oh!

chengts95 commented Oct 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Uh oh!

Conversation

re0312 commented Dec 28, 2023

Objective

Solution

Notes

Performance

Migration Guide

Uh oh!

alice-i-cecile commented Dec 28, 2023

Uh oh!

JMS55 commented Dec 28, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

re0312 commented Dec 29, 2023

Uh oh!

Davier Dec 29, 2023

Choose a reason for hiding this comment

Uh oh!

re0312 Dec 30, 2023

Choose a reason for hiding this comment

Uh oh!

hymm commented Jan 3, 2024

Uh oh!

re0312 commented Jan 3, 2024

Uh oh!

chengts95 commented Oct 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chengts95 commented Oct 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

JMS55 commented Dec 28, 2023 •

edited

Loading

chengts95 commented Oct 19, 2025 •

edited

Loading