Micro-optimize queue_material_meshes, primarily to remove bit manipulation.#12791
Merged
alice-i-cecile merged 5 commits intobevyengine:mainfrom Apr 1, 2024
Merged
Conversation
manipulation. This commit makes the following optimizations: `MeshPipelineKey` has been split into `BaseMeshPipelineKey`, which lives in `bevy_render` and `MeshPipelineKey`, which lives in `bevy_pbr`. Conceptually, `BaseMeshPipelineKey` is a superclass of `MeshPipelineKey`. For `BaseMeshPipelineKey`, the bits start at the highest (most significant) bit and grow downward toward the lowest bit; for `MeshPipelineKey`, the bits start at the lowest bit and grow upward toward the highest bit. This prevents them from colliding. The goal of this is to avoid having to reassemble bits of the pipeline key for every mesh every frame. Instead, we can just use a bitwise or operation to combine the pieces that make up a `MeshPipelineKey`. Previously, all of `specialize()` was marked as `#[inline]`. This bloated `queue_material_meshes` unnecessarily, as most of it is a slow path that's rarely hit. This commit refactors the function to move the slow path to `specialize_slow()`. Together, these two changes shave about 5% off `queue_material_meshes`.
james7132
reviewed
Mar 30, 2024
| /// | ||
| /// These are precalculated so that we can just "or" them together in | ||
| /// [`queue_material_meshes`]. | ||
| pub mesh_pipeline_key_bits: MeshPipelineKey, |
Member
There was a problem hiding this comment.
Does this hard-tie the material system to meshes? Or should we generalize MeshPipelineKey?
Contributor
Author
There was a problem hiding this comment.
There would be a lot of untangling that would need to happen if we wanted to have meshes that didn't have materials, or materials that didn't have meshes. This doesn't really make things worse.
james7132
approved these changes
Mar 30, 2024
Member
james7132
left a comment
There was a problem hiding this comment.
LGTM other than the aforementioned comments.
kristoff3r
approved these changes
Apr 1, 2024
chompaa
pushed a commit
to chompaa/bevy
that referenced
this pull request
Apr 5, 2024
…ulation. (bevyengine#12791) This commit makes the following optimizations: ## `MeshPipelineKey`/`BaseMeshPipelineKey` split `MeshPipelineKey` has been split into `BaseMeshPipelineKey`, which lives in `bevy_render` and `MeshPipelineKey`, which lives in `bevy_pbr`. Conceptually, `BaseMeshPipelineKey` is a superclass of `MeshPipelineKey`. For `BaseMeshPipelineKey`, the bits start at the highest (most significant) bit and grow downward toward the lowest bit; for `MeshPipelineKey`, the bits start at the lowest bit and grow upward toward the highest bit. This prevents them from colliding. The goal of this is to avoid having to reassemble bits of the pipeline key for every mesh every frame. Instead, we can just use a bitwise or operation to combine the pieces that make up a `MeshPipelineKey`. ## `specialize_slow` Previously, all of `specialize()` was marked as `#[inline]`. This bloated `queue_material_meshes` unnecessarily, as a large chunk of it ended up being a slow path that was rarely hit. This commit refactors the function to move the slow path to `specialize_slow()`. Together, these two changes shave about 5% off `queue_material_meshes`:  ## Migration Guide - The `primitive_topology` field on `GpuMesh` is now an accessor method: `GpuMesh::primitive_topology()`. - For performance reasons, `MeshPipelineKey` has been split into `BaseMeshPipelineKey`, which lives in `bevy_render`, and `MeshPipelineKey`, which lives in `bevy_pbr`. These two should be combined with bitwise-or to produce the final `MeshPipelineKey`.
github-merge-queue bot
pushed a commit
that referenced
this pull request
Apr 5, 2024
# Objective - #12791 broke example `irradiance_volumes` - Fixes #12876 ``` wgpu error: Validation Error Caused by: In Device::create_render_pipeline note: label = `pbr_opaque_mesh_pipeline` Color state [0] is invalid Sample count 8 is not supported by format Rgba8UnormSrgb on this device. The WebGPU spec guarentees [1, 4] samples are supported by this format. With the TEXTURE_ADAPTER_SPECIFIC_FORMAT_FEATURES feature your device supports [1, 2, 4]. ``` ## Solution - Shift bits a bit more
ChristopherBiscardi
added a commit
to ChristopherBiscardi/bevy_ecs_tilemap
that referenced
this pull request
Jun 7, 2024
[12791](bevyengine/bevy#12791) the `primitive_topology` field on `GpuMesh` was removed in favor of `key_bits` which can be constructed using `BaseMeshPipeline::from_primitive_topology`
rparrett
added a commit
to StarArawn/bevy_ecs_tilemap
that referenced
this pull request
Jul 5, 2024
* Update to 0.14.0-rc.2 * [12997](bevyengine/bevy#12997): rename `multi-threaded` to `multi_threaded` * RenderAssets<Image> is now RenderAssets<GpuImage> Implemented in [12827](bevyengine/bevy#12827) * FloatOrd is now in bevy_math implemented in [12732](bevyengine/bevy#12732) * convert Transparent2d::dynamic_offset to extra_index [12889](bevyengine/bevy#12889) Gpu Frustum Culling removed the dynamic_offset of Transparent2d and it became `extra_index` with the special value `PhaseItemExtraIndex::NONE`, which indicates the `None` that was here previously * RenderPhase<Transparent2d> -> ViewSortedRenderPhases<Transparent2d> [12453](https://github.com/StarArawn/bevy_ecs_tilemap/pull/bevyengine/bevy#12453): Render phases are now binned or sorted. Following the changes in the `mesh2d_manual` [example](https://github.com/bevyengine/bevy/blob/ecdd1624f302c5f71aaed95b0984cbbecf8880b7/examples/2d/mesh2d_manual.rs#L357-L358): use the `ViewSortedRenderPhases` resource. * get_sub_app_mut is now an Option in [9202](https://github.com/StarArawn/bevy_ecs_tilemap/pull/bevyengine/bevy/pull/9202) SubApp access has changed * GpuImage::size f32 -> u32 via UVec2 [11698](bevyengine/bevy#11698) changed `GpuImage::size` to `UVec2`. Right above this, `Extent3d` does the same thing, so I'm taking a small leap and assuming can `as`. * GpuMesh::primitive_topology -> key_bits/BaseMeshPipeline [12791](bevyengine/bevy#12791) the `primitive_topology` field on `GpuMesh` was removed in favor of `key_bits` which can be constructed using `BaseMeshPipeline::from_primitive_topology` * RenderChunk2d::prepare requires &mut MeshVertexBufferLayouts now [12216](bevyengine/bevy#12216) introduced an argument `&mut MeshVertexBufferLayouts` to `get_mesh_vertex_buffer_layout`, which bevy_ecs_tilemap calls in `RenderChunk2d::prepare` * into_linear_f32 -> color.0.linear().to_f32_array(), [12163](bevyengine/bevy#12163) bevy_color was created and Color handling has changed. Specifically Color::as_linear_rgba_f32 has been removed. LinearRgba is now its own type that can be accessed via [`linear()`](https://docs.rs/bevy/0.14.0-rc.2/bevy/color/enum.Color.html#method.linear) and then converted. * Must specify type of VisibleEntities when accessing [12582](bevyengine/bevy#12582) divided `VisibleEntities` into separate lists. So now we have to specify which kind of entity we want. I think we want the Mesh here, and I think we can get rid of the `.index` calls on Entity since Entity [already compares bits](https://docs.rs/bevy_ecs/0.14.0-rc.2/src/bevy_ecs/entity/mod.rs.html#173) for optimized codegen purposes. Waiting to do that until the other changes are in though so as to not change functionality until post-upgrade. * app.world access is functions now - [9202](bevyengine/bevy#9202) changed world access to functions. [relevent line](https://github.com/bevyengine/bevy/pull/9202/files#diff-b2fba3a0c86e496085ce7f0e3f1de5960cb754c7d215ed0f087aa556e529f97fR640) - This also surfaced [12655](bevyengine/bevy#12655) which removed `Into<AssetId<T>>` for `Handle<T>`. using a reference or .id() is the solution here. * We don't need `World::cell`, and it doesn't exist anymore In [12551](bevyengine/bevy#12551) `WorldCell` was removed. ...but it turns out we don't need it or its replacement anyway. * examples error out unless this bevy bug is addressed with these features being added bevyengine/bevy#13728 * check_visibility is required for the entity that is renderable As a result of [12582](bevyengine/bevy#12582) `check_visibility` must be implemented for the "renderable" tilemap entities. Doing this is trivial by taking advantage of the existing `check_visibility` type arguments, which accept a [`QF: QueryFilter + 'static`](https://docs.rs/bevy/0.14.0-rc.2/bevy/render/view/fn.check_visibility.html). The same `QueryFilter`` is used when checking `VisibleEntities`. I've chosen `With<TilemapRenderSettings` because presumably if the entity doesn't have a `TilemapRenderSettings` then it will not be rendering, but this could be as sophisticated or simple as we want. For example `WithLight` is currently implemented as ```rust pub type WithLight = Or<(With<PointLight>, With<SpotLight>, With<DirectionalLight>)>; ``` * view.view_proj -> view.clip_from_world [13289](bevyengine/bevy#13489) introduced matrix naming changes, including `view_proj` which becomes `clip_from_world` * color changes to make tests runnable * clippy fix * Update Cargo.toml Co-authored-by: Rob Parrett <[email protected]> * Update Cargo.toml Co-authored-by: Rob Parrett <[email protected]> * final clippy fixes * Update Cargo.toml Co-authored-by: Rob Parrett <[email protected]> * Simplify async loading in ldtk/tiled helpers See Bevy #12550 * remove second allow lint * rc.3 bump * bump version for major release * remove unused features --------- Co-authored-by: Rob Parrett <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This commit makes the following optimizations:
MeshPipelineKey/BaseMeshPipelineKeysplitMeshPipelineKeyhas been split intoBaseMeshPipelineKey, which lives inbevy_renderandMeshPipelineKey, which lives inbevy_pbr. Conceptually,BaseMeshPipelineKeyis a superclass ofMeshPipelineKey. ForBaseMeshPipelineKey, the bits start at the highest (most significant) bit and grow downward toward the lowest bit; forMeshPipelineKey, the bits start at the lowest bit and grow upward toward the highest bit. This prevents them from colliding.The goal of this is to avoid having to reassemble bits of the pipeline key for every mesh every frame. Instead, we can just use a bitwise or operation to combine the pieces that make up a
MeshPipelineKey.specialize_slowPreviously, all of
specialize()was marked as#[inline]. This bloatedqueue_material_meshesunnecessarily, as a large chunk of it ended up being a slow path that was rarely hit. This commit refactors the function to move the slow path tospecialize_slow().Together, these two changes shave about 5% off
queue_material_meshes:Migration Guide
primitive_topologyfield onGpuMeshis now an accessor method:GpuMesh::primitive_topology().MeshPipelineKeyhas been split intoBaseMeshPipelineKey, which lives inbevy_render, andMeshPipelineKey, which lives inbevy_pbr. These two should be combined with bitwise-or to produce the finalMeshPipelineKey.