Order independent transparency, part 1

Correctly sorting transparent meshes is one of the hard problems in realtime rendering. The typical solution to this is to sort the meshes by distance and render them back to front. This can’t address all transparency sorting artifacts, in cases when meshes intersect or self sorting artifacts in transparent meshes. Also correctly sorting particles with transparent meshes can be a challenge sometimes.

For an extreme but illustrative example, here is screenshot of Sponza rendered with transparent materials using hardware alpha blending with the over operator c1*a1 + c2*(1-a1). To mix things up I have added a particle system using additive blending (the yellow one)

Continue reading “Order independent transparency, part 1”
Order independent transparency, part 1

Accelerating raytracing using software VRS

I discussed in the previous post how divergence in a wave can slow down the execution of a shader. This is particularly evident during raytracing global illumination (GI) as ray directions between neighbouring wave threads can differ a lot forcing different paths through the BVH tree with different number of steps. I described how ray binning can be used to improve this but it is not the only technique we can use. For this one we will use a different approach, instead of “binning” based on the similarity of input rays we will “bin” threads based on the raytraced GI’s output. This makes sense because it is usually quite uniform, with large and sudden transitions happening mainly at geometric edges.

Continue reading “Accelerating raytracing using software VRS”
Accelerating raytracing using software VRS

Increasing wave coherence with ray binning

Raytracing involves traversing acceleration structures (BVH), which encode a scene’s geometry, in an attempt to identify ray/triangle collisions. Depending on the rendering technique, eg raytraced shadows, AO, GI, rays can diverge a lot in direction something. This introduces additional cache and memory pressure as rays in a wave can follow very different paths in the BVH, ultimately colliding with different triangles.

Yet, ray generation is typically based on a limited set of random samples (eg a tiled blue noise texture), which we reuse across the frame, meaning that we raytrace using a limited number of ray directions. It sounds reasonable that we should be able to group rays by direction so as to enable all in a group to follow a similar path within the BVH tree and potentially hit the same triangle. Of course grouping by ray direction only is not enough, the origin of the ray matters as well, ideally we would like to group rays by both attributes.

Continue reading “Increasing wave coherence with ray binning”
Increasing wave coherence with ray binning

Raytracing, a 4 year retrospective

Recently I got access to a GPU that supports accelerated raytracing and the temptation to tinker with DXR is too strong. This means that I will steer away from compute shader raytracing for the foreseeable future. It is a good opportunity though to do a quick retrospective of the past few years of experimenting with “software” raytracing.

Continue reading “Raytracing, a 4 year retrospective”
Raytracing, a 4 year retrospective

Raytraced global illumination denoising

Recently, I’ve been playing Metro: Exodus on Series X a second time, after the enhanced edition was released, just to study the new raytraced GI the developers added to the game (by the way, the game is great and worth playing anyway). What makes this a bigger achievement is that the game runs at 60fps as well. The developers, smartly, use a layered approach in calculating the GI in the game, starting with screen space raymarching the g-buffer for collisions and then resorting to tracing rays at 0.25 rays per pixel (aka raytracing at half the rendering resolution) when none is found. They also use DDGI to calculate second bounce, to light the hitpoints with indirect lighting as well, and all these working together give an overall great lighting result. While all this is very interesting, it is their approach to denoising that piqued my interest and I set about to explore it a bit more in my toy renderer. This technique is described in this presentation and expanded in this one, where from I will be borrowing some images as well.

Continue reading “Raytraced global illumination denoising”
Raytraced global illumination denoising

Abstracting the Graphics API for a toy renderer

I’ve been asked a few times in DMs what is the best way to abstract the graphics API in own graphics engines to make development of graphics techniques easier. Since I’ve recently finished a first pass abstraction of DirectX12 in my own toy engine I’ve decided to put together a post to briefly discuss how I went about doing this.

Modern, low level APIs like DX12 and Vulkan are quite verbose, offering a lot of control to the developer but also requiring a lot of boilerplate code to set up the rendering pipeline. This prospect can seem like a daunting task to people that want to use such an API and often reach out to ask what is the best way to abstract it in their own graphics engines.

Continue reading “Abstracting the Graphics API for a toy renderer”
Abstracting the Graphics API for a toy renderer

Occlusion and directionality in image based lighting: implementation details

I got a few follow-up questions on the blog post I published a few days ago on occlusion and directionality in image based lighting, so I put together a quick follow-up to elaborate on a few points and add some more resources.

To implement the main technique the exploration was based on, Ground Truth ambient occlusion, it is worth starting with the original paper by Jimenez et al. This is best read in conjunction with the Siggraph 2016 presentation, it will help to understand the paper better. The paper also includes fairly detailed pseudo-code for the GTAO and bent normals implementation, it will help to also use Intel’s implementation of the technique as a reference, it clarifies some parts of it as well. At the moment the sample does not seem to implement directional GTAO, in which the visibility cone is combined with the cosine and projected to SH.

Continue reading “Occlusion and directionality in image based lighting: implementation details”
Occlusion and directionality in image based lighting: implementation details

Notes on occlusion and directionality in image based lighting.

Update: I wrote a follow-up post with some implementation details and some more resources here.

Image based lighting (IBL), in which we use a cubemap to represent indirect radiance from an environment, is an important component of scene lighting. Environment lighting is not always uniform and has often a strong directional component (think sunset or sunrise or a room lit by a window) and that indirect light should interact with the scene correctly, with directional occlusion. I spent some time exploring the directionality and occlusion aspects of IBL for diffuse lighting, with a sprinkle of raytracing, and made some notes (and pretty pictures).

Continue reading “Notes on occlusion and directionality in image based lighting.”
Notes on occlusion and directionality in image based lighting.

Shaded vertex reuse on modern GPUs

A well known feature of a GPU is the post transform vertex cache, used in cases where a drawcall uses an index buffer to index the vertex to be processed, to cache the output of the vertex shader for each vertex. If subsequently the same vertex is indexed, as part of another triangle, the results are already in the cache and the GPU needs not process that particular vertex again. Since all caches are of limited capacity, rendering engines typically rearrange the vertex indices in meshes to encourage more locality in vertex reuse and better cache hit ratio.

Continue reading “Shaded vertex reuse on modern GPUs”
Shaded vertex reuse on modern GPUs