Sampling is performed on images, typically in conjunction with a
sampler, to return a value based on a neighborhood of texels in an image.
Sampling instructions include the functionality of the following SPIR-V
Image Instructions:
OpImageSample* and OpImageSparseSample* read one or more
neighboring texels of the image, and filter
the texel values based on the state of the sampler.
Instructions with ImplicitLod in the name
determine the LOD used in the
sampling operation based on the coordinates used in neighboring
fragments.
Instructions with ExplicitLod in the name
determine the LOD used in the
sampling operation based on additional coordinates.
Instructions with Proj in the name apply homogeneous
projection to the coordinates.
OpImageFetch and OpImageSparseFetch return a single texel of
the image.
No sampler is used.
OpImage*Gather and OpImageSparse*Gather read neighboring
texels and return a single component of each.
OpImageSampleFootprintNV identifies and returns information about
the set of texels in the image that would be accessed by an equivalent
OpImageSample* instruction.
OpImage*Dref* instructions apply
depth comparison on the texel
values.
OpImageSparse* instructions additionally return a
sparse residency code.
OpImageQueryLod returns the LOD parameters that would be used in a
sample operation.
The actual operation is not performed.
OpImageSampleWeightedQCOM reads a 2D neighborhood of texels and
computes a weighted average using weight values from a separate weight
texture.
opImageBlockMatchSADQCOM and opTextureBlockMatchSSD compare 2D
neighborhoods of texels from two textures.
OpImageBoxFilterQCOM reads a 2D neighborhood of texels and computes
a weighted average of the texels.
opImageBlockMatchWindowSADQCOM and
opImageBlockMatchWindowSSDQCOM compare 2D neighborhoods of texels
from two textures with the comparison repeated across a window region in
the target texture.
opImageBlockMatchGatherSADQCOM and
opImageBlockMatchWindowSSDQCOM compares four 2D neighborhoods of
texels from a target texture with a single 2D neighborhood in the
reference texture.
The R component of each comparison is gathered and returned in the
output.
Sampling Coordinate Systems
There are three sampling coordinate systems used in this chapter:
SPIR-V
opImageBlockMatchSADQCOM, opImageBlockMatchSSDQCOM,
opImageBlockMatchWindowSADQCOM, opImageBlockMatchWindowSSDQCOM,
OpImageFetch, and OpImageSparseFetch instructions use integer
sampling coordinates.
Other image instructions can use either normalized or unnormalized sampling
coordinates (selected by the unnormalizedCoordinates state of the
sampler used in the instruction), but there are
limitations on what operations, image
state, and sampler state is supported.
Normalized coordinates are logically
converted to unnormalized as part of
image operations, and certain steps are
only performed on normalized coordinates.
The array layer coordinate is always treated as unnormalized even when other
coordinates are normalized.
Normalized Sampling Coordinates
Normalized sampling coordinates are referred to as (s,t,r,q,a), with
the coordinates having the following meanings:
s: Coordinate in the first dimension of an image.
t: Coordinate in the second dimension of an image.
r: Coordinate in the third dimension of an image.
(s,t,r) are interpreted as a direction vector for Cube images.
q: Fourth coordinate, for homogeneous (projective) coordinates.
a: Coordinate for array layer.
The values s,t,r,q,a are all floating-point values.
Values of s,t,r in the normalized range [0.0,1.0] and a in
the range [0.0, layers - 1] correspond to locations in the image being
sampled, where layers is the dimension of the image with the same
name.
The value q is used as a divisor for each of s,t,r, rather than
directly corresponding to a location in an image.
The coordinates are extracted from the SPIR-V operand based on the
dimensionality of the image variable and type of instruction.
For Proj instructions, the components are in order (s, [t,] [r,]
q), with t and r being conditionally present based on the
Dim of the image.
For non-Proj instructions, the coordinates are (s [,t] [,r]
[,a]), with t and r being conditionally present based on the
Dim of the image and a being conditionally present based on the
Arrayed property of the image.
Projective image instructions are not supported on Arrayed images.
Unnormalized Sampling Coordinates
Unnormalized sampling coordinates are referred to as (u,v,w,a), with
the coordinates having the following meanings:
u: Coordinate in the first dimension of an image.
v: Coordinate in the second dimension of an image.
w: Coordinate in the third dimension of an image.
a: Coordinate for array layer.
The values u,v,w,a are all floating-point values, with values in the
ranges [0.0,width), [0.0, height), [0.0, depth), and
[0,layers-1] corresponding to locations in the image being sampled.
The values width, height, depth, and layers are the
dimensions of the accessed image.
Only the u and v coordinates are directly extracted from the
SPIR-V operand, because only 1D and 2D (non-Arrayed) dimensionalities
support unnormalized coordinates.
The components are in order (u [,v]), with v being conditionally
present when the dimensionality is 2D.
When normalized coordinates are converted to unnormalized coordinates, all
four coordinates are used.
Integer Sampling Coordinates
Integer sampling coordinates are referred to as (i,j,k,l,n), with the
coordinates having the following meanings:
i: Coordinate in the first dimension of an image.
j: Coordinate in the second dimension of an image.
k: Coordinate in the third dimension of an image.
l: Coordinate for array layer.
n: Index of the sample within the texel.
The values i,j,k,l,n are all integer values, with values in the ranges
[0,width), [0, height), [0, depth), [0,layers), and
[0,samples) corresponding to locations in the image being sampled.
The values width, height, depth, layers, and
samples are the dimensions of the accessed image.
They are extracted from the SPIR-V operand in order (i [,j] [,k] [,l]
[,n]), with j and k conditionally present based on the Dim
of the image, and l conditionally present based on the Arrayed
property of the image.
n is conditionally present and is taken from the Sample image
operand.
Integer coordinates are used as image coordinates to perform an
image read after sampling calculations, directly
translating each coordinate as follows:
i → x
j → y
k → z
l → layer
n → sample
level is calculated separately via the Lod image operand if
present, or is set to 0 otherwise.
Integer sampling coordinates are used as image coordinates to
perform an image read after sampling calculations, directly
translating each coordinate as follows:
The following diagrams illustrate the different sampling coordinate systems
for 2D images.
Figure 1. Texel Coordinate Systems, Linear Filtering
The Texel Coordinate Systems - For the example shown of an 8×4 texel
two dimensional image.
Normalized texel coordinates:
The s coordinate goes from 0.0 to 1.0.
The t coordinate goes from 0.0 to 1.0.
Unnormalized texel coordinates:
The u coordinate within the range 0.0 to 8.0 is within the image,
otherwise it is outside the image.
The v coordinate within the range 0.0 to 4.0 is within the image,
otherwise it is outside the image.
Integer texel coordinates:
The i coordinate within the range 0 to 7 addresses texels within
the image, otherwise it is outside the image.
The j coordinate within the range 0 to 3 addresses texels within
the image, otherwise it is outside the image.
Also shown for linear filtering:
Given the unnormalized coordinates (u,v), the four texels
selected are i0j0, i1j0, i0j1, and
i1j1.
The fractions α and β.
Given the offset Δi and Δj, the
four texels selected by the offset are i0j'0,
i1j'0, i0j'1, and i1j'1.
For formats with reduced-resolution components, Δi and
Δj are relative to the resolution of the
highest-resolution component, and therefore may be divided by two relative
to the unnormalized coordinate space of the lower-resolution components.
Sampling instructions are SPIR-V image instructions that read from an
image with a sampler.
Sampling operations are a set of steps that are performed on state,
coordinates, and texel values while processing a sampling instruction, and
which are common to some or all sampling instructions.
They include the following steps, which are performed in the listed order:
For sampling instructions involving multiple texels (for sampling or
gathering), these steps are applied for each texel that is used in the
instruction.
Depending on the type of sampling instruction, other steps are conditionally
performed between these steps or involving multiple coordinate or texel
values.
Texel input validation operations inspect instruction/image/sampler state
or coordinates, and in certain circumstances cause the texel value to be
replaced or become undefined.
There are a series of validations that the texel undergoes.
Instruction/Sampler/Image View Validation
There are a number of cases where a SPIR-V instruction can mismatch with
the sampler, the image view, or both, and a number of further cases where
the sampler can mismatch with the image view.
In such cases the value of the texel returned is poison.
These cases include:
The sampler borderColor is an integer type and the image view
format is not one of the VkFormat integer types or a stencil
component of a depth/stencil format.
The sampler borderColor is a float type and the image view
format is not one of the VkFormat float types or a depth
component of a depth/stencil format.
VkSamplerBorderColorComponentMappingCreateInfoEXT::components,
if specified, has a component swizzle that does not match the component
swizzle of the image view, and either component swizzle is not a form of
identity swizzle.
The sampler was created with flags containing
VK_SAMPLER_CREATE_SUBSAMPLED_BIT_EXT and is used with a function
that is not OpImageSampleImplicitLod or
OpImageSampleExplicitLod, or is used with operands Offset or
ConstOffsets.
The SPIR-V instruction is one of the OpImage*Dref* instructions and
the sampler compareEnable is VK_FALSE
The SPIR-V instruction is not one of the OpImage*Dref* instructions
and the sampler compareEnable is VK_TRUE
The SPIR-V instruction is one of the OpImage*Dref* instructions,
the image view format is one of the depth/stencil formats, and the
image view aspect is not VK_IMAGE_ASPECT_DEPTH_BIT.
The SPIR-V instruction’s image variable’s properties are not compatible
with the image view:
The SPIR-V instruction is OpImageSampleFootprintNV with Dim =
2D and addressModeU or addressModeV in the sampler is not
VK_SAMPLER_ADDRESS_MODE_CLAMP_TO_EDGE.
The SPIR-V instruction is OpImageSampleFootprintNV with Dim =
3D and addressModeU, addressModeV, or addressModeW in
the sampler is not VK_SAMPLER_ADDRESS_MODE_CLAMP_TO_EDGE.
If the underlying VkImage format has an X component in its format
description, undefined values are read from those bits.
If the VkImage format and VkImageView format are the same, these
bits will be unused by format conversion and this will have no effect.
However, if the VkImageView format is different, then some bits of the
result may be undefined.
For example, when a VK_FORMAT_R10X6_UNORM_PACK16VkImage is
sampled via a VK_FORMAT_R16_UNORMVkImageView, the low 6 bits of
the value before format conversion are undefined and format conversion may
return a range of different values.
Some implementations will return undefined values in the case where a
sampler uses a VkSamplerAddressMode of
VK_SAMPLER_ADDRESS_MODE_MIRRORED_REPEAT, the sampler is used with
operands Offset, ConstOffset, or ConstOffsets, and the value
of the offset is larger than or equal to the corresponding width, height, or
depth of any accessed image level.
This behavior was not tested prior to Vulkan conformance test suite version
1.3.8.0.
Affected implementations will have a conformance test waiver for this issue.
Layout Validation
If all planes of a disjointmulti-planar image are not in the same
image layout, the image must not be sampled
with sampler Y′CBCR conversion enabled.
If the texel lies beyond the selected cube map face in either only
x or only y, then the coordinates (x,y,layer) are
transformed to select the adjacent texel from the appropriate
neighboring face.
Cube Map Corner Texel
If the texel lies beyond the selected cube map face in both x and
y, then there is no unique neighboring face from which to read
that texel.
The texel should be replaced by the average of the three values of the
adjacent texels in each incident face.
However, implementations may replace the cube map corner texel by
other methods.
The methods are subject to the constraint that for linear filtering if the
three available texels have the same value, the resulting filtered texel
must have that value, and for cubic filtering if the twelve available
samples have the same value, the resulting filtered texel must have that
value.
Border Replacement
If the sampler includes a border, out of bounds texels are replaced with a
value based on the image format and the borderColor of the sampler.
The border color is:
The custom border color (U) may be rounded by implementations prior
to texel replacement, but the error introduced by such a rounding must not
exceed one ULP of the image’s format.
The names VK_BORDER_COLOR_*_TRANSPARENT_BLACK,
VK_BORDER_COLOR_*_OPAQUE_BLACK, and
VK_BORDER_COLOR_*_OPAQUE_WHITE are meant to describe which components
are zeros and ones in the vocabulary of compositing, and are not meant to
imply that the numerical value of VK_BORDER_COLOR_INT_OPAQUE_WHITE is
a saturating value for integers.
This is substituted for the texel value by replacing the number of
components in the image format
Table 2. Border Texel Components After Replacement
As VK_FORMAT_B4G4R4A4_UNORM_PACK16 is required by Vulkan, support must
be advertised for this format.
Some Vulkan implementations on Apple hardware implement these formats
through a hardware format with a different channel order, swizzled to match
Vulkan’s expectations.
Unfortunately the swizzle cannot be readily applied to the fixed border
colors - resulting in the apparent channel swap.
For most standard border colors this does not result in a modification to
the sampled output.
However, VK_BORDER_COLOR_*_OPAQUE_BLACK will instead be sampled as
transparent red or blue.
If the customBorderColorWithoutFormat feature is supported and enabled,
this functionality is expected to work without issue, but this feature may
come with a performance cost.
When border color replacement occurs, texel reads
are skipped, and the replaced color is used for ongoing operations instead.
Texel Reads
A texel is read from an image, performed as outlined in Image Reads,
using the converted image coordinates.
The returned components of each texel are then processed by further input
operations.
Depth Compare Operation
If the image view has a depth/stencil format, the depth component is
selected by the aspectMask, and the operation is an OpImage*Dref*
instruction, a depth comparison is performed.
The result is 1.0 if the comparison evaluates to true, and
0.0 otherwise.
This value replaces the depth component D.
The compare operation is selected by the VkCompareOp value set by
VkSamplerCreateInfo::compareOp.
The reference value from the SPIR-V operand Dref and the texel depth
value Dtex are used as the reference and test values,
respectively, in that operation.
If the image being sampled has an unsigned normalized fixed-point format,
then Dref is clamped to [0,1] before the compare operation.
If the value of magFilter is VK_FILTER_LINEAR, or the value of
minFilter is VK_FILTER_LINEAR, then D may be computed in
an implementation-dependent manner which differs from the normal rules of
linear filtering.
The resulting value must be in the range [0,1] and should be
proportional to, or a weighted average of, the number of comparison passes
or failures.
Component Swizzle
All texel input instructions apply a swizzle based on:
The swizzle can rearrange the components of the texel, or substitute zero
or one for any components.
It is defined as follows for each color component:
where:
If the border color is one of the VK_BORDER_COLOR_*_OPAQUE_BLACK enums
and the VkComponentSwizzle is not the
identity swizzle for all
components, the value of the texel after swizzle is undefined.
If the image view has a depth/stencil format and the
VkComponentSwizzle is VK_COMPONENT_SWIZZLE_ONE, and
VkPhysicalDeviceMaintenance5Properties::depthStencilSwizzleOneSupport
is not VK_TRUE, the value of the texel after swizzle is undefined.
Sparse Residency
OpImageSparse* instructions return a structure which includes a
residency code indicating whether any texels accessed by the instruction
are sparse unbound texels.
This code can be interpreted by the OpImageSparseTexelsResident
instruction which converts the residency code to a boolean value.
In some color models, the color representation is defined in terms of
monochromatic light intensity (often called “luma”) and color differences
relative to this intensity, often called “chroma”.
It is common for color models other than RGB to represent the chroma
components at lower spatial resolution than the luma component.
This approach is used to take advantage of the eye’s lower spatial
sensitivity to color compared with its sensitivity to brightness.
Less commonly, the same approach is used with additive color, since the
green component dominates the eye’s sensitivity to light intensity and the
spatial sensitivity to color introduced by red and blue is lower.
Lower-resolution components are “downsampled” by resizing them to a lower
spatial resolution than the component representing luminance.
This process is also commonly known as “chroma subsampling”.
There is one luminance sample in each texture texel, but each chrominance
sample may be shared among several texels in one or both texture dimensions.
“_444” formats do not spatially downsample chroma values
compared with luma: there are unique chroma samples for each texel.
“_422” formats have downsampling in the x dimension
(corresponding to u or s coordinates): they are sampled at half the
resolution of luma in that dimension.
“_420” formats have downsampling in the x dimension
(corresponding to u or s coordinates) and the y dimension
(corresponding to v or t coordinates): they are sampled at half the
resolution of luma in both dimensions.
The process of reconstructing a full color value for texture access involves
accessing both chroma and luma values at the same location.
To generate the color accurately, the values of the lower-resolution
components at the location of the luma samples are reconstructed from the
lower-resolution sample locations, an operation known here as “chroma
reconstruction” irrespective of the actual color model.
The location of the chroma samples relative to the luma coordinates is
determined by the xChromaOffset and yChromaOffset members of the
VkSamplerYcbcrConversionCreateInfo structure used to create the
sampler Y′CBCR conversion.
The following diagrams show the relationship between unnormalized (u,v)
coordinates and (i,j) integer texel positions in the luma component
(shown in black, with circles showing integer sample positions) and the
texel coordinates of reduced-resolution chroma components, shown as crosses
in red.
If the chroma values are reconstructed at the locations of the luma samples
by means of interpolation, chroma samples from outside the image bounds are
needed; these are determined according to Wrapping Operation.
These diagrams represent this by showing the bounds of the “chroma texel”
extending beyond the image bounds, and including additional chroma sample
positions where required for interpolation.
The limits of a sample for NEAREST sampling is shown as a grid.
If the format’s R and B components are reduced in resolution in just
width by a factor of two relative to the G component (i.e. this is a
“_422” format), the values
accessed by texel filtering are
reconstructed as follows:
If the format’s R and B components are reduced in resolution in width
and height by a factor of two relative to the G component (i.e. this is
a “_420” format), the values
accessed by texel filtering are
reconstructed as follows:
xChromaOffset and yChromaOffset have no effect if
chromaFilter is VK_FILTER_NEAREST for explicit reconstruction.
If the format’s R and B components are reduced in resolution in just
width by a factor of two relative to the G component (i.e. this is a
“_422” format):
If the format’s R and B components are reduced in resolution in width
and height by a factor of two relative to the G component (i.e. this is
a “_420” format), a similar relationship applies.
Due to the number of options, these formulae are expressed more
concisely as follows:
In the case where the texture itself is bilinearly interpolated as described
in Texel Filtering, thus requiring four
full-color samples for the filtering operation, and where the reconstruction
of these samples uses bilinear interpolation in the chroma components due to
chromaFilter=VK_FILTER_LINEAR, up to nine chroma samples may be
required, depending on the sample location.
Implicit Reconstruction
Implicit reconstruction takes place by the samples being interpolated, as
required by the filter settings of the sampler, except that
chromaFilter takes precedence for the chroma samples.
This will not have any visible effect if the locations of the luma samples
coincide with the location of the samples used for rasterization.
The sample coordinates are adjusted by the downsample factor of the
component (such that, for example, the sample coordinates are divided by two
if the component has a downsample factor of two relative to the luma
component):
Sampler Y′CBCR Conversion
Sampler Y′CBCR conversion performs the following operations on sampled
data, in order: