Name Description Size
conv.rs 8561
help.rs ! Helpers for the hlsl backend Important note about `Expression::ImageQuery`/`Expression::ArrayLength` and hlsl backend: Due to implementation of `GetDimensions` function in hlsl (<https://docs.microsoft.com/en-us/windows/win32/direct3dhlsl/dx-graphics-hlsl-to-getdimensions>) backend can't work with it as an expression. Instead, it generates a unique wrapped function per `Expression::ImageQuery`, based on texture info and query function. See `WrappedImageQuery` struct that represents a unique function and will be generated before writing all statements and expressions. This allowed to works with `Expression::ImageQuery` as expression and write wrapped function. For example: ```wgsl let dim_1d = textureDimensions(image_1d); ``` ```hlsl int NagaDimensions1D(Texture1D<float4>) { uint4 ret; image_1d.GetDimensions(ret.x); return ret.x; } int dim_1d = NagaDimensions1D(image_1d); ``` 53285
keywords.rs 19578
mod.rs ! Backend for [HLSL][hlsl] (High-Level Shading Language). # Supported shader model versions: - 5.0 - 5.1 - 6.0 # Layout of values in `uniform` buffers WGSL's ["Internal Layout of Values"][ilov] rules specify how each WGSL type should be stored in `uniform` and `storage` buffers. The HLSL we generate must access values in that form, even when it is not what HLSL would use normally. The rules described here only apply to WGSL `uniform` variables. WGSL `storage` buffers are translated as HLSL `ByteAddressBuffers`, for which we generate `Load` and `Store` method calls with explicit byte offsets. WGSL pipeline inputs must be scalars or vectors; they cannot be matrices, which is where the interesting problems arise. ## Row- and column-major ordering for matrices WGSL specifies that matrices in uniform buffers are stored in column-major order. This matches HLSL's default, so one might expect things to be straightforward. Unfortunately, WGSL and HLSL disagree on what indexing a matrix means: in WGSL, `m[i]` retrieves the `i`'th column* of `m`, whereas in HLSL it retrieves the `i`'th *row*. We want to avoid translating `m[i]` into some complicated reassembly of a vector from individually fetched components, so this is a problem. However, with a bit of trickery, it is possible to use HLSL's `m[i]` as the translation of WGSL's `m[i]`: - We declare all matrices in uniform buffers in HLSL with the `row_major` qualifier, and transpose the row and column counts: a WGSL `mat3x4<f32>`, say, becomes an HLSL `row_major float3x4`. (Note that WGSL and HLSL type names put the row and column in reverse order.) Since the HLSL type is the transpose of how WebGPU directs the user to store the data, HLSL will load all matrices transposed. - Since matrices are transposed, an HLSL indexing expression retrieves the "columns" of the intended WGSL value, as desired. - For vector-matrix multiplication, since `mul(transpose(m), v)` is equivalent to `mul(v, m)` (note the reversal of the arguments), and `mul(v, transpose(m))` is equivalent to `mul(m, v)`, we can translate WGSL `m * v` and `v * m` to HLSL by simply reversing the arguments to `mul`. ## Padding in two-row matrices An HLSL `row_major floatKx2` matrix has padding between its rows that the WGSL `matKx2<f32>` matrix it represents does not. HLSL stores all matrix rows [aligned on 16-byte boundaries][16bb], whereas WGSL says that the columns of a `matKx2<f32>` need only be [aligned as required for `vec2<f32>`][ilov], which is [eight-byte alignment][8bb]. To compensate for this, any time a `matKx2<f32>` appears in a WGSL `uniform` variable, whether directly as the variable's type or as part of a struct/array, we actually emit `K` separate `float2` members, and assemble/disassemble the matrix from its columns (in WGSL; rows in HLSL) upon load and store. For example, the following WGSL struct type: ```ignore struct Baz { m: mat3x2<f32>, } ``` is rendered as the HLSL struct type: ```ignore struct Baz { float2 m_0; float2 m_1; float2 m_2; }; ``` The `wrapped_struct_matrix` functions in `help.rs` generate HLSL helper functions to access such members, converting between the stored form and the HLSL matrix types appropriately. For example, for reading the member `m` of the `Baz` struct above, we emit: ```ignore float3x2 GetMatmOnBaz(Baz obj) { return float3x2(obj.m_0, obj.m_1, obj.m_2); } ``` We also emit an analogous `Set` function, as well as functions for accessing individual columns by dynamic index. [hlsl]: https://docs.microsoft.com/en-us/windows/win32/direct3dhlsl/dx-graphics-hlsl [ilov]: https://gpuweb.github.io/gpuweb/wgsl/#internal-value-layout [16bb]: https://github.com/microsoft/DirectXShaderCompiler/wiki/Buffer-Packing#constant-buffer-packing [8bb]: https://gpuweb.github.io/gpuweb/wgsl/#alignment-and-size 12272
storage.rs ! Generating accesses to [`ByteAddressBuffer`] contents. Naga IR globals in the [`Storage`] address space are rendered as [`ByteAddressBuffer`]s or [`RWByteAddressBuffer`]s in HLSL. These buffers don't have HLSL types (structs, arrays, etc.); instead, they are just raw blocks of bytes, with methods to load and store values of specific types at particular byte offsets. This means that Naga must translate chains of [`Access`] and [`AccessIndex`] expressions into HLSL expressions that compute byte offsets into the buffer. To generate code for a [`Storage`] access: - Call [`Writer::fill_access_chain`] on the expression referring to the value. This populates [`Writer::temp_access_chain`] with the appropriate byte offset calculations, as a vector of [`SubAccess`] values. - Call [`Writer::write_storage_address`] to emit an HLSL expression for a given slice of [`SubAccess`] values. Naga IR expressions can operate on composite values of any type, but [`ByteAddressBuffer`] and [`RWByteAddressBuffer`] have only a fixed set of `Load` and `Store` methods, to access one through four consecutive 32-bit values. To synthesize a Naga access, you can initialize [`temp_access_chain`] to refer to the composite, and then temporarily push and pop additional steps on [`Writer::temp_access_chain`] to generate accesses to the individual elements/members. The [`temp_access_chain`] field is a member of [`Writer`] solely to allow re-use of the `Vec`'s dynamic allocation. Its value is no longer needed once HLSL for the access has been generated. Note about DXC and Load/Store functions: DXC's HLSL has a generic [`Load` and `Store`] function for [`ByteAddressBuffer`] and [`RWByteAddressBuffer`]. This is not available in FXC's HLSL, so we use it only for types that are only available in DXC. Notably 64 and 16 bit types. FXC's HLSL has functions Load, Load2, Load3, and Load4 and Store, Store2, Store3, Store4. This loads/stores a vector of length 1, 2, 3, or 4. We use that for 32bit types, bitcasting to the correct type if necessary. [`Storage`]: crate::AddressSpace::Storage [`ByteAddressBuffer`]: https://learn.microsoft.com/en-us/windows/win32/direct3dhlsl/sm5-object-byteaddressbuffer [`RWByteAddressBuffer`]: https://learn.microsoft.com/en-us/windows/win32/direct3dhlsl/sm5-object-rwbyteaddressbuffer [`Access`]: crate::Expression::Access [`AccessIndex`]: crate::Expression::AccessIndex [`Writer::fill_access_chain`]: super::Writer::fill_access_chain [`Writer::write_storage_address`]: super::Writer::write_storage_address [`Writer::temp_access_chain`]: super::Writer::temp_access_chain [`temp_access_chain`]: super::Writer::temp_access_chain [`Writer`]: super::Writer [`Load` and `Store`]: https://github.com/microsoft/DirectXShaderCompiler/wiki/ByteAddressBuffer-Load-Store-Additions 22780
writer.rs 163026