The one that takes T is more useful than the one that takes references
to T. When specialization lands, we will be able to have both under the
`extend` name.
This patch includes a number of changes to encourage reading and writing
buffers through mapping rather than copying data as before.
This includes a new `BufWrite` abstraction which is designed for filling
buffers. It behaves much like a Vec<u8>, but with fixed capacity.
Previous threshold was seeing occasional failures, and also fairly wide
variance in the error. This seems to be reliable, but hasn't been
validated extremely rigorously.
There's a bit of reorganizing as well. Shader stages are made available
from piet-gpu to the test rig, config is now a proper structure
(marshaled with bytemuck).
This commit just has the transform stage, which is a simple monoid scan
of affine transforms.
Progress toward #119
Reduce allocation of descriptor heaps. This change also enables clearing
of buffers, as the handles are needed at command dispatch time.
Also updates the tests to use clear_buffers on DX12. Looking forward to
being able to get rid of the compute shader workaround on Metal.
This is a followup on #125, and progress toward #95
Thanks to Jeff Bolz for spotting the write-after-read hazard on the
sh_flag accesses. This fixes observed failures on Nvidia Turing and
Ampere on DX12.
This is our version of the standard message passing litmus test for
atomics. It does a bunch in parallel and permutes the reads and writes
extensively, so it's been more sensitive than existing tests.
This adds both regular and Vulkan memory model atomic versions of the
prefix sum test, compiled by #ifdef. The build chain is getting messy,
but I think it's important to test this stuff.
Integrate DXC for translating HLSL for use in DX12. This will work
around FXC limitations and unlock the use of more advanced HLSL features
such as subgroups.
This hardcodes the use of DXIL, but it could be adapted (with a bit of
effort) to choose between DXIL and HLSL at runtime.
The MSL translation of the prefix example had its bindings permuted; a
flag prevents this (but, as is typical for shader translation,
potentially creates other problems).
Also use explicit unsigned literal to avoid DXC warnings.
Add a clear stage and associated tests, and also use it on non-Vulkan
backends to clear the state buffer.
While that's a workaround and will go away when we implement the actual
clear command, it's also a nice demo of how the new "stage" structure
composes.
This gets it working on mac. Also delete old implementation.
There's also an update to winit 0.25 in here, because it was easier to
roll forward than fix inconsistent Cargo.lock. At some point, we should
systematically update all deps.
Use an array of bindtypes rather than the previous situation, which was
a choice of buffer counts, or a heavier builder pattern.
The main thing this unlocks is distinguishing between readonly and
read/write buffers, which is important for DX12.
This is WIP, the Metal part hasn't been done, and the old stuff not
deleted.
Part of #125
Have a structured way of gathering test results, rather than the
existing ad hoc approach of just printing stuff.
The details are still pretty primitive, but there's room to grow.
This adds a prefix sum test. This patch is also trying to get a little
more serious about structuring both the test runner (toward the goal of
collecting proper statistics) and pipeline stages for the tests.
Still WIP but giving good results.
This was motivated by experiments with the Vulkan memory model. To use
that, we actually need to explicitly enable the relevant feature on
device creation time. That's a lot easier to do now that push_next works
on the structs in that chain. This PR doesn't do that though, it only
upgrades the dependency and cleans up deprecations.
The flag read needs acquire semantics. There are a number of ways that
could be expressed, but a generally portable way is to have a barrier
after. However, in the translation to Metal, that barrier needs to be in
uniform control flow. This patch does some workarounds to ensure that.