Thanks to Jeff Bolz for spotting the write-after-read hazard on the
sh_flag accesses. This fixes observed failures on Nvidia Turing and
Ampere on DX12.
This is our version of the standard message passing litmus test for
atomics. It does a bunch in parallel and permutes the reads and writes
extensively, so it's been more sensitive than existing tests.
This adds both regular and Vulkan memory model atomic versions of the
prefix sum test, compiled by #ifdef. The build chain is getting messy,
but I think it's important to test this stuff.
The MSL translation of the prefix example had its bindings permuted; a
flag prevents this (but, as is typical for shader translation,
potentially creates other problems).
Also use explicit unsigned literal to avoid DXC warnings.
Add a clear stage and associated tests, and also use it on non-Vulkan
backends to clear the state buffer.
While that's a workaround and will go away when we implement the actual
clear command, it's also a nice demo of how the new "stage" structure
composes.
Use an array of bindtypes rather than the previous situation, which was
a choice of buffer counts, or a heavier builder pattern.
The main thing this unlocks is distinguishing between readonly and
read/write buffers, which is important for DX12.
This is WIP, the Metal part hasn't been done, and the old stuff not
deleted.
Part of #125
This adds a prefix sum test. This patch is also trying to get a little
more serious about structuring both the test runner (toward the goal of
collecting proper statistics) and pipeline stages for the tests.
Still WIP but giving good results.