This just runs ninja on the piet-gpu/shaders on a Windows machine, so
translated shaders match the existing pipeline.
At some point, we'll rework this to reduce friction.
This PR reworks the clip implementation. The highlight is that clip bounding box accounting is now done on GPU rather than CPU. The clip mask is also rasterized on EndClip rather than BeginClip, which decreases memory traffic needed for the clip stack.
This is a pretty good working state, but not all cleanup has been applied. An important next step is to remove the CPU clip accounting (it is computed and encoded, but that result is not used). Another step is to remove the Annotated structure entirely.
Fixes#88. Also relevant to #119
Make max workgroup size 256 and respect LG_WG_FACTOR.
Because the monoid scans only support a height of 2, this will reduce
the maximum scene complexity we can render. But it also increases
compatibility. Supporting larger scans is a TODO.
This is one of the stages in the new element pipeline. It's a simple
one, just a prefix sum of a couple counts, and some of it will probably
get merged with a downstream stage, but we'll do it separately for now
for convenience.
This patch also contains an update to Vulkan tools 1.2.198, which
accounts for the large diff of translated shaders.