This is the core logic for robust dynamic memory. There are changes to both shaders and the driver logic.
On the shader side, failure information is more useful and fine grained. In particular, it now reports which stage failed and how much memory would have been required to make that stage succeed.
On the driver side, there is a new RenderDriver abstraction which owns command buffers (and associated query pools) and runs the logic to retry and reallocate buffers when necessary. There's also a fairly significant rework of the logic to produce the config block, as that overlaps the robust memory.
The RenderDriver abstraction may not stay. It was done this way to minimize code disruption, but arguably it should just be combined with Renderer.
Another change: the GLSL length() method on a buffer requires additional infrastructure (at least on Metal, where it needs a binding of its own), so we now pass that in as a field in the config.
This also moves blend memory to its own buffer. This worked out well because coarse rasterization can simply report the size of the blend buffer and it can be reallocated without needing to rerun the pipeline. In the previous state, blend allocations and ptcl writes were interleaved in coarse rasterization, so a failure of the former would require rerunning coarse. This should fix#83 (finally!)
There are a few loose ends. The binaries haven't (yet) been updated (I've been testing using a hand-written test program). Gradients weren't touched so still have a fixed size allocation. And the logic to calculate the new buffer size on allocation failure could be smarter.
Closes#175
This patch sets up very basic CI (right now just cargo fmt) but more importantly compiles shaders in a GitHub Action.
Any PR to branches other than main will run shader compilation. Any push to the dev branch will run shader compilation and then merge to main.
Closes#177
Adds kurbo as an optional dependency and implements conversions to/from the common types.
This also removes the direct pinot dependency and changes moscato (temporarily) to a git based dep to allow iteration on the underlying glyph loading code without PR churn here.
Adds a test to visualize the blend modes. Fixes a dumb bug in blend.h and also a more subtle issue where default blending is not the same as clipping, as the former needs to always push a blend group (to cause isolation) and the latter does not. This might be something we need to get back to.
This should fix the rendering, so it fairly closely resembles the Mozilla reference image. There's also a compile-time switch to disable sRGB conversion, which is (sadly) needed for compatible rendering.
Split the blend stack into register and memory segments. Do blending in registers up to that size, then spill to memory if needed.
This version may regress performance on Pixel 4, as it uses common memory for the blend stack, rather than keeping that memory read-only in fine rasterization, and using a separate buffer for blend stack. This needs investigation. It's possible we'll want to have single common memory as a config option, as it pools allocations and decreases the probability of failure.
Also a flaw in this version: there is no checking of memory overflow.
For understanding code history: this commit largely reverts #77, but there were some intervening changes to blending, and this commit also implements the split so some of the stack is in registers.
Closes#156
The blending math had two errors: first, colors were not separated for the purpose of blending (blending was wrongly applied to premultiplied values), and second, alpha was applied over-aggressively to the alpha channel.
This PR does *not* address the issue of gamma correctness. That is a complex issue and should probably be handled in the short term by disabling sRGB conversions and doing the internal math in sRGB color space rather than linear. This will degrade the quality of antialiasing but on the other hand give spec-compliant results for compositing.
We remove the plus-darker mode as its specification does not appear to be valid. The plus-lighter mode remains as it is quite useful for cross-fading effects.
Also the generated shaders were compiled on mac so the DXIL is unsigned. Those should be compiled on Windows before this PR is merged. (and we should figure out a better strategy for all that)
* Change pgpu-render header file generator to add a comment noting that the file is auto-generated
* Remove bin target and associated dependencies from piet-scene crate
* Remove FP constructors from the Color type
* Better codegen and rounding in Color::to_premul_u32()
* Add kurbo attribution for piet_scene::geometry module
* remove CmdBuff::dispatch() which was moved to ComputePass
* remove CmdBuff::write_timestamp() which is replaced by timestamp index pair in ComputePassDescriptor