Adds kurbo as an optional dependency and implements conversions to/from the common types.
This also removes the direct pinot dependency and changes moscato (temporarily) to a git based dep to allow iteration on the underlying glyph loading code without PR churn here.
Adds a test to visualize the blend modes. Fixes a dumb bug in blend.h and also a more subtle issue where default blending is not the same as clipping, as the former needs to always push a blend group (to cause isolation) and the latter does not. This might be something we need to get back to.
This should fix the rendering, so it fairly closely resembles the Mozilla reference image. There's also a compile-time switch to disable sRGB conversion, which is (sadly) needed for compatible rendering.
Split the blend stack into register and memory segments. Do blending in registers up to that size, then spill to memory if needed.
This version may regress performance on Pixel 4, as it uses common memory for the blend stack, rather than keeping that memory read-only in fine rasterization, and using a separate buffer for blend stack. This needs investigation. It's possible we'll want to have single common memory as a config option, as it pools allocations and decreases the probability of failure.
Also a flaw in this version: there is no checking of memory overflow.
For understanding code history: this commit largely reverts #77, but there were some intervening changes to blending, and this commit also implements the split so some of the stack is in registers.
Closes#156
The blending math had two errors: first, colors were not separated for the purpose of blending (blending was wrongly applied to premultiplied values), and second, alpha was applied over-aggressively to the alpha channel.
This PR does *not* address the issue of gamma correctness. That is a complex issue and should probably be handled in the short term by disabling sRGB conversions and doing the internal math in sRGB color space rather than linear. This will degrade the quality of antialiasing but on the other hand give spec-compliant results for compositing.
We remove the plus-darker mode as its specification does not appear to be valid. The plus-lighter mode remains as it is quite useful for cross-fading effects.
Also the generated shaders were compiled on mac so the DXIL is unsigned. Those should be compiled on Windows before this PR is merged. (and we should figure out a better strategy for all that)
* Change pgpu-render header file generator to add a comment noting that the file is auto-generated
* Remove bin target and associated dependencies from piet-scene crate
* Remove FP constructors from the Color type
* Better codegen and rounding in Color::to_premul_u32()
* Add kurbo attribution for piet_scene::geometry module
* remove CmdBuff::dispatch() which was moved to ComputePass
* remove CmdBuff::write_timestamp() which is replaced by timestamp index pair in ComputePassDescriptor
This commit adds timestamps to compute pass boundaries for command style timer queries on metal.
It also updates the code in piet-gpu/stages, piet-gpu/lib.rs and tests/ to use the new ComputePass type.
We need to be able to call memory_barrier() on ComputePass, to avoid the borrow checker complaining if we tried to call it on the underlying command buffer.
Use compute pass for tests in tests subdir. This is also shaking out some issues that weren't apparent from just collatz.
In particular, we need more autorelease pools to prevent things from leaking. As of this commit, the "clear" test runs correctly but the others haven't yet been converted to the compute_pass format.
This commit implements actual rendering for the new piet-scene crate.
In particular, it adds a new EncodedSceneRef type in piet_gpu::encoder to store references to the raw stream data. This is used as a proxy for both PietGpuRenderContext and piet_scene::scene::Scene to feed data into the renderer.
Much of this is still hacky for expedience and because the intention is to be a transitional state that maintains both interfaces until we can move to the new scene API and add a wrapper for the traditional piet API.
The general structure and relationships between these crates need further discussion.
Current status: the piet-gpu-hal module (including the collatz example)
have the new API (with queries set on compute pass) implemented. The
other uses have not yet been updated.
On Metal, only M1 is tested. The "command" counter style is partly
implemented, but not fully wired up.
The current plan is to more or less follow the wgpu/wgpu-hal approach. In the mux/backend layer (which corresponds fairly strongly to wgpu-hal), there isn't explicit construction of a compute encoder, but there are new methods for beginning and ending a compute pass. At the hub layer (which corresponds to wgpu) there will be a ComputeEncoder object.
That said, there will be some differences. The WebGPU "end" method on a compute encoder is implemented in wgpu as Drop, and that is not ideal. Also, the wgpu-hal approach to timer queries (still based on write_timestamp) is not up to the task of Metal timer queries, where the query offsets have to be specified at compute encoder creation. That's why there are different projects :)
WIP: current state is that stage-style queries work on Apple Silicon, but non-Metal backends are broken, and piet-gpu is not yet updated to use new API.