Commit graph

45 commits

Author SHA1 Message Date
Chad Brokaw 8943fa7ea6 encode absolute transforms
This removes the GPU transform stage, changes shaders to reference transforms directly from the scene, and modifies the render context to maintain a transform stack.
2022-07-15 14:36:20 -04:00
Chad Brokaw 8de34f8728 remove shader gen directories 2022-07-14 14:57:17 -04:00
Chad Brokaw 9626eaa19b separate instance and surface creation
This separates creation of Instance and Surfaces, allowing for rendering to multiple windows.
2022-07-14 14:46:46 -04:00
Commit by GitHub Action f6ea9308ba commit compiled shaders 2022-07-13 19:27:07 +00:00
Raph Levien 64e6268059 Remove generated shaders from dev branch 2022-07-13 12:22:11 -07:00
Chad Brokaw 60bca997e5 Remove deprecated functions
* remove CmdBuff::dispatch() which was moved to ComputePass

* remove CmdBuff::write_timestamp() which is replaced by timestamp index pair in ComputePassDescriptor
2022-05-04 01:59:49 -04:00
chad 02cc867950 command style metal timer queries + compute pass
This commit adds timestamps to compute pass boundaries for command style timer queries on metal.

It also updates the code in piet-gpu/stages, piet-gpu/lib.rs and tests/ to use the new ComputePass type.
2022-04-21 04:20:54 -04:00
Raph Levien 5a9b8d9243 Start applying compute pass to tests
Use compute pass for tests in tests subdir. This is also shaking out some issues that weren't apparent from just collatz.

In particular, we need more autorelease pools to prevent things from leaking. As of this commit, the "clear" test runs correctly but the others haven't yet been converted to the compute_pass format.
2022-04-20 13:45:42 -07:00
Raph Levien 7134be2329 Fix missing blend/clip logic
We always do BeginClip/EndClip if it's a solid tile and the blend mode
is not default.

Also fix missing entry in pipeline layout (affects Vulkan but not Metal).
2022-03-16 14:40:58 -07:00
Raph Levien acb3933d94 Variable size encoding of draw objects
This patch switches to a variable size encoding of draw objects.

In addition to the CPU-side scene encoding, it changes the representation of intermediate per draw object state from the `Annotated` struct to a variable "info" encoding. In addition, the bounding boxes are moved to a separate array (for a more "structure of "arrays" approach). Data that's unchanged from the scene encoding is not copied. Rather, downstream stages can access the data from the scene buffer (reducing allocation and copying).

Prefix sums, computed in `DrawMonoid` track the offset of both scene and intermediate data. The tags for the CPU-side encoding have been split into their own stream (again a change from AoS to SoA style).

This is not necessarily the final form. There's some stuff (including at least one piet-gpu-derive type) that can be deleted. In addition, the linewidth field should probably move from the info to path-specific. Also, the 1:1 correspondence between draw object and path has not yet been broken.

Closes #152
2022-03-14 16:32:08 -07:00
Raph Levien 3b67a4e7c1 New clip implementation
This PR reworks the clip implementation. The highlight is that clip bounding box accounting is now done on GPU rather than CPU. The clip mask is also rasterized on EndClip rather than BeginClip, which decreases memory traffic needed for the clip stack.

This is a pretty good working state, but not all cleanup has been applied. An important next step is to remove the CPU clip accounting (it is computed and encoded, but that result is not used). Another step is to remove the Annotated structure entirely.

Fixes #88. Also relevant to #119
2022-02-17 17:13:28 -08:00
Raph Levien d948126c16 Adjust workgroup sizes
Make max workgroup size 256 and respect LG_WG_FACTOR.

Because the monoid scans only support a height of 2, this will reduce
the maximum scene complexity we can render. But it also increases
compatibility. Supporting larger scans is a TODO.
2021-12-08 11:48:38 -08:00
Raph Levien 75496f5e67 Fix draw test
We'll revert this later, but for now trying to keep tests green.
2021-12-08 10:38:46 -08:00
Raph Levien 44327fe49f Beginnings of new element pipeline
This successfully renders the tiger; fills and strokes are supported.
Other parts of the imaging model, not yet.

Progress toward #119
2021-12-03 15:33:01 -08:00
Raph Levien 875c8badf4 Add draw object stage
This is one of the stages in the new element pipeline. It's a simple
one, just a prefix sum of a couple counts, and some of it will probably
get merged with a downstream stage, but we'll do it separately for now
for convenience.

This patch also contains an update to Vulkan tools 1.2.198, which
accounts for the large diff of translated shaders.
2021-12-02 13:37:16 -08:00
Raph Levien 178761dcb3 Path stream processing
This patch contains the core of the path stream processing, though some
integration bits are missing. The core logic is tested, though
combinations of path types, transforms, and line widths are not (yet).

Progress towards #119
2021-12-01 07:33:24 -08:00
Raph Levien a7a5b84c86 Clean up stray files 2021-11-30 10:34:42 -08:00
Raph Levien 3039a2ac39 Merge branch 'master' into bufwrite 2021-11-30 10:31:16 -08:00
Raph Levien f1d7560b3c Tweak extend implementation
The one that takes T is more useful than the one that takes references
to T. When specialization lands, we will be able to have both under the
`extend` name.
2021-11-25 22:02:04 -08:00
Raph Levien 9fb2ae91eb Access buffer data through mapping
This patch includes a number of changes to encourage reading and writing
buffers through mapping rather than copying data as before.

This includes a new `BufWrite` abstraction which is designed for filling
buffers. It behaves much like a Vec<u8>, but with fixed capacity.
2021-11-25 21:27:08 -08:00
Raph Levien 8f7ed161a6 Tune transform test parameters
Previous threshold was seeing occasional failures, and also fairly wide
variance in the error. This seems to be reliable, but hasn't been
validated extremely rigorously.
2021-11-24 09:40:52 -08:00
Raph Levien 47f8812e2f Start work on new element pipeline
There's a bit of reorganizing as well. Shader stages are made available
from piet-gpu to the test rig, config is now a proper structure
(marshaled with bytemuck).

This commit just has the transform stage, which is a simple monoid scan
of affine transforms.

Progress toward #119
2021-11-24 08:01:43 -08:00
Raph Levien a8103a4c20 cargo fmt
I really need to make this automated. There are some small challenges though.
2021-11-23 08:49:06 -08:00
Raph Levien abe2a6ceef Fix tests to use bytemuck 2021-11-23 08:48:14 -08:00
Raph Levien 0762cc763c Implement clear_buffers on Metal
Since clearing functionality is not built-in, use a compute shader.

Simplify tests greatly; they don't need the workaround.
2021-11-20 22:24:52 -08:00
Raph Levien 657f219ce8 Better DX12 descriptor management
Reduce allocation of descriptor heaps. This change also enables clearing
of buffers, as the handles are needed at command dispatch time.

Also updates the tests to use clear_buffers on DX12. Looking forward to
being able to get rid of the compute shader workaround on Metal.

This is a followup on #125, and progress toward #95
2021-11-20 16:36:43 -08:00
Raph Levien 69b6632085 Fix write-after-read in prefix test
Thanks to Jeff Bolz for spotting the write-after-read hazard on the
sh_flag accesses. This fixes observed failures on Nvidia Turing and
Ampere on DX12.
2021-11-14 07:13:15 -08:00
Raph Levien f32f2d7f95 Add linked list DXIL
Not sure why it wasn't in the previous commit.
2021-11-12 16:01:27 -08:00
Raph Levien d66f67fa09 Actually add README 2021-11-12 15:27:47 -08:00
Raph Levien 27bedd9ef1 Add README
Also verify linked list results.
2021-11-12 15:23:31 -08:00
Raph Levien c6965de557 Add linked list test
Measure bandwidth of building linked lists with atomics.
2021-11-12 10:23:31 -08:00
Raph Levien 10a624ee75 Add message passing litmus test
This is our version of the standard message passing litmus test for
atomics. It does a bunch in parallel and permutes the reads and writes
extensively, so it's been more sensitive than existing tests.
2021-11-11 16:17:04 -08:00
Raph Levien 825a1eb04c Add atomic versions of prefix
This adds both regular and Vulkan memory model atomic versions of the
prefix sum test, compiled by #ifdef. The build chain is getting messy,
but I think it's important to test this stuff.
2021-11-11 15:26:47 -08:00
Raph Levien 3f1bbe4af1 Commit DXIL to repo
We're following the policy of committing all translated shaders to the
git repo rather than rebuilding at runtime. Here are the new DXIL ones.
2021-11-11 13:05:22 -08:00
Raph Levien f9d0aa078b Use DXIL shader compilation
Integrate DXC for translating HLSL for use in DX12. This will work
around FXC limitations and unlock the use of more advanced HLSL features
such as subgroups.

This hardcodes the use of DXIL, but it could be adapted (with a bit of
effort) to choose between DXIL and HLSL at runtime.
2021-11-11 12:55:10 -08:00
Raph Levien 7a021793ee Configure number of iterations 2021-11-11 07:26:32 -08:00
Raph Levien a0648a2153 Portability fixes
The MSL translation of the prefix example had its bindings permuted; a
flag prevents this (but, as is typical for shader translation,
potentially creates other problems).

Also use explicit unsigned literal to avoid DXC warnings.
2021-11-11 07:08:39 -08:00
Raph Levien fbfd4ee81b Add workaround for buffer clearing
Add a clear stage and associated tests, and also use it on non-Vulkan
backends to clear the state buffer.

While that's a workaround and will go away when we implement the actual
clear command, it's also a nice demo of how the new "stage" structure
composes.
2021-11-10 17:36:54 -08:00
Raph Levien 94949a6906 Mac port of bind layout rework
This gets it working on mac. Also delete old implementation.

There's also an update to winit 0.25 in here, because it was easier to
roll forward than fix inconsistent Cargo.lock. At some point, we should
systematically update all deps.
2021-11-10 13:40:16 -08:00
Raph Levien 74f2b4fd1c Rework bind layout
Use an array of bindtypes rather than the previous situation, which was
a choice of buffer counts, or a heavier builder pattern.

The main thing this unlocks is distinguishing between readonly and
read/write buffers, which is important for DX12.

This is WIP, the Metal part hasn't been done, and the old stuff not
deleted.

Part of #125
2021-11-10 11:25:16 -08:00
Raph Levien bd39d26bce Improve collection and reporting of test results
Have a structured way of gathering test results, rather than the
existing ad hoc approach of just printing stuff.

The details are still pretty primitive, but there's room to grow.
2021-11-09 14:40:53 -08:00
Raph Levien 3820e4b2f4 Add missing file
Also add finish_timestamps call, which is needed for DX12 (there are
other issues but this is an easy fix for that one).
2021-11-06 21:46:01 -07:00
Raph Levien b36ca7fc2e Add generated shaders 2021-11-06 16:25:56 -07:00
Raph Levien 4ed339d434 Add tree reduction prefix sum test
Do a tree reduction in addition to the existing decoupled look-back, to
explore the tradeoff between performance and compatibility.
2021-11-06 16:19:26 -07:00
Raph Levien 33d7b25a92 Start testing framework
This adds a prefix sum test. This patch is also trying to get a little
more serious about structuring both the test runner (toward the goal of
collecting proper statistics) and pipeline stages for the tests.

Still WIP but giving good results.
2021-11-06 11:24:34 -07:00