Commit graph

57 commits

Author SHA1 Message Date
Chad Brokaw ca6c43adab update dependencies 2022-10-19 15:21:29 -04:00
Chad Brokaw 6c5a2cb4f4 macOS fixes
* Fix call to removed encoded_scene method in pgpu-render
* Add new ImageFormat::Surface variant to select a pixel format that matches the platform specific surface format. This makes gradients consistent across platforms.
2022-10-19 15:20:04 -04:00
Raph Levien 5a9b8d9243 Start applying compute pass to tests
Use compute pass for tests in tests subdir. This is also shaking out some issues that weren't apparent from just collatz.

In particular, we need more autorelease pools to prevent things from leaking. As of this commit, the "clear" test runs correctly but the others haven't yet been converted to the compute_pass format.
2022-04-20 13:45:42 -07:00
Raph Levien 9980c858b6 Fix timer queries in Vulkan and DX12 backends
Current status: the piet-gpu-hal module (including the collatz example)
have the new API (with queries set on compute pass) implemented. The
other uses have not yet been updated.

On Metal, only M1 is tested. The "command" counter style is partly
implemented, but not fully wired up.
2022-04-14 17:17:33 -07:00
Raph Levien ba2b27cc3c Rework of compute encoder abstraction
The current plan is to more or less follow the wgpu/wgpu-hal approach. In the mux/backend layer (which corresponds fairly strongly to wgpu-hal), there isn't explicit construction of a compute encoder, but there are new methods for beginning and ending a compute pass. At the hub layer (which corresponds to wgpu) there will be a ComputeEncoder object.

That said, there will be some differences. The WebGPU "end" method on a compute encoder is implemented in wgpu as Drop, and that is not ideal. Also, the wgpu-hal approach to timer queries (still based on write_timestamp) is not up to the task of Metal timer queries, where the query offsets have to be specified at compute encoder creation. That's why there are different projects :)

WIP: current state is that stage-style queries work on Apple Silicon, but non-Metal backends are broken, and piet-gpu is not yet updated to use new API.
2022-04-14 10:19:28 -07:00
Raph Levien 0cf370f9c7 Mostly working rendering
This exposes interfaces to render glyphs into a texture atlas. The main changes are:

* Methods to plumb raw Metal GPU resources (device, texture, etc) into piet-gpu-hal objects.

* A new glyph_render API specialized to rendering glyphs. This is basically the same as just painting to a canvas, but will allow better caching (and has more direct access to fonts, bypassing the Piet font type which is underdeveloped).

* Ability to render to A8 target in addition to RGBA.

WIP, there are some rough edges, not least of which is that the image format changes are only on mac and cause compile errors elsewhere.
2022-01-19 12:10:51 -08:00
Raph Levien 833d993a4e More progress exposing interface
Much of the surface area exists for rendering now.

WIP of course still
2022-01-18 18:41:28 -08:00
Raph Levien 5e221d2e91 Add capability to function as a guest in Metal
WIP
2022-01-13 17:36:08 -08:00
Raph Levien 9fb2ae91eb Access buffer data through mapping
This patch includes a number of changes to encourage reading and writing
buffers through mapping rather than copying data as before.

This includes a new `BufWrite` abstraction which is designed for filling
buffers. It behaves much like a Vec<u8>, but with fixed capacity.
2021-11-25 21:27:08 -08:00
Raph Levien 2ebdd942cf Use bytemuck
Get rid of `PlainData` trait and use `Pod` from bytemuck instead.
2021-11-23 08:24:16 -08:00
Raph Levien 657f219ce8 Better DX12 descriptor management
Reduce allocation of descriptor heaps. This change also enables clearing
of buffers, as the handles are needed at command dispatch time.

Also updates the tests to use clear_buffers on DX12. Looking forward to
being able to get rid of the compute shader workaround on Metal.

This is a followup on #125, and progress toward #95
2021-11-20 16:36:43 -08:00
Raph Levien f9d0aa078b Use DXIL shader compilation
Integrate DXC for translating HLSL for use in DX12. This will work
around FXC limitations and unlock the use of more advanced HLSL features
such as subgroups.

This hardcodes the use of DXIL, but it could be adapted (with a bit of
effort) to choose between DXIL and HLSL at runtime.
2021-11-11 12:55:10 -08:00
Raph Levien 94949a6906 Mac port of bind layout rework
This gets it working on mac. Also delete old implementation.

There's also an update to winit 0.25 in here, because it was easier to
roll forward than fix inconsistent Cargo.lock. At some point, we should
systematically update all deps.
2021-11-10 13:40:16 -08:00
Raph Levien 74f2b4fd1c Rework bind layout
Use an array of bindtypes rather than the previous situation, which was
a choice of buffer counts, or a heavier builder pattern.

The main thing this unlocks is distinguishing between readonly and
read/write buffers, which is important for DX12.

This is WIP, the Metal part hasn't been done, and the old stuff not
deleted.

Part of #125
2021-11-10 11:25:16 -08:00
Ishi Tatsuyuki d77dfb8c00 Runtime querying of threadgroup size 2021-06-08 16:29:40 +09:00
Raph Levien bae185efbd API reorg
Move types into the toplevel and hide implementation details. Remove
deref of hub CmdBuf to mux. Restrict public visibility of internals.

Most items have some docs, though improvements are still possible. In
particular, there should be detailed safety info.
2021-05-29 21:11:02 -07:00
Raph Levien 7d7c86c44b API changes and cleanup
Add workgroup size to dispatch call (needed by metal). Change all fence
references to mutable for consistency.

Move backend traits to a separate file (move them out of the toplevel
namespace in preparation for the hub types going there, to make the
public API nicer).

Add a method and macro for automatically choosing shader code, and
change collatz example to generate all 3 kinds on build.
2021-05-28 16:14:39 -07:00
Raph Levien c2965254db Merge branch 'dx12' into metal 2021-05-27 16:12:21 -07:00
Raph Levien b4ba6886d8 Tweak wait_and_reset mutable fence signature
A reference to a slice of mutable references is not a thing.
2021-05-27 16:10:14 -07:00
Raph Levien 84dabcf049 Merge branch 'dx12' into metal 2021-05-27 16:02:12 -07:00
Raph Levien b6292c644f Make fences mutable
Change the interface for fences to accept mutable references. This will
actualy help the Metal backend more than dx12 (avoiding interior
mutability) but more accurately captures intent and matches gfx-hal.
2021-05-27 15:53:12 -07:00
Raph Levien 0d5ff515ec Merge branch 'dx12' into metal 2021-05-26 18:16:45 -07:00
Raph Levien 37de07f670 More work on DX12 backend
This gets swapchain presentation wired up, and some more changes.
2021-05-26 16:31:24 -07:00
Raph Levien 2ecfc7a414 Wire hub to mux
Make the hub abstraction connect to the mux, rather than directly to the
Vulkan back-end.

As of this commit, both command line and winit examples work (on
Vulkan). In theory it should be possible to get them working on Dx12 as
well by translating the shader code, but there's a lot that can go
wrong.

This commit also contains a bunch of changes to mux to make conditional
compilation of match arms work, and new methods to support swapchain.
2021-05-26 09:30:07 -07:00
Raph Levien d15994fe44 Fix cfg'ed backend imports 2021-05-25 17:09:24 -07:00
Raph Levien f04da3af9d Add multiplexer abstraction
Adds a new "mux" module which can have multiple backends. As of this
commit, it's not wired up at all, but the functionality should be
reasonably complete.

Minor tweaks to the backend trait to accommodate this, mostly changing
Fence and Semaphore to references so they don't need to be Copy.

Part of the work toward #95
2021-05-25 15:12:37 -07:00
Raph Levien dfac2148a9 Merge branch 'staging' into dx12 2021-05-24 15:44:53 -07:00
Raph Levien 174c81ec09 Cleanup
Fix bound on blanket RetainResource impl. Clean up run_cmd_buf.
2021-05-24 15:42:25 -07:00
Raph Levien 47d2e0a756 Add create_buffer_init method
Add a method to create a buffer with initial content, which requires
staging buffers under the hood.

This patch also changes the lower-level (Vulkan) interface to be closer
to the raw Vulkan call.
2021-05-24 13:18:11 -07:00
Raph Levien 60d54b6e69 Add image support
Adds image data types and operations. At this point, lightly tested.
2021-05-22 15:15:33 -07:00
Raph Levien 050df66801 Redo memory options for usage
Rework the entire mechanism for specifying memory for creating
resources, inferring the correct options from the new usage flags.
2021-05-21 22:17:17 -07:00
Raph Levien 3dfae7aed6 Merge branch 'usage' into dx12_work 2021-05-21 22:00:49 -07:00
Raph Levien 4dcf385b18 Remove MemFlags trait 2021-05-21 21:51:33 -07:00
Raph Levien e9a8b4643b Migrate to BufferUsage
Adopt the BufferUsage concept from WebGPU, and replace MemFlags, which
is inadequate.
2021-05-21 19:43:55 -07:00
Raph Levien cd5e799d1a Beginning of Metal back-end
Work in progress, some types in place but mostly a skeleton.
2021-05-21 17:44:49 -07:00
Raph Levien e4b16e706a Timestamp queries
These function, but can use some work.

First, the buffer situation is worse than it should be. It should be
possible to create a single readback buffer rather then copy from
gpu-local to host-coherent.

Second, the command buffer `finish_timestamps` call doesn't correlate to
anything in Vulkan, so needs plumbing up through the hub in one form or
other when that happens. I'm inclined to make it ergonomic by doing a
bit of resource tracking that will trigger the appropriate call (and
subsequent host barrier) in the `finish` method on the command buffer.
2021-05-21 13:19:10 -07:00
Raph Levien f482921806 Create compute pipelines
Create compute pipelines from shader source and descriptor sets. This
gets it to the point where it can run the collatz example.

Still WIP and with rough edges, of course.
2021-05-18 10:08:23 -07:00
Raph Levien 619fc8d4eb Merge branch 'master' into dx12 2021-05-16 10:19:06 -07:00
Raph Levien a5991ecf97 Expand runtime query of GPU capabilities
Test whether the GPU supports subgroups (including size control) and
memory model.

This patch does all the ceremony needed for runtime query, including
testing the Vulkan version and only probing the extensions when
available. Thus, it should work fine on older devices (not yet tested).

The reporting of capabilities follows Vulkan concepts, but is not
particularly Vulkan-specific.
2021-05-08 11:41:47 -07:00
Elias Naur 903ab1fb59 implement FillImage command and sRGB support
FillImage is like Fill, except that it takes its color from one or
more image atlases.

kernel4 uses a single image for non-Vulkan hosts, and the dynamic sized array
of image descriptors on Vulkan.

A previous version of this commit used textures. I think images are a better
choice for piet-gpu, for several reasons:

- Texture sampling, in particular textureGrad, is slow on lower spec devices
  such as Google Pixel. Texture sampling is particularly slow and difficult to
implement for CPU fallbacks.
- Texture sampling need more parameters, in particular the full u,v
  transformation matrix, leading to a large increase in the command size. Since
all commands use the same size, that memory penalty is paid by all scenes, not
just scenes with textures.
- It is unlikely that piet-gpu will support every kind of fill for every
  client, because each kind must be added to kernel4.

With FillImage, a client will prepare the image(s) in separate shader stages,
sampling and applying transformations and special effects as needed. Textures
that align with the output pixel grid can be used directly, without
pre-processing.

Note that the pre-processing step can run concurrently with the piet-gpu pipeline;
Only the last stage, kernel4, needs the images.

Pre-processing most likely uses fixed function vertex/fragment programs,
which on some GPUs may run in parallel with piet-gpu's compute programs.

While here, fix a few validation errors:
- Explicitly enable EXT_descriptor_indexing, KHR_maintenance3,
  KHR_get_physical_device_properties2.
- Specify a vkDescriptorSetVariableDescriptorCountAllocateInfo for
  vkAllocateDescriptorSets. Otherwise, variable image2D arrays won't work (but
sampler2D arrays do, at least on my setup).

Updates #38

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-03-19 12:50:12 +01:00
Raph Levien facc9e0982 Use sampler for texture images
Provide images to fine rasterization kernel as readonly textures with a
sampler, rather than storage images. That lets us use the GPU's hardware
for sampling, which should be considerably more efficient.

There are a bunch of parameters that are hardcoded, but it does seem to
work.
2020-11-25 18:05:10 -08:00
Raph Levien 047a0830d1 Towards wiring up images to k4
This patch passes a dynamically sized array of textures to the fine
rasterizer.

A bunch of the low level Vulkan stuff is done, but only enough of the
shaders and encoders to do minimal testing. We'll want to switch from
storage images to sampled images, track the actual array of textures
during encoding, use that to build the descriptor set (which will need
to be more dynamic), and of course run image elements through the
pipeline.

Progress towards #38
2020-11-24 22:11:38 -08:00
Raph Levien 6b06d249ab Builder pattern for pipelines
Use a builder pattern for pipelines and descriptor sets, so we can go
richer without hugely complicating existing code.

WIP
2020-11-24 22:11:38 -08:00
Raph Levien d63583083c Start DX12 backend
Very early so far, but cool to have a branch for it.
2020-11-24 10:32:49 -08:00
Raph Levien a60c2dd3c8 Scratch buffer for clip stack
We keep a small window of the clip stack in registers in the fine
rasterization kernel, and when that window is exceeded, spill to global
memory, so the clip stack can be unbounded.
2020-11-22 18:14:09 -08:00
Raph Levien 75c4b62730 Add hub abstraction
The hub does a little better lifetime tracking of resources (so
Rust-side references can be dropped), and in the future will be used for
dynamic selection of backend.

The migration is still a bit half-baked, as there are a bunch of
Vulkan-specific types in the signatures, but it shouldn't be too much
work to sort that out. Perhaps it can wait until there is a second
backend though.

The main motivation for this is to create image objects with lifetime
tracking, one of the things required for #38.
2020-11-18 16:06:08 -08:00
Raph Levien 301abf4db7 Minor cleanups
Mostly cleaning up some comments. Also adds host barrier and a command
to copy a buffer to an image (in preparation for images, see #38).
2020-11-17 14:18:30 -08:00
msiglreith b38e43f0c2 Initial work for surface support
surface: handle extensions

Implement swapchain creation and blit image to screen
2020-05-04 16:24:42 +02:00
Raph Levien aa8b71e922 Reset query pool before use
Quiets validation errors now that we can see them :)
2020-04-29 18:18:04 -07:00
Raph Levien 55e35dd879 Dynamic allocation of intermediate buffers
When the initial allocation is exceeded, do an atomic bump allocation.
This is done for both tilegroup instances and per tile command lists.
2020-04-25 10:45:47 -07:00