Make the hub abstraction connect to the mux, rather than directly to the
Vulkan back-end.
As of this commit, both command line and winit examples work (on
Vulkan). In theory it should be possible to get them working on Dx12 as
well by translating the shader code, but there's a lot that can go
wrong.
This commit also contains a bunch of changes to mux to make conditional
compilation of match arms work, and new methods to support swapchain.
Adds a new "mux" module which can have multiple backends. As of this
commit, it's not wired up at all, but the functionality should be
reasonably complete.
Minor tweaks to the backend trait to accommodate this, mostly changing
Fence and Semaphore to references so they don't need to be Copy.
Part of the work toward #95
Use an enum instead of Box<dyn Any> for resources to be retained until
command buffer completion, and allow both references (which will be
cloned) and owned resources (useful for staging buffers).
Add a method to create a buffer with initial content, which requires
staging buffers under the hood.
This patch also changes the lower-level (Vulkan) interface to be closer
to the raw Vulkan call.
These function, but can use some work.
First, the buffer situation is worse than it should be. It should be
possible to create a single readback buffer rather then copy from
gpu-local to host-coherent.
Second, the command buffer `finish_timestamps` call doesn't correlate to
anything in Vulkan, so needs plumbing up through the hub in one form or
other when that happens. I'm inclined to make it ergonomic by doing a
bit of resource tracking that will trigger the appropriate call (and
subsequent host barrier) in the `finish` method on the command buffer.
Create compute pipelines from shader source and descriptor sets. This
gets it to the point where it can run the collatz example.
Still WIP and with rough edges, of course.
Chipping away at the dx12 backend. This should more or less do the
signalling to the CPU that the command buffer is done (ie wire up the
fence). It also creates buffer objects.
Test whether the GPU supports subgroups (including size control) and
memory model.
This patch does all the ceremony needed for runtime query, including
testing the Vulkan version and only probing the extensions when
available. Thus, it should work fine on older devices (not yet tested).
The reporting of capabilities follows Vulkan concepts, but is not
particularly Vulkan-specific.
The compute shaders have a check for the succesful completion of their
preceding stage. However, consider a shader execution path like the
following:
void main()
if (mem_error != NO_ERROR) {
return;
}
...
malloc(...);
...
barrier();
...
}
and shader execution that fails to allocate memory, thereby setting
mem_error to ERR_MALLOC_FAILED in malloc before reaching the barrier. If
another shader execution then begins execution, its mem_eror check will
make it return early and not reach the barrier.
All GPU APIs require (dynamically) uniform control flow for barriers,
and the above case may lead to GPU hangs in practice.
Fix this issue by replacing the early exits with careful checks that
don't interrupt barrier control flow.
Unfortunately, it's harder to prove the soundness of the new checks, so
this change also clears dynamic memory ranges in MEM_DEBUG mode when
memory is exhausted. The result is that accessing memory after
exhaustion triggers an error.
Signed-off-by: Elias Naur <mail@eliasnaur.com>
Don't run extensions unless they're available. This includes querying
for descriptor indexing, and running one of two versions of kernel4
depending on whether it's enabled.
Part of the support needed for #78
The BeginClip and EndClip bounding boxes are absolute and must pairwise
match. I mistakenly modified the BeginClip bounding box for stroked
clips.
Signed-off-by: Elias Naur <mail@eliasnaur.com>
Adds an example binary that can be run with `cargo apk`.
One thing that will still need manual tuning (for now) is the size of
the canvas. A good followup is to sense that from the window size.
Don't run extensions unless they're available. This includes querying
for descriptor indexing, and running one of two versions of kernel4
depending on whether it's enabled.
Part of the support needed for #78