Commit graph

89 commits

Author SHA1 Message Date
Raph Levien 59728868de Merge branch 'master' into gradient 2021-08-16 10:53:19 -07:00
Raph Levien 05e81acebc Basically get gradients working
Separate out render context upload from renderer creation. Upload ramps
to GPU buffer. Encode gradients to scene description. Fix a number of
bugs in uploading and processing.

This renders gradients in a test image, but has some shortcomings. For
one, staging buffers need to be applied for a couple things (they're
just host mapped for now). Also, the interaction between sRGB and
premultiplied alpha isn't quite right. The size of the gradient ramp
buffer is fixed and should be dynamic.

And of course there's always more optimization to be done, including
making the upload of gradient ramps more incremental, and probably
hashing of the stops instead of the processed ramps.
2021-08-09 16:16:46 -07:00
Raph Levien 6f707c4c62 Start work on gradients
WIP. Most of the GPU-side work should be done (though it's not tested
end-to-end and it's certainly possible I missed something), but still
needs work on encoding side.
2021-07-12 06:56:52 -07:00
Ishi Tatsuyuki 7a2dc37d36 Remove manual blend stack spilling and rely on scratch memory instead
v2: Add a panic when the nested blend depth exceeds the limit.
v3: Rebase and partially remove code introduced in 22507de.
2021-06-25 17:13:01 +09:00
Raph Levien 379fb1caaa
Merge pull request #89 from linebender/text
Start text rendering
2021-06-23 07:56:24 -07:00
Ishi Tatsuyuki d77dfb8c00 Runtime querying of threadgroup size 2021-06-08 16:29:40 +09:00
Raph Levien bae185efbd API reorg
Move types into the toplevel and hide implementation details. Remove
deref of hub CmdBuf to mux. Restrict public visibility of internals.

Most items have some docs, though improvements are still possible. In
particular, there should be detailed safety info.
2021-05-29 21:11:02 -07:00
Raph Levien 7d7c86c44b API changes and cleanup
Add workgroup size to dispatch call (needed by metal). Change all fence
references to mutable for consistency.

Move backend traits to a separate file (move them out of the toplevel
namespace in preparation for the hub types going there, to make the
public API nicer).

Add a method and macro for automatically choosing shader code, and
change collatz example to generate all 3 kinds on build.
2021-05-28 16:14:39 -07:00
Raph Levien 2ecfc7a414 Wire hub to mux
Make the hub abstraction connect to the mux, rather than directly to the
Vulkan back-end.

As of this commit, both command line and winit examples work (on
Vulkan). In theory it should be possible to get them working on Dx12 as
well by translating the shader code, but there's a lot that can go
wrong.

This commit also contains a bunch of changes to mux to make conditional
compilation of match arms work, and new methods to support swapchain.
2021-05-26 09:30:07 -07:00
Raph Levien 47d2e0a756 Add create_buffer_init method
Add a method to create a buffer with initial content, which requires
staging buffers under the hood.

This patch also changes the lower-level (Vulkan) interface to be closer
to the raw Vulkan call.
2021-05-24 13:18:11 -07:00
Raph Levien e9a8b4643b Migrate to BufferUsage
Adopt the BufferUsage concept from WebGPU, and replace MemFlags, which
is inadequate.
2021-05-21 19:43:55 -07:00
Raph Levien a5991ecf97 Expand runtime query of GPU capabilities
Test whether the GPU supports subgroups (including size control) and
memory model.

This patch does all the ceremony needed for runtime query, including
testing the Vulkan version and only probing the extensions when
available. Thus, it should work fine on older devices (not yet tested).

The reporting of capabilities follows Vulkan concepts, but is not
particularly Vulkan-specific.
2021-05-08 11:41:47 -07:00
Raph Levien 951f3aa508 Start text rendering
This commit puts in basic integration with ttf-parser and starts
populating the various piet text objects. The font is currently
hard-coded.
2021-05-04 08:21:22 -07:00
Raph Levien 01e4024599 Merge branch 'master' into ext_query 2021-04-11 09:08:46 -07:00
Tatsuyuki Ishi 0637e2d6e5 Encode premultiplied alpha in render_ctx.rs 2021-04-11 13:20:40 +09:00
Raph Levien 115cb855d9 Query extensions at runtime
Don't run extensions unless they're available. This includes querying
for descriptor indexing, and running one of two versions of kernel4
depending on whether it's enabled.

Part of the support needed for #78
2021-04-08 15:11:15 -07:00
Elias Naur 678bfedfca kernel4: assume colors in alpha-premultiplied sRGB format
See http://ssp.impulsetrain.com/gamma-premult.html for a description of
the format.

Pre-multiplied alpha only matters for translucent objects; draw a few
such shapes in the test render.

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-03-29 21:17:01 +02:00
Elias Naur eb37db1b05 replace per-element fill mode flags with a SetFillMode element
Fixes #70

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-03-29 21:10:25 +02:00
Elias Naur 8db77e180e support stroked fills for clips, images
This change completes general support for stroked fills for clips and
images.

Annotated_size increases from 28 to 32, because of the linewidth field
added to AnnoImage. Stroked image fills are presumably rare, and if
memory pressure turns out to be a bottleneck, we could replace the
linewidth field with a separate AnnoLinewidth elements.

Updates #70

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-03-19 16:43:33 +01:00
Elias Naur e9ff509ab9 use tag flags for fill vs stroke modes in scene elements
Encode stroke vs fill as tag flags, thereby reducing the number of scene
elements. Encoding change only, no functional changes.

The previous Stroke and Fill commands are merged to one command,
FillColor. The encoding to annotated element is divergent, which is
fixed when annotated elements move to tag flags.

Updates #70

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-03-19 12:50:12 +01:00
Elias Naur 903ab1fb59 implement FillImage command and sRGB support
FillImage is like Fill, except that it takes its color from one or
more image atlases.

kernel4 uses a single image for non-Vulkan hosts, and the dynamic sized array
of image descriptors on Vulkan.

A previous version of this commit used textures. I think images are a better
choice for piet-gpu, for several reasons:

- Texture sampling, in particular textureGrad, is slow on lower spec devices
  such as Google Pixel. Texture sampling is particularly slow and difficult to
implement for CPU fallbacks.
- Texture sampling need more parameters, in particular the full u,v
  transformation matrix, leading to a large increase in the command size. Since
all commands use the same size, that memory penalty is paid by all scenes, not
just scenes with textures.
- It is unlikely that piet-gpu will support every kind of fill for every
  client, because each kind must be added to kernel4.

With FillImage, a client will prepare the image(s) in separate shader stages,
sampling and applying transformations and special effects as needed. Textures
that align with the output pixel grid can be used directly, without
pre-processing.

Note that the pre-processing step can run concurrently with the piet-gpu pipeline;
Only the last stage, kernel4, needs the images.

Pre-processing most likely uses fixed function vertex/fragment programs,
which on some GPUs may run in parallel with piet-gpu's compute programs.

While here, fix a few validation errors:
- Explicitly enable EXT_descriptor_indexing, KHR_maintenance3,
  KHR_get_physical_device_properties2.
- Specify a vkDescriptorSetVariableDescriptorCountAllocateInfo for
  vkAllocateDescriptorSets. Otherwise, variable image2D arrays won't work (but
sampler2D arrays do, at least on my setup).

Updates #38

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-03-19 12:50:12 +01:00
Elias Naur 07e07c7544 ensure consistent path segment transformation
As described in #62, the non-deterministic scene monoid may result in
slightly different transformations for path segments in an otherwise
closed path.

This change ensures consistent transformation across paths in three steps.

First, absolute transformations computed by the scene monoid is stored
along with path segments and annotated elements.

Second, elements.comp no longer transforms path segments. Instead, each
segment is stored untransformed along with a reference to its absolute
transformation.

Finally, path_coarse performs the transformation of path segments.
Because all segments in a path share a single transformation reference,
the inconsistency in #62 is avoided.

Fixes #62

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-03-19 12:45:23 +01:00
Elias Naur fd746ea7a6 name and comment magic constant
Follow-up to review of PR #61.

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-03-19 12:45:23 +01:00
Ishi Tatsuyuki 8a499bc50e Always close fill paths, fix #68 2021-03-17 01:16:00 +09:00
Elias Naur c4f5a69a0d implement variable output sizing
Signed-off-by: Elias Naur <mail@eliasnaur.com>
2020-12-27 20:24:29 +01:00
Elias Naur c67696714b coarse.comp: don't write Cmd_End to tiles out of bounds
If WIDTH_IN_TILES or HEIGHT_IN_TILES are not divisible by N_TILE_X or N_TILE_Y
respectively, the previously unconditional Cmd_End_write would write out of
bounds.

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2020-12-27 20:24:29 +01:00
Elias Naur 4de67d9081 unify GPU memory management
Merge all static and dynamic buffers to just one, "memory". Add a malloc
function for dynamic allocations.

Unify static allocation offsets into a "config" buffer containing scene setup
(number of paths, number of path segments), as well as the memory offsets of
the static allocations.

Finally, set an overflow flag when an allocation fail, and make sure to exit
shader execution as soon as that triggers. Add checks before beginning
execution in case the client wants to run two or more shaders before checking
the flag.

The "state" buffer is left alone because it needs zero'ing and because it is
accessed with the "volatile" keyword.

Fixes #40

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2020-12-27 20:24:29 +01:00
Raph Levien 634530fb91 Merge branch 'master' into image_work 2020-12-02 11:58:45 -08:00
Raph Levien 4138f8a516 Optimize clips
Optimize tiles with clip masks that are all-zero or all-one.

Part of #36
2020-11-27 09:30:35 -08:00
Raph Levien facc9e0982 Use sampler for texture images
Provide images to fine rasterization kernel as readonly textures with a
sampler, rather than storage images. That lets us use the GPU's hardware
for sampling, which should be considerably more efficient.

There are a bunch of parameters that are hardcoded, but it does seem to
work.
2020-11-25 18:05:10 -08:00
Raph Levien 047a0830d1 Towards wiring up images to k4
This patch passes a dynamically sized array of textures to the fine
rasterizer.

A bunch of the low level Vulkan stuff is done, but only enough of the
shaders and encoders to do minimal testing. We'll want to switch from
storage images to sampled images, track the actual array of textures
during encoding, use that to build the descriptor set (which will need
to be more dynamic), and of course run image elements through the
pipeline.

Progress towards #38
2020-11-24 22:11:38 -08:00
Raph Levien 6b06d249ab Builder pattern for pipelines
Use a builder pattern for pipelines and descriptor sets, so we can go
richer without hugely complicating existing code.

WIP
2020-11-24 22:11:38 -08:00
Raph Levien a60c2dd3c8 Scratch buffer for clip stack
We keep a small window of the clip stack in registers in the fine
rasterization kernel, and when that window is exceeded, spill to global
memory, so the clip stack can be unbounded.
2020-11-22 18:14:09 -08:00
Raph Levien d14895b107 Continuing work on clips
I realized there's a problem with encoding clip bboxes relative to the
current transform (see #36 for a more detailed explanation), so this is
changing it to absolute bboxes.

This more or less gets clips working. There are optimization
opportunities (all-clear and all-opaque mask tiles), and it doesn't deal
with overflow of the blend stack, but it seems to basically work.
2020-11-20 18:25:27 -08:00
Raph Levien f53d00e6bc Add transforms and state stack
Actually handle transforms in RenderCtx (was implemented in renderer but
not actually plumbed through). This also requires maintaining a state
stack, which will also be required for clipping.

This PR also starts work on encoding clipping, including tracking
bounding boxes.

WIP, none of this is tested yet.
2020-11-20 18:25:27 -08:00
Raph Levien 47e24ec9d5 Start adding support for creating images
This is still WIP, focused on creating image resources and making them
available GPU-side.

Progress toward #38
2020-11-19 16:32:29 -08:00
Raph Levien 75c4b62730 Add hub abstraction
The hub does a little better lifetime tracking of resources (so
Rust-side references can be dropped), and in the future will be used for
dynamic selection of backend.

The migration is still a bit half-baked, as there are a bunch of
Vulkan-specific types in the signatures, but it shouldn't be too much
work to sort that out. Perhaps it can wait until there is a second
backend though.

The main motivation for this is to create image objects with lifetime
tracking, one of the things required for #38.
2020-11-18 16:06:08 -08:00
Raph Levien 8e2f2aeeba Update dependencies
Update to latest versions of all dependencies. Among other things, this
gets us on piet 0.2, though almost all of the changes were around text,
which is not yet implemented.
2020-11-14 08:25:43 -08:00
Elias Naur 326f7f0d03 shader: delete more unused code and variables
Signed-off-by: Elias Naur <mail@eliasnaur.com>
2020-09-13 13:03:56 +02:00
bhmerchant@gmail.com d836d21d12 Clean up bits of right edge tracking logic left over from sort-middle. 2020-08-12 19:57:14 -07:00
msiglreith 1cc5c7ac0d Shader documentation and a slight cleanup 2020-06-28 15:37:27 +02:00
msiglreith eed71721eb Update winit example 2020-06-14 23:32:59 +02:00
Raph Levien 65f802894c Merge branch 'master' into sorta 2020-06-13 07:30:40 -07:00
Raph Levien b23113461b Minor cleanups
Get rid of warnings. Do cargo update to bump deps.
2020-06-10 14:10:28 -07:00
Raph Levien b571e0d10c Continue wiring up gpu-side flattening
All segments given to path coarse raster are cubics. Flatten to
quadratics.

This works but the quality is not (yet) good.
2020-06-09 17:56:11 -07:00
Raph Levien 0f44bc8b78 Start GPU-side flattening
This starts the work on GPU-side flattening by plumbing curves through.
2020-06-09 16:01:47 -07:00
Raph Levien af0a1af8e1 Make fills work
The backdrop propagation is slow but it does work.
2020-06-05 22:40:44 -07:00
Raph Levien 63ba45c774 Fix performance issues
Use larger workgroup for tile initialization (utilization was poor).
Provide correct element count to coarse rasterizer.
2020-06-03 15:32:58 -07:00
Raph Levien 70a9c17e23 Continue building out pipeline
Plumbs the new tiling scheme to k4. This works (stroke only) but still
has some performance issues.
2020-06-03 12:21:09 -07:00
Raph Levien 294f6fd1db Experiment with new sorting scheme
Path segments are unsorted, but other elements are using the same
sort-middle approach as before.

This is a checkpoint. At this point, there are unoptimized versions
of tile init and coarse path raster, but it isn't wired up into a
working pipeline. Also observing about a 3x performance regression in
element processing, which needs to be investigated.
2020-06-03 09:29:25 -07:00