Commit graph

30 commits

Author SHA1 Message Date
Elias Naur a5b6bda941 add support for element flags to shaders
Commit 9afa9b86b6 added Rust support for
encoding flags into elements. This change adds support to shaders by
introducing variant tag structs:

struct VariantTag {
    uint tag;
    uint flags;
}

and returning them from Variant_tag functions.

It also adds a flags argument to write functions for enum variants that
include TagFlags.

No functionality changes.

Updates #70

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-03-19 12:50:12 +01:00
Elias Naur 903ab1fb59 implement FillImage command and sRGB support
FillImage is like Fill, except that it takes its color from one or
more image atlases.

kernel4 uses a single image for non-Vulkan hosts, and the dynamic sized array
of image descriptors on Vulkan.

A previous version of this commit used textures. I think images are a better
choice for piet-gpu, for several reasons:

- Texture sampling, in particular textureGrad, is slow on lower spec devices
  such as Google Pixel. Texture sampling is particularly slow and difficult to
implement for CPU fallbacks.
- Texture sampling need more parameters, in particular the full u,v
  transformation matrix, leading to a large increase in the command size. Since
all commands use the same size, that memory penalty is paid by all scenes, not
just scenes with textures.
- It is unlikely that piet-gpu will support every kind of fill for every
  client, because each kind must be added to kernel4.

With FillImage, a client will prepare the image(s) in separate shader stages,
sampling and applying transformations and special effects as needed. Textures
that align with the output pixel grid can be used directly, without
pre-processing.

Note that the pre-processing step can run concurrently with the piet-gpu pipeline;
Only the last stage, kernel4, needs the images.

Pre-processing most likely uses fixed function vertex/fragment programs,
which on some GPUs may run in parallel with piet-gpu's compute programs.

While here, fix a few validation errors:
- Explicitly enable EXT_descriptor_indexing, KHR_maintenance3,
  KHR_get_physical_device_properties2.
- Specify a vkDescriptorSetVariableDescriptorCountAllocateInfo for
  vkAllocateDescriptorSets. Otherwise, variable image2D arrays won't work (but
sampler2D arrays do, at least on my setup).

Updates #38

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-03-19 12:50:12 +01:00
Elias Naur 07e07c7544 ensure consistent path segment transformation
As described in #62, the non-deterministic scene monoid may result in
slightly different transformations for path segments in an otherwise
closed path.

This change ensures consistent transformation across paths in three steps.

First, absolute transformations computed by the scene monoid is stored
along with path segments and annotated elements.

Second, elements.comp no longer transforms path segments. Instead, each
segment is stored untransformed along with a reference to its absolute
transformation.

Finally, path_coarse performs the transformation of path segments.
Because all segments in a path share a single transformation reference,
the inconsistency in #62 is avoided.

Fixes #62

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-03-19 12:45:23 +01:00
Elias Naur ad444f615c elements.comp: use shared array of structs directly
The NVIDIA shader compiler bug that forced splitting of the state struct
into primitive types is now fixed.

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-03-19 12:45:23 +01:00
Elias Naur 6a4e26ef2a all: add optional memory checks
Defining MEM_DEBUG in mem.h will add a size field to Alloc and enable
bounds and alignment checks for every memory read and write.

Notes:
- Deriving an Alloc from Path.tiles is unsound, but it's more trouble to
  convert Path.tiles from TileRef to a variable sized Alloc.
- elements.comp note that "We should be able to use an array of structs but the
  NV shader compiler doesn't seem to like it". If that's still relevant, does
  the shared arrays of Allocs work?

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-02-15 16:07:45 +01:00
Elias Naur 4de67d9081 unify GPU memory management
Merge all static and dynamic buffers to just one, "memory". Add a malloc
function for dynamic allocations.

Unify static allocation offsets into a "config" buffer containing scene setup
(number of paths, number of path segments), as well as the memory offsets of
the static allocations.

Finally, set an overflow flag when an allocation fail, and make sure to exit
shader execution as soon as that triggers. Add checks before beginning
execution in case the client wants to run two or more shaders before checking
the flag.

The "state" buffer is left alone because it needs zero'ing and because it is
accessed with the "volatile" keyword.

Fixes #40

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2020-12-27 20:24:29 +01:00
Elias Naur d21f2b68de all: add SPDX license headers
Fixes #53

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2020-12-11 18:24:35 +01:00
Elias Naur 580b63e558 elements.comp: tighten state size calculations
The state header is only one word (flags), not two.

Move the partition atomic counter to a separate field instead of state[0],
simplifying state offset calculations.

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2020-12-10 18:48:16 +01:00
Elias Naur feeb459fa1 remove FillMask and FillMaskInv
Obsoleted by BeginClip/EndClip.

Updates #36

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2020-11-29 16:59:58 +01:00
Raph Levien d14895b107 Continuing work on clips
I realized there's a problem with encoding clip bboxes relative to the
current transform (see #36 for a more detailed explanation), so this is
changing it to absolute bboxes.

This more or less gets clips working. There are optimization
opportunities (all-clear and all-opaque mask tiles), and it doesn't deal
with overflow of the blend stack, but it seems to basically work.
2020-11-20 18:25:27 -08:00
Elias Naur b942e4035b piet-gpu/shader: ensure forward progress in decoupled lookback
The Vulkan and OpenGL specifications offer only weak forward progress guarantees, and
in practice several mobile devices fail to complete the decoupled lookback
spinloop without mitigation.

This patch implements Raph's suggestion from the "Forward Progress"
section from

https://raphlinus.github.io/gpu/2020/04/30/prefix-sum.html

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2020-10-25 21:02:58 +01:00
Elias Naur bc01180519 piet-gpu/shader: delete unused is_fill from elements.comp
Delete debug code as well.

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2020-10-25 20:59:54 +01:00
Elias Naur 8fab45544e shader: implement clip paths
Expand the the final kernel4 stage to maintain a per-pixel mask.

Introduce two new path elements, FillMask and FillMaskInv, to fill
the mask. FillMask acts like Fill, while FillMaskInv fills the area
outside the path.

SVG clipPaths is then representable by a FillMaskInv(0.0) for every nested
path, preceded by a FillMask(1.0) to clear the mask.

The bounding box for FillMaskInv elements is the entire screen; tightening of
the bounding box is left for future work. Note that a fullscreen bounding
box is not hopelessly inefficient because completely filling a tile with
a mask is just a single CmdSolidMask per tile.

Fixes #30

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2020-10-09 13:20:26 +02:00
Elias Naur 55cfd472a5 shader: delete unused code
Signed-off-by: Elias Naur <mail@eliasnaur.com>
2020-10-09 13:20:26 +02:00
Elias Naur cfd57361c4 Fix linewidth transformations
The transformation determinant is signed, but we're only interested in
the absolute scale for transforming linewidths.

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2020-08-24 16:12:18 +02:00
bhmerchant@gmail.com d836d21d12 Clean up bits of right edge tracking logic left over from sort-middle. 2020-08-12 19:57:14 -07:00
Raph Levien eaa1d261c3 Sederberg error metric
Use proper math to compute number of subdivisions. This works but is not
very satisfying, as it over-subdivides.
2020-06-09 18:43:49 -07:00
Raph Levien b571e0d10c Continue wiring up gpu-side flattening
All segments given to path coarse raster are cubics. Flatten to
quadratics.

This works but the quality is not (yet) good.
2020-06-09 17:56:11 -07:00
Raph Levien 0f44bc8b78 Start GPU-side flattening
This starts the work on GPU-side flattening by plumbing curves through.
2020-06-09 16:01:47 -07:00
Raph Levien 294f6fd1db Experiment with new sorting scheme
Path segments are unsorted, but other elements are using the same
sort-middle approach as before.

This is a checkpoint. At this point, there are unoptimized versions
of tile init and coarse path raster, but it isn't wired up into a
working pipeline. Also observing about a 3x performance regression in
element processing, which needs to be investigated.
2020-06-03 09:29:25 -07:00
Raph Levien 55df3e6cc8 Fix linewidth math
Coarse rasterization wasn't entirely taking line width into account.

Also fix swizzle in matrix (not yet used). And fix missing End command
in ptcl output (hasn't been a problem because buffer was cleared).
2020-05-24 09:43:41 -07:00
Raph Levien a616b4d010 Rework right_edge computation in elements
Trying to fit it into the fancy monad doesn't really work, so use a
more straightforward approach to compute it from the aggregate.

Also add yEdge logic (basically copying piet-metal). With a fix to
ELEMENT_BINNING_RATIO (which I had simply gotten wrong), the example
renders almost correctly, with small bounding box artifacts.
2020-05-21 10:00:56 -07:00
Raph Levien 076e6d600d Progress on wiring up fills
Write the right_edge to the binning output.

More work on encoding the fill/stroke distinction and plumbing that
through the pipeline. This is a bit unsatisfying because of the code
duplication; having an extra fill/stroke bool might be better, but I
want to avoid making the structs bigger (this could be solved by
better packing in the struct encoding).

Fills are plumbed through to the last stage. Backdrop is WIP.
2020-05-20 11:14:19 -07:00
Raph Levien 03da52cff8 Start implementing fills
This should get the "right_edge" value for each segment plumbed through
to the binning phase. It also needs to be plumbed to coarse raster and
wired up there.

Also considering WIP because none of this logic has been tested yet.
2020-05-19 20:40:04 -07:00
Raph Levien fe1790e724 Fix bbox bug
Bounding boxes were being calculated as way too large in the element
processing.

Also wire up counters so winit binary is happy.
2020-05-16 21:20:25 -07:00
Raph Levien 93044b469b Fix prefix sum
First, add decoupled lookback.

Second, fix problem with monoid that was overly aggressive in resetting
the bbox.
2020-05-15 20:09:39 -07:00
Raph Levien 868b0320a4 Render strokes
As of this point, it mostly renders stroke outlines for tiger. Some
dropouts are because the scan in the elements pass doesn't do lookback
yet, others are probably a bug.
2020-05-15 17:38:17 -07:00
Raph Levien 343e4c3075 Binning stage
Adds a binning stage. This is a first draft, and a number of loose ends
exist.
2020-05-12 17:34:15 -07:00
Raph Levien 736f883f66 Store annotated elements
Apply transform to paths and annotate with computed linewidth and
bounding box information, storing the result.
2020-05-12 12:13:39 -07:00
Raph Levien 9a8854ffab Experimenting with sort-middle
Starting a prototype that explores the sort-middle approach. This
commit has a prefix sum pass computing state per element.
2020-05-12 08:54:09 -07:00