Commit graph

48 commits

Author SHA1 Message Date
Raph Levien ef5ef2745c Render color emoji layers
A bit hacky still, but does render color in Segoe color emoji.
2021-08-27 08:25:05 -07:00
Raph Levien 05e81acebc Basically get gradients working
Separate out render context upload from renderer creation. Upload ramps
to GPU buffer. Encode gradients to scene description. Fix a number of
bugs in uploading and processing.

This renders gradients in a test image, but has some shortcomings. For
one, staging buffers need to be applied for a couple things (they're
just host mapped for now). Also, the interaction between sRGB and
premultiplied alpha isn't quite right. The size of the gradient ramp
buffer is fixed and should be dynamic.

And of course there's always more optimization to be done, including
making the upload of gradient ramps more incremental, and probably
hashing of the stops instead of the processed ramps.
2021-08-09 16:16:46 -07:00
Raph Levien 6f707c4c62 Start work on gradients
WIP. Most of the GPU-side work should be done (though it's not tested
end-to-end and it's certainly possible I missed something), but still
needs work on encoding side.
2021-07-12 06:56:52 -07:00
Elias Naur eb37db1b05 replace per-element fill mode flags with a SetFillMode element
Fixes #70

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-03-29 21:10:25 +02:00
Elias Naur 8db77e180e support stroked fills for clips, images
This change completes general support for stroked fills for clips and
images.

Annotated_size increases from 28 to 32, because of the linewidth field
added to AnnoImage. Stroked image fills are presumably rare, and if
memory pressure turns out to be a bottleneck, we could replace the
linewidth field with a separate AnnoLinewidth elements.

Updates #70

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-03-19 16:43:33 +01:00
Elias Naur db59b5d570 coarse,kernel4: make stroke, (non-zero) fill, solid separate commands
Before this change, every command (FillColor, FillImage, BeginClip)
had (or would need) stroke, (non-zero) fill and solid variants.

This change adds a command for each fill mode and their parameters,
reducing code duplication and adds support for stroked FillImage and
BeginClip as a side-effect.

The rest of the pipeline doesn't yet support Stroked FillImage and
BeginClip. That's a follow-up change.

Since each command includes a tag, this change adds an extra word for
each fill and stroke. That waste is also addressed in a follow-up.

Updates #70

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-03-19 16:43:33 +01:00
Elias Naur 44bff2726c collapse FillCubic and StrokeCubic into Cubic with flags for fill mode
Updates #70

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-03-19 12:50:12 +01:00
Elias Naur df055563bd collapse annotated Fill and Stroke to Color with fill mode flag
No functionality changes, just different encoding.

Updates #70

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-03-19 12:50:12 +01:00
Elias Naur e9ff509ab9 use tag flags for fill vs stroke modes in scene elements
Encode stroke vs fill as tag flags, thereby reducing the number of scene
elements. Encoding change only, no functional changes.

The previous Stroke and Fill commands are merged to one command,
FillColor. The encoding to annotated element is divergent, which is
fixed when annotated elements move to tag flags.

Updates #70

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-03-19 12:50:12 +01:00
Elias Naur 903ab1fb59 implement FillImage command and sRGB support
FillImage is like Fill, except that it takes its color from one or
more image atlases.

kernel4 uses a single image for non-Vulkan hosts, and the dynamic sized array
of image descriptors on Vulkan.

A previous version of this commit used textures. I think images are a better
choice for piet-gpu, for several reasons:

- Texture sampling, in particular textureGrad, is slow on lower spec devices
  such as Google Pixel. Texture sampling is particularly slow and difficult to
implement for CPU fallbacks.
- Texture sampling need more parameters, in particular the full u,v
  transformation matrix, leading to a large increase in the command size. Since
all commands use the same size, that memory penalty is paid by all scenes, not
just scenes with textures.
- It is unlikely that piet-gpu will support every kind of fill for every
  client, because each kind must be added to kernel4.

With FillImage, a client will prepare the image(s) in separate shader stages,
sampling and applying transformations and special effects as needed. Textures
that align with the output pixel grid can be used directly, without
pre-processing.

Note that the pre-processing step can run concurrently with the piet-gpu pipeline;
Only the last stage, kernel4, needs the images.

Pre-processing most likely uses fixed function vertex/fragment programs,
which on some GPUs may run in parallel with piet-gpu's compute programs.

While here, fix a few validation errors:
- Explicitly enable EXT_descriptor_indexing, KHR_maintenance3,
  KHR_get_physical_device_properties2.
- Specify a vkDescriptorSetVariableDescriptorCountAllocateInfo for
  vkAllocateDescriptorSets. Otherwise, variable image2D arrays won't work (but
sampler2D arrays do, at least on my setup).

Updates #38

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-03-19 12:50:12 +01:00
Elias Naur 07e07c7544 ensure consistent path segment transformation
As described in #62, the non-deterministic scene monoid may result in
slightly different transformations for path segments in an otherwise
closed path.

This change ensures consistent transformation across paths in three steps.

First, absolute transformations computed by the scene monoid is stored
along with path segments and annotated elements.

Second, elements.comp no longer transforms path segments. Instead, each
segment is stored untransformed along with a reference to its absolute
transformation.

Finally, path_coarse performs the transformation of path segments.
Because all segments in a path share a single transformation reference,
the inconsistency in #62 is avoided.

Fixes #62

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-03-19 12:45:23 +01:00
Elias Naur 79d722df48 remove unused commands from pathseg
Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-03-19 12:45:23 +01:00
Elias Naur b73eabf4eb kernel4.comp: remove unused commands
Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-02-24 15:32:24 +01:00
Elias Naur 1c6ca7e5fb remove unused BinChunk type
Signed-off-by: Elias Naur <mail@eliasnaur.com>
2020-12-08 00:45:08 +01:00
Elias Naur 19f4d9fa95 change tile segment representation to (origin, vector)
Eliminates the precision loss of the subtraction in the sign(end.x - start.x)
expression in kernel4. That's important for the next change that avoids
inconsistent line intersections in path_coarse.

Updates #23

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2020-12-01 18:34:40 +01:00
Elias Naur feeb459fa1 remove FillMask and FillMaskInv
Obsoleted by BeginClip/EndClip.

Updates #36

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2020-11-29 16:59:58 +01:00
Elias Naur bd450ef461 piet-gpu-types: remove unused Segment and SegChunk types
Signed-off-by: Elias Naur <mail@eliasnaur.com>
2020-11-29 16:51:35 +01:00
Raph Levien d14895b107 Continuing work on clips
I realized there's a problem with encoding clip bboxes relative to the
current transform (see #36 for a more detailed explanation), so this is
changing it to absolute bboxes.

This more or less gets clips working. There are optimization
opportunities (all-clear and all-opaque mask tiles), and it doesn't deal
with overflow of the blend stack, but it seems to basically work.
2020-11-20 18:25:27 -08:00
Raph Levien f53d00e6bc Add transforms and state stack
Actually handle transforms in RenderCtx (was implemented in renderer but
not actually plumbed through). This also requires maintaining a state
stack, which will also be required for clipping.

This PR also starts work on encoding clipping, including tracking
bounding boxes.

WIP, none of this is tested yet.
2020-11-20 18:25:27 -08:00
Elias Naur 8fab45544e shader: implement clip paths
Expand the the final kernel4 stage to maintain a per-pixel mask.

Introduce two new path elements, FillMask and FillMaskInv, to fill
the mask. FillMask acts like Fill, while FillMaskInv fills the area
outside the path.

SVG clipPaths is then representable by a FillMaskInv(0.0) for every nested
path, preceded by a FillMask(1.0) to clear the mask.

The bounding box for FillMaskInv elements is the entire screen; tightening of
the bounding box is left for future work. Note that a fullscreen bounding
box is not hopelessly inefficient because completely filling a tile with
a mask is just a single CmdSolidMask per tile.

Fixes #30

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2020-10-09 13:20:26 +02:00
Elias Naur 9be0faba6f piet-gpu-types: remove unused scene elements
Delete image compute shader as well; it is unused.

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2020-09-27 18:57:53 +02:00
Elias Naur fa9bf0dc2b piet-gpu-types: remove unused ptcl types
Signed-off-by: Elias Naur <mail@eliasnaur.com>
2020-09-27 18:30:33 +02:00
Elias Naur dceb0f9412 piet-gpu-types: remove unused annotated types
Signed-off-by: Elias Naur <mail@eliasnaur.com>
2020-09-21 10:55:58 +02:00
Raph Levien 0f44bc8b78 Start GPU-side flattening
This starts the work on GPU-side flattening by plumbing curves through.
2020-06-09 16:01:47 -07:00
Raph Levien af0a1af8e1 Make fills work
The backdrop propagation is slow but it does work.
2020-06-05 22:40:44 -07:00
Raph Levien 70a9c17e23 Continue building out pipeline
Plumbs the new tiling scheme to k4. This works (stroke only) but still
has some performance issues.
2020-06-03 12:21:09 -07:00
Raph Levien 294f6fd1db Experiment with new sorting scheme
Path segments are unsorted, but other elements are using the same
sort-middle approach as before.

This is a checkpoint. At this point, there are unoptimized versions
of tile init and coarse path raster, but it isn't wired up into a
working pipeline. Also observing about a 3x performance regression in
element processing, which needs to be investigated.
2020-06-03 09:29:25 -07:00
Raph Levien dbcffb10db Reinstate fills
Add fills back in.
2020-05-25 15:27:03 -07:00
Raph Levien 8eaf49a04d Checkpoint parallel output
Parallel segment output seems to be working for strokes.
2020-05-25 12:14:18 -07:00
Raph Levien a616b4d010 Rework right_edge computation in elements
Trying to fit it into the fancy monad doesn't really work, so use a
more straightforward approach to compute it from the aggregate.

Also add yEdge logic (basically copying piet-metal). With a fix to
ELEMENT_BINNING_RATIO (which I had simply gotten wrong), the example
renders almost correctly, with small bounding box artifacts.
2020-05-21 10:00:56 -07:00
Raph Levien 076e6d600d Progress on wiring up fills
Write the right_edge to the binning output.

More work on encoding the fill/stroke distinction and plumbing that
through the pipeline. This is a bit unsatisfying because of the code
duplication; having an extra fill/stroke bool might be better, but I
want to avoid making the structs bigger (this could be solved by
better packing in the struct encoding).

Fills are plumbed through to the last stage. Backdrop is WIP.
2020-05-20 11:14:19 -07:00
Raph Levien 03da52cff8 Start implementing fills
This should get the "right_edge" value for each segment plumbed through
to the binning phase. It also needs to be plumbed to coarse raster and
wired up there.

Also considering WIP because none of this logic has been tested yet.
2020-05-19 20:40:04 -07:00
Raph Levien 1240da3870 Delete old-style kernels and buffers
Pave the way for the coarse raster pass to write to the ptcl buffer.
2020-05-15 15:24:37 -07:00
Raph Levien 343e4c3075 Binning stage
Adds a binning stage. This is a first draft, and a number of loose ends
exist.
2020-05-12 17:34:15 -07:00
Raph Levien 736f883f66 Store annotated elements
Apply transform to paths and annotate with computed linewidth and
bounding box information, storing the result.
2020-05-12 12:13:39 -07:00
Raph Levien 9a8854ffab Experimenting with sort-middle
Starting a prototype that explores the sort-middle approach. This
commit has a prefix sum pass computing state per element.
2020-05-12 08:54:09 -07:00
Raph Levien aa83d782ed Fills
Adds fills, and has more or less working tiger render (with artifacts).
2020-05-01 19:42:20 -07:00
Raph Levien b23fe25177 Use linked list strategy for segments
Trying to allocate them contiguously wasn't good.
2020-04-28 22:25:57 -07:00
Raph Levien cb06b1bc3d Implement stroked polylines
This version seems to work but the allocation of segments has low
utilization. Probably best to allocate in chunks rather than try to
make them contiguous.
2020-04-28 18:45:59 -07:00
Raph Levien 55e35dd879 Dynamic allocation of intermediate buffers
When the initial allocation is exceeded, do an atomic bump allocation.
This is done for both tilegroup instances and per tile command lists.
2020-04-25 10:45:47 -07:00
Brian Merchant 4aaa6f1f29
Add f16 support. 2020-04-21 23:45:24 -07:00
Raph Levien 6976f877e0 Add first draft of kernel 3
A fairly simple approach, but it adds the translation (not tested yet
in scene encoding) and does bounding box culling.
2020-04-21 18:49:50 -07:00
Brian Merchant 818d5b2047
Merge branch 'master' into master 2020-04-21 15:18:51 -07:00
Brian Merchant 3270ee64c2 Add f16 support.
Handling f16 requires special work, compared to other scalars, as the minimum conversion operation for u32->f16 in GLSL (unpackHalf2x16) loads two f16s from one u32. This means that in order to minimize unnecessary calls to unpackHalf2x16, we should look-ahead to see if the current f16 has already been extracted in the process of dealing with the last f16. Similar considerations exist for write operations, where we want to pack, when possible, two f16s in one go (using packHalf2x16).
2020-04-21 15:03:06 -07:00
Raph Levien 2ed89dd65e First draft of kernel 1
Output of kernel 1 is validated by simple inspection, next step is to
wire it up properly.
2020-04-20 18:07:18 -07:00
Raph Levien 5adb703936 Staging buffers
Add hal methods to clear and copy buffers, so work happens in device
local buffers.
2020-04-18 07:46:59 -07:00
Raph Levien 3c35899a2f Render circles
WIP
2020-04-17 16:01:37 -07:00
Raph Levien 228bfc88cd Add scene types
This patch adds a module that contains both scene and ptcl types (very
lightly adapted from piet-metal), as well as infrastructure for encoding
Rust-side.

WIP, it's not wired up in either the shader or on the Rust side.
2020-04-16 18:19:58 -07:00