Eliminates the precision loss of the subtraction in the sign(end.x - start.x)
expression in kernel4. That's important for the next change that avoids
inconsistent line intersections in path_coarse.
Updates #23
Signed-off-by: Elias Naur <mail@eliasnaur.com>
Provide images to fine rasterization kernel as readonly textures with a
sampler, rather than storage images. That lets us use the GPU's hardware
for sampling, which should be considerably more efficient.
There are a bunch of parameters that are hardcoded, but it does seem to
work.
This patch passes a dynamically sized array of textures to the fine
rasterizer.
A bunch of the low level Vulkan stuff is done, but only enough of the
shaders and encoders to do minimal testing. We'll want to switch from
storage images to sampled images, track the actual array of textures
during encoding, use that to build the descriptor set (which will need
to be more dynamic), and of course run image elements through the
pipeline.
Progress towards #38
We keep a small window of the clip stack in registers in the fine
rasterization kernel, and when that window is exceeded, spill to global
memory, so the clip stack can be unbounded.
I realized there's a problem with encoding clip bboxes relative to the
current transform (see #36 for a more detailed explanation), so this is
changing it to absolute bboxes.
This more or less gets clips working. There are optimization
opportunities (all-clear and all-opaque mask tiles), and it doesn't deal
with overflow of the blend stack, but it seems to basically work.
Actually handle transforms in RenderCtx (was implemented in renderer but
not actually plumbed through). This also requires maintaining a state
stack, which will also be required for clipping.
This PR also starts work on encoding clipping, including tracking
bounding boxes.
WIP, none of this is tested yet.
Expand the the final kernel4 stage to maintain a per-pixel mask.
Introduce two new path elements, FillMask and FillMaskInv, to fill
the mask. FillMask acts like Fill, while FillMaskInv fills the area
outside the path.
SVG clipPaths is then representable by a FillMaskInv(0.0) for every nested
path, preceded by a FillMask(1.0) to clear the mask.
The bounding box for FillMaskInv elements is the entire screen; tightening of
the bounding box is left for future work. Note that a fullscreen bounding
box is not hopelessly inefficient because completely filling a tile with
a mask is just a single CmdSolidMask per tile.
Fixes#30
Signed-off-by: Elias Naur <mail@eliasnaur.com>
In kernel 4, compute a chunk of pixels rather than just one per thread.
This is a dramatic speedup.
(This commit cherry-picked from another working branch)
This version seems to work but the allocation of segments has low
utilization. Probably best to allocate in chunks rather than try to
make them contiguous.
This just adds the first step of polyline stroking, which is adding it
to the scene. Also just a bit of cleaning up of dimensions into one
header file.