vello

alex/vello

mirror of https://github.com/italicsjenga/vello.git synced 2025-01-10 20:51:29 +11:00

Author	SHA1	Message	Date
Elias Naur	fd746ea7a6	name and comment magic constant Follow-up to review of PR #61. Signed-off-by: Elias Naur <mail@eliasnaur.com>	2021-03-19 12:45:23 +01:00
Elias Naur	79d722df48	remove unused commands from pathseg Signed-off-by: Elias Naur <mail@eliasnaur.com>	2021-03-19 12:45:23 +01:00
Ishi Tatsuyuki	8a499bc50e	Always close fill paths, fix #68	2021-03-17 01:16:00 +09:00
Elias Naur	b73eabf4eb	kernel4.comp: remove unused commands Signed-off-by: Elias Naur <mail@eliasnaur.com>	2021-02-24 15:32:24 +01:00
Elias Naur	6a4e26ef2a	all: add optional memory checks Defining MEM_DEBUG in mem.h will add a size field to Alloc and enable bounds and alignment checks for every memory read and write. Notes: - Deriving an Alloc from Path.tiles is unsound, but it's more trouble to convert Path.tiles from TileRef to a variable sized Alloc. - elements.comp note that "We should be able to use an array of structs but the NV shader compiler doesn't seem to like it". If that's still relevant, does the shared arrays of Allocs work? Signed-off-by: Elias Naur <mail@eliasnaur.com>	2021-02-15 16:07:45 +01:00
Elias Naur	ee67a0a515	kernel4: simplify a tiny bit Signed-off-by: Elias Naur <mail@eliasnaur.com>	2020-12-27 20:24:29 +01:00
Elias Naur	716517cc04	coarse,binning: organize bins into width_in_bins x height_in_bins The binning shader supports up to N_TILE bins. To efficiently cover wide or tall viewports, convert the rigid N_TILE_X x N_TILE_Y bin layout to a variable width_in_bins x height_in_bins layout. Signed-off-by: Elias Naur <mail@eliasnaur.com>	2020-12-27 20:24:29 +01:00
Elias Naur	ef4ec772ad	backdrop: repair unsound optimization Signed-off-by: Elias Naur <mail@eliasnaur.com>	2020-12-27 20:24:29 +01:00
Elias Naur	8b62022749	backdrop: avoid a (benign) zero-sized read Found with MEM_DEBUG added in later change. Signed-off-by: Elias Naur <mail@eliasnaur.com>	2020-12-27 20:24:29 +01:00
Elias Naur	c4f5a69a0d	implement variable output sizing Signed-off-by: Elias Naur <mail@eliasnaur.com>	2020-12-27 20:24:29 +01:00
Elias Naur	c67696714b	coarse.comp: don't write Cmd_End to tiles out of bounds If WIDTH_IN_TILES or HEIGHT_IN_TILES are not divisible by N_TILE_X or N_TILE_Y respectively, the previously unconditional Cmd_End_write would write out of bounds. Signed-off-by: Elias Naur <mail@eliasnaur.com>	2020-12-27 20:24:29 +01:00
Elias Naur	4de67d9081	unify GPU memory management Merge all static and dynamic buffers to just one, "memory". Add a malloc function for dynamic allocations. Unify static allocation offsets into a "config" buffer containing scene setup (number of paths, number of path segments), as well as the memory offsets of the static allocations. Finally, set an overflow flag when an allocation fail, and make sure to exit shader execution as soon as that triggers. Add checks before beginning execution in case the client wants to run two or more shaders before checking the flag. The "state" buffer is left alone because it needs zero'ing and because it is accessed with the "volatile" keyword. Fixes #40 Signed-off-by: Elias Naur <mail@eliasnaur.com>	2020-12-27 20:24:29 +01:00
Elias Naur	a2a2d12c5d	path_coarse.comp: fix intersection inconsistencies, take 2 The previous attempt to fix inconsistent intersections because of floating point inaccuracy[0] missed two cases. The first case is that for top intersections with the very first row would fail the test tag == PathSeg_FillCubic && y > y0 && xbackdrop < bbox.z In particular, y is not larger than y0 when y0 has been clipped to 0. Fix that by re-introducing the min(p0.y, p1.y) < tile_y0 check that does work and is just as consistent. Add similar check, min(p0.x, p1.x) < tile_x0, for deciding when to clip the segment to the left edge (but keep consistent xray check for deciding left edge intersections). The second case is that the tracking left intersections in the [xray, next_xray] range of tiles may fail when next_xray is forced to last_xray, the final xray value. Fix that case by computing next_xray explicitly, before looping over the x tiles. The code is now much simpler. Finally, ensure that xx0 and xx1 doesn't overflow the allocated number of tiles by clamping them after setting them. Adjust xx0 to include xray, just as xx1 is adjusted; I haven't seen corruption without it, but it's not obvious xx0 always includes xray. While here, replace a "+=" on a guaranteed zero value to just "=". Updates #23 [0] `29cfb8b63e` Signed-off-by: Elias Naur <mail@eliasnaur.com>	2020-12-27 20:24:29 +01:00
Elias Naur	d21f2b68de	all: add SPDX license headers Fixes #53 Signed-off-by: Elias Naur <mail@eliasnaur.com>	2020-12-11 18:24:35 +01:00
Elias Naur	5c04e4882b	remove unused tilegroup.h and extra spaces from kernel4.comp Signed-off-by: Elias Naur <mail@eliasnaur.com>	2020-12-11 15:00:58 +01:00
Elias Naur	580b63e558	elements.comp: tighten state size calculations The state header is only one word (flags), not two. Move the partition atomic counter to a separate field instead of state[0], simplifying state offset calculations. Signed-off-by: Elias Naur <mail@eliasnaur.com>	2020-12-10 18:48:16 +01:00
Raph Levien	56aaf7c19a	Merge pull request #52 from linebender/fix_query_pool Fetch correct query pool	2020-12-09 22:42:23 -08:00
Raph Levien	769d71915e	Fetch correct query pool It should fetch the last query pool, and was off by one. That worked on my machine (windows), but did cause panics.	2020-12-09 08:34:01 -08:00
Elias Naur	1c6ca7e5fb	remove unused BinChunk type Signed-off-by: Elias Naur <mail@eliasnaur.com>	2020-12-08 00:45:08 +01:00
Raph Levien	634530fb91	Merge branch 'master' into image_work	2020-12-02 11:58:45 -08:00
Raph Levien	3906f348fd	Merge pull request #47 from linebender/clip_opt Optimize clips	2020-12-02 11:57:14 -08:00
Elias Naur	29cfb8b63e	eliminate inconsistent line intersections from path_coarse.comp The finite precision of floating point computations can lead the coarse renderer into inconsistent tile intersections, which implies impossible line segments such as lines with gaps or double intersections. The winding number algorithm is sensitive to these errors which show up as incorrectly filled paths. This change forces all intersections to be consistent. First, the floating point top edge intersection test is removed; top edge intersections are completely determined by left edge intersections. Then, left edge intersections are inserted from the tile with the last top edge intersection. The next top edge is then fixed to be the last tile with a left edge intersection. More details in the patch comments. Fixes #23 Signed-off-by: Elias Naur <mail@eliasnaur.com>	2020-12-01 18:35:29 +01:00
Elias Naur	19f4d9fa95	change tile segment representation to (origin, vector) Eliminates the precision loss of the subtraction in the sign(end.x - start.x) expression in kernel4. That's important for the next change that avoids inconsistent line intersections in path_coarse. Updates #23 Signed-off-by: Elias Naur <mail@eliasnaur.com>	2020-12-01 18:34:40 +01:00
Elias Naur	2068171f96	path_coarse.comp: tighten variable scopes, delete unused variables No functional changes. Signed-off-by: Elias Naur <mail@eliasnaur.com>	2020-12-01 18:01:04 +01:00
Raph Levien	97dcb5122e	Merge branch 'master' into image_work	2020-11-29 17:09:48 -08:00
Raph Levien	b8ea1e35cf	Merge branch 'master' into clip_opt	2020-11-29 17:07:46 -08:00
Elias Naur	feeb459fa1	remove FillMask and FillMaskInv Obsoleted by BeginClip/EndClip. Updates #36 Signed-off-by: Elias Naur <mail@eliasnaur.com>	2020-11-29 16:59:58 +01:00
Elias Naur	bd450ef461	piet-gpu-types: remove unused Segment and SegChunk types Signed-off-by: Elias Naur <mail@eliasnaur.com>	2020-11-29 16:51:35 +01:00
Raph Levien	4138f8a516	Optimize clips Optimize tiles with clip masks that are all-zero or all-one. Part of #36	2020-11-27 09:30:35 -08:00
Raph Levien	facc9e0982	Use sampler for texture images Provide images to fine rasterization kernel as readonly textures with a sampler, rather than storage images. That lets us use the GPU's hardware for sampling, which should be considerably more efficient. There are a bunch of parameters that are hardcoded, but it does seem to work.	2020-11-25 18:05:10 -08:00
Raph Levien	047a0830d1	Towards wiring up images to k4 This patch passes a dynamically sized array of textures to the fine rasterizer. A bunch of the low level Vulkan stuff is done, but only enough of the shaders and encoders to do minimal testing. We'll want to switch from storage images to sampled images, track the actual array of textures during encoding, use that to build the descriptor set (which will need to be more dynamic), and of course run image elements through the pipeline. Progress towards #38	2020-11-24 22:11:38 -08:00
Raph Levien	6b06d249ab	Builder pattern for pipelines Use a builder pattern for pipelines and descriptor sets, so we can go richer without hugely complicating existing code. WIP	2020-11-24 22:11:38 -08:00
Raph Levien	a60c2dd3c8	Scratch buffer for clip stack We keep a small window of the clip stack in registers in the fine rasterization kernel, and when that window is exceeded, spill to global memory, so the clip stack can be unbounded.	2020-11-22 18:14:09 -08:00
Raph Levien	b928c7a3ed	Restore FillMaskInv logic	2020-11-21 10:47:28 -08:00
Raph Levien	13134e7cb3	Restore FillMask logic Per discussion, don't remove FillMask until we get unbounded clip stacks.	2020-11-21 07:00:03 -08:00
Raph Levien	d14895b107	Continuing work on clips I realized there's a problem with encoding clip bboxes relative to the current transform (see #36 for a more detailed explanation), so this is changing it to absolute bboxes. This more or less gets clips working. There are optimization opportunities (all-clear and all-opaque mask tiles), and it doesn't deal with overflow of the blend stack, but it seems to basically work.	2020-11-20 18:25:27 -08:00
Raph Levien	f53d00e6bc	Add transforms and state stack Actually handle transforms in RenderCtx (was implemented in renderer but not actually plumbed through). This also requires maintaining a state stack, which will also be required for clipping. This PR also starts work on encoding clipping, including tracking bounding boxes. WIP, none of this is tested yet.	2020-11-20 18:25:27 -08:00
Raph Levien	47e24ec9d5	Start adding support for creating images This is still WIP, focused on creating image resources and making them available GPU-side. Progress toward #38	2020-11-19 16:32:29 -08:00
Raph Levien	75c4b62730	Add hub abstraction The hub does a little better lifetime tracking of resources (so Rust-side references can be dropped), and in the future will be used for dynamic selection of backend. The migration is still a bit half-baked, as there are a bunch of Vulkan-specific types in the signatures, but it shouldn't be too much work to sort that out. Perhaps it can wait until there is a second backend though. The main motivation for this is to create image objects with lifetime tracking, one of the things required for #38.	2020-11-18 16:06:08 -08:00
Raph Levien	301abf4db7	Minor cleanups Mostly cleaning up some comments. Also adds host barrier and a command to copy a buffer to an image (in preparation for images, see #38).	2020-11-17 14:18:30 -08:00
Raph Levien	8e2f2aeeba	Update dependencies Update to latest versions of all dependencies. Among other things, this gets us on piet 0.2, though almost all of the changes were around text, which is not yet implemented.	2020-11-14 08:25:43 -08:00
Elias Naur	b942e4035b	piet-gpu/shader: ensure forward progress in decoupled lookback The Vulkan and OpenGL specifications offer only weak forward progress guarantees, and in practice several mobile devices fail to complete the decoupled lookback spinloop without mitigation. This patch implements Raph's suggestion from the "Forward Progress" section from https://raphlinus.github.io/gpu/2020/04/30/prefix-sum.html Signed-off-by: Elias Naur <mail@eliasnaur.com>	2020-10-25 21:02:58 +01:00
Elias Naur	bc01180519	piet-gpu/shader: delete unused is_fill from elements.comp Delete debug code as well. Signed-off-by: Elias Naur <mail@eliasnaur.com>	2020-10-25 20:59:54 +01:00
Elias Naur	8fab45544e	shader: implement clip paths Expand the the final kernel4 stage to maintain a per-pixel mask. Introduce two new path elements, FillMask and FillMaskInv, to fill the mask. FillMask acts like Fill, while FillMaskInv fills the area outside the path. SVG clipPaths is then representable by a FillMaskInv(0.0) for every nested path, preceded by a FillMask(1.0) to clear the mask. The bounding box for FillMaskInv elements is the entire screen; tightening of the bounding box is left for future work. Note that a fullscreen bounding box is not hopelessly inefficient because completely filling a tile with a mask is just a single CmdSolidMask per tile. Fixes #30 Signed-off-by: Elias Naur <mail@eliasnaur.com>	2020-10-09 13:20:26 +02:00
Elias Naur	55cfd472a5	shader: delete unused code Signed-off-by: Elias Naur <mail@eliasnaur.com>	2020-10-09 13:20:26 +02:00
Elias Naur	9be0faba6f	piet-gpu-types: remove unused scene elements Delete image compute shader as well; it is unused. Signed-off-by: Elias Naur <mail@eliasnaur.com>	2020-09-27 18:57:53 +02:00
Elias Naur	fa9bf0dc2b	piet-gpu-types: remove unused ptcl types Signed-off-by: Elias Naur <mail@eliasnaur.com>	2020-09-27 18:30:33 +02:00
Elias Naur	dceb0f9412	piet-gpu-types: remove unused annotated types Signed-off-by: Elias Naur <mail@eliasnaur.com>	2020-09-21 10:55:58 +02:00
Elias Naur	ac3ac3ddff	shader: introduce a crude setting for adjusting the maximum workgroup size Both the Vulkan and OpenGL ES spec allow implementations to limit workgroups to 128 threads. Add a LG_WG_FACTOR setting for easy switching between 128 and 256 threads, with 256 being kept as the default setting. Manually tested that LG_WG_FACTOR = 0 (128 threads) works as expected. Signed-off-by: Elias Naur <mail@eliasnaur.com>	2020-09-13 13:04:13 +02:00
Elias Naur	326f7f0d03	shader: delete more unused code and variables Signed-off-by: Elias Naur <mail@eliasnaur.com>	2020-09-13 13:03:56 +02:00

1 2 3

131 commits