weston

Commit Graph

Author	SHA1	Message	Date
Pekka Paalanen	57d32722a2	gl-renderer: simplify main() in frag By moving the application of view_alpha after pre-multiplication we can simplify main() considerably. The cost is that for straight-alpha input or color_pipeline() we might be doing three multiplications more than before. However, a) the cost of running color_pipeline() probably dominates anyway, and b) to get straight-alpha input you have to use a future Wayland extension that probably won't be advertised without color management. So we keep the optimization for the simple case (no color management) while potentially incurring a small cost on the heavy case (with color management). Thanks to Pierre-Yves Mordred for the inspiration in https://gitlab.freedesktop.org/wayland/weston/-/merge_requests/889#note_1411774 Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.com>	3 years ago
Pekka Paalanen	932c374779	gl-renderer: move undo-premult to color_pipeline() Now that we have the if-else ladder to call color_pipeline() only when necessary, and since only color_pipeline() needs undo-premult, move undo-premult into color_pipeline(). This is a small step towards improving code readability. Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.com>	3 years ago
Pekka Paalanen	924b94bc94	gl-renderer: call it view_alpha in frag We always talk about "view alpha", so the name variable in the fragment shader the same. Now it's clear without the comments, making the code easier to read overall. Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.com>	3 years ago
Pekka Paalanen	f31de214d9	gl-renderer: fix performance regression in frag When color management is disabled, the fragment shader was still first ensuring straight alpha and then immediately just going back to pre-multiplied. This is near-impossible for a shader compiler to optimize out, I guess because of the if-statement to handle division by zero. Having view alpha applied in between certainly didn't make it easier. That causes extra fragment computations that are unnecessary. In the issue report this was found to cause a notable performance regression. Fix the performance regression by introducing special-case paths for when straight alpha is not needed. This skips the unnecessary computations. Fixes: https://gitlab.freedesktop.org/wayland/weston/-/issues/623 Fixes: `9a6a4e7032` Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.com> (cherry picked from commit `6234cb98d1`) Dropped SHADER_COLOR_MAPPING_IDENTITY as that is not available in weston 10.0.	3 years ago
Pekka Paalanen	6234cb98d1	gl-renderer: fix performance regression in frag When color management is disabled, the fragment shader was still first ensuring straight alpha and then immediately just going back to pre-multiplied. This is near-impossible for a shader compiler to optimize out, I guess because of the if-statement to handle division by zero. Having view alpha applied in between certainly didn't make it easier. That causes extra fragment computations that are unnecessary. In the issue report this was found to cause a notable performance regression. Fix the performance regression by introducing special-case paths for when straight alpha is not needed. This skips the unnecessary computations. Fixes: https://gitlab.freedesktop.org/wayland/weston/-/issues/623 Fixes: `9a6a4e7032` Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.com>	3 years ago
Vitaly Prosyak	93c6180c71	gl-renderer: shaders implementation of color mapping function The following GL extensions provide support for shaders CM: -GL_OES_texture_float_linear makes GL_RGB32F linear filterable. -GL ES 3.0 provides Texture3D support in GL API. -GL_OES_texture_3D provides sampler3D support in ESSL 1.00. If abovesaid is supported then renderer sets flag WESTON_CAP_COLOR_OPS which means that all fields in struct weston_color_transform are supported, for example, 1DLUT and 3DLUT. Use GL_OES_texture_3D to implement 3DLUT function which uses trilinear interpolation for pixel processing or bypass as is. Quote from https://nick-shaw.github.io/cinematiccolor/luts-and-transforms.html "3D LUTs have long been embraced by color scientists and are one of the tools commonly used for gamut mapping. In fact, 3D LUTs are used within ICC profiles to model the complex device behaviors necessary for accurate color image reproduction". Quote from https://developer.nvidia.com/gpugems/gpugems2/part-iii-high-quality-rendering/ chapter-24-using-lookup-tables-accelerate-color is about interpolation: "By generating intermediate results based on a weighted average of the eight corners of the bounding cube, this algorithm is typically sufficient for color processing, and it is implemented in graphics hardware". Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com>	3 years ago
Pekka Paalanen	9a6a4e7032	gl-renderer: implement SHADER_COLOR_CURVE_LUT_3x1D This adds shader support for using a three-channel one-dimensional look-up table for de/encoding input colors. This operation will be useful for applying EOTF or its inverse, in other words, gamma curves. It will also be useful in optimizing a following 3D LUT tap distribution once support for 3D LUT is added. Even though called three-channel and one-dimensional, it is actually implemented as a one-channel two-dimensional texture with four rows. Each row corresponds to a source color channel except the fourth one is unused. The reason for having the fourth row is to get texture coordinates in 1/8 steps instead of 1/6 steps. 1/6 may would not be exact in floating- or fixed-point arithmetic and might perhaps risk unintended results from bilinear texture filtering when we want linear filtering only in x but not in y texture coordinates. I may be paranoid. The LUT is applied on source colors after they have been converted to straight RGB. It cannot be applied with pre-multiplied alpha. A LUT can be used for both applying EOTF to go from source color space to blending color space, and EOTF^-1 to go from blending space to output (electrical) space. However, this type of LUT cannot do color space conversions. For now, this feature is hardcoded to off everywhere, to be enabled in following patches. Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.com>	4 years ago
Pekka Paalanen	391f513c36	gl-renderer: fragment shader precision to high Always when supported, make the fragment shader default floating point precision high. The medium precision is roughly like half-floats, which can be surprisingly bad. High precision does not reach even normal 32-bit float precision (by specification), but it's better. GL ES implementations are allowed to exceed the minimum precision requirements given in the specification. This is an advance attempt to avoid nasty surprises from poor shader precision. Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.com>	4 years ago
Pekka Paalanen	4d5b2f3410	gl-renderer: add shader bit input_is_premult Add a new shader requirements bit input_is_premult which says whether the texture sampling results in premultiplied alpha or not. Currently this can be deduced fully from the shader texture variant, but in the future there might a protocol extension to explicitly control it. Hence the need for a new bit. yuva2rgba() is changed to produce straight alpha always. This makes sample_input_texture() sometimes produce straight or premultiplied alpha. The input_is_premult bit needs to match sample_input_texture() behavior. Doing this should save three multiplications in the shader for straight alpha formats. Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.com>	4 years ago
Pekka Paalanen	37fe6fde49	gl-renderer: define fragment shader compile_const Compile time constants play an important role in keeping the shader programs fast. Introduce an informal annotation to mark compile time constants to make the shader code easier to reason with. This will make much more sense once functions with compile time constant parameters are added. Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.com>	4 years ago
Pekka Paalanen	9a59303a4f	gl-renderer: doc YCbCr-RGB conversion I have verified that the conversion here follows ITU-R BT.601 except for the offsets 16/256 and 128/256 which should be 16/255 and 128/255 respectively. I used to following octave script to verify this: rf = 0.299; gf = 0.587; bf = 0.114; crdiv = 1.402; cbdiv = 1.772; M = [ rf, gf, bf ; -rf / cbdiv, -gf / cbdiv, (1 - bf) / cbdiv; (1 - rf) / crdiv, -gf / crdiv, -bf / crdiv ]; YCbCr = [ 'Y'; 'Cb'; 'Cr' ]; RGB = [ 'R'; 'G'; 'B' ]; eq = [ ' '; '='; ' ' ]; l = [ ' [ '; ' [ '; ' [ ' ]; r = [ ' ] '; ' ] '; ' ] ' ]; mat = [ sprintf('%9f %9f %9f', M(1,:)); sprintf('%9f %9f %9f', M(2,:)); sprintf('%9f %9f %9f', M(3,:)); ]; [ l YCbCr r eq l mat r l RGB r ] R = inv(M); mat = [ sprintf('%9f %9f %9f', R(1,:)); sprintf('%9f %9f %9f', R(2,:)); sprintf('%9f %9f %9f', R(3,:)); ]; [ l RGB r eq l mat r l YCbCr r ] [ R(:,1), R(:,2:3) .* (255/224) ] The final matrix printed is what the shader uses down to +/- one digit, so at least 7 correct decimals. Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.com>	4 years ago
Pekka Paalanen	2b5a863974	gl-renderer: move view alpha out of sample_input_texture() Sampling input texture has nothing to do with view alpha. This clarifies the code structure. Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.com>	4 years ago
Pekka Paalanen	3f6be39f94	gl-renderer: factor out sample_input_texture() Reading the input texture is just one part of the future color pipeline, so separate it into a function of its own. This makes it easier to add more steps to the pipeline, and shows the green tint is separate as well. Making use of early returns, reducing the if-else ladder should help with readability. Sharing the call to yuva2rgba() likewise. Setting yuva.w = alpha is not shared though, in case support for AYUV format might be added in the future. Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.com>	4 years ago
Pekka Paalanen	d278015d00	gl-renderer: drop redundant texture lookups Do not call texture2D() in the shader when we already have the result. Simpler code, maybe even a little bit faster? Suggested-by: Harish Krupo <harishkrupo@gmail.com> Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.com>	4 years ago
Pekka Paalanen	a88144f9e1	gl-renderer: move magic constants into yuva2rgba() These same magic constants were used in all cases, so move them into a common place. While we are touching all these lines, also change from the four floats into a vec4. This allows further clean-up in the next patch. This makes the code easier to read. Behavior and results are unchanged. Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.com>	4 years ago
Pekka Paalanen	054ba37084	gl-renderer: move alpha pre-mult from YUV to RGB Mathematically the result is the same, while multiplying RGB with alpha is easier to understand as correct than the earlier form. Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.com>	4 years ago
Pekka Paalanen	a8d5ef4a04	gl-renderer: rename color uniform to unicolor A more unique name is easier to grep for. Using 'color' as a local variable might be useful in the future. Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.com>	4 years ago
Harish Krupo	7903c5e667	gl-renderer: Requirement based shader generation This patch modifies the shader generation code so that the shaders are stitched together based on the requirement instead of creating them during initialization. This is necessary for HDR use cases where each surface would have different properties based on which different de-gamma or tone mapping or gamma shaders are stitched together. v2: Use /* */ instead of // (Pekka) Move shader strings to gl-shaders.c file (Pekka) Remove Makefile.am changes (Pekka) Use a struct instead of uint32_t for storing requirements (Pekka) Clean up shader list on destroy (Pekka) Rename shader_release -> shader_destroy (Pekka) Move shader creation/deletion into gl-shaders.c (Pekka) Use create_shaders's multi string capbility instead of concatenating (Pekka) v3: Add length check when adding shader string (Pekka) Signed-off-by: Harish Krupo <harishkrupo@gmail.com> v4: Rebased, PROTECTION_MODE_ENFORCED converted. Dropped unnecessary { }. Ported setup_censor_overrides(). Split out moving code into gl-shaders.c. Changed to follow "gl-renderer: rewrite fragment shaders", no more shader source stitching. Added SHADER_VARIANT_XYUV. Const'fy function arguments. Added gl_shader_requirements_cmp() and moved the early return in use_gl_program(). Moved use_gl_program() before first use in file. Split solid shader requirements by use case: requirements_censor and requirements_triangle_fan. Simplified fragment_debug_binding() since no need to force anything. Ensure struct gl_shader_requirements has no padding. This allows us to use normal C syntax instead of memset() and memcpy() when initializing or assigning. See also: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2071 Make it also a bitfield to squeeze the size. v5: Move wl_list_insert() into gl_shader_create() (Daniel) Compare variant to explicit value. (Daniel) Change functions to gl_renderer_get_program, gl_renderer_use_program, and gl_renderer_use_program_with_view_uniforms. Use local variable instead of gr->current_shader. (Daniel) Simplified gl_renderer_get_program. Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.com>	4 years ago
Pekka Paalanen	5d64e66e06	gl-renderer: rename shader debug flag to green_tint The new name reflects better what it does. Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.com>	4 years ago
Pekka Paalanen	477bdc85c9	gl-renderer: rewrite fragment shaders The main goal of this patch is to improve the readability of how and what fragment shaders are generated. Instead of having C code that assembles each shader variant from literal string snippets, create one big fragment shader source that has everything in it. This relies on a GLSL compiler to optimize statically false conditions and unused uniforms away. Having all the fragment shader code in one file, uncluttered by C string literal syntax, improves readability significantly. A disadvantage is that the code is more verbose, but it allows comments much better. The actual shader code is kept unchanged except: - FRAGMENT_CONVERT_YUV macro is now a proper function - GLSL version is explicitly set to 1.00 ES - RGBA and EXTERNAL use the same path, the difference is how the sampler is declared Further shader code consolidation is possible, but is left for another time. Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.com>	4 years ago

21 Commits (be9f5bd99331c61ff18644035bf5d51ce2e43276)