When color management is disabled, the fragment shader was still first
ensuring straight alpha and then immediately just going back to
pre-multiplied. This is near-impossible for a shader compiler to
optimize out, I guess because of the if-statement to handle division by
zero. Having view alpha applied in between certainly didn't make it
easier.
That causes extra fragment computations that are unnecessary. In the
issue report this was found to cause a notable performance regression.
Fix the performance regression by introducing special-case paths for
when straight alpha is not needed. This skips the unnecessary
computations.
Fixes: https://gitlab.freedesktop.org/wayland/weston/-/issues/623
Fixes: 9a6a4e7032
Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.com>
(cherry picked from commit 6234cb98d1)
Dropped SHADER_COLOR_MAPPING_IDENTITY as that is not available in weston
10.0.
This adds the initial dma-buf feedback implementation, following the
changes in the dma-buf protocol extension.
The initial dma-buf feedback implementation gives support to send
default feedback and per-surface feedback. For now the per-surface
feedback support is very basic and is still not implemented in the
DRM-backend, what basically means that KMS plane's formats/modifiers are
not being exposed to clients. In the next commits of this series we add
the DRM-backend implementation.
This patch is based on previous work of Scott Anderson (@ascent).
Signed-off-by: Leandro Ribeiro <leandro.ribeiro@collabora.com>
Signed-off-by: Scott Anderson <scott.anderson@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Add function to query the DRM device given an EGLDisplay. It is the
device being used by the compositor to perform composition.
This will be useful in the next commits of this series, where we add
support for dma-buf feedback.
Signed-off-by: Scott Anderson <scott.anderson@collabora.com>
Signed-off-by: Leandro Ribeiro <leandro.ribeiro@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
These formats will be eventually be useful for color managed clients
using wl_shm that wish to submit buffers encoding high dynamic range
images.
While the minimum requirement for linearly filterable half float
textures is GL ES 2.0 + GL_OES_texture_half_float_linear, to keep
the code simple, this commit only enables the new formats when
the requirements for color management (notably including GL ES 3.0
and GL_EXT_color_buffer_half_float) are available.
Signed-off-by: Manuel Stoeckl <code@mstoeckl.com>
Adding these formats makes it possible for clients using wl_shm to
submit buffers with 10 bits per pixel, and thus (if Weston is
configured with an xrgb2101010 frame buffer) display more precise
colors on some computer monitors.
Signed-off-by: Manuel Stoeckl <code@mstoeckl.com>
My reading of the GL spec is that a dmabuf becomes a sibling to the
EGLImage created from it, and that all updates to the dmabuf will be
propagated to the EGLImage.
A rebind is still required every time the dmabuf content changes,
but this should be satisfied by gl_renderer_attach(), which does
a rebind when the buffer is commit.
Signed-off-by: Derek Foreman <derek.foreman@collabora.com>
So, turns out the GL implementation is allowed to destroy EGLImage
sources if this isn't set. Apparently none we've ever been tested on do
this, but it looks like we should be setting this anyway.
Signed-off-by: Derek Foreman <derek.foreman@collabora.com>
EGL_KHR_partial_update can be implemented independently of
EGL_EXT_buffer_age so we handle each case seperately.
Signed-off-by: Ben Davis <ben.davis@arm.com>
Signed-off-by: Dennis Tsiang <dennis.tsiang@arm.com>
Potential failures when creating the EGL image could cause an incorrect
number of num images (num_planes > num_images). With this change
egl_image_unref() requires an additional check to avoid any potential NULL
derefs when cleaning up. We do it straight in egl_image_unref() instead
of adding guards all over the necessary parts.
Signed-off-by: Marius Vlad <marius.vlad@collabora.com>
As observed on some platforms, importing known DMA buffers can cause
failures, leading to an attempt of destroyng an EGL image not set. This patch
resets the num_images such that loop becomes inert when destroying the
DMA buffer, and avoids passing an egl image to it.
The initial import doesn't have this issue as it sets the num_images in
case it succeeds. This also corrects the assumption that the num_images
were 0 at that point which, if the initial import succeded, was actually set
to 1.
Signed-off-by: Marius Vlad <marius.vlad@collabora.com>
Use the blending to output color space transformation when blitting from
the shadow to a framebuffer.
This allows the blending and output color spaces to differ as long as
shadow is used, in case a backend does not off-load the color
transformation.
Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.com>
Use the sRGB to output color space transformation when blitting the
borders (decorations) into an output window (nested compositor).
Nested output does not need to be sRGB anymore, as far as the
decorations are concerned.
Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.com>
Use the sRGB to blending color space transformation for the censoring
color fill and triangle fan debug drawings.
This removes the assert that the output's blending space is (non-linear)
sRGB.
Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.com>
This makes weston_color_transform object be able to express
three-channel one-dimensional look-up table transformations. They are
useful for applying EOTF and EOTF^-1 mapping, or, gamma curves. They
will also be useful in optimizing a following 3D LUT tap distribution
once support for 3D LUT is added.
The code added here translates from the lut_3x1d fill_in() interface to
a GL texture to be used with SHADER_COLOR_CURVE_LUT_3x1D for
weston_surfaces.
It demonstrates how renderer data is attached to weston_color_transform
and cached.
GL_OES_texture_float_linear is required to be able to use bilinear
texture filtering with 32-bit floating-point textures, used for the LUT.
As the size of the LUT depends on what implements it, lut_3x1d fill_in()
interface is a callback to the color management component to ask for an
arbitrary size. For GL-renderer this is not important as it can easily
realize any LUT size, but when DRM-backend wants to offload the EOTF^-1
mapping to KMS (GAMMA_LUT), the LUT size comes from KMS.
Nothing actually implements lut_3x1d fill_in() yet, that will come in a
later patch.
Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.com>
This adds shader support for using a three-channel one-dimensional
look-up table for de/encoding input colors. This operation will be useful
for applying EOTF or its inverse, in other words, gamma curves. It will
also be useful in optimizing a following 3D LUT tap distribution once
support for 3D LUT is added.
Even though called three-channel and one-dimensional, it is actually
implemented as a one-channel two-dimensional texture with four rows.
Each row corresponds to a source color channel except the fourth one is
unused. The reason for having the fourth row is to get texture
coordinates in 1/8 steps instead of 1/6 steps. 1/6 may would not be
exact in floating- or fixed-point arithmetic and might perhaps risk
unintended results from bilinear texture filtering when we want linear
filtering only in x but not in y texture coordinates. I may be paranoid.
The LUT is applied on source colors after they have been converted to
straight RGB. It cannot be applied with pre-multiplied alpha. A LUT can
be used for both applying EOTF to go from source color space to blending
color space, and EOTF^-1 to go from blending space to output
(electrical) space. However, this type of LUT cannot do color space
conversions.
For now, this feature is hardcoded to off everywhere, to be enabled in
following patches.
Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.com>
Always when supported, make the fragment shader default floating point
precision high. The medium precision is roughly like half-floats, which
can be surprisingly bad. High precision does not reach even normal
32-bit float precision (by specification), but it's better. GL ES
implementations are allowed to exceed the minimum precision requirements
given in the specification.
This is an advance attempt to avoid nasty surprises from poor shader
precision.
Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.com>
Add a new shader requirements bit input_is_premult which says whether
the texture sampling results in premultiplied alpha or not. Currently
this can be deduced fully from the shader texture variant, but in the
future there might a protocol extension to explicitly control it. Hence
the need for a new bit.
yuva2rgba() is changed to produce straight alpha always. This makes
sample_input_texture() sometimes produce straight or premultiplied
alpha. The input_is_premult bit needs to match sample_input_texture()
behavior. Doing this should save three multiplications in the shader for
straight alpha formats.
Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.com>
This creates the FP16 shadow framebuffer automatically if the color
transformation from blending space to output space is not identity and
the backend does not claim to implement it on the renderer's behalf.
That makes the weston_output_set_renderer_shadow_buffer() API and
use-renderer-shadow weston.ini option obsolete.
To still cater for the one test that needs to enable the shadow
framebuffer in spite of not needing it for color correct blending, the
quirk it uses now also forces the shadow.
Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.com>
Compile time constants play an important role in keeping the shader
programs fast. Introduce an informal annotation to mark compile time
constants to make the shader code easier to reason with.
This will make much more sense once functions with compile time constant
parameters are added.
Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.com>
Trying to support GL ES 2.0 + extensions along with GL ES 3.0 for better
control is becoming too complicated fast. In this patch you see the
GL_RGBA vs. GL_RBA16F and GL_HALF_FLOAT vs. GL_HALF_FLOAT_OES paths.
More such cases will come, e.g. GL_RED_EXT vs. GL_R32F.
Make things simpler and require GL ES 3.0 +
GL_EXT_color_buffer_half_float for all color management related
functionality. If one doesn't have GL ES 3.0, all you lose is color
management.
Also, all extensions needed by color transformation operations are
gathered under one boolean flag instead of having a flag per extension,
again for simplicity.
This makes the GL ES extension handling much easier.
Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.com>
This reverts commit 36d699a164.
A different way to fix this same issue is the previous commit
"gl-renderer: do not unbind the context on output destroy"
which is needed for other reasons.
Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.com>
If we make EGL_NO_CONTEXT current, all following GL calls are
no-ops. This will be a problem when gl-renderer introduces
gl_renderer_color_transform containing GL textures and needs to destroy
them when weston_color_transform is destroyed. Mesa would print the the
warning that glDeleteTextures is no-op.
To fix this, keep our GL context current when destroying a GL output.
In case EGL_KHR_surfaceless_context is not available, we must use
dummy_surface.
Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.com>
This is needed when the compositor produces any content internally:
- the lines in triangle fan debug
- the censoring color fill (unmet HDCP requirements)
Solid color surfaces do not need this special-casing because
weston_surface is supposed to carry color space information, which will
get used in gl_shader_config_init_for_view().
This makes sure the internally produced graphics fit in, e.g on a
monitor in HDR mode.
For now, just ensure there is an identity transformation. Actual
implementations in GL-renderer will follow later.
Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.com>
This is needed when drawing anything internal directly to an output,
like the borders/decorations in a nested compositor setup. This makes
the assumption that the internal stuff starts in sRGB, which should be
safe. As borders are never blended with other content, this should also
be sufficient.
This patch is a reminder that that path exists, rather than a real
implementation. To be implemented when someone needs it.
Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.com>
This is the blending space to monitor space color transform. It needs to
be implemented in the renderers, unless a backend sets
from_blend_to_output_by_backend = true, in which case the backend does
it and the renderer does not.
The intention is that from_blend_to_output_by_backend can be toggled
frame by frame to allow backends to react to dynamic change of output
color profile.
For now, renderers just assert that they don't need to do anything for
output color transform.
Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.com>
See: https://gitlab.freedesktop.org/wayland/weston/-/issues/467#note_814985
This starts building the framework required for implementing color
management.
The main new interface is struct weston_color_manager. This commit also
adds a no-op color manager implementation, which is used if no other
color manager is loaded. This no-op color manager simply provides
identity color transforms for everything, so that Weston keeps running
exactly like before.
weston_color_manager interface is incomplete and will be extended later.
Colorspace objects are not introduced in this commit. However, when
client content colorspace and output colorspace definitions are
combined, they will produce color transformations from client content to
output blending space and from output blending space to output space.
This commit introduces a placeholder struct for color transforms,
weston_color_transform. Objects of this type are expected to be heavy to
create and store, which is why they are designed to be shared as much as
possible, ideally making their instances unique. As color transform
description is intended to be generic in libweston core, renderers and
backends are expected to derive their own state for each transform
object as necessary. Creating and storing the derived state maybe be
expensive as well, more the reason to re-use these objects as much as
possible. E.g. GL-renderer might upload a 3D LUT into a texture and keep
the texture around. DRM-backend might create a KMS blob for a LUT and
keep that around.
As a color transform depends on both the surface and the output, a
transform object may need to be created for each unique pair of them.
Therefore color transforms are referenced from weston_paint_node. As
paint nodes exist for not just surface+output but surface+view+output
triplets, the code ensures that all paint nodes (having different view)
for the same surface+output have the same color transform state.
As a special case, if weston_color_transform is NULL, it means identity
transform. This short-circuits some checks and memory allocations, but
it does mean we use a separate member on weston_paint_node to know if
the color transform has been initialized or not.
Color transformations are pre-created at the weston_output
paint_node_z_order_list creation step. Currently the z order lists
contain all views globally, which means we populate color transforms we
may never need, e.g. a view is never shown on a particular output.
This problem should get fixed naturally when z order lists are
constructed "pruned" in the future: to contain only those paint nodes
that actually contribute to the output's image.
As nothing actually supports color transforms yet, both renderers and
the DRM-backend assert that they only get identity transforms. This
check has the side-effect that all surface-output pairs actually get a
weston_surface_color_transform_ref even though it points to NULL
weston_color_transform.
This design is inspired by Sebastian Wick's Weston color management
work.
Co-authored-by: Sebastian Wick <sebastian@sebastianwick.net>
Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.com>
A following patch will need the paint node in
gl_shader_config_init_for_view() for color transformations.
While passing the paint node through, rename the functions to be more
appropriate and get surface/view/output from the paint node.
This is a pure refactoring with no functional change.
Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.com>
Iterate paint nodes instead of the global view list. Right now this does
not change behavior.
This is a step towards using per-output view lists that can then be
optimized for the output in libweston core.
Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.com>
In "backend-drm: simplify compile time array copy", ARRAY_COPY was
introduced to be used by the DRM-backend.
In this patch we expand its usage to other code where hardcoded arrays
are being copied.
Signed-off-by: Leandro Ribeiro <leandro.ribeiro@collabora.com>
EGL implementations have no way to tell that implicit modifiers are not
supported. So Weston must consider that implicit modifiers are
supported. Always add DRM_FORMAT_MOD_INVALID to formats that we query
from EGL.
The implication is that clients using dmabuf extension may pick
DRM_FORMAT_MOD_INVALID to allocate their buffers, and so these buffers
will not be directly imported to KMS and placed in planes. See commit
"backend-drm: do not import dmabuf buffers with no modifiers to KMS" for
more details.
But we should not avoid advertising DRM_FORMAT_MOD_INVALID in the dmabuf
protocol just because we hope that the client don't choose it, it's not
our choice.
Signed-off-by: Leandro Ribeiro <leandro.ribeiro@collabora.com>
In commit "libweston: add struct weston_drm_format" struct
weston_drm_format and its helper functions were added to libweston.
The functions query_dmabuf_formats and query_dmabuf_modifiers are very
specific to GL-renderer and its internals. So instead of exposing them
in libweston, query and store DRM formats and modifiers internally in
GL-renderer. Also, add a vfunction to struct weston_renderer in order
to retrieve the formats.
Signed-off-by: Leandro Ribeiro <leandro.ribeiro@collabora.com>
Now that pieces of color management implementation start to land, the
fallback shader becomes even more special than before. It is the only
case where the compositor ignores color management.
Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.com>
The texture target can be uniquely inferred from the shader variant, so
do not store texture target separately.
Shortens the code a bit.
Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.com>
Replace the shader_requirements with just shader_variant. The variant is
the only thing gl_surface_state will actually carry. All the other
requirements fields are always unused.
Co-authored-by: Sebastian Wick <sebastian@sebastianwick.net>
Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.com>
This patch gathers all values to be loaded to shader uniforms into a new
struct gl_shader_config along with texture target and filter
information. Struct gl_shader becomes opaque outside of gl-shaders.c.
Everything that used or open-coded these are converted.
The aim is to make gl-renderer.c easier to read. Previously, uniform
values were loaded up in various places, texture units were set up in
one place, textures were bound into units in different places. Stuff was
all over the place.
Now, shader requirements and associated uniform data is stored in a
single struct. The data is loaded into a shader program in one function
only.
That makes it easy for things like maybe_censor_override() to replace
the whole config rather than poke only the shader requirements. This may
not look like much right now, but when color management adds more
uniforms and even hardcoded color need to go through the proper color
pipeline, doing things the old way would become intractable.
Similar simplification can be seen in draw_view(), where the RGBA->RGBX
override becomes more contained. There is no longer a need to "pre-load"
the shader used by triangle fan debug. Triangle fan debug no longer
needs to play tricks with saving and restoring the current shader.
The real benefit of this change will probably come when almost all
shader operations need to take color spaces into account. That means
filling in gl_shader_config parts based on a color transformation.
This is based on an idea Sebastian already used in his Weston color
management work.
Co-authored-by: Sebastian Wick <sebastian@sebastianwick.net>
Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.com>
Avoid looking up 'gr' from view->compositor by passing it explicitly
into the functions needing it.
Also fixes the whitespace in repaint_region() signature.
Clarifies code by removing local variables, but also future changes will
need 'gr' more.
Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.com>
A future change will call this function from draw_view(), so move it
upwards to avoid adding a function declaration.
No functional or even cosmetic change.
Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.com>
These functions are related to shaders, so they are more at home in
gl-shaders.c. gl-renderer.c is too long already.
This allows making a couple functions static while the moved functions
become non-static. Future changes turn some of these functions into
static again, with the ultimate goal of making struct gl_shader opaque.
Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.com>
This adds a heuristic for freeing shader programs that have not been
needed for a while. The intention is to stop Weston accumulating shader
programs indefinitely, especially in the future when color management
will explode the number of possible different shader programs.
Shader programs that have not been used in the past minute are freed,
except always keep the ten most recently used shader programs anyway.
The former rule is to ensure we keep shader programs that are actively
used regardless of how many. The latter rule is to prevent freeing too
many shader programs after Weston has been idle for a long time and then
repaints just a small area. Many of the shader programs could still be
relevant even though not needed in the first repaint after idle.
The numbers ten and one minute in the above are arbitrary and not based
on anything.
These heuristics are simpler to implement than e.g. views taking
references on shader programs. Expiry by time allows shader programs to
survive a while even after their last user is gone, with the hope of
being re-used soon. Tracking actual use instead of references also
adapts to what is actually visible rather than what merely exists.
Keeping the shader list in most recently used order might also make
gl_renderer_get_program() more efficient on average.
last_repaint_start time is used for shader timestamp to avoid calling
clock_gettime() more often. Adding that variable is an ABI break, but
libweston major has already been bumped to 10 since last release.
Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.com>
This is useful for seeing that the shader program garbage collection
works in a future patch.
Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.com>
One more thing is coming to need this, so add the compositor pointer and
migrate existing places to use it where it simplifies things.
Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.com>
I have verified that the conversion here follows ITU-R BT.601 except for
the offsets 16/256 and 128/256 which should be 16/255 and 128/255
respectively.
I used to following octave script to verify this:
rf = 0.299;
gf = 0.587;
bf = 0.114;
crdiv = 1.402;
cbdiv = 1.772;
M = [ rf, gf, bf ;
-rf / cbdiv, -gf / cbdiv, (1 - bf) / cbdiv;
(1 - rf) / crdiv, -gf / crdiv, -bf / crdiv ];
YCbCr = [ 'Y'; 'Cb'; 'Cr' ];
RGB = [ 'R'; 'G'; 'B' ];
eq = [ ' '; '='; ' ' ];
l = [ ' [ '; ' [ '; ' [ ' ];
r = [ ' ] '; ' ] '; ' ] ' ];
mat = [
sprintf('%9f %9f %9f', M(1,:));
sprintf('%9f %9f %9f', M(2,:));
sprintf('%9f %9f %9f', M(3,:));
];
[ l YCbCr r eq l mat r l RGB r ]
R = inv(M);
mat = [
sprintf('%9f %9f %9f', R(1,:));
sprintf('%9f %9f %9f', R(2,:));
sprintf('%9f %9f %9f', R(3,:));
];
[ l RGB r eq l mat r l YCbCr r ]
[ R(:,1), R(:,2:3) .* (255/224) ]
The final matrix printed is what the shader uses down to +/- one digit,
so at least 7 correct decimals.
Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.com>