This fixes an issue where a shader would be emitted using
EXT_clip_cull_distance even if the host didn't support the extension.
Fixes 072f30955b
Signed-off-by: Italo Nicola <italonicola@collabora.com>
Reviewed-by: John Bates <jbates@chromium.org>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Recording the uid from the fragment shader seems unnecessary when we
already track most of the information through vars_info.
Signed-off-by: Italo Nicola <italonicola@collabora.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
This prevents shader recompilation when the guest only wants to
enable/disable clip plane in a compatibility profile. Can improve
performance if many shaders have to be recompiled because of this,
which happens in some games like Portal 2.
Signed-off-by: Italo Nicola <italonicola@collabora.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
This prevents shader recompilation when the guest wants to enable/disable
alpha test. Can make a performance difference when lots of fragment
shaders suddenly have to be recompiled only because the application
called `glEnable(GL_ALPHA_TEST)`.
Signed-off-by: Italo Nicola <italonicola@collabora.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Sometimes we receive FS shaders where some output was optimized away.
In this case we have to set the output locations explicitly, but it is
no problem if we always do this. On GL this can always be done by using
glBindFragDataLocationIndexed, it is supported since at least GL 3.3,
and the alternative, setting the layouts in the shader, requires the same
OpenGL. On GLES we have to emit explict locations if EXT_blend_func_extended
is not supported. This fises rendering of "The Long Dark".
v2: - Emit EXT_blend_func_extended in FS if extension is available
since this is required when not emitting the explicit locations
(Robert Wenzel)
- call glBindFragDataLocationIndexedEXT on GLES because that's what the
extension requires
v3: use sizoef(buf) instead of hard-coding size (John)
Related: #223
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: maksym.wezdecki@collabora.com (v2)
Reviewed-by: John Bates <jbates@chromium.org>
Legacy FS shaders may use glColor and glSecondaryColor without the VS
actually emitting it. However, on GLES the interfaces between stages must
be matched, so replace FS input colors with a constant when they are not
emitted by the VS.
v2: - revert how the color_in_mask is populated
- handle back color too
- correctly appply the interface matching only on GLES (Bug reported
by David Riley).
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: John Bates <jbates@chromium.org>
For glsl >= 140 glClipVertex is only available in compatibility context
and needs to be lowered in the last vertex stage.
In addition, reworks the handling of clip and cull distances, and add
handling for gl_PointSize to the gl_PerVertex emission.
v2: - reorder patches and remove debug messages (Rohan)
v3: - don't force require glsl 1.50 for non-last vertex stage VS (Italo)
v4: - Move expression into common if branch (John)
- fix ws (John)
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: John Bates <jbates@chromium.org>
If the guest is creating texture and the memory
comes from buffer object by creating GL_TEXTURE_BUFFER,
then host creates GL_TEXTURE_BUFFER too.
No texture parameters can be set for GL_TEXTURE_BUFFER.
If there is mismatch between the guest texture format and
the host texture format, for example GL_ALPHA8(guest) and
GL_R8(host), then we can't apply swizzling for such textures.
In such case, add manually swizzling in GLSL shader generation
step.
The logic of this patch:
1. Add additional fields in shader key struct
2. During draw_vbo call check if "manual swizzling" is needed
3. If yes, the add fields in key struct and generate shader again
4. During generation of for example texelFetch instruction
in GLSL put additional instruction for swizzling
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Probably for some time now the generics are moved from index 0-31
to 9-40, and this doesn't fit the masks we were using, so use
64 bit masks.
Apparently this only triggered an assert and didn't lead to further
failures when the assertion was not enabled so that the CTS didin't
catch this.
We could also use 32 bit masks and shift the index by 9, but if
this would get us into trouble we ever decide to switch the
PIPE_CAP_TGSI_TEXCOORD in mesa/virgl.
Fixes a number of piglits from the glsl-1.10 set that otherwise trigger
an assertion.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Rohan Garg <rohan.garg@collabora.com>
Let the value found in the properties takes preference.
This fixes all compilation and link errors with
KHR-GL43.cull_distance.functional
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Rohan Garg <rohan.garg@collabora.com>
Up until now sme of the info stored in sinfo was changed
based on the shader key, and since this info is used for
all shader variants when evaluating the key for shaders
to be linked, this may result in the wrong shader
variant being picked up.
Instead track the shader info that can be changed like this
in the shader variant itself.
Closes: #239
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Tested-by: Dave Airlie <airlied@redhat.com>
Fixes e5fabecbe0
vrend: pass texture levels per shader on GLES as uniform
v2: - only emit texlod uniform when textureQueryLevels is called
- initialize uniform location to -1 if the mask shows that
the texture lod levels arenot needed and ony test this when
uploading the data
(all suggested in a way by Chia-I Wu)
Closes: #237
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Resources imported from dma-bufs as EGL images are given internal format
GL_RGB8 (24bpp) by host mesa. When placing srgb texture views on such
resources, the appropriate view format would be GL_SRGB8, but this is
not color-renderable for GL or GLES and results in an incomplete
framebuffer. Instead, GL_RGB8 must be used as the attachment format and
linear->srgb colorspace conversions must be applied manually in the frag
shader and for calls to glClearColor() involving these surfaces.
This resolves a
dEQP-EGL.functional.wide_color.window_888_colorspace_srgb test failure
for GBM-enabled hosts with newer host mesa, which imports
B8G8R8X8_UNORM and R8G8B8X8_UNORM as 24bpp instead of the more
compatible 32bpp.
Signed-off-by: Ryan Neph <ryanneph@google.com>
Reviewed-by: Lepton Wu <lepton@chromium.org>
Reviewed-by: John Bates <jbates@chromium.org>
For fragment ARB programs, the fog co-ordinate input
can be requested independently of any other part of the
graphics pipeline. Through the use of an OPTION declaration.
This can lead to a case where a fragment shader expects
a fog co-ordinate input, but it is not being supplied
by the preceding shader stage.
This patch will fix it by forcing it to 0 when it detects
the case, which should neutralize the fog color.
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Seperate more elements into the shader stages that they are used in
and evaluate them only for the stage they are relevant for.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Rohan Garg <rohan.garg@collabora.com>
Mesa clamps the number of cbufs to eight, so we can use uint8_t bitmask for
cbufs states. Be save for the future by only reporting support for
at most eight cbufs.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Rohan Garg <rohan.garg@collabora.com>
Compress the structure so that it fits into a 64 bit value.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Rohan Garg <rohan.garg@collabora.com>
In addition combine values into a bitfield so that the compiler can use a
64 bit move instead of individual moves.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Rohan Garg <rohan.garg@collabora.com>
Keeping all the information in one place might come in handy when we
want to further refactor this. On the way also compress the structure.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Rohan Garg <rohan.garg@collabora.com>
Use a bitfield to declare vrend_interp_info and make it a fixed size array.
One one hand this avoids all the hassles with allocating and freeing memory,
and it will make it possible to shrink the size of the data that is passed
from the sinfo to the shader key.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Rohan Garg <rohan.garg@collabora.com>
The property doesn't change. so it doesn't make sense to set it in the
shader key.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Rohan Garg <rohan.garg@collabora.com>
If GL_POINT_SPRITE mode is enabled and a texture has
GL_COORD_REPLACE enabled, the texture will apply coord_replace
to all primitive types, instead of only GL_POINTS as expected.
This change checks the prim_mode for every draw call and selects
the appropriate shader variant to enable coord_replace only when
rendering GL_POINTS primitives.
Closes: https://gitlab.freedesktop.org/virgl/virglrenderer/-/issues/188
Signed-off-by: Ryan Neph <ryanneph@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
If host support dual_src_blend extension, it's possible we need
to call glBindFragDataLocationIndexed to set fragment output location.
We can't explicitly set it in GLSL in such cases.
Fixes: b17ba74 ("shader: Add layout qualifier to fragment shader outputs")
Signed-off-by: Lepton Wu <lepton@chromium.org>
Reviewed-by: David Riley <davidriley@chromium.org>
Currently, even the input/output buffers are integer formats, we still
use float type for them. This is actually undefined behavior and MALI
GPU will do type conversion and cause unexpected results. To fix this,
we check the input buffer format for vertex shader and also color buffer
format for fragment shader, if they are integer types, just use integer
types in generated shaders. Since the old way works fine on other GPU,
only enable this fix for ARM MALI for now. The plan is to enable this
for other GPU also with auto detection.
Signed-off-by: Lepton Wu <lepton@chromium.org>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
The plan is to remove the patch step and generate the correct glsl
from the beginning. As the first step, just move the patch step to
where we create glsl.
Signed-off-by: Lepton Wu <lepton@chromium.org>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
If the next shader stage doesn't declare an input block,
we should not emit an output block in the current stage.
Fix the remaining compilation issue when using the GLES backend.
error: redeclaration of gl_PerVertex must be a subset of the built-in members of gl_PerVertex
Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Cleanup to make it more clear what the functions are modifying.
Signed-off-by: John Bates <jbates@chromium.org>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Some applications change glAlphaFunc nearly every frame.
Sam 3 apitrace replay went from ~4 fps to ~19 fps on a chromebook.
Signed-off-by: John Bates <jbates@chromium.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
The value should never haven been part of the shader key.
Fixes#132
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
On OpenGL with GLSL < 4.30 the invariant input specifiers must match the
invariant output specifiers of the previous stage.
Fixes#75
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
With vtest this swizzling is necessary, but with qemu it is not, so make
it a tweak.
v2: Use original blit format to check the format type in blitter
v3: Correct typo (Gurchetan)
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Analogous to previous commit there's a single user that cares about it.
Let that one use amplertype_is_shadow() and drop the function argument.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
GL 4.2 supports depth layouts that make it possible to optimize by
enabling early depth tests and still being able to write a new
z value in the fragment shader under specific circumstances.
Fixes#106
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Just like all the other non-static vrend_shader functions.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
This patch uses GL_EXT_framebuffer_fetch_non_coherent (preferred)
or GL_EXT_framebuffer_fetch to emulate the logiops in the fragement
shader. If neither of these extension are available then only
GL_COPY, GL_COPY_INVERTED, GL_CLEAR, and GL_SET are emulated.
Fixes piglit gl-1.0-logicop on GLES hosts.
v2: Use non_coherent access when possible
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Pass a mask of the non-array generic inputs that the next shader expects
is passed to the shader that is currently converted. If, after emitting
all generic outputs, some are missing, then these are also generated.
Limitations: This doesn't take care of input arrays that may not be emitted
as outputs, but since this problem seems to be only related to IO variables
that are implicitely declared this is not a problem.
Fixes piglit: glsl-routing
v2: rebase and update names to new naming
v3: Declare TCS outputs as arrays as required
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Tested-by: Elie Tournier <elie.tournier@collabora.com> (v1)
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
On D-GL a program may contain a TE shader but no TC shader. On GLES either
both or none of TES and TCS need to be available. So if the guest sends a
shader program without a TCS, inject a passthrough shader using the
patch parameters given in the GL code.
Closes#84
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Softpipe doesn't support ARB_GLES3_1_compatibility but can support
NV_shader_atomic_float and this is needed for some dEQP-GLES31 tests to be
run on a softpipe GL host.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
(u|i)mulExtended are provided by ARB_gpu_shaders5 but also by
MESA_shader_integer_functions and only the latter is supported by softpipe
so fall back to this extension if needed.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
mesa optimizes away the information about input layouts for generics and
patches for TCS, TES, and GEOM shaders (all that pass these inputs as
arrays), but when input arrays are allowed, then these varyings may
actually have overlapping layouts (at least some piglits do, even though
they don't specifically require ARB_enhanced_layouts).
To be able to link shaders like these with enables arrays pass this info
to the next stage.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-By: Gurchetan Singh <gurchetansingh@chromium.org>
Arrays of arrays are way easier to handle and they are available on GLES
3.1, D-GL 4.3, and all hardware that is supported by mesa. The old code
path is still able to use blocks for passing parameters to and from
tesselation and geometry stages, but for the new code path they are
required.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-By: Gurchetan Singh <gurchetansingh@chromium.org>
This saves adding the padding and just appends the ext strings
v2: use emit_hdr for the precision strings (Gert)
Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This uses the string array helpers to pass around the glsl strings
for the program.
This could be expanded on to provide more than 2 strings easily
Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
We already track samplers, images and ssbos using bitmasks, so this is a
bit more familiar to the rest of the code.
Also, this is going to enable some other nifty optimizations later on.
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>