This is needed to properly support EXT_sRGB_write_control
v2: Make return value of virgl_has_egl_khr_gl_colorspace
a bool (Gurchetan)
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>
This adds a new api to set/get a private pointer.
v2: add private ptr test
Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This is always passed as zero, so let's make the code a bit easier to
grok by removing this unused argument. We still need to pass zero to the
externally passed callback, though.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Add flags tgsi,glsl,stream, and shader and change the corresponding
logging code.
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Tested-by: Jakob Bornecrantz <jakob@collabora.com>
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>
No real flags are defined, only the macros and functions.
VREND_DEBUG(flag, ctx, ...) translates to fprintf(stderr, ...)
that is enabled based on whether logging is enabled for flag
in context ctx
VREND_DEBUG_EXT(flag, ctx, X) can be used to add code sequenses
as X, e.g. specific logging calls like for streams.
v2: Make use of variadic macros to make the VREND_DEBUG macro more
like a call to *printf (Following a suggestion by Gurchetan)
v3: Already include debug header in vrend_renderer.c
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Tested-by: Jakob Bornecrantz <jakob@collabora.com>
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>
Add support for TGSI's HW atomic counters, implemented here with
atomic_uint.
v2: - Fix calculation of atomic count in cmd
v3: - Add feature-checks (Dave Airlie)
v4: - Pass max-values for all stages and combined (Erik)
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Gallium copy_image redirects calls to resource_copy_region for equal src
and dst formats or if at least one of the two is compressed. Handle the
condition for selecting blit vs. glCopyImageSubRegion accordingly.
Fixes piglit: bptc-float-modes
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This adds the texture barrier support for the texture barriers,
a separate patch would be needed to implement the framebuffer fetch
barriers.
Reviewed-by: Jakob Bornecrantz <jakob@collabora.com>
Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>
glTexStorage*D is more restrictive in supporting texture formats, especially
on GLES. Specifically, it doesn't support BGRA textures that are needed to get
any useful display, but it is needed to get immutable textures that are required
for glTextureView.
Check which formats are supported and use glTexStorage*D for these, otherwise
fall back to use glTexImage*D.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
On an GL host set the sRGB blit framebuffer state explicitly to make virgl
behave like on a GLES host.
This does not correct the handing of sRGB completely, because the state
GL_FRAMEBUFFER_SRGB is not properly transmitted to the host. As a result
the piglits "blit texture linear_to_srgb * * *" flip.
Tests thatset "enable" failed before and pass now, and tests that set "disable"
now fail.
v2: - Move setting the fbo state out of the loop (Robert Tarasov)
- Use the blitter context to set and store the state, since it is only,
and currently we don't pass the state from the guest
relevant when the dst texture is SRGB, there is no need to disable it
- Just enforce that a source SRGB texture is always decoded when this could
be disabled
Fixes on GL host:
dEQP-GLES3.functional.fbo.blit.conversion.rgb8_to_srgb8_alpha8
dEQP-GLES3.functional.fbo.blit.default_framebuffer.srgb8_alpha8
Reviewed-by: Robert Tarasov <tutankhamen@chromium.org>
Tested-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Tested-by: Jakob Bornecrantz <jakob@collabora.com>
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>
This requires adding new protocol to pass the width/height/layers/sample
default values from the host.
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
For compute shaders we put the req local memory into the streamout
places in the protocol, and we create compute shaders in a separate
lookup function
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
GL4.2/GLES3.1 adds glMemoryBarrier so make sure we can handle it.
v2: add a cap bit for this for guest
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
This adds support for tracking images, and binding them
to the GL context.
includes:
vrend_renderer: create texture when TBO is used as image
vrend_renderer: specify correct access to glBindImageTexture
v2:
vrend_renderer: invert glBindImageTexture layered logic
v3: fix decode macros (Gert pointed out for ssbo)
v4: add max image samples to the caps.
add image arrays.
use mask var outside loop (Gert)
change img_locs type to GLint and printf on fail (Gert)
Co-authors: Gurchetan Singh <gurchetansingh@chromium.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
These VREND_BIND_*-flags here are basically a subset of the
VIRGL_BIND_*-flags, with one custom flag added. So let's just use
those, and use an unused big from the others for the swizzle-flag.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
These are identical to the corresponding VIRGL_BIND-flags,
so let's get rid of this duplicate definition.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
This pulls the code out from the gles31 development,
and modifies the caps to support two different limits
(so far I've only found fs/cs vs everyone else limits differ)
v2: fix buffer creation paths, limit maximums, handle indirect
(don't pass -1 into gl funcs when we don't need to).
v3: free ssbo locs
v4: use two caps fields
Co-authors: Gurchetan Singh <gurchetansingh@chromium.org>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
When the host is gles >= 3.2, gl >= 4.3, or when the extension
GL_(ARB|EXT|OES)_copy_image is available, memcopy like blitting and region
copying can be done for many color format combinations by using
glCopyImageSubData.
This fixes a number of tests from the subset
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_*
v2: - Clean list of canonical formats (Gurchetan Singh)
- Use size of canonical formats to decide whether they can be copied via
gCopyImageSubData
- Also honour the render state when deciding whether glCopyImageSubData
will do, or whether we need to do a blit.
v3: - replace format size check by compatibility check (Gurchetan Singh) but
keep the check seperate because we need to add logic for compressed
texture later
Reviewed-by: Gurchetan Singh <gurchetansingh at chromium.org>
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>
The protocol will never send negative numbers, so use uints
to avoid having to compare to 0 and other warnings.
Reviewed-by: Po-Hsien Wang <pwang@chromium.org>
This is required to implement glMinSampleShading().
Sadly, we've been setting has_sample_shading for a while, even
though this is needed. So we need to set a capability so mesa will
know that it's safe to emit this command.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Query the number of supported samples and the sample position and
store these to the caps.v2 structure. We support only up to 16 samples.
This implementation requires a GL host backend.
v2: - glTexImage2Dmultisample is not available on a gles 3.1 host
and trying to call it crashed qemu (Jakob Bornecrantz)
Use glTexStorage2DMultisample instead and delete texture each
round because the texture becomes immutable.
- move call to get sample positions only when caps v2 needs to be
filled.
v3: - rebase against master
- take care of nits (Dave)
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
In the copy fallback, when a texture can not be rendered, the data that resides
in the backing iovec needs to be used. For the non-zero levels of mip-map textures
the data is located at an offset. This patch adds storing this offset and using it
when data is read from the backing iovec and updating the dst iov. We limit the
mip-map levels for which this is done to 1-17, which is enough to cover
32kx32k textures. The patch also fixes the stride when accessing mip-map levels.
Fixes:
dEQP-GLES3.functional.texture.specification.teximage3d_depth.depth_component24_2d_array
dEQP-GLES3.functional.texture.specification.texsubimage3d_depth.depth_component32f_2d_array
dEQP-GLES3.functional.texture.specification.texsubimage3d_depth.depth_component24_2d_array
dEQP-GLES3.functional.texture.specification.texsubimage3d_depth.depth_component16_2d_array
dEQP-GLES3.functional.texture.specification.texsubimage3d_depth.depth32f_stencil8_2d_array
dEQP-GLES3.functional.texture.specification.texsubimage3d_depth.depth24_stencil8_2d_array
v2: * rebase and remove unused variables
* also correct offset when writing to the destination backing iovec
v3: * follow mesa/virgl notation and range for storing the mip-map offsets
Suggested-by: Gurchetan Singh <gurchetansingh@chromium.org>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>
v2: With epoxy GL/gl.h is not directly included (Dave Airlie).
Instead move the include of epoxy/gl.h from vrend_renderer.c to
vrend_renderer.h
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Allow resources created externally (eg gbm created buffers as
dma bufs) to be used. As an example, crosvm
(https://chromium.googlesource.com/chromiumos/platform/crosvm/)
will intercept resource creation to use minigbm to allocate
buffers that its compositor is able to properly handle since it
only supports compositing with buffers allocated via minigbm.
This patch allows direct rendering to those buffers without
requiring an extra copy.
v2: Handle missing extension better.
v3: Update commit message with more details on usage.
Signed-off-by: David Riley <davidriley@chromium.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
This passes the default tessellation factors from the guest to
the host.
v2: fix warnings
Tested-by: Elie Tournier <elie.tournier@collabora.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
Tested-by: Jakob Bornecrantz <jakob@collabora.com>
and make sure that on GLES it it chosen for GL_RGBA4 over
VIRGL_FORMAT_B4G4R4A4_UNORM, by removing support for the latter.
This is needed because on GLES3 GL_BGRA isn't a supported format to pass
to glTexImage3D.
Fixes the test dEQP-GLES3.functional.texture.format.sized.3d.rgba4_pot
on GLES hosts.
v2: * Make more explicit the GL/GLES split (Gert Wollny)
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Suggested-by: Jakob Bornecrantz <jakob@collabora.com>
Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
Tested-by: Jakob Bornecrantz <jakob@collabora.com>
This allows the tbo code to properly detect if we are using a buffer
as a texture or not, instead of relying on GL_TEXTURE_BUFFER being used.
We also don't need to special case generate the tbo texture id until
sampler bind time.
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
Use the source format swizzle information to set the
GL_TEXTURE_SWIZZLE_* parameters for the GL blit operation. This also
removes the need for the emulated alpha special case, since when using
emulated alpha the source format already has proper swizzle information.
Signed-off-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Tested-by: Gurchetan Singh <gurchetansingh@chromium.org>
Explicitly describe the swizzle of all supported formats in the format
table. In this commit all format swizzles are set to NO_SWIZZLE, but
future commits will update some format/swizzle combinations to improve
support for the corresponding virgl formats.
Signed-off-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Tested-by: Gurchetan Singh <gurchetansingh@chromium.org>
Currently, we always try to create an OpenGL 3.1 context. Some
dEQP tests require an OpenGL 3.2 context (specifically, ones
that use glGetInteger64v). Let's try to create the highest
version context we can, and iterate to lower versions, i.e:
https://developer.android.com/guide/topics/graphics/opengl.html#version-check
The return code for (*create_gl_context) is a little unclear.
This patch assumes NULL is returned on failure. This should work
for GLX and EGL.
GLX:
"On failure glXCreateContextAttribsARB returns NULL and generates
an X error with extended error information"
https://www.khronos.org/registry/OpenGL/extensions/ARB/GLX_ARB_create_context.txt
EGL:
"#define EGL_NO_CONTEXT ((EGLContext)0)"
https://www.khronos.org/registry/EGL/api/1.1/EGL/egl.h
The semantics of rcbs->create_gl_context may be different, though.
Fixes:
dEQP-GLES3.functional.state_query.integers.max_vertex_output_components_getinteger64
dEQP-GLES3.functional.state_query.integers.max_vertex_output_components_getfloat
Signed-off-by: Dave Airlie <airlied@redhat.com>
These two formats are required by DRI in the guest and as such
Wayland, X11, GBM or any API built on top if DRI. The format
GL_BGRA_EXT is not supported on Desktop OpenGL.
v2: Better documentation.
Signed-off-by: Jakob Bornecrantz <jakob.bornecrantz@collabora.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
These are needed for ARB_draw_indirect and GL4.0
This enables support and turns in the cap when
support is present.
This also enhances the draw packets to cover
future features, it doesn't enable or show these
yet, since other work is required in the shaders.
Signed-off-by: Dave Airlie <airlied@redhat.com>
In vrend clear dispatch function, the 'buffers' is read from
guest. A malicious guest can specify a bad 'buffers' to make
a the function call util_format_is_pure_uint() even the
'ctx->sub->surf[i]' is NULL. This can cause a NULL pointer deref.
Make a sanity check to avoid this.
[airlied: use a define]
Signed-off-by: Li Qiang <liq3ea@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
That way an value if (type > PIPE_SHADER_GEOMETRY) guard will actually
work for all values.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Instead of polling the fences regularly, have a thread
that blocks for a single fence using a separate shared
context, then uses eventfd to wake up the main thread
when something happens.
Inside the guest, glmark2 typicially runs twice as fast with the thread
sync. Although in general, the performances seems to be about +30%. The
benefits is mostly for CPU-bounds tasks (when main the thread hits 100%)
A naive perf stat of the vtest renderer with glmark2 "build" test with a
fixed number of frames (500) results in the following stats data:
(do not value timing related informations, since the renderer is ran and
stopped manually)
without thread:
3032.282265 task-clock (msec) # 0.420 CPUs utilized
4,277 context-switches # 0.001 M/sec
102 cpu-migrations # 0.034 K/sec
9,020 page-faults # 0.003 M/sec
7,884,098,254 cycles # 2.600 GHz
4,440,126,451 stalled-cycles-frontend # 56.32% frontend cycles idle
<not supported> stalled-cycles-backend
11,024,091,578 instructions # 1.40 insns per cycle
# 0.40 stalled
# cycles per insn
1,091,831,588 branches # 360.069 M/sec
5,426,846 branch-misses # 0.50% of all branches
with thread:
3403.592921 task-clock (msec) # 0.452 CPUs utilized
7,145 context-switches # 0.002 M/sec
410 cpu-migrations # 0.120 K/sec
6,191 page-faults # 0.002 M/sec
7,475,038,064 cycles # 2.196 GHz
4,487,043,071 stalled-cycles-frontend # 60.03% frontend cycles idle
<not supported> stalled-cycles-backend
9,925,205,494 instructions # 1.33 insns per cycle
# 0.45 stalled
# cycles per insn
834,375,503 branches # 245.146 M/sec
4,919,995 branch-misses # 0.59% of all branches
Signed-off-by: Marc-André Lureau <marcandre.lureau@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>