There is never a need to wait for the GPU works (if any) involved in
resource creation. A client commonly does
VCMD_SUBMIT_CMD
VCMD_RESOURCE_CREATE2(dummy)
...
VCMD_RESOURCE_BUSY_WAIT(dummy)
only because VCMD_RESOURCE_BUSY_WAIT needs a resource. We don't need a
fence for resource creation.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Let VTEST_FUZZER_FENCES control whether VCMD_RESOURCE_BUSY_WAIT waits or
not. Fences are always created.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
It gives clients access to virgl_renderer_resource_create_blob and
virgl_renderer_resource_export_blob.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Protocol version 3 uses server-generated, rather than client-generated,
resource ids. Without that, clients could pick conflicting resource
ids.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
VCMD_RESOURCE_CREATE and VCMD_RESOURCE_CREATE2 use and return
server-generated ids since version 3. The client id must be 0.
This makes the commands work more like linux's
DRM_IOCTL_VIRTGPU_RESOURCE_CREATE.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
This paves the way for a protcol change where the server will
generate resource ids. This also improves our error handling by
switching from FREE to vtest_unref_resource.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Add VIRGL_RENDERER_VENUS and the related bits.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
In addition to virgl_renderer_submit_cmd, vkr_ring can also be used to
submit commands.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
On 64-bit builds, we can store uint64_t keys in pointers directly and
there is no need to malloc/free.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
util_hash_table_u64 is a wrapper to util_hash_table with uint64_t
keys. This is similar to hash_table_u64 in mesa.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
v2: - Add root build dir and src to include path (Gert)
- fix gallium includes (Gert)
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Acked-by: Rohan Garg <rohan.garg@collabora.com>
This fixes regression in various astc format related dEQP tests.
Fixes: 51d4d37 ("formats:add astc 2d compressed format support")
Signed-off-by: Lepton Wu <lepton@chromium.org>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
and init flag for preferring discrete GPU. Modify GBM selection to
prefer integrated GPU for display. Report "different GPU" back to
client to enable drawable shadowing. Allocate linear GBM buffers to make
them shareable between different devices.
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
When virglrenderer creates VIRGL_FORMAT_B8G8R8X8_UNORM texture, it
uses internal format GL_BGRA_EXT.
When Mesa imports dma_buf VIRGL_FORMAT_B8G8R8X8_UNORM/DRM|GBM_FORMAT_XRGB8888
it uses internal format GL_RGB8.
These formats are not copy compatible.
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
This was regressed by commit 3a2a537c (vrend: hook up per-context
fencing internally), bisected by Gert.
When there is no sync thread, vrend_renderer_check_fences checks fences
on vrend_state.fence_list for signaled ones and call ctx->fence_retire
on them. Because fences belonging to the same context are ordered, as
an optimization, we ideally want to call ctx->fence_retire only on the
last fence of those belonging to the same context. (Note we are close to
that but not quite because we want to avoid complex changes until virgl
starts using per-context fencing, if ever)
The issue is in need_fence_retire_signal_locked. It has this check
if (fence->fences.next == &vrend_state.fence_list)
return true;
which says that, if the fence is the last one in vrend_state.fence_list,
call ctx->fence_retire (because it must be the last fence of some
context). The check is incorrect because not all fences on the list
have signaled when the sync thread is not used. It will fail when there
are also unsignaled fences on the list.
To fix the issue, we contruct a list of signaled fences first before
calling need_fence_retire_signal_locked. We could merge the paths with
and without sync thread further because the non-sync-thread path just
needs this extra step to construct a list of signaled fences. But we
will save that for future and focus on fixing the performance
regression.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Make sure that the passed buffer size is not negative and that
evaluating the buffer size in bytes doesn't overflow. With that
we make sure that the buf_offset in the decoding loop can't wrap
around when it is updated.
v2: - move check to virgl_renderer_submit_cmd (Chia-I)
- remove the size conversion on both ends
v3: - keep conversion to size in bytes (Chia-I)
- explicitely convert to uint32_t to silence a warning
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Avoids compiling shader variants in the host driver if they will never
be used for rendering.
Wait to register shaders with the host GL driver until after shader
dependencies have been resolved and the selected set of variants is
known to be used for rendering.
The cost of this workaround is taht TGSI to GLSL conversion is still
performed for every variant (observed to happen twice for every
vertex/fragment shader in some cases), but this is expected to be much
cheaper than calling out to the host driver to compile unused variants.
Workaround for #180.
Signed-off-by: Ryan Neph <ryanneph@google.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Shader programs are submitted for linking in the host driver twice as
frequently as for native applications due to circular shader
dependencies, immediate TGSI conversion, and explicit shader variant
selection ordering at draw-time.
Circular dependencies are resolved by re-running shader variant
selection twice before emitting calling glUseProgram() in the host.
The cost of this workaround is that GLSL is still emitted to the host
driver for compilation into intermediate/native instructions even for
unused shader variants, but the extraneous linking has been eliminated.
Workaround for #180.
v2: changed shader select order on second pass; only select frag shader
once.
Signed-off-by: Ryan Neph <ryanneph@google.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
This is useful when one wants to trace events like frame begin and end
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Rohan Garg <rohan.garg@collabora.com>
This implements accepting a marker message string from the
guest and passing it to the host implementation.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Rohan Garg <rohan.garg@collabora.com>
On hosts that support the NVX_GPU_MEMORY_INFO extension, we can query
the total GPU memory through GL_GPU_MEMORY_INFO_DEDICATED_VIDMEM_NVX.
Signed-off-by: Rohan Garg <rohan.garg@collabora.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Per-context fences signal in creation order only within a context. Two
per-context fences in two contexts might signal in any order.
When a per-context fence is created, a fence cookie can be specified.
The cookie will be passed to write_context_fence callback. This
replaces fence_id that is used in ctx0 fencing.
write_context_fence is called on each fence unless the fence has
VIRGL_RENDERER_FENCE_FLAG_MERGEABLE set. When the bit is set,
write_context_fence might be skipped.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Acked-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Other than hooking things up, vrend_renderer_check_fences and
vrend_renderer_export_ctx0_fence have some non-trivial changes. This is
because fence->ctx is no longer always vrend_state.ctx0. It can also
point to a user context or be NULL now.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Acked-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
A fence cookie (void *) is more flexible than a fence id to the users.
When per-context fencing is introduced, its API will use fence cookies.
Conversions between fence ids and fence cookies are some free casts.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Acked-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Replace the unused ctx_id by a vrend_context pointer. When a
vrend_context is destroyed, remove fences associated with the context to
avoid dangling pointers.
For now, the context is always ctx0 which is never destroyed before the
fences. That will change later when per-context fencing is introduced.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Acked-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Being an interface of virgl_context, this provides per-context fencing.
This is motivated by Vulkan, where each context (i.e., VkInstance) is
independent from one another. Fences submitted to difference contexts
do not signal in the order they are submitted. It is in theory and in
practice also true for OpenGL when SW/HW scheduling or context priority
is considered.
Besides being per-context, fences in Vulkan are also per-VkDevice and
per-VkQueue within a context. This interface uses an opaque uint64_t to
identify the queue a fence is submitted to, similar to how an opaque
uint64_t is used to identify a blob.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Acked-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
There are two calls of do_readpixels, and both of them creates a new
frame buffer object and destroys it immediately after that. This change
integrates those operations into do_readpixels.
One thing to note that glReadBuffer(GL_COLOR_ATTACHMENT0) has been
called after binding such a newly created frame buffer object and before
calling do_readpixels in vrend_transfer_send_readpixels. The call is
simply deleted because a new frame buffer object should have
GL_COLOR_ATTACHMENT0 as read buffer.
Signed-off-by: Akihiko Odaki <akihiko.odaki@gmail.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
vrend resource could cache a frame buffer object to read a texture
back. However, vrend resource is independent of OpenGL context
while a frame buffer object is not because it is a container
object. On QEMU, a frame buffer object is typically cached on each
renderer context while its destruction operation is performed on
"0 context", resulting in destroying another frame buffer object
whose name is identical but is created in "0 context".
This change simply removes the caching mechanism and eliminates the
need of a context switch when destroying a cached frame buffer
object and some handling of cache misses.
Signed-off-by: Akihiko Odaki <akihiko.odaki@gmail.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Allow for querying of detailed GPU memory information
through the GL_ATI_meminfo and GL_NVX_gpu_memory_info
extensions.
Signed-off-by: Rohan Garg <rohan.garg@collabora.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>