Allow the caller to provide a pre-opened DRM file descriptor, to be
used by the EGL context.
Bump callback interface version for compatibility.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Allow resources created externally (eg gbm created buffers as
dma bufs) to be used. As an example, crosvm
(https://chromium.googlesource.com/chromiumos/platform/crosvm/)
will intercept resource creation to use minigbm to allocate
buffers that its compositor is able to properly handle since it
only supports compositing with buffers allocated via minigbm.
This patch allows direct rendering to those buffers without
requiring an extra copy.
v2: Handle missing extension better.
v3: Update commit message with more details on usage.
Signed-off-by: David Riley <davidriley@chromium.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
virgl_renderer_resource_get_info() doesn't return the underlying image
stride. eglExportDMABUFImageMESA() does. This fixes imports failing
due to a stride mismatch under certain resolutions. (note that qemu
uses its own export methods, perhaps because Gerd noticed the problem
with virgl egl export functions)
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
apitrace & epoxy with EGL don't work nicely together, using glx works
around this issue for now (disabled by default).
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Instead of polling the fences regularly, have a thread
that blocks for a single fence using a separate shared
context, then uses eventfd to wake up the main thread
when something happens.
Inside the guest, glmark2 typicially runs twice as fast with the thread
sync. Although in general, the performances seems to be about +30%. The
benefits is mostly for CPU-bounds tasks (when main the thread hits 100%)
A naive perf stat of the vtest renderer with glmark2 "build" test with a
fixed number of frames (500) results in the following stats data:
(do not value timing related informations, since the renderer is ran and
stopped manually)
without thread:
3032.282265 task-clock (msec) # 0.420 CPUs utilized
4,277 context-switches # 0.001 M/sec
102 cpu-migrations # 0.034 K/sec
9,020 page-faults # 0.003 M/sec
7,884,098,254 cycles # 2.600 GHz
4,440,126,451 stalled-cycles-frontend # 56.32% frontend cycles idle
<not supported> stalled-cycles-backend
11,024,091,578 instructions # 1.40 insns per cycle
# 0.40 stalled
# cycles per insn
1,091,831,588 branches # 360.069 M/sec
5,426,846 branch-misses # 0.50% of all branches
with thread:
3403.592921 task-clock (msec) # 0.452 CPUs utilized
7,145 context-switches # 0.002 M/sec
410 cpu-migrations # 0.120 K/sec
6,191 page-faults # 0.002 M/sec
7,475,038,064 cycles # 2.196 GHz
4,487,043,071 stalled-cycles-frontend # 60.03% frontend cycles idle
<not supported> stalled-cycles-backend
9,925,205,494 instructions # 1.33 insns per cycle
# 0.45 stalled
# cycles per insn
834,375,503 branches # 245.146 M/sec
4,919,995 branch-misses # 0.59% of all branches
Signed-off-by: Marc-André Lureau <marcandre.lureau@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This merges the error/bounds checking on the transfer
code, but keeps the same API, it also uses a struct
to pass through the transfer info.
this also passes a return value out to make testing easier.