weston/spec/main.tex

\documentclass{article}
\usepackage{palatino}

\author{Kristian Høgsberg\\
\texttt{krh@bitplanet.net}
}

\title{The Wayland Display Server}

\begin{document}

\maketitle

\section{Wayland Overview}

 - wayland is a protocol for a new display server.

 - wayland is an implementation

\subsection{Replacing X11}

Over the last 10 years, a lot of functionality have slowly moved out
of the X server and into libraries or kernel drivers. It started with
freetype and fontconfig providing an alternative to the core X fonts
and direct rendering OpenGL as a graphics driver in a client side
library. Then cairo came along and provided a modern 2D rendering
library independent of X and compositing managers took over control of
the rendering of the desktop. Recently with GEM and KMS in the Linux
kernel, we can do modesetting outside X and schedule several direct
rendering clients. The end result is a highly modular graphics stack.

Wayland is a new display server building on top of all those
components. We’re trying to distill out the functionality in the X
server that is still used by the modern Linux desktop. This turns out
to be not a whole lot. Applications can allocate their own off-screen
buffers and render their window contents by themselves. In the end,
what’s needed is a way to present the resulting window surface to a
compositor and a way to receive input. This is what Wayland provides,
by piecing together the components already in the eco-system in a
slightly different way.

X will always be relevant, in the same way Fortran compilers and VRML
browsers are, but it’s time that we think about moving it out of the
critical path and provide it as an optional component for legacy
applications.


\section{Wayland protocol}

\subsection{Basic Principles}

The wayland protocol is a asynchronous object oriented protocol.  All
requests are method invocations on some object.  The request include
an object id that uniquely identifies an object on the server.  Each
object implements an interface and the requests include an opcode that
identifies which method in the interface to invoke.

The wire protocol is determined from the C prototypes of the requests
and events.  There is a straight forward mapping from the C types to
packing the bytes in the request written to the socket.  It is
possible to map the events and requests to function calls in other
languages, but that hasn't been done at this point.

The server sends back events to the client, each event is emitted from
an object.  Events can be error conditions.  The event includes the
object id and the event opcode, from which the client can determine
the type of event.  Events are generated both in repsonse to a request
(in which case the request and the event constitutes a round trip) or
spontanously when the server state changes.

    - state is broadcast on connect, events sent out when state
      change.  client must listen for these changes and cache the state.
      no need (or mechanism) to query server state.

    - server will broadcast presence of a number of global objects,
      which in turn will broadcast their current state

\subsection{Connect Time}

 - no fixed format connect block, the server emits a bunch of events
   at connect time

 - presence events for global objects: output, compositor, input devices

\subsection{Security and Authentication}

 - mostly about access to underlying buffers, need new drm auth
   mechanism (the grant-to ioctl idea), need to check the cmd stream?

 - getting the server socket depends on the compositor type, could be
   a system wide name, through fd passing on the session dbus. or the
   client is forked by the compositor and the fd is already opened.

\subsection{Creating Objects}

\begin{itemize}
\item client allocates object ID, uses range protocol
\item server tracks how many IDs are left in current range, sends new
  range when client is about to run out.
\end{itemize}

\subsection{Compositor}

\begin{itemize}
\item a global object
\item broadcasts drm file name, or at least a string like drm:/dev/card0
\item commit/ack/frame protocol
\end{itemize}

\subsection{Surface}

created by the client
\begin{itemize}
\item attach
\item copy
\item damage
\item destroy
\item input region, opaque region
\item set cursor
\end{itemize}

\subsection{Input Group}

global object

\begin{itemize}
\item - input group, keyboard, mouse
\item keyboard map, change events
\item pointer motion
\item enter, leave, focus
\item xkb on wayland
\item multi pointer wayland
\end{itemize}


\subsection{Output}

 - global objects
 - a connected screen
 - laid out in a big coordinate system
 - basically xrandr over wayland

\section{Types of compositors}

\subsection{System Compositor}

 - ties in with graphical boot
 - hosts different types of session compositors
 - lets us switch between multiple sessions (fast user switching,
   secure/personal desktop switching)
 - multiseat
 - linux implementation using libudev, egl, kms, evdev, cairo
 - for fullscreen clients, the system compositor can reprogram the
   video scanout address to source fromt the client provided buffer.

\subsection{Session Compositor}

 - nested under the system compositor.  nesting is feasible because
   protocol is async, roundtrip would break nesting
 - gnome-shell
 - moblin
 - compiz?
 - kde compositor?
 - text mode using vte
 - rdp session
 - fullscreen X session under wayland
 - can run without system compositor, on the hw where it makes
   sense
 - root window less X server, bridging X windows into a wayland
   session compositor

\subsection{Embbedding Compositor}

X11 lets clients embed windows from other clients, or lets client copy
pixmap contents rendered by another client into their window.  This is
often used for applets in a panel, browser plugins and similar.
Wayland doesn't directly allow this, but clients can communicate GEM
buffer names out-of-band, for example, using d-bus or as command line
arguments when the panel launches the applet.  Another option is to
use a nested wayland instance.  For this, the wayland server will have
to be a library that the host application links to.  The host
application will then pass the wayland server socket name to the
embedded application, and will need to implement the wayland
compositor interface.  The host application composites the client
surfaces as part of it's window, that is, in the web page or in the
panel.  The benefit of nesting the wayland server is that it provides
the requests the embedded client needs to inform the host about buffer
updates and a mechanism for forwarding input events from the host
application.

 - firefox embedding flash by being a special purpose compositor to
   the plugin

\section{Implementation}

what's currently implemented

\subsection{Wayland Server Library}

\texttt{libwayland-server.so}

 - implements protocol side of a compositor

 - minimal, doesn't include any rendering or input device handling

 - helpers for running on egl and evdev, and for nested wayland

\subsection{Wayland Client Library}

\texttt{libwayland.so}

 - minimal, designed to support integration with real toolkits such as
   Qt, GTK+ or Clutter.

 - doesn't cache state, but lets the toolkits cache server state in
   native objects (GObject or QObject or whatever).

\subsection{Wayland System Compositor}

 - implementation of the system compositor

 - uses libudev, eagle (egl), evdev and drm

 - integrates with ConsoleKit, can create new sessions

 - allows multi seat setups

 - configurable through udev rules and maybe /etc/wayland.d type thing

\subsection{X Server Session}

 - xserver module and driver support

 - uses wayland client library

 - same X.org server as we normally run, the front buffer is a wayland
   surface but all accel code, 3d and extensions are there

 - when full screen the session compositor will scan out from the X
   server wayland surface, at which point X is running pretty much as it
   does natively.

\end{document}