Learning to Use X11
When I started programming many years ago, on a system very, very different from what we use now, producing graphical output from programs was easy; all the necessary commands were usually built right into the language. Later, when I moved to C and UNIX, things were no longer simple. Not only does C not include any graphics manipulation functions, per se, but all graphical output in UNIX has to go through the standard UNIX windowing system: the X Window System, release 11, version 6.6 (its current incarnation), or X11 for short.
Unfortunately, X11 is rather hard to approach. First of all, it is huge (the O'Reilly series of printed manuals and references includes six volumes with close to 6,000 pages among them, not counting the additional four volumes and 2,000 pages when we include Motif--I know of hardly any other language, library, framework or application that big). However, the fact that it is based on concepts quite different from the ones currently in use makes it difficult to understand. To top it all off, the X11 documentation makes use of a specialized and not-so-obvious terminology. Therefore, we need to establish a minimal glossary.
The X Window System was specifically designed to allow the graphical output of a program running on one machine to appear on a different machine, possibly one that is physically remote and/or a different make and architecture. In other words, X11 was designed to be a platform-independent, networked graphics framework.
In X11 parlance, the "display" denotes the box on which the graphical output will appear. Interestingly, an individual display is defined by the X11 documentation as having exactly one keyboard and one pointer (i.e., mouse), but potentially multiple CPUs, monitors, etc.
The "screen" corresponds to the actual physical display device; in most cases this will be a monitor. X11 allows for an arbitrary number of screens to be connected to each display. Think of a workstation with two monitors or a departmental server, connected to a larger number of (relatively dumb) X terminals.
Finally, a "window" is a rectangular area of the screen that can be used for input and output. If the rectangular area is not directly associated with a screen, but instead resides in memory, it is referred to as a "pixmap". Pixmaps and windows share the property of being "drawable" and can be used interchangeably in some function calls. It is important to remember that to X11 a window is merely a rectangular area on the screen. As such, it does not include things like titlebars, scrollbars and other GUI elements that we have come to associate with the word window. If these elements are present, they are controlled by a different program called a window manager.
Every GUI-oriented computer ships with a mouse or equivalent. When X11 came about, the development of graphical input devices was still in its infancy. Consequently the X11 documentation always speaks (somewhat bashfully) of a "pointer" (a generic term for mice), trackballs, digitizing tablets or other yet-to-be-invented graphical input devices. A final cause of confusion is the specific usage of the words client and server in X11: a "client" is any application that creates data for graphical output. The "server" is the program that manages the shared resource accessed by all clients, namely the (finite) amount of screen real estate. The unfortunate consequence of this naming convention is that the (X11) client typically executes on the server (machine), while the (X11) server runs on the client (computer).
We are now ready to write a simple example program that demonstrates the most important concepts and functions for programming with X11. The program will pop up a window on the screen, draw an X in it and disappear after a mouse-click into the window area. The following is a line-by-line explanation of the program given in Listing 1.
Listing 1. Example Program for Concepts and Functions in X11
Line 1: Xlib.h is the most important header file for X11 programming. It defines several structs and macros used throughout an X11 program and provides function prototypes for all the basic functions in the library. Other headers are part of X11 as well. If those are needed, Xlib.h usually has to be included before any of the other headers because they depend on it. Strangely, the dependent headers do not themselves include Xlib.h.
Lines 5 and 6: Display is a struct defined in Xlib.h that represents the display, i.e., the box on which the graphical output is meant to appear. The library function XOpenDisplay attempts to establish a connection to the X11 server on this machine. As argument, it takes a zero-terminated string with the display_name, in the following format: hostname:servernumber.screennumber. If the argument is NULL, the display_name defaults to the value of the environment variable DISPLAY. If no connection can be established, XOpenDisplay will return NULL.
Line 9: we obtain an identifier to be used for the screen. DefaultScreen is not a function; it is a macro defined in Xlib.h. X11 provides macros like this as accessors for the elements of the Display struct that are accessible to the programmer. Members of this and other library structs should never be accessed directly, only through the provided macros. Variables for which no macros are provided should be considered "private".
Lines 10 and 11: colors in X11 are identified by integer numbers. Since X11 is meant to be platform independent, it does not make any assumptions about the capabilities of any device (i.e., the screen not the display) to render colors. Only two colors are guaranteed to be available, black and white. Note that the actual appearance of these values does not have to correspond to black and white pixels; think of those old monitors with green (or amber) letters on a black background.
Line 14: finally, we are ready to create a window. The data type Window is not a struct as one might assume. Rather it is typedefed to some integer data type and merely provides an identifier for a window.
Line 15: each window is the child of some other window and is geometrically contained by it. Here we set the root window (the entire screen) of the default screen as the parent window.
Line 16: the coordinates of the upper-left-hand corner of the new window with respect to the parent window, designated in pixels. X11 uses graphical coordinates with the origin in the upper left and with the positive X-axis running to the right and the positive Y-axis running down. However, since the placement of new windows is under control of the window manager, the new window can pop up pretty much anywhere on the screen, regardless of the values of these two arguments--expect surprises.
Line 17: the width and height (in that order) of the new window, in pixels.
Line 18: the width and color of the border of the window. This is not the border that may be added by the window manager.
Line 19: the color of the background of the window.
Line 21: so far we have created a data structure for the window, but only in memory. In X11 a window is not automatically displayed when it is created. Making the window visible is a separate process, called mapping. The function XMapWindow consequentially maps the window on the specified display. Note that it is not necessary to specify the screen again; when the window was created, it was explicitly created as child of a parent window, which itself is associated with a specific screen.
To understand the next few lines, we have to reconsider one of the basic tenets of X11: it is a networked graphics framework in which the client (such as the program in Listing 1) and the graphics server might reside on physically remote machines. In such a situation, the failure of the network is a possibility that has to be anticipated. The XMapWindow command is issued on the client but executed on the server (note how X11's way of using those terms makes much more sense in the present context), which might not be available when the command is issued. The solution to this situation is to make the client/server communication asynchronous or event-driven. That is, XMapWindow sends its request to the server and returns immediately, without waiting for the completion of the server process. It is then up to the client to poll the server for successful execution of the previous command. The following lines do exactly that.
Lines 24 and 25: first, we need to select what kinds of events (such as mouse clicks, mouse movements, key strokes, etc.) we are interested in. Xlib.h defines a number of bit masks for the different kinds of events that can be bitwise ORed, if we are interested in several kinds of events. Here we are interested in the success of the previous mapping operation, and we therefore select the appropriate bit mask and inform the server of our choice, using XSelectInput. This function overwrites any previous setting with each new call.
Lines 27-30: XNextEvent blocks until an event occurs, sending the program to sleep so it does not consume CPU time while waiting. As a side effect, all non-empty output buffers are flushed with each call to XNextEvent, obviating explicit calls to XFlush. Since the StructureNotifyMask combines several events that are structural changes to the window and not just mapping events, we have to loop until the proper type of event has been reported. (If this presentation leaves questions, don't despair: hopefully a fuller treatment of X11 events will be the subject future articles.)
Lines 33-36: now that we can be certain the window is actually visible on the screen, we are finally ready to draw something to it. First, we define a GraphicsContext (GC), a struct that comprises information about the appearance (such as color, linestyle, etc.) of the graphical elements. Attributes of the GraphicsContext can be set at creation by providing a bit mask specifying the attributes as the third and an array of values as the fourth argument. Alternatively, attributes of the GC can be set individually by calling the appropriate functions.
Lines 39 and 40: draw two lines, forming an X on the screen. There are similar functions to draw points, arcs and rectangles, all taking coordinates relative to the current window. Remember to use these functions only with windows that you know are actually visible, not hidden or minimized. Drawing to an invisible window has no effect.
Lines 43-48: now that the graph (as it is) is complete, the program should wait for the user to click on the window to terminate it. Accordingly, we set the appropriate event mask and wait for the desired event. Note again that XNextEvent flushes all output buffers, so we are certain that the two lines indeed appeared before the mouse-click event occurred.
Lines 51 and 52: part of well-behaved C programming is responsible resource management. We therefore explicitly free the allocated resources, rather than let the program fall off the end and have the operating system clean up after us. As a side effect, XDestroyWindow unmaps the window in question automatically, so it disappears from the screen.
This concludes our analysis of the example program. If the source file is called x1.cc, the compile command would take the form:
g++ -I /usr/X11/include -L /usr/X11/lib -o x1 x1.cc -lX11
I have been using C++ throughout, so that I could use end-of-line comments and define variables wherever I needed them. The compiler option -I instructs the compiler where to look for the include files, while the -L option tells the linker where to find the libraries. The argument -lX11 (which has to go last on the command line) finally instructs the linker to link the library named libX11.so. Depending on your installation, the paths may be different.
The program should compile and run successfully. You may want to experiment with some of the parameter values or add further functionality. Don't be afraid to browse the Xlib documentation for additional functions or obtain more information on those functions used in the example program.
Two crucial issues that I had to omit in this article concern the handling of errors generated by X11, as well as the behavior of the window's contents when the window is resized, or minimized and maximized again (try it out). We will cover them and much more in future articles, when we take a closer look at X11 events.
There aren't many resources available for X11 programming, in particular not for beginners. Some useful ones are:
Christophe Tronche' X11 Pages: Includes a short tutorial covering essentially the same material that has been covered here. This site is interesting because the entire Xlib reference is available. I strongly recommend browsing it.
Brian Hammond's X11 Pages: Another personal X11 page with a tutorial, containing a little bit more information than Tronche's tutorial but less organized.
XFree86 Version 4.1.0: The official page for the current release of the open-source version of the X Window System. At the bottom of the page one finds the complete set of man pages.
Xlib Programming Manual by Adrian Nye. Volume 1 of the X Window System Series at O'Reilly. The book is wordy (it needs three chapters and almost 80 pages to cover not much more material than the present article), and the presentation is not always noted for its clarity. Nevertheless, it is probably still the standard introduction to X11 programming.
X Window Applications Programming by Eric F. Johnson and Kevin Reichard. One of the few truly introductory books on X11 programming but unfortunately out of print.
Philipp K. Janert has been programming for 15 years, both inside and outside of academia. He prefers C/C++ and UNIX but tries not to be religious about it. He holds a PhD in Theoretical Physics from the University of Washington, Seattle.
email: janert@ieee.org