Cross-Platform Software Development Using CMake
When looking through a large list of projects, one thing becomes apparent: a description of the build process always is stored in a group of files. These files can be simple shell scripts, Makefiles, Jam files, complex scripts based on projects like Autoconf and Automake or tool-specific files.
Recently another player came into the software building game, CMake. CMake is not directly a build process, because it uses native build tools, such as Make or even Microsoft Visual Studio. With support for numerous platforms, in-source and out-of-source builds, cross-library dependency checking, parallel building and simple configuration of header files, it significantly reduces the complexity of cross-platform software development and maintenance processes.
Looking at most software development projects, you are undoubtedly faced with a common problem. You have a bunch of source files, some depend on each other, and you want to make some final binary. Sometimes you want to do something more complicated, but in most cases, that is it.
So, you have this little project and you want to build it using your Linux desktop. You sit down and quickly write the following Makefile:
main.o: main.c main.h cc -c main.c MyProgram: main.o cc -o MyProgram main.o -lm -lz
Once this file is ready, all you have to do is type make and the project is built. If any file is modified, all the necessary files also are rebuilt. Great, you can now congratulate yourself and go have a drink.
Except your boss comes by and says, "We just got this great new XYZ computer and you need to build the software on it." So, you copy files there, type make and receive the following error message:
cc: Command not found
You know there is a compiler on that XYZ computer and it is called cc-XYZ, so you modify Makefile and try again. But that system does not have zlib. So, you remove -lz and play with source code and on it goes.
As you see, the problem with the Makefile approach is that once the file is moved to new platform, where the compiler name is not cc or where compile flags are different or even where the syntax of the compile line is different, make breaks.
As a more elaborate example of this problem, let us check our favorite compression library, zlib. Zlib is a fairly simple library, consisting of 17 C source files and 11 header files. Compiling zlib is straightforward. All you need to do is compile each C file and then link them all together. You can write a Makefile for it, but then you have to modify it on every single platform.
Tools such as Autoconf and Automake do a good job of solving some of these problems on UNIX and UNIX-like platforms. They are, however, usually too complex. To make things even worse, in most projects developers end up writing shell scripts inside Autoconf input files. The results then quickly become dependent on assumptions the developer made. Because the result of Autoconf depends on the shell, these configuration files do not run on platforms where the Bourne Shell or another standard /bin/sh is not available. Autoconf and Automake also depend on several tools installed on the system.
CMake is a solution to these problems. As opposed to other similar tools, CMake makes few assumptions about the underlying system. It is written in fairly standard C++, so it should run on almost any modern platform. It does not use any other tool except the native build tools of the system.
For several platforms, such as Debian GNU/Linux, CMake is available as a standard package. For most other platforms, including UNIX, Mac OS X and Microsoft Windows, CMake binaries can be downloaded from the CMake Web site. To check if CMake is installed, you can run the command cmake --help. This will display the version of CMake and the usage information. If the location to the CMake executable is not in the system path, you can run it by specifying the full path to the executable.
Now that CMake is installed, we can use it for our projects. For this, we have to prepare the CMake input file, which is called CMakeLists.txt. For example, this is a simple CMakeLists.txt for a possible project:
PROJECT(MyProject C) ADD_LIBRARY(MyLibrary STATIC libSource.c) ADD_EXECUTABLE(MyProgram main.c) TARGET_LINK_LIBRARIES(MyProgram MyLibrary z m)
Using CMake to build the project is extremely easy. In the directory containing CMakeLists.txt, supply the following two commands, where path is the path to the source code.
cmake path
make
The cmake step reads the CMakeLists.txt file from the source directory and generates appropriate Makefiles for the system, in the current directory. CMake also maintains a list of all header files that objects depend on, so dependency checking can be assured. If you need to add more source files, simply add them to the list. Once Makefiles are generated, you do not have to run CMake any more, because the dependency to CMakeLists.txt also are in the generated Makefiles. If you want to make sure that dependencies are regenerated, you can always run make depend.
CMake is in essence a simple interpreter. CMake input files have an extremely simple but powerful syntax. It consists of commands, primitive flow control constructs, macros and variables. All commands have exactly the same syntax:
COMMAND_NAME(ARGUMENT1 ARGUMENT2 ...)
For example, the ADD_LIBRARY command specifies that a library should be created. The first argument is the name of the library, the second optional argument is whether the library is static or shared and the remaining arguments are a list of sources. Do you want a shared library? Simply SHARED instead of STATIC. The list of all commands can be seen in the CMake documentation.
A few flow control constructs, such as IF and FOREACH are used. The IF construct uses one of several types of expressions, such as boolean (NOT, AND, OR), check if command exists (COMMAND) and check if file exists (EXISTS). Expressions, however, cannot contain other commands. An example of a commonly used IF statement would be:
IF(UNIX) IF(APPLE) SET(GUI "Cocoa" ELSE(APPLE) SET(GUI "X11" ENDIF(APPLE) ELSE(UNIX) IF(WIN32) SET(GUI "Win32" ELSE(WIN32) SET(GUI "Unknown" ENDIF(WIN32) ENDIF(UNIX) MESSAGE("GUI system is ${GUI}")
This example shows a simple use of IF statements and variables.
FOREACH is used in the same fashion. The FOREACH command's arguments include the variable that traverses, and the list of items to traverse. For example, if a list of executables needs to be created, where every executable is created from a source file with the same name, the following FOREACH would be used:
SET(SOURCES source1 source2 source3) FOREACH(source ${SOURCES}) ADD_EXECUTABLE(${source} ${source}.c) ENDFOREACH(source)
Macros use a syntax of both commands and flow control constructs. To define a macro, the MACRO construct is used. Let's say we often create executables that link to some libraries. The following example macro, then makes our life a bit easier. In the example, CREATE_EXECUTABLE is the name of macro and the rest are arguments. Within the macro, all arguments are presented as variables. Once the macro is created, it can be used as a regular command. The definition and use of a CREATE_EXECUTABLE macro would be:
MACRO(CREATE_EXECUTABLE NAME SOURCES LIBRARIES) ADD_EXECUTABLE(${NAME} ${SOURCES}) TARGET_LINK_LIBRARIES(${NAME} ${LIBRARIES}) ENDMACRO(CREATE_EXECUTABLE) ADD_LIBRARY(MyLibrary libSource.c) CREATE_EXECUTABLE(MyProgram main.c MyLibrary)
Macros, however, are not equivalent to procedures or functions from programming languages and do not allow recursion.
An important feature of a good build process is the notion of turning on and off parts of the build. The build process also should be able to find and set locations to the system resources your project needs. All these functions are achieved in CMake using conditional compiling. Let me demonstrate with an example. Let's say your project has two modes, regular and debug. Debug mode adds debug code to all the regular code. So, your code is full of sections, such as:
#ifdef DEBUG fprintf(stderr, "The value of i is: %d\n", i); #endif /* DEBUG */
In order to tell CMake to add -DDEBUG to compile lines, you can use the SET_SOURCE_FILE_PROPERTIES with the COMPILE_FLAGS property. But you probably do not want to edit the CMakeLists.txt file every time you switch between debug and regular builds. The OPTION command creates a boolean variable that can be set before building the project. The syntax for the previous example would look like this:
OPTION(MYPROJECT_DEBUG "Build the project using debugging code" ON) IF(MYPROJECT_DEBUG) SET_SOURCE_FILE_PROPERTIES( libSource.c main.c COMPILE_FLAGS -DDEBUG) ENDIF(MYPROJECT_DEBUG)
Now, you ask, "how do I set this variable?" CMake comes with three flavors of GUI. On UNIX-like systems, there is a Curses GUI called ccmake. It is text-based and can be run through a remote connection. CMake also has GUIs for Microsoft Windows and Mac OS X.
When using CMake-generated Makefiles, if you have already run CMake for the first time, all you have to type is make edit_cache. This command runs an appropriate GUI. In all the GUIs, you have the option of setting variables. As you will see right away, CMake has several default options, for example, EXECUTABLE_OUTPUT_PATH and LIBRARY_OUTPUT_PATH (where executables and libraries go) and, in our case, MYPROJECT_DEBUG. After changing a value of some variable, you press the configure button or [c] (on ccmake).
In GUIs you can set several different types of entries. MYPROJECT_DEBUG is a boolean variable. Another common type variable is a path, which specifies the location of some file on the system. Say our program relies on the location of the file Python.h. We would put in the CMakeLists.txt file the following command, which tries to find a file:
FIND_PATH(PYTHON_INCLUDE_PATH Python.h /usr/include /usr/local/include)
But this command looks only in /usr/include and /usr/local/include, so you have to specify some other locations. Because it can be wasteful to specify all these locations in every single project, you can include other CMake files, called modules. CMake comes with several useful modules, from the type that search for different packages to the type that actually perform some tasks or define macros. For the list of all modules, check the Modules subdirectory of CMake. One example is a module called FindPythonLibs.cmake, which finds Python libraries and header files on almost every system. If, however, CMake does not find the file you need, you can always specify it in the GUI. If you are used to an Autoconf approach, where you specify configuration options on the command-line, you can use command line access to CMake variables. The following line sets the MYPROJECT_DEBUG variable to OFF:
cmake -DMYPROJECT_DEBUG:BOOL=OFF
As a software developer, you probably organize source code in subdirectories. Different subdirectories can represent libraries, executables, testing or even documentation. We now can enable or disable subdirectories to build parts of our project and skip other parts. To tell CMake to process a subdirectory, use the SUBDIRS command. This command tells CMake to go to the specified subdirectory and find the CMakeLists.txt file. Using this command, we can now make our project a bit more organized. We move all the library files to the library subdirectory Library, and the top-level CMakeLists.txt now looks like this:
PROJECT(MyProject C) SUBDIRS(SomeLibrary) INCLUDE_DIRECTORIES(SomeLibrary) ADD_EXECUTABLE(MyProgram main.c) TARGET_LINK_LIBRARIES(MyProgram MyLibrary)
The INCLUDE_DIRECTORIES command tells the compiler where to find header files for main.c. So, even if your project has five hundred subdirectories and you move all your sources in, you will not have any problems getting dependencies to work. CMake does all this for you.
So, now we want to "cmakify" zlib. Start with a simple CMakeLists.txt file:
PROJECT(ZLIB) # source files for zlib SET(ZLIB_SRCS adler32.c gzio.c inftrees.c uncompr.c compress.c infblock.c infutil.c zutil.c crc32.c infcodes.c deflate.c inffast.c inflate.c trees.c ) ADD_LIBRARY(zlib ${ZLIB_SRCS}) ADD_EXECUTABLE(example example.c) TARGET_LINK_LIBRARIES(example zlib)
Now you can build it. However, there are couple of little things to remember.First, zlib needs unistd.h on some platforms. So, we add this test:
INCLUDE ( ${CMAKE_ROOT}/Modules/CheckIncludeFile.cmake) CHECK_INCLUDE_FILE( "unistd.h" HAVE_UNISTD_H) IF(HAVE_UNISTD_H) ADD_DEFINITION(-DHAVE_UNISTD_H) ENDIF(HAVE_UNISTD_H)
Also, we have to do something about shared libraries on Windows. Zlib needs to be compiled with -DZLIB_DLL, for proper export macros. So, we add the following option:
OPTION(ZLIB_BUILD_SHARED "Build ZLIB shared" ON) IF(WIN32) IF(ZLIB_BUILD_SHARED) SET(ZLIB_DLL 1) ENDIF(ZLIB_BUILD_SHARED) ENDIF(WIN32) IF(ZLIB_DLL) ADD_DEFINITION(-DZLIB_DLL) ENDIF(ZLIB_DLL)
This works, but there is a better way. Instead of passing ZLIB_DLL and HAVE_UNISTD_H as compiler flags, we can configure an include file. We do that by preparing an input file with tags for CMake. An example of the include file for zlib would be zlibConfig.h.in:
#ifndef _zlibConfig_h #define _zlibConfig_h #cmakedefine ZLIB_DLL #cmakedefine HAVE_UNISTD_H #endif
Here, the \#cmakedefine VAR is replaced with \#define VAR or with /* \#undef VAR */, depending on whether VAR is defined. We tell CMake to create the file zlibConfig.h using the following CMake command:
CONFIGURE_FILE(> ${ZLIB_SOURCE_DIR}/zlibDllConfig.h.in ${ZLIB_BINARY_DIR}/zlibDllConfig.h)
With the information in this article you will be able to start using CMake for most of your everyday tasks. Except, now you are able to connect to your friend's AIX system and build your project, if your code is portable enough. Also, the CMake files are much easier to read than Makefiles, so your friend can check what you missed.
This example, however, only scratches the surface; CMake is capable of doing several other tasks. With its face-lift in version 1.6, it can now do platform independent TRY_RUN and TRY_COMPILE builds, which come in handy when you want to test the capabilities of the system. It natively supports only C and C++, but there is limited support for building Java files. With a little effort, you can build anything from Python- or Emacs-compiled scripts to LaTeX documents. By using CMake with the testing framework Dart, you can do platform-independent regression testing. If you want to go even further, you can use CMake's C API to write a plug-in for CMake, which adds your own command to the list of existing commands.
CMake is being actively used in several projects such as VTK and ITK. Its benefits are enormous in traditional software development, however they become even more apparent, when portability is necessary. By using CMake for software development, your code will be significantly more "open", because it will build on a variety of platforms.
Andrej Cedilnik (andy.cedilnik@kitware.com) is a senior software engineer at Kitware Inc., a small business devoted to imaging, visualization and computer graphics. He is one of the developers behind CMake, and in his spare time he is a Linux evangelist.