Examining the compilation process. part 3.
The last two articles that I wrote for Linuxjournal.com were about the steps that GCC goes through during the compilation process and were based on a software development class I taught a few years ago. I hadn't intended for this to be a three part series, but it's been pointed out that I didn't cover the make utility and I think it's almost negligent to discuss software development and not discuss make. Since I don't like to think of myself as negligent, I decided to extend the series to one more article.
If you only have a simple project with less than, say, 5 source files, you probably don't need to use the make utility. You can simply compile your project with a shell script like the one I'm using for a small 3D graphics program I'm writing as part of another article for Linux Journal. Take a look.
#!/bin/bash
g++ ./game.cpp -lIrrlicht -lGL -lXxf86vm -lXext -lX11 -ljpeg -lpng -o game
This is pretty easy. We have a single source file, a few libraries and a final executable. Scripting the compilation of this project saves us from having to type that mammoth command line in every time we make a change to the program.
But what happens when your project has several, perhaps hundreds of source files of hundreds or thousands of lines of code each? Sometimes, it can take several minutes or even hours to compile a very large project. What if you find an important bug in your program and have to make this change:?
Change a = a+b;
into a = a-b;
This one character change means you have to recompile the whole project! Now while waiting for a project to recompile is a wonderful excuse to go drink a beer, it's a very inefficient use of your time as a software developer.
The ability to only recompile those parts of our project that actually need to be recompiled is what we get from using the make utility. Let's take a look at how this works.
The make utility uses a file called a Makefile to determine which parts of our project need to be recompiled. Essentially, the Makefile specifies which files depend on each other, and how to regenerate a given file if we need to. Let's take a look at an example. Here is a simple Makefile.
main: main.o f.o
gcc main.o f.o -o main
main.o: main.c
gcc -c main.c -o main.o
f.o: f.c
gcc -c f.c -o f.o
The general format of a Makefile is a target, a list of dependencies for that target, and a command line used to regenerate the target if any of the dependencies have been modified.
From this Makefile, we can see a few things. For example, the executable file, main, is dependent upon both main.o and f.o. When we need to regenerate main, perhaps because either main.o or f.o have changed, we use the command, “gcc main.o f.o -o main”. In turn, main.o is dependent upon main.c. When a change is made to main.c, main.o needs to be regenerated and the Makefile tells us how this is done. It turns out that if main.c changes and we have to regenerate main.o, we also have to relink the results in order to regenerate the final executable. Make takes care of the recursive nature of the problem for us.
The f.o target is similar.
After creating this Makefile and typing make for the first time, we had to recompile the entire project from scratch:
# make
gcc -c main.c -o main.o
gcc -c f.c -o f.o
gcc main.o f.o -o main
This resulted in an executable call “main.” However, let's say we make a change to f.c. When we rerun make, we see this:
# make
gcc -c f.c -o f.o
gcc main.o f.o -o main
We see that main.c isn't recompiled this time since it didn't change. The file, f.c is recompiled into f.o and then relinked with the existing main.o. The result is another, updated, executable called main.
If for some reason, we wanted to recompile main.c, we could ask make to do it for us by typing:
make main.o
In this case, make would consult the Makefile in the current directory and figure out what needs to be done in order to regenerate the main.o target.
So by using the make utility, we can save ourselves from having to watch as, potentially, several unnecessary recompilations take place.
If that was all we could expect from the make utility, it would be a tremendous time saver. But there is more. Make allows us to define variables and use them within our Makefile. For example, take a look at an excerpt from one of my other projects:
OBJ = network.o config.o protocol.o parsers.o events.o users.o
CPPFLAGS = -DTEXT_CLIENT
CXXFLAGS = -O3 -ffast-math -Wall
LDFLAGS = -lenet
game: $(OBJ) main.cpp
g++ $(CPPFLAGS) $(CXXFLAGS) \
main.cpp $(OBJ) $(LDFLAGS) -o game
Here we see a few things of interest. First, we define a few variables, OBJ, CPPFLAGS, CXXFLAGS, LDFLAGS. These variables are then used later in the Makefile where we describe how to remake the “game” target.
For the sake of clarity, let's see what happends to the command line specified in this Makefile snippet. We start out with:
g++ $(CPPFLAGS) $(CXXFLAGS) main.cpp $(OBJ) $(LDFLAGS) -o game
Se can see the references to the variables that we defined earlier, so let's go ahead and substitute their values in. When we do, we end up with:
g++ -DTEXT_CLIENT -O3 -ffast-math -Wall main.cpp network.o config.o protocol.o parsers.o events.o users.o -lenet
I think you get the idea about how variables work inside a Makefile. In real life, variables might be used in several places in the Makefile and thus save us a lot of time and spare us from the potential of making a trivial, but devastating typing mistake.
We also see that we can make the command line easier to read by ending it with the '\' character and continuing it on the next line. It's a simple fact: Any code that is easy to read is less prone to errors, and our Makefile is no different.
Ok, so far, the make utility sounds like a really cool thing. But what kinds of problems can we have with it?
The common mistake people make... with make, is that they leave out a dependency. For example, let's say you have a file, foo, that depends upon another file, bar.o, but you forget to list it in the dependencies for foo.
Now, if bar.o doesn't exist, you will simply get some linking error messages and you'll know right away that you've left something out.
However, what if you've been compiling by hand until the project grew large enough to warrant using make? Now bar.o already exists, but isn't mentioned as a dependency for foo. In this case, everything will seem to work just fine, until you find a bug in bar.o. So, you go into the files that are used to generate bar.o and you fix your bug and recompile. You find that you have the same symptoms. You think that maybe you forgot to save your changes so you do it again. Same bug. This time you put a few debugging print statements in your code and recompile. Same bug and NO DEBUGGING OUTPUT! If you happen to be prone to swearing, this is where it begins. Fortunately, once you've made this mistake once, you tend to remember it, and the symptoms and you don't get bitten again.
Header files, or .h files if you write in C, can lead to some problems with make. In this case, it's common to have header files that contain the prototypes of all of your external functions and data types. Many times, programmers get lazy and put ALL of their prototypes in a single file, which causes them to have to include this file in all of the rest of their source files. The header file becomes a dependency for every file in the project. In this case, the programmer has created a situation where any change to that file requires the entire project to be recompiled. Some times, this is just the nature of the problem at hand, other times, that single header file could be split up into separate files for use in separate modules.
It's quite common for a programmer to use make to provide himself with the luxury of removing all executables and object files from his project, thus necessitating a complete recompile. Typically, a programmer would add a target like this to his Makefile:
clean:
rm *.o main
Then the programmer can simply type “make clean” and get a completely clean slate. Similarly, it's common to have an “install” target so that an end user can type “make install” and have the software automatically installed for them.
As you can see, the make utility is a wonderful time saver. It can help a programmer ensure that the files that need to be recompiled, are recompiled, and that only those files that need to be recompiled, are recompiled. This article wasn't as detailed as the previous two in this series, but I hope this artciel rounds out our discussion of the compilation process.