Building Tiny Linux Systems with Busybox--Part I
Because Linux is small and easy to customize, it's a fine kernel for embedded systems. But what about all of the other programs that are needed for a minimum functional GNU/Linux system? The minimum system installed by the Debian or Red Hat set-up disks, exclusive of the kernel, is about 40 megabytes in size. Busybox replaces the GNU/Linux distribution with a large set of command-line tools--all that are needed to boot and run a practical Linux system with networking--in a very small package.
The typical compiled size of Busybox on the i386 architecture is 256 to 500K total for all tools, depending on the C library used and how it is linked. This makes it easily possible to create a single-floppy Linux systems with a full-featured kernel, a command-line environment, plus your application.
I originally wrote Busybox in 1996 for the Debian GNU/Linux setup disk. The goal was to put a complete bootable system on a single floppy that would be both a rescue disk and an installer for the Debian system. A rescue disk is used to repair a Linux system when that system has become unbootable. Thus, a rescue disk needs to be able to boot the system and mount the hard disk file systems, and it must provide the command-line tools needed to bring the hard-disk root file system back to a bootable state. The Debian installer at the time I wrote Busybox was a Bourne shell script using dialogto provide a simple graphical interface on an ANSI terminal or the Linux console.
Since its creation, many people have added to and maintained Busybox, including members of the Debian Boot-Floppies team, the Linux Router Project and Lineo Corporation, where Eric Anderson maintains Busybox today. In the tradition of Free Software projects, contributions by other authors now make up the majority of the project. Busybox is a part of almost every commercial embedded Linux offerings, and is found in such diverse projects as the Kerbango Internet Radio and the IBM Wristwatch that runs Linux.
The name Busybox comes from a child's toy box with a telephone dial and anumber of knobs, buttons and other devices, all of which make noises when operated. This was called a busy boxin the past and is today commonly referred to as an activity center.
The Busybox source code can be found at www.busybox.lineo.com. By default, the Makefile provided builds a dynamic-link executable using the default libc library on your system. However, it is easily adapted to cross-compilation, and one can select static linking and other, specialized libc libraries by editing Makefile variables. Before you exercise the Makefile options or embed Busybox, you should build and run it on your host system just so that you can get familiar with it. If you are on a Linux system with the development tools installed, simply typing make should build it.
Once you build Busybox, it's time to learn about multi-call executable files. This is a trick we use to make Busybox small. There is one executable called busybox that is linked to 107 different names and provides the functions of 107 different programs. To illustrate this, run the following shell commands:
ln busybox ls ln busybox uptime ln busybox whoami
Now, run these commands:
./ls ./uptime ./whoami
Be sure to type the leading "./" as illustrated above, or you will get the system version of these commands rather than the Busybox version.
The lncommand is used to apply another name to a file. It does not copy the file. The only space it uses in the file system is the small amount that is necessary to store a name in a directory. You can illustrate this using the following ls command:
ls -il busybox ls uptime whoami
Be sure to use the -il argument to ls as above. This causes ls to print the inode number, a unique number identifying each file in a file system, along with the usual long listing provided by the -l argument. Linux allows the same file to have more than one name, but a file only has one inode number. You should see something similar to Listing 1.
1849054 -rwxrwxr-x 4 bruce bruce 252956 Sep 7 08:22 busybox 1849054 -rwxrwxr-x 4 bruce bruce 252956 Sep 7 08:22 ls 1849054 -rwxrwxr-x 4 bruce bruce 252956 Sep 7 08:22 uptime 1849054 -rwxrwxr-x 4 bruce bruce 252956 Sep 7 08:22 whoami
Note that this is not a listing of four files. It is a listing of one file with four names, as indicated by the inode number in the first column and the link count in the third column. The link count reports how many names a file has. You'll notice that directories in the common Linux file systems always have a link count of two, because they have two names: "." and "..". Most files, however, only have one name and their link count will be 1.
Because there is a fixed overhead of several kilobytes for every executable program, compressing 107 commands into one file saves a significant amount of space. So, just as we have linked Busybox to four different names, we can link it to 107. This provides us with a complete, bootable, runnable Linux system in a very small space. Even static-linking with GNU LIBC 6, which has become the standard for Linux systems, Busybox occupies only half a megabgyte.
If you don't need the internationalization of LIBC 6, the old LIBC 5 is significantly smaller. A new library intended for embedded use, uC-Libc (www.opensource.lineo.com), is even smaller, but use caution if your application is proprietary. Like Busybox, uC-Libc is covered by the GNU GPL (General Public License), and can't be linked to proprietary software. GNU LIBC 5 and 6, in contrast, are under the LGPL (the Lesser General Public License) and can be linked to proprietary applications. So, don't use uC-Libc for the libc library of a non-free program. At this writing, uC-Libc doesn't quite provide all of the functions required by Busybox, but it's only short a few and these may be provided by the time you read this article.
On the Debian install floppy, I linked Busybox dynamicaly, and then stripped down the shared libc library so that it only provided the functions necessary to support Busybox and the other executables on the floppy. This was the best way to provide a library shared by several different executables, since the floppy contained other programs besides Busybox. Stripping libc down to only the functions that actually were used cut its size by half. Rather than strip it by hand, I wrote a script that finds all of the library functions referenced by a set of dynamic executables, and then creates a library subset providing those functions (and the functions they depend on). This script has since been completely replaced by a version written by Marcus Brinkmann, which can be found in the Debian boot-floppies package under scripts/rootdisk/mklibs.sh. The script and how it works are properly the subject of another article the size of this one however, until that article is written, one can puzzle out how mklibs.sh works by installing the boot-floppies package on a Debian system, building the floppies and then reading the script carefully. Warning: mklibs.sh is probably the most complex shell script you will ever examine.
So, now that you know how to build and run Busybox, how do you make a small Linux system containing it? You'll need a few pieces: a static-linked Busybox executable, a skeleton root file system and /dev directory populated with the proper special files, and a Linux kernel with the features you need plus two features that will be used to boot and run a small Linux system: RAM disks and the compressed ROM file system. [Look for the details on how to build a small Linux system containing Busybox in a future issue--Ed.]
email: bruce@perens.com