Linux as a Backup E-mail Server
One Friday morning several months ago, the Microsoft Exchange e-mail server I'm in charge of crashed. At the time, I was a fairly new Windows NT administrator. The server, which we call tuccster, was not running properly again until late the following Sunday evening.
Unfortunately, at this point my problems had just begun. Our departmental e-mail server had been down for nearly three days and megabytes of important e-mail were spooled on remote servers all over the Internet. There was no way to predict when the mail would arrive and how much e-mail might never show up. Since different e-mail servers try to re-send e-mail at different intervals, what did show up would arrive out of chronological order. While I had learned a lot about Windows NT disaster recovery, the whole event was a major inconvenience for the users in my department and a horrible embarrassment for me.
While I'm not convinced I could have prevented the failure of Microsoft Exchange, I could have set up a fall-back e-mail server to spool all incoming mail while tuccster was down. Shortly after the mishap, I was able to find an old Gateway 486/66 with an Ethernet card that was being replaced with a faster Pentium system. Using Debian Linux and sendmail, I set up a fall-back e-mail server that receives and spools any incoming e-mail whenever tuccster, the primary Windows-based server, is down. The addition of a web server and a simple CGI script written in Perl provided a simple user interface into the system. I configured the web server so that a web browser could be used from certain trusted hosts to check if anything is waiting in the sendmail queue. Once our primary server is ready to begin receiving e-mail, the sendmail queue can be flushed by clicking a link on the same web page.
A “fall-back e-mail server” is an old idea on the Internet—the functionality to set one up is actually built into the Domain Name Server (DNS) protocol. The intent was that every important e-mail server would have a backup in place. A Domain Name Server contains many types of records. The most common of these types are SOA records which indicate authority for a domain's data; NS records which list name servers for a domain; A records which map a name to an address; PTR records which perform the reverse, mapping an address to a name; and MX records which describe “mail exchangers”. MX records allow one to define the actual host responsible for receiving mail directed at any particular host. The host actually responsible for receiving e-mail need not be the host to which the mail appears to be addressed.
To illustrate why this would be useful, imagine a set of workstations called larry, curly and moe. To reduce the load on curly and moe, we would like all incoming e-mail to be directed to larry, regardless of the host to which the mail was actually addressed. MX records provide a way to achieve this goal. Suppose we program our DNS server with the following:
larry.tucc.uab.edu. IN MX 1 larry.tucc.uab.edu. curly.tucc.uab.edu. IN MX 1 larry.tucc.uab.edu. moe.tucc.uab.edu. IN MX 1 larry.tucc.uab.edu.
If somebody tries to send e-mail to foo@moe.tucc.uab.edu, the mail transport agent (MTA) should look up the DNS record and see that larry is responsible for all e-mail directed to moe. Not all MTAs properly implement MX redirection. The mail will then be delivered to larry as if it were addressed to foo@larry.tucc.uab.edu.
While this is useful, it is not all that can be accomplished with MX records. The number appearing in the example between “MX” and “larry.tucc.uab.edu” is a preference value. Suppose I was worried that student projects running on larry might cause the system to crash periodically, or that larry was running a less-than-robust e-mail server. I could set up curly as a fall-back server by using the following DNS entries:
larry.tucc.uab.edu. IN MX 1 larry.tucc.uab.edu. larry.tucc.uab.edu. IN MX 2 curly.tucc.uab.edu. curly.tucc.uab.edu. IN MX 1 larry.tucc.uab.edu. curly.tucc.uab.edu. IN MX 2 curly.tucc.uab.edu. moe.tucc.uab.edu. IN MX 1 larry.tucc.uab.edu. moe.tucc.uab.edu. IN MX 2 curly.tucc.uab.edu.
Now suppose that larry is down for some reason. A remote host attempting to send e-mail to larry would discover that larry is unavailable. It would then learn from DNS that curly is the next preferred e-mail server for larry. The remote host will send the message to curly. The mail transport agent (such as sendmail) on curly will then realize that larry is preferred over curly as a mail exchange. It then spools the message locally, periodically attempting to pass the message on until it succeeds.
You've probably guessed by now that I used this very same technique to set up my fall-back e-mail server. Serendipitously, one of the older 486 computers in the department was slated for replacement with a shiny new Pentium-based PC. This computer, with 16MB of RAM, a 600MB hard drive and an Ethernet card was the perfect computer for the job. Setting up the computer was relatively easy. I chose to install the Debian distribution, since I'm most familiar with it. I could just as easily have used any of the other distributions. The components essential to my setup are the latest version of sendmail, the Apache web server (although NCSA, CERN or another web server would have worked) and Perl. I also installed other tools, such as Emacs, gcc, make, rcs and vi to make myself more comfortable while working on the system.
These utilities also allowed me to rebuild the kernel without using another system, and they should come in handy if I ever have to repair the system after some sort of mishap. I left all X11 libraries off the system to save disk space. After compiling a kernel containing just the modules I needed to run the system, I created a rescue boot floppy with this same kernel. There's no reason not to be prepared for the worst. Finally, I dubbed the system bartleby, after the downtrodden scrivener in the Dead Letter Office of Herman Melville's short story Bartleby the Scrivener.
The next step was to choose an appropriate location for bartleby to live. Simply placing bartleby in the same room as tuccster would provide adequate backup if tuccster crashed of its own accord again. However, by now I had disaster recovery on the brain and wanted to protect against other sorts of failures. I finally settled on a location in our mainframe room, which is located in a different building. This placed bartleby on a different subnet than tuccster, so no e-mail would be lost if our own subnet failed. Being in the mainframe room also meant that bartleby was located on a protected power system.
Long after tuccster has been automatically shut down by its UPS, bartleby will be merrily spooling any incoming e-mail. The wisdom of placing the system on a separate subnet was proven several months later when our own subnet was accidently disconnected by maintenance workers. With bartleby properly in place, all that was left was adding the appropriate entry to our DNS and configuring sendmail and the web server.
Just as in my example with the Three Stooges, I wanted to add MX records to our domain name server so that both bartleby and tuccster were listed as MX hosts for tuccster, with tuccster as the preferred host. The two MX records as I entered them are shown in Listing 1.
While configuring sendmail is often something to fear, it was easy in this case. sendmail will correctly process deferred e-mail by default. All that was necessary was configuring sendmail to send and receive e-mail directed to and from bartleby, something handled automatically by the Debian package install script. Once I finished the sendmail install, bartleby was working correctly as a fall-back e-mail server. The next time I brought tuccster down for maintenance, I examined the sendmail queue on bartleby using the sendmail -bp command. Sure enough, several e-mail messages were in the queue, waiting for tuccster to wake up again. After I brought tuccster back on-line, I flushed the queue using the sendmail -q command.
It is possible to adjust the frequency with which sendmail attempts to process any spooled messages by appending a time argument to the -q command when sendmail is first started. The Debian package for sendmail includes a macro at the beginning of the sendmail startup script that makes setting this value easy. I set mine to 30 minutes (the actual command-line argument is -q30m). The sendmail man page describes the syntax for the -q command. I automatically shut down the Exchange server late each Friday night to save its message store to tape. I wanted bartleby to flush its queue automatically so any messages that were deferred during the backup would arrive properly without human intervention. You may wish to lengthen the time between queue flushes.
The system as I had it set up was pretty good. However, it did require at least a minimal understanding of the UNIX shell to log in and check or flush the sendmail queue. I wanted to make the system as foolproof as possible in the event something happened while I was away from the office. I settled on a simple CGI script written in Perl. This script (see Listing 2) displays the current contents of the sendmail queue, as well as the output from the df command. This allows someone to see at a glance what e-mail messages are spooled in the system and how much disk space remains. If the sendmail queue is not empty, a link is created that can be clicked to flush the queue.
Since this is the only service this web server provides, I called it “index.cgi” and placed it in the root of the server directory. Adding “index.cgi” to the DirectoryIndex list in the srm.conf configuration file causes the script to be executed automatically when the server is accessed. I also adjusted the MinSpareServers and MaxSpareServers values in the httpd.conf file, so that no spare servers are left running. In this case, I don't mind slowing the response time to free up a little extra memory. There were a few final security issues to contemplate. I consider the listing of the sendmail queue to be “sensitive but unclassified” information. By this, I mean I don't feel an intruder would find the information contained therein particularly useful, but then again I don't see the need to make the information easily available to the entire Internet. The action of flushing the queue is nearly harmless on its own, though I suppose it would be possible to mount a denial-of-service attack by repeatedly flushing the queue. In the end, I decided to allow non-password based access from a handful of machines in the department.
Each of these machines is located in the office of an administrator who might have a reason to check the queue. I implemented these restrictions with the Apache directives shown in Listing 3 in the access.conf file.
I could have added the standard plaintext password authentication available through nearly every web browser, but I don't believe it would have added enough real security to offset the inconvenience. A screen shot of this web interface is shown below.
I decided to restrict shell access to myself. Furthermore, I installed the s/key system from Bell Labs to avoid plaintext passwords and used TCP Wrappers to restrict the hosts that can log into the system. While there is little to be gained (other than, perhaps, traffic analysis) from examining the sendmail queue, much could be gained from unrestricted access to any incoming e-mail message.
Setting up a fall-back e-mail server using a Linux system running on older hardware is an excellent tool to preserve incoming e-mail in the event of a disaster on your primary server. While I am backing up a Microsoft Exchange server, the same technique can be used to back up an SMTP server from any vendor. Setting up the fall-back server costs nearly nothing other than the time required for configuration. Having a very stable system completely independent of the rest of our network has also proven useful. Since I first configured bartleby, I have set up a collection of relatively simple scripts to watch other services on our network and page me in the event of an irregularity.
A fall-back e-mail system is a good way to sneak Linux into a low profile but “mission-critical” application in your organization. Once you've proven Linux is a “real” operating system to any skeptical decision maker, you can begin to utilize it in higher-profile roles.
John Blair currently works as a software engineer at Cobalt Microserver. When he's not hacking Cobalt's cute blue Qube, he's hanging out with his wife Rachel and newborn son Ethan. John is also the author of Samba: Integrating UNIX and Windows, published by SSC. Feel free to contact him at jdblair@cobaltmicro.com.