The Trouble With Live Data
live data, n. 1. Data that is written to be interpreted and takes over program flow when triggered by some un-obvious operation, such as viewing it. One use of such hacks is to break security. For example, some smart terminals have commands that allow one to download strings to program keys; this can be used to write live data that, when listed to the terminal, infects it with a security-breaking virus that is triggered the next time a hapless user strikes that key. For another, there are some well-known bugs in vi that allow certain texts to send arbitrary commands back to the machine when they are simply viewed. 2. In C code, data that includes pointers to function hooks (executable code). 3. An object, such as a trampoline, that is constructed on the fly by a program and intended to be executed as code. --From the Jargon file, version 3.2.0
The link seemed innocent enough. All it said was:
Crusty the Clown fans should click here!!!
Perhaps I was a bit naive—or just stupid. But I clicked on the damned link. The actual transfer didn't take very long, and pageview (Sun's PostScript viewer) quickly brought up a picture of Crusty the Clown (from The Simpsons). The first inkling of any trouble was that my disk drive light came on. And stayed on. That almost never happens, especially without me doing something. I pulled up another xterm. Something was really funny; my tcsh configuration was all wrong...
My drive light went off. My home directory was gone.
Some fancy footwork with my sysadmin got my home directory back (fortunately it had been backed up the night before). Since the catastrophe happened early in the morning, nothing except the previous evening's e-mail was lost.
I was damned lucky.
All that had happened was that someone had embedded a command line in the PostScript image. The PostScript viewer I was using happily executed the command and 200 megabytes of my files were simply erased.
As the Internet becomes more and more sophisticated, rich and complex schemes of data interpretation become more common. Each of these schemes is a potential security hole. Conventional security mechanisms are less than effective at dealing with these problems.
The breadth of use of live data makes finding an effective solution quite hard. Users are demanding the newer, richer Internet services and site administrators cannot delay adding such services for very long. The widespread deployment of web browsers and MIME e-mail involves fairly substantial security risks which have many unintended consequences.
The (true) nightmare story introducing this article is a good example. Most PostScript viewers have a secure mode which disables the file writing and program-launching operations that are part of the PostScript language. However, my local configuration did not use the secure mode and I got burned by someone's idea of humor. Problems with embedded PostScript are well-known, however. Other tools which (at first) seem innocuous may also have potential problems.
The infamous Internet Worm of 1987 used a simple buffer-overrun bug in the finger daemon to compromise security on a great many computers. Historically, a great many C programs have had similar flaws—maybe not as dangerous to security as the finger bug, but often quite catastrophic in their own right.
Consider the current state-of-the-art in web browsers. The vast majority of the web browsers currently on the market share a common ancestry with NCSA's Mosaic. Given the intense market pressures to bring web clients to market in today's Internet feeding frenzy and greased-pig-catching contest, it seems unlikely that extreme care has been taken in ensuring that web browsers securely interpret HTML. Buffer overflow problems (which seem particularly likely in anchors) could give unscrupulous individuals a way to execute any arbitrary code sequence on target machines.
This problem is only getting worse. The Java language is currently exploding through the Internet world, and it seems likely that there will be a great many surprises associated with Java. Even though Java is intended to provide security mechanisms to prevent abuse, all it takes is one flaw to cause a major problem.
Another example of live data (which few people have considered) is the local variables list in Emacs. This is a highly convenient feature, as it allows per-file customization for the Emacs editor. For example, programmers can code their bracing and indentation preferences into the file, so other emacs users who edited the file are automatically able to follow those same bracing conventions. However, since many of the local “variables” are typically Emacs Lisp functions, there could be a substantial risk associated with merely viewing a file in Emacs—the same risks associated with viewing PostScript files with an unsafe viewer.
The Emacs problem is fairly manageable. If you set the variable enable-local-variables to nil, this feature of Emacs is disabled.
It seems that the biggest risks are associated with the most complex services. MIME e-mail provides a mechanism for launching viewer programs based on the contents of the e-mail. So, if you send a JPEG picture, a JPEG viewer will be launched. Thus, a very large (and rapidly expanding) set of programs can be involved—too many to ensure reasonable security. Some MIME configurations even allow the transmission and execution of shell scripts or Perl programs.
The cheapest, most reliable, and most secure components of any system are the ones that aren't there. Is it possible to solve the live data problem by avoiding the use of the advanced tools that pose the threat?
It probably isn't practical for a site administrator to outlaw such tools. For one thing, they are all too useful, and one has to weigh a possibly substantial security risk against a threat to an organization's competitiveness. In addition, it is unlikely that any but the most draconian site administrators could prevent users from acquiring their own personal web browsers or MIME e-mail clients. Fascist site administration might make matters worse, since the individuals who were using the outlawed tools would obviously not be informing site security staff of what they were doing.
Internet firewalls are becoming a popular security mechanism. However, it doesn't seem likely that they can protect against hostile live data. They could completely block risky services (like e-mail and the Web), but if you completely block those services there isn't much point to being on the Internet. It also does not seem practical for a firewall to inspect all data coming from the net and looking for “dangerous” activities. For one thing, we don't have a good mechanism for distinguishing “friendly” live data from “unfriendly” live data—and the most dangerous live data doesn't look live at all. Another point to think about is that the effort involved in trying to solve this problem will likely put an unacceptable performance burden on the firewall.
There does not seem to be any “silver bullet” solution to this problem. However, there are some fairly simple steps that can be taken to provide reasonable protection:
If possible, run your web browser and perform other possibly risky activities (e.g. viewing PostScript files, running programs you have downloaded) as another user ID that you use expressly for that purpose. This makes it less likely that major damage to your personal files or your system could occur.
Save your “dot files” frequently. It may also help to allow only read-only access to such files. Dot files (such as your .profile, .emacs, .exrc, or .rhosts files) are frequent targets of live-data attacks.
Probably the best advice is to be a little bit paranoid. If you get a large MIME attachment that appears to be a shell script, treat it the same way you would treat an armed bomb.
The best advice is not to avoid tools which use live data, but rather to use them very carefully. Being aware of the risks is probably the best defense. So have fun, and be careful out there.
David Bonn When he isn't busy skiing, he is usually fiddling around with Linux. When he isn't doing those two things, he is busy being president of Mazama Software Labs. Since David graduated from the University of Washington in 1986, most of his computer time has been spent working on networked systems.