Tweaking Tux, Part 4
With this installment, I want to get back to some of the basics. How do you read your system's mind? How can you find out if old Tux is maybe a little stressed out? We've already seen uptime and top. I hinted (actually, I might have said it) that one of top's problems is that it is top-heavy (ahem). When you run it, the program itself is often the most active and taxing process. This calls for something lighter and faster. I showed you something called free to whet your appetite. Now, I am going to show you something else.
This little command is virtually ubiquitous in some form or another, in that you can find it on most UNIX systems (including, of course, Linux). It represents another means of taking a peek into current CPU usage--it is called vmstat. (Ah, the command line lives!) The format of the command is vmstat interval_in_seconds number_of_intervals. In the following examples, I am taking a sample every 2 seconds for 5 iterations.
# vmstat 2 5 procs memory swap io system cpu r b w swpd free buff cache si so bi bo in cs us sy id 0 0 0 14004 2288 2960 22992 0 0 1 1 21 23 4 3 16 0 0 0 14004 2168 2968 23104 28 0 15 0 225 18 3 2 94 0 0 0 14004 2168 2968 23104 0 0 0 23 285 9 2 1 97 0 0 0 14004 2168 2968 23104 0 0 0 0 105 9 1 2 97 0 0 0 14004 2168 2968 23104 0 0 0 3 125 10 1 2 97
Let's look at vmstat in more detail, including what you can decipher from some of the other columns. Notice the columns "us", "sy" and "id". They represent the percentage of CPU time spent dealing with user programs or requests (us); the percentage of CPU time dealing with system tasks (sy), such as waiting on I/O, updating the system stats, maintaining priorities and all the other systemy stuff; and the percentage of CPU time spent doing nothing at all (id). If your computer were human, it would be flipping through the TV Guide at this point. This should look familiar--it's another representation of what top displayed in its "CPU states" header. (See last week's column for a discussion of top.)
For the numbers to actually mean something, you should take regular samplings or let vmstat run for several iterations, such as vmstat 1 20. One pass using vmstat 1 1 won't tell you anything worthwhile. Let's say that you consistently see the "us" column at a high percentage with "sy" taking up the rest, and next to no idle time. We might guess (remember, lots of factors) that the system is overloaded with tasks. Essentially, the system is doing nothing but servicing tasks, whether they be system or user.
Disk I/O is the great bottleneck of any system. For the sake of simplicity, I won't get into removable media like CD-ROMs at this time. Because of this bottleneck, anything that impacts disk access time can make a substantial difference to your system. For instance, demands for memory can cause your system to swap heavily, which can cause your disks to thrash as pages of information start a manic dance from memory to disk and back to memory. Let's have a look at another vmstat output.
# vmstat 2 5
procs memory swap io system cpu r b w swpd free buff cache si so bi bo in cs us sy id 1 0 0 3684 1120 1308 25692 0 1 19 2 267 1270 8 3 89 1 0 0 3684 1376 1308 25052 0 0 2687 0 172 2483 19 15 67 1 0 0 4172 1412 1308 25408 0 488 3230 122 205 346 2 6 92 1 0 0 4168 724 1308 24768 0 0 1417 0 180 699 24 5 71 1 0 0 4300 548 1308 25376 0 132 2528 33 204 2790 13 3 84
This is a system where several interesting things are happening. The "r" column shows how many processes are currently in the run queue. The "w" tells me that no processes are swapped out. This is a good thing. In fact, that number should (ideally) always be zero. Have a look at the "swap" set of columns (si and so) and you'll see that we are starting to use more and more swap space. The "free" column is rapidly shrinking as whatever is happening continues to happen. The system, faced with a shortage of real memory, starts to swap a great deal of memory out to disk as well. You can see that activity in the "si" and "so" columns, which can be a great indicator of how your system is doing as far as memory is concerned. The "si" column refers to the amount of memory swapped in from disk, while "so" is memory swapped out to disk. That's memory, not processes. You shouldn't immediately panic when you see this. This kind of swapping in and out is a pretty common thing to see, particularly when you first start up a huge application (monster word processing programs anyone?). When everything finishes loading, you'll see those numbers settle down again. Some real memory will be freed up and the amount of used virtual memory (swpd) will likely be somewhat higher. The real problem becomes evident when you start hitting the ceiling with virtual memory and "si/so" activity never seems to settle down.
Keeping an eye on those numbers is important, particularly if your system is being used by others. Sure, they'll let you know if it gets bad enough, but that's not the best way. Using tools like vmstat, you can spot trends over time as your system usage increases. (Does it ever decrease?) Consider a crontab entry that runs vmstat on a regular basis and outputs the information to a file. Then, have another job (sometime during the night) e-mail the report file, remove the old log and get ready to start over for the next day.
Let's revisit hdparm before we move on to something else. The response from most people was a pretty universal "Wow!" (except for the Mandrake 7.1 user who pointed out that some of the hdparm tweaks had already been incorporated into that version). Several readers asked me what to do with the hdparm commands once they had something that worked, since those settings are lost at reboot time. Certainly, you don't want to enter them each time you reboot (it can be so long between reboots that a sysadmin might well forget all changes that were ever made). Let's take a moment and look at some likely commands for tweaking with hdparm.
# /sbin/hdparm -c3 /dev/hda # /sbin/hdparm -d1 /dev/hda
You can also turn this into one command with the following:
# /sbin/hdparm -c3 -d1 /dev/hda
If it turns out that these two commands (or one) had the desired effect on your system performance (namely, a speedier set of disks), then you need to get these into your startup scripts. Using your favorite editor, add the command to your rc.local file. On my system, that's the /etc/rc.d/rc.local file to be exact. Now, this example is from a Red Hat system, and the location of the rc.local (or its equivalent) varies from system to system. If you need a refresher on this one, check out Tweaking Tux, Part II for details on locating your startup file, regardless of system.
Time to have a little fun. Yes, I know, it's all fun; but this is just plain fun while still being useful (confused yet?). Anyhow, one of the things I love so much about Linux is there is always some open-source developer out there trying to make the programs we use more useful, more powerful or just plain more friendly, all the while keeping the price down. So it is with performance monitoring.
It's kind of weird, but after a while, you find yourself wanting that information at all times--as though knowing just how your system is doing becomes a kind of addiction. Small, graphical tools can be quite nice to keep you on top of that information with a glance. I tend to run either KDE (2.0 looks great!) or WindowMaker as my desktop. What I particularly like about WindowMaker are those little dockable applets, small programs that take up next to no space on your desktop. In fact, I've been running those applets even when using KDE. For system performance analysis at a glance, I use something called wmmon.
With a single click, the applet switches from CPU utilization to I/O stats then to a combined display showing memory, swap and uptime. It's simple and it's beautiful. There are binaries (and source) for wmmon at the WindowMaker web site. (I confess to having developed quite the fondness for these little, desktop real estate friendly apps.) To start the program, simply type wmmon &.
Quick tip time. If you are running this from KDE, as I often do, and you want to place the program somewhere else on your desktop, you'll quickly notice the complete absence of a title bar to grab on to. You'll need to hold down the "Alt" key while clicking the vmmon display to drag it into place.
Another variation of wmmon, a program called WMgMon (WindowMaker Generic Monitor), can be found at http://www.caesium.fr/freeware/wmgmon/ . What's cool about this one is that the program flashes from one display to another, alternating every few seconds between the various modes. WMgMon even shows you the status of your mailbox. It is also configurable and allows you to modify the interval of the display and what the app actually displays. You have to compile WMgMon from source, but that is extremely easy.
tar -xzvf wmgmon-0.3.0.tar.gz cd wmgmon.app make cp bin/wmgmon /usr/local/bin
Then, just type wmgmon & to run the program. There is some documentation (but only a little) in the wmgmon.app/doc directory. Check it out if you want to modify the program defaults.
Finally, there's that old favourite, xosview, an X-Windows application that is probably already installed on your system. xosview is an X-Windows program that displays CPU activity, I/O, load average, memory usage and swap, both numerically and with little hyperactive bar graphs. It even flashes device access through interrupts with little blinking lights at the bottom.
If you don't happen to have it with your distribution, you can always download it from http://www.ibiblio.org/pub/Linux/system/status/xstatus/xosview-1.7.1.tar.gz.
Compiling the program is standard fare.
tar -xzvf xosview -1.7.1.tar.gz cd xosview-1.7.1 ./configure make make install
I've rambled on more than enough for one week, so I will say "Thanks for dropping by the SysAdmin's Corner". Until next time, give Tux a tweak. You both might enjoy it.
email: ljeditors@ssc.com