Tweaking Tux, Part 1

by Marcel Gagné

Hello, true believers. Your response to the Tell-Tale Heart series (in particular, the /proc wrap-up) was touching and buoyed my heart. *sniff* Hot on the heels of that last column, I want to start something new, something I call "Tweaking Tux". While to some, this may conjure up images of a certain puffy white character laughing when his belly is poked, I want the image to be this: tuning your Linux system for performance, fun and profit. Well, maybe it's just fun.

Okay, here's my plan. Over the next few installments in this series, we're going to visit another one of those topics that nobody seems to want to talk about (like printing - well, can you blame them?). Back a million years ago when I first started working with UNIX systems, we always had to compile our own kernels (as in the early days of Linux). We did this because every system was different in terms of hardware, drivers, number of users, etc., and all these parameters had to be taken into consideration before you ever booted your system. Otherwise, the 20th person would log in and the system would go down in flames. Resource and capacity planning as well as performance tuning were low-level issues that had to be considered. Not so much with today's systems. For the most part, when someone asks me when they should rebuild their kernels, I answer this way.

"Never."

Unless you have to, that is. No, this series is not about rebuilding your kernel. It's about not rebuilding your kernel. It's also about not rebooting.

Last week, we talked about the /proc filesystem and all its wonders. I left you with a little question of modifying a running kernel, and that was that. We're going to start by exploring /proc in quite a bit more detail in this series, but in bits and pieces. I also want to talk about performance, limits, and knowing when your system needs more. But first, a little more about /proc.

Okay. Change to the /etc/sysconfig directory, and do a cat on your network file. It should look something like this:

   NETWORKING=yes
   FORWARD_IPV4=no
   HOSTNAME=netgate.mycompany.com
   GATEWAY=192.168.22.10
   GATEWAYDEV=

In particular, look at the second line (FORWARD_IPV4). For those who aren't already familiar with this concept, IP forwarding means routing. Routing means a networked computer will forward or direct packets between networks, specifically packets from other computers on your network. In this manner, a computer with a single dial-up Internet connection can act as a gateway for a whole network of computers. By default, your system does not do IP forwarding. If you wanted to change it so that at bootup (or network restart) you did have forwarding turned on, you would change the value of FORWARD_IPV4 to "yes" instead of "no". Now, here's a little problem for all you upgraders.

If you are upgrading to Red Hat 6.2 and are currently running with IP forwarding, you may find that things don't seem to be working with your old configuration scripts. If you start with a squeaky-clean 6.2 system, your /etc/sysconfig/network file will have an entry that says something like this (going by memory here):

   # FORWARD_IPV4 removed; see /etc/sysctl.conf

The /etc/sysctl.conf file looks like this:

   # Disables packet forwarding
   net.ipv4.ip_forward = 0
   # Enables source route verification
   net.ipv4.conf.all.rp_filter = 1
   # Disables automatic defragmentation (needed for masquerading, LVS)
   net.ipv4.ip_always_defrag = 0
   # Disables the magic-sysrq key
   kernel.sysrq = 0

In both cases, what is happening is the same. As a result of these parameters being set, changes are made to an entry in the /proc filesystem, in this case, /proc/sys/net/ipv4/ip_forward. If you cat this file, you'll see a simple "0" sitting there, or a "1" if you are running forwarding. To make a short story long, changing your network to go from not-forwarding to forwarding is as simple as changing the /proc entry.

The way to write your own entry into a proc entry (specifically ip_forward, in this case) is this:

   echo "1" > /proc/sys/net/ipv4/ip_forward

That would have exactly the same effect as mucking about with the other network startup scripts. To make sure this happens at boot time, just add that line (with appropriate comments for yourself) to your /etc/rc.d/rc.local script.

Cool? The point (was I making one?) is that you can change things on your running system, essentially modifying your running kernel. In fact, a number of the entries in /proc can be modified while your system is running. The reason for doing this varies; from changing the behaviour of things to extending limits that are otherwise built into your kernel at compile time, to improving performance. The downside is that you can wind up with things being worse than they were before you started touching these things, which is why I offer the following warning. <weasel words> Be careful. Be very careful whenever you change things in /proc. </weasel words>

By now, we're all familiar with DoS (denial of service) attacks. A fairly simple one is the TCP SYN flood. The quick-and-dirty on this one is that in establishing communication with a remote network, you send a packet that is then acknowledged by that network, whose acknowledgement you then acknowledge. Sort of like this silly conversation.

  "Hey there, remote."

  "Hey yourself."

  "Good, you're home and you're talking. Let's chat."

Usually when you call someone on the phone, you expect them to say "hello", at which point you say "hello" back. If I were to call your house a thousand times in a matter of seconds and hang up immediately, I might be well on my way to either making you nuts or dangerous. In the case of your system, a TCP SYN flood is that phone ringing thousands of times in a very short period of time. Meanwhile, your system is still keeping an ear open, looking to hear a reply to its "hello", a reply which isn't coming. Force a system to keep track of too many such one-sided messages (there's a system/kernel table that does this), and your network may become unavailable (or worse). This is what syncookies are for: a mechanism for your system to dump unacknowledged acknowledgements (what a mouthful) if it is taking too long for a reply. If your network happens to be under attack, syncookies will also dump any SYN packets that would push your system over the edge.

Now, any recent Linux kernel has support for TCP syncookies compiled in, but not turned on. If your system is connected to the Internet as a gateway, you will definitely want to turn this feature on. To do that, you go back to /proc. A cat of /proc/sys/net/ipv4/tcp_syncookie will show it to be set to 0 (for "off"). To turn syncookie protection on, change that to a 1 just as you did with the IP forwarding.

     echo "1" > /proc/sys/net/ipv4/tcp_syncookies

Here's another interesting one before we go. Do a netstat -a and have a look at the display. You'll see something like this:

Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address      State
tcp        0   0 gate1.mycorp.com:2354   www.mycorp.com:pop3   TIME_WAIT
tcp        0   6 gate1.mycorp.com:2344   news.whodat.ca:nntp   ESTABLISHED
tcp        1   0 gate1.mycorp.com:2277   visit.rsite.org:www   CLOSE_WAIT
tcp       57   0 gate1.mycorp.com:2195   host.somsite.org:ftp  CLOSE_WAIT

Notice the numbers attached to the local address (i.e., gate1.mycorp.com:2195). These are TCP sockets, and their numbers are assigned as needed by the system. As you might guess, there is an upper limit to these sockets (or ports) through which remote sites can communicate with yours. That number has a lower and an upper limit. The lower limit is 1024 and the upper limit is 4099. Check it out for yourself.

     # cat /proc/sys/net/ipv4/ip_local_port_range
     1024 4999

If you are running a busy web site, you may find yourself hitting that wall. On such a (busy) site, the recommended procedure is to change it from 1024 4099 to 32768 61000, like this:

     echo "32768   61000" > /proc/sys/net/ipv4/ip_local_port_range

Okay, one last little question. How does the system do this? Here is method number 1. On one of my systems (Red Hat 6.0), there's this little bit of code in the /etc/rc.d/init.d/network script:

         if [ "$FORWARD_IPV4" = "no" -o "$FORWARD_IPV4" = "false" ]; then
             value=0
             message="Disabling IPv4 packet forwarding"
         else
             value=1
             message="Enabling IPv4 packet forwarding"
         fi
         if [ $value != `cat /proc/sys/net/ipv4/ip_forward` ]; then
             action "$message" /bin/true
             echo "$value" > /proc/sys/net/ipv4/ip_forward
         fi

Notice the echo "$value" > line near the bottom. Looks pretty much like the same thing we've been doing at the command line. That's one; here is method number 2. Earlier, I mentioned that Red Hat 6.2 does it a bit differently. That's when we looked at the sysctl.conf file earlier to set IP forwarding. How does that figure in to what the system does for setting these parameters? The answer is that the file /etc/sysctl.conf is used by the command sysctl. Have a peek at your /etc/rc.d/init.d/network script (with vi or cat) and look for the following line:

   action "Setting network parameters" sysctl -p /etc/sysctl.conf

The -p for sysctl tells the command to read its parameters from the named file, in this case /etc/sysctl.conf, and one of those parameters is ip_forward. Here's my final question to you. Without looking at the source code, what do you think "sysctl" does?

This is where we stop and prepare for next time when we start to dig deeper into performance issues, look at more tools, and find other fun things to do with /proc. Until next time, give Tux a tweak. You both might enjoy it.

Load Disqus comments