Hack and / - Bond, Ethernet Bond

by Kyle Rankin

As a sysadmin, one of the most important virtues you should foster is tolerance. Now, although it's important for vi and Emacs users and GNOME and KDE users to live in harmony, what I want to focus on here is fault tolerance. Everybody has faults, but when your server's uptime is on the line, you should do everything you can to make sure it can survive power faults, network faults, hardware faults and even your faults. In this column, I describe a basic fault-tolerance procedure that's so simple to implement, all sysadmins should add it to their servers: Ethernet bonding.

Live and Let Interfaces Die

The basic idea behind Ethernet bonding is to combine two or more Ethernet ports on your machine, such that if one Ethernet port loses its connection, another bonded port can take over the traffic with zero or minimal downtime. The fact is that these days, the bulk of the services on a server require a network connection to be useful, so if you set up multiple Ethernet ports that are connected to redundant switches, you can conceivably survive a NIC failure, a cable failure, a bad switch port or even a switch failure, and your server will stay up.

On top of basic fault tolerance, you also can use certain Ethernet bonding modes to provide load balancing as well and get more bandwidth than a single interface could provide. Ethernet bonding is a feature that is part of the Linux kernel, and it can provide a number of different behaviors based on which bonding mode you choose. All of the Ethernet bonding information can be found in the Documentation/networking/bonding.txt file included with the Linux kernel source. Below, I provide an excerpt from that documentation that lists the 007 bonding modes:

  • balance-rr or 0 — round-robin policy: transmit packets in sequential order from the first available slave through the last. This mode provides load balancing and fault tolerance.

  • active-backup or 1 — active-backup policy: only one slave in the bond is active. A different slave becomes active if, and only if, the active slave fails. The bond's MAC address is externally visible on only one port (network adapter) to avoid confusing the switch.

  • balance-xor or 2 — XOR policy: transmit based on the selected transmit hash policy. The default policy is a simple [(source MAC address XOR'd with destination MAC address) modulo slave count]. Alternate transmit policies may be selected via the xmit_hash_policy option, described below. This mode provides load balancing and fault tolerance.

  • broadcast or 3 — broadcast policy: transmits everything on all slave interfaces. This mode provides fault tolerance.

  • 802.3ad or 4 — IEEE 802.3ad dynamic link aggregation: creates aggregation groups that share the same speed and duplex settings. Utilizes all slaves in the active aggregator according to the 802.3ad specification.

  • balance-tlb or 5 — adaptive transmit load balancing: channel bonding that does not require any special switch support. The outgoing traffic is distributed according to the current load (computed relative to the speed) on each slave. Incoming traffic is received by the current slave. If the receiving slave fails, another slave takes over the MAC address of the failed receiving slave.

  • balance-alb or 6 — adaptive load balancing: includes balance-tlb plus receive load balancing (rlb) for IPv4 traffic and does not require any special switch support. The receive load balancing is achieved by ARP negotiation. The bonding driver intercepts the ARP Replies sent by the local system on their way out and overwrites the source hardware address with the unique hardware address of one of the slaves in the bond such that different peers use different hardware addresses for the server.

Now that you've seen all the choices, the real question is which bonding mode should you choose? To be honest, that's a difficult question to answer, because it depends so much on your network and what you want to accomplish. What I recommend is to set up a test network and simulate a failure by unplugging a cable while you ping the server from another host. What I've found is that different modes handle failure differently, especially in the case of a switch that takes some time to re-enable a port that has been unplugged or a switch that has been rebooted. Depending on the bonding mode you choose, those situations might result in no downtime or a 30-second outage. For my examples in this column, I chose bonding mode 1 because although it provides only fault tolerance, it also has only one port enabled at a time, so it should be relatively safe to experiment with on most switches.

Note: the bonding mode is set at the time the bonding module is loaded, so if you want to experiment with different bonding modes, you will at least have to unload and reload the module or at most reboot the server.

Although Ethernet bonding is accomplished through a kernel module and a utility called ifenslave, the method you use to configure kernel module settings and the way networks are configured can differ between Linux distributions. For this column, I talk about how to set this up for both Red Hat- and Debian-based distributions, as that should cover the broadest range of servers. The first step is to make sure that the ifenslave program is installed. Red Hat servers should have this program installed already, but on Debian-based systems, you might need to install the ifenslave package.

For Your Files Only

The next step is to configure the bonding module with the bonding mode you want to use, along with any other options you might want to set for that module. On a Red Hat system, you will edit either /etc/modprobe.conf (for a 2.6 kernel) or /etc/modules.conf (for an older 2.4 kernel). On a Debian-based system, edit or create the /etc/modprobe.d/aliases file. In either case, add the following lines:

alias bond0 bonding
options bonding mode=1 miimon=100

The alias line will associate the bond0 network interface with the bonding module. If you intend on having multiple bonded interfaces (such as on a system with four or more NICs), you will need to add an extra alias line for bond1 or any other interfaces. The options line allows me to set my bonding mode to 1 as well as set miimon (how often the kernel will check the link state of the interface in milliseconds).

On Your Distribution's Network Service

Like with module configuration, different distributions handle network configuration quite differently, and that's true for bonded interfaces as well. So, at this point, it's best if I describe each system separately.

From Red Hat with Love

Red Hat network configuration is managed via files under /etc/sysconfig/network-scripts. Each interface has its own configuration file preceded by ifcfg-, so that the configuration for eth0 can be found in /etc/sysconfig/network-scripts/ifcfg-eth0. To configure bonding, you simply can use the basic network settings you would have for your regular interface, only now they will be found in ifcfg-bond0:

DEVICE=bond0
NETMASK=255.255.255.0
GATEWAY=192.168.19.1
BOOTPROTO=static
IPADDR=192.168.19.64
HOSTNAME=goldfinger.example.net
ONBOOT=yes

Next, each interface you want to use for bond0 needs to be configured. In my case, if I wanted to bond eth0 and eth1, I would put the following into ifcfg-eth0:

DEVICE=eth0
USERCTL=no
ONBOOT=yes
MASTER=bond0
SLAVE=yes
BOOTPROTO=none

and the following into ifcfg-eth1:

DEVICE=eth1
USERCTL=no
ONBOOT=yes
MASTER=bond0
SLAVE=yes
BOOTPROTO=none

Finally, type service network restart as root to restart your network service with the new bonded interface. From this point on, you can treat ifcfg-bond0 as your main configuration file for any network changes (and files like route-bond0 to configure static routes, for instance). To make this even easier for you, I've included a script for Red Hat users that automates this entire process. Run the script with the name of the bonded interface you want to use (such as bond0), follow it with the list of interfaces you want to bond, and it will set up the modules and network interfaces for you automatically based on the configuration it finds in the first interface (such as eth0) that you list. So, for instance, to set up the above configuration, I would make sure that ifcfg-eth0 had the network settings I wanted to use, and then I would run the script shown in Listing 1.

Listing 1. Bond Script for Red Hat Users

# bond bond0 eth0 eth1

#!/usr/bin/perl

# bond -- create a bonded interface out of one or 
# more physical interfaces
# Created by Kyle Rankin
#

my $bond_interface = shift;
my @interfaces = @ARGV;
my $network_scripts_path = '/etc/sysconfig/network-scripts/';
my $bond_mode=1;
my $bond_miimon=100;
my $bond_max=2;

usage() unless (@ARGV);
if($#interfaces < 1){
   usage("ERROR: You must have at least 2 interfaces to bond!");
}

system("/etc/init.d/network stop");

config_bond_master($bond_interface, $interfaces[0]);
foreach(@interfaces){
   config_bond_slave($bond_interface, $_);
}
config_modules($bond_interface, $bond_miimon, $bond_mode);

system("/etc/init.d/network start") or die 
 ↪"Couldn't start networking: $!\n";

sub usage
{
   $error = shift;
   print "$error\n" if($error);
   print "Usage: $0 bond_interface interface1 interface2 [...]\n";
   print "\nbond_interface will use the network 
   ↪settings of interface1\n";
   exit
}

sub config_bond_master
{
   my $bond_interface = shift;
   my $main_interface = shift;
   my $netconfig_ref = get_network_config($main_interface);

   open CONFIG, "> $network_scripts_path/ifcfg-$bond_interface" 
   ↪or die "Can't open 
   ↪$network_scripts_path/ifcfg-$bond_interface: $!\n";

   print CONFIG "DEVICE=$bond_interface\n";
   foreach(keys %$netconfig_ref){
      unless($_ eq "HWADDR" || $_ eq "DEVICE"){
         print CONFIG "$_=$$netconfig_ref{$_}\n";
      }
   }
   close CONFIG;
}

sub config_bond_slave
{
   my $bond_interface = shift;
   my $slave_interface = shift;
   my $netconfig_ref = get_network_config($slave_interface);

   open CONFIG, "> $network_scripts_path/ifcfg-$slave_interface" 
   ↪or die "Can't open 
   ↪$network_scripts_path/ifcfg-$slave_interface: $!\n";
   print CONFIG <<"EOC";
DEVICE=$slave_interface
USERCTL=no
ONBOOT=yes
MASTER=$bond_interface
SLAVE=yes
BOOTPROTO=none
EOC
   if($$netconfig_ref{'HWADDR'}){
      print CONFIG "HWADDR=$$netconfig_ref{'HWADDR'}";
   }
}

# This subroutine returns a hash with key-value pairs matching 
# the network configuration for the interface passed as an 
# argument according to the configuration file in 
# /etc/sysconfig/network-scripts/ifcfg-interface
sub get_network_config
{
   my $interface = shift;
   my %netconfig;
   open(CONFIG, "$network_scripts_path/ifcfg-$interface") 
   ↪or die "Can't open 
   ↪$network_scripts_path/ifcfg-$interface: $!\n";
   while(<CONFIG>)
   {
      chomp;
      ($key, $value) = split '=';
      $netconfig{uc($key)} = $value;
   }
   close CONFIG;
   return \%netconfig;
}

sub config_modules
{
   my $bond_interface = shift;
   my $bond_miimon = shift;
   my $bond_mode = shift;
   my $bond_options_configured = 0;
   my $bond_alias_configured = 0;

   if(-f "/etc/modprobe.conf"){ # for 2.6 kernels
      $module_config = "/etc/modprobe.conf";
   }
   else {
      $module_config = "/etc/modules.conf";
   }
   open CONFIG, "$module_config" or die 
   ↪"Can't open $module_config: $!\n";
   while(<CONFIG>){
      if(/options bonding/){ $bond_options_configured = 1; }
      if(/alias $bond_interface bonding/){ 
      ↪$bond_alias_configured = 1; }
   }
   close CONFIG;

   open CONFIG, ">> $module_config" or die 
   ↪"Can't open $module_config: $!\n";
   unless($bond_alias_configured)
   {
      print CONFIG "alias $bond_interface bonding\n";
   }
   unless($bond_options_configured)
   {
      print CONFIG "options bonding 
      ↪miimon=$bond_miimon mode=$bond_mode max_bonds=$bond_max\n";
   }
   close CONFIG;
}
Debian Is Forever

As you might imagine, Debian's network configuration is quite different from Red Hat's. Unfortunately, I don't have a Perl script to automate the process for Debian users, but as you will see, it's so simple, a script isn't necessary. All the network configuration for Debian-based servers can be found in /etc/network/interfaces. Here's a sample interfaces file:

# The loopback network interface
auto lo
iface lo inet loopback

# The primary network interface
auto eth0
iface eth0 inet static
  address 192.168.19.64
  netmask 255.255.255.0
  gateway 192.168.19.1

To configure a bonded interface, I simply comment out all of the configuration settings for eth0 and create a new configuration for bond0 that copies all of eth0's settings. The only change I make is the addition of an extra parameter called slaves that lists which interfaces should be used for this bonded interface:

# The loopback network interface
auto lo
iface lo inet loopback

# The primary network interface
#auto eth0
#iface eth0 inet static
#  address 192.168.19.64
#  netmask 255.255.255.0
#  gateway 192.168.19.1

auto bond0
iface bond0 inet static
  address 192.168.19.64
  netmask 255.255.255.0
  gateway 192.168.19.1
  slaves eth0 eth1

Once you have made the changes, type sudo service networking restart or sudo /etc/init.d/networking restart to restart your network interface.

No matter whether you use Red Hat or Debian, once you have configured the bonded interface, you can use the ifconfig command to see that it has been configured:

$ sudo ifconfig
bond0	Link encap:Ethernet HWaddr 00:0c:29:28:13:3b
	inet addr:192.168.19.64 Bcast:192.168.0.255
	Mask:255.255.255.0
	inet6 addr: fe80::20c:29ff:fe28:133b/64 Scope:Link
	UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
	RX packets:38 errors:0 dropped:0 overruns:0 frame:0
	TX packets:43 errors:0 dropped:0 overruns:0 carrier:0
	collisions:0 txqueuelen:0
	RX bytes:16644 (16.2 KB) TX bytes:3282 (3.2 KB)

eth0	Link encap:Ethernet HWaddr 00:0c:29:28:13:3b
	UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
	RX packets:37 errors:0 dropped:0 overruns:0 frame:0
	TX packets:43 errors:0 dropped:0 overruns:0 carrier:0
	collisions:0 txqueuelen:1000
	RX bytes:16584 (16.1 KB) TX bytes:3282 (3.2 KB)
	Interrupt:17 Base address:0x1400

eth1	Link encap:Ethernet HWaddr 00:0c:29:28:13:3b
	UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
	RX packets:1 errors:0 dropped:0 overruns:0 frame:0
	TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
	collisions:0 txqueuelen:1000
	RX bytes:60 (60.0 B) TX bytes:0 (0.0 B)
	Interrupt:18 Base address:0x1480

lo	Link encap:Local Loopback
	inet addr:127.0.0.1 Mask:255.0.0.0
	inet6 addr: ::1/128 Scope:Host
	UP LOOPBACK RUNNING MTU:16436 Metric:1
	RX packets:0 errors:0 dropped:0 overruns:0 frame:0
	TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
	collisions:0 txqueuelen:0
	RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)

Once the bonded interface is enabled, you can ping the server from a remote host and test that it fails over when you unplug one of the Ethernet cables. Any failures should be logged both in dmesg (/var/log/dmesg) and in the syslog (/var/log/messages or /var/log/syslog) and would look something like this:

Oct 04 16:43:28 goldfinger kernel: [ 2901.700054] eth0: link down
Oct 04 16:43:29 goldfinger kernel: [ 2901.731190] bonding: bond0:
link status definitely down for interface eth0, disabling it
Oct 04 16:43:29 goldfinger kernel: [ 2901.731300] bonding: bond0:
making interface eth1 the new active one.

As I said earlier, I highly recommend you experiment with each bonding mode and with different types of failures, so you can see how each handles both failures and recoveries on your network. When your system is more tolerant of failures, you'll find you are more tolerant of your system.

Kyle Rankin is a Systems Architect in the San Francisco Bay Area and the author of a number of books, including The Official Ubuntu Server Book, Knoppix Hacks and Ubuntu Hacks. He is currently the president of the North Bay Linux Users' Group.

Load Disqus comments