Dynamic Load-Balancing DNS: dlbDNS
The rapid growth of computer literacy has led to a dramatic rise in the number of people using computers today. This rise has resulted in the development of intense computation-oriented and resource-sharing applications. These factors together play a prominent role in increasing the load across the Internet, causing severe network traffic congestion. This phenomenon, though dynamic in nature, causes a lot of user frustration in the form of slow response times and repeated crashing of applications.
Developing servers with more capacity and capability of handling this traffic is one way to solve the problem; another is to distribute client requests across multiple servers. This second method is an elegant way of handling this problem, since it uses existing resources and avoids scenarios in which some servers are overloaded while the rest of them are idle. The need for distributing requests across servers is further strengthened, considering:
Each TCP session eats up 32 bytes of memory (a general rule of thumb), causing a server that has 32MB of RAM to theoretically support one million simultaneous connections (see Resources 2).
Given a number of servers, users always log in to their favorite server while overlooking the load on that server.
Distributing a request across servers can be implemented by monitoring the servers regularly and directing the request dynamically to the best server. This way of dynamically directing a request across multiple servers based on the server load is called dynamic load balancing. This feature can be added to the pre-existing Domain Name Service (DNS), as it already plays a prominent role in resolving client requests and can be configured to direct client requests across multiple servers in an effort to avoid network traffic congestion. Here, best server refers to the server with the best rating based on a rating algorithm to be explained later.
We will explain the design, implementation and benefits of a dynamic load-balancing DNS, dlbDNS, which extends DNS.
Four load-balancing models are available. First, RFC 1794 (see Resources 1) describes a load-balancing method using a special zone transfer agent that obtains its information from external sources. The new zone then gets loaded by the name server. One problem with this method is that between zone transfers, the weighted information is essentially static or possibly handed out in a round-robin fashion. This method also doesn't allow a virtual/dynamic domain where a response is created dynamically based on the name being queried (see Resources 4).
The second model is a dedicated load-balancing server which intercepts incoming requests and directs them to the best server. This design employs virtual IP addresses for internal use by the load-balancing server. One problem with this is it adds another server to the existing cluster of servers to be monitored, instead of utilizing the available resources.
A third model is a remote monitoring system that monitors the performance of different servers and provides feedback to the DNS. This design helps detect problems not visible internally, and provides truer access time measurements and easy detection of configuration errors that affect external users. The major problem here is the dependency on the remote network to monitor and deliver data (see Resources 5).
Last is an internal monitoring system that monitors the performance of the servers and provides feedback to the DNS. Its major advantages are easy maintainability and administration, closeness to the source of addressable problems and no security hazards (see Resources 5). This design is implemented in dlbDNS.
Initially, load-balancing was intended to permit DNS agents to support the concept of machine clusters (derived from the VMS usage) where all machines were functionally similar or the same. It didn't particularly matter which machine was picked, as long as the processing load was reasonably well-distributed across a series of actual different hosts. With servers of different configurations and capacities, there is a need for more sophisticated algorithms (see Resources 1).
“Round-robin algorithm A” can distribute requests in a round-robin fashion evenly across servers. Although the requests are handled dynamically, the problem is the total ignorance of various performance characteristics.
“Load-average algorithm A” can distribute requests across servers based on the server load. This design is very simple and fairly inexpensive, but fails miserably if servers vary in configuration and potential.
“Rating algorithm A” is based on the number of users and load-average shown below. This algorithm is reasonable, as its rating favors hosts with the smallest number of unique logins and lower load averages (see Resources 4). This rating algorithm is implemented in dlbDNS to determine the best server.
WT_PER_USER = 100 USER_PER_LOAD_UNIT = 3 FUDGE = (TOT_USER - UNIQ_USER) * (WT_PER_USER/5) WEIGHT = (UNIQ_USER * WT_PER_USER) + (USER_PER_LOAD_UNIT * LOAD) + FUDGE
where the variables are
TOT_USER: total number of users logged in
UNIQ_USERS: unique number of users logged in
LOAD: load average over the last minute, multiplied by 100
WT_PER_USER: pseudo-weight per user
FUDGE: fudge factor for users logged in more than once
WEIGHT: rating of the server
To get started, we downloaded BIND 8.1.2 from the Internet Software Consortium (www.isc.org/bind.html). Initially, time was spent installing and understanding DNS. DNS was installed on odie.cs.twsu.edu, a stand-alone Linux workstation.
During configuration, a new attribute called DNAME was added to distinguish the hosts taking part in dynamic load-balancing. Listing 1 is a snapshot from named.hosts.wsu, containing information on all hosts in a particular zone. In this listing, the set of hosts kira.cs.twsu.edu, sisko.cs.twsu.edu and q.cs.twsu.edu take part in dynamic load-balancing for http://www1.cs.twsu.edu/. The set of hosts kira.cs.twsu.edu, mccoy.cs.twsu.edu and emcity.cs.twsu.edu take part in dynamic load-balancing for http://www2.cs.twsu.edu/. The set of hosts kira.cs.twsu.edu, sisko.cs.twsu.edu and deanna.cs.twsu.edu take part in dynamic load-balancing for http://www3.cs.twsu.edu/. Hosts kira.cs.twsu.edu and sisko.cs.twsu.edu belong to multiple groups.
Here is the algorithm we added to the pre-existing DNS feature. If the service requested is of type DNAME, do the following:
Determine the set of participating servers for this service.
Request ratings from all participating servers by establishing a concurrent connectionless (UDP) connection with each server.
Using the ratings returned, determine the best server.
Handle error conditions such as “server is too busy to return the rating within the time frame”, “the rating returned by the server gets lost on its way back to the dlbDNS”, “all servers have same rating” and “a server is down”.
A rating daemon runs on each server taking part in dynamic load balancing. Here is the algorithm:
Receive request for rating from dlbDNS and respond by returning the host rating.
Calculate the host rating once every minute rather than calculating it at the time of request, as quick response time is a most important feature.
Ensure the host rating is updated every minute, independent of the dlbDNS request.
Handle error conditions such as dlbDNS closing the UDP sockets without waiting for host response.
Figure 1 shows the functionality of dlbDNS. The path traced by C indicates the process of updating the server rating by the rating daemons. The path traced by B indicates the communication between dlbDNS and the rating daemons to determine the best server. The path traced by A indicates the path traced by the user request. HOST 1 has a better rating than the other two hosts, so the user request gets directed to HOST 1.
Implementing dlbDNS provides efficient utilization of system resources and ensures that facilities newly added to the existing network will be utilized. Since DNS is used, applications such as FTP and TELNET will also utilize dlbDNS.
Uneven distribution of load across servers has been a major problem in the Computer Science department of Wichita State University. bugs.cs.twsu.edu, kira.cs.twsu.edu, roger.cs.twsu.edu and sisko.cs.twsu.edu are four Linux servers available for students in the department. These servers vary in potential and configuration.
dlbDNS was installed in December 1998 to effectively utilize the servers. lion.cs.twsu.edu, the actual DNS server, was made to direct DNAME requests toward odie.cs.twsu.edu where dlbDNS was installed. The lines added to the configuration file were:
; bestlinux IN DNAME bugs.cs.twsu.edu. bestlinux IN DNAME kira.cs.twsu.edu. bestlinux IN DNAME roger.cs.twsu.edu. bestlinux IN DNAME sisko.cs.twsu.edu. ;
Here, the bestlinux attribute was added to handle non-web requests from applications such as TELNET and FTP.
Currently, the gethostbyname system call fails within the BIND code. This problem is avoided by using a configuration file with a list of host and IP addresses. We'd like to find a better solution.
The rating algorithm is still not complete. An algorithm that takes into account the number of processors, CPU and memory utilization would make the rating algorithm more efficient.
At this time, only Linux servers can take part in the dynamic load-balancing scheme, as the rating algorithm uses files in the /proc file structure. A more extensible design is needed.
Harish V.C. (harish@acm.org) is a graduate student in the Computer Science department of Wichita State University. His research interests include computer and Internet security, networking and operating systems. He is currently working as an intern at IBM.
Brad J. Owens (bjowens@cs.twsu.edu) is a faculty member in the Computer Science department at Wichita State University. His research interests include computer and Internet security, high-speed networking, parallel and distributed programming.