Real-Time Applications with RTLinux
Real-Time Linux (RTLinux) is a small hard real-time kernel that can run Linux as its lowest priority thread. Begun as a free software project at New Mexico Tech in 1994, RTLinux is now being used in everything from machine tools, flight simulators and telephone systems to artificial hearts, autonomous underwater vehicles and induction motor control. RTLinux controls instrumentation for NASA, high-speed Active Magnetic Bearings (AMBs) for the University of Virginia Rotating Machinery and Controls Laboratory, animatronic puppets at the Jim Henson Creature Shop and satellite base stations for Japan Post and Telegraph. It is still a free software project, but the core development team now maintains RTLinux from a company: Finite State Machine Labs (FSMLabs). The current edition of RTLinux is V3, and is available for x86, PowerPC and Alpha processors, with a MIPS port in beta release. A smaller sibling of RTLinux, called miniRTL, is small enough to run from a floppy, and can be run on industry standard PC-104 boards.
What makes RTLinux so useful is that it extends the standard UNIX programming environment to real-time problems. From the programmer's point of view, RTLinux adds a special real-time process to Linux and allows programmers to insert real-time threads and signal handlers into that process. The real-time software can talk to ordinary Linux processes using RT-FIFOs (which are a lot like pipes), shared memory and signals. But the real-time code can also access the full power of the raw machine, run at precisely timed intervals and respond rapidly to interrupts. The worst-case timing of RTLinux--the timing delays when the system is busiest--are at the edge of what the hardware can offer. RTLinux makes it possible to create real-time applications in a familiar and standard programming environment, with access to all of the tools and services that make Linux itself such a good development platform.
For engineers designing control systems, RTLinux offers an alternative to the costly special-purpose real-time operating systems and digital signal processors that were formaerly the only choices. Developers using RTLinux can use the standard gcc compilers, debug RTLinux threads using gdb, and create real-time software that makes use of programs running under Linux such as scripting languages, networking tools, databases and other utilities. There are also a number of RTLinux-specific development support packages, both open-source and proprietary. Examples are a proprietary program called Simulinux, from Quanser, that allows development of RTLinux code using Matlab, a proprietary remote debugger from CAD-UL and an open-source front end for RTLinux called RTiC Lab (written by Edgar F. Hilton).
The rest of this article is in two sections. First, we'll explain what real time is all about and give a quick tour of the RTLinux technology and programming model. The remaining section is taken up by a couple of examples. One of the authors (Edgar) comes from a mechanical engineering background and (like a lot of people) is much more interested in applications than in operating system design. So we've focused on how to use RTLinux in practice.
When people talk about real-time, they generally mean ``right away'' or ``fast'' as in ``real-time stock quotes''. A text editor needs to be fast and responsive, but if it's delayed now and then it's no big deal. If a file system averages 100MB/second in data transfer, we don't care if it stops now and then to shuffle buffers. We want things to happen fast (usually described as low-latency). For general-purpose computer systems, ``fast'' translates to average case performance.
But for computer systems, fast doesn't imply realtime. A real-time system is one that has deadlines that can't be missed. For example, consider the control of a robot arm that lifts partially assembled automobiles from one assembly station to another. In order to position the arm correctly, the computer must monitor its movement and stop it precisely 5.2 seconds after it starts. These timing constraints make this a hard real-time system, where average case performance won't do--stopping the arm 7 seconds after it starts one time and 3.4 seconds after it starts the next time gets us two expensive and dangerous failures.
Even software that should usually meet timing deadlines, such as video drivers or the X Windows system, can afford a hitch now and then. A missed video frame will not cause the damage of a missed robot arm control message.
So a text editor would ideally (but not necessarily) be fast and responsive (we can tolerate waiting for a while), a video display that can drop a frame now and then needs to usually meet timing deadlines, but a robot controller needs to be able to guarantee meeting deadlines. In the Real-Time Systems literature, the text editor is considered to be ``non-real-time'' and the video display would be called ``soft real-time''. Only the robot controller would be called a ``hard real-time'' system.
Twenty years ago, hard real-time applications were simple enough to be controlled by dedicated, custom, isolated hardware. Modern real-time applications, however, must control highly complex systems such as robotic manipulators on factory floors, synchronized to the transfer lines and remotely monitored by supply databases; remotely monitored and controlled astronomical observatories and telescopes connected via the Internet; or cell phones that generate graphical displays, synch to passing cell-phone node antennae and parse incoming data through on-board Internet browsers. These applications run subsystems that have both real-time and non-real-time responsibilities, need to work on commodity hardware and need to be networked. Furthermore, as demands for speed and quality of service increase, applications that have never required it before have begun to require hard real-time support. A CD player that makes a popping sound once in a while is okay for casual listening but isn't acceptable for a professional music editing system. A network connection that drops a message now and then might work for a web server but might cause a transaction to get lost in an electronic commerce application. So, not only are real-time applications now being asked to do more than simply control an isolated device, but traditionally non-real-time applications are also gaining new real-time requirements.
The problem here is that to deliver the tight worst-case timing needed for hard real-time, operating systems need to be simple, small and predictable. But delivering the sophisticated services that modern applications need is beyond the capabilities of simple, small, predictable operating systems. When you try to put real time inside a general-purpose operating system, or try to put complex services in a small real-time operating system, you end up with something that does neither task well, and where non-real-time services can interfere with the execution of real-time services.
The solution to this dilemma is to decouple the real-time and non-real-time parts of the operating system. That is, instead of trying to make one operating system support both, we make a system in which a real-time operating system and a general-purpose (time-sharing) system work together. This is the path taken by RTLinux. Instead of trying to create software that combines simplicity with complexity, and combines good worst-case timing with good average-case timing, RTLinux is a small and predictable hard real-time operating system that runs Linux in its idle moments. That is, we put the hard real-time components in a real-time kernel and use Linux to run everything else. The principles that drive the RTLinux design are:
The core real-time OS needs to be predictable, simple and fast, and have minimal overhead.
Any process that can go in Linux, should go in Linux. A real-time application should be ruthlessly separated into the part that must be real-time and the part that does not have hard-time constraints. The part that doesn't have hard-time constraints should run in Linux.
Linux operation cannot be permitted to delay the operation of any of the real-time components.
The technique that allows RTLinux to preempt Linux whenever necessary is a variation on one of the oldest programming tricks in the book: the virtual machine. RTLinux provides Linux with software emulation of the underlying hardware that Linux uses to disable and enable hardware interrupts. For example, if Linux asks the emulator to prevent the clock from generating an interrupt, the emulator agrees--but lies. When the clock interrupts, RTLinux takes control and tells the emulator that the next time Linux ``re-enables'' clock interrupts, the emulator should emulate a clock interrupt. But Linux is never allowed to really disable interrupts. The effect of the emulation is that real-time tasks and handlers execute when they need to execute, no matter what Linux is doing. When an interrupt arrives, the RTLinux kernel intercepts the interrupt and decides what to do. If there is a real-time handler for the interrupt, the handler is invoked. If there is no real-time handler, or if the handler indicates that it wants to share the interrupt with Linux, the interrupt is marked pending. If Linux has asked that interrupts be enabled, any pending interrupts are emulated and the Linux handlers are invoked with hardware interrupts re-enabled.
The most powerful result obtained by this scheme is that no matter what Linux does, whether Linux is running in kernel mode or running a user process, whether Linux is disabling interrupts or enabling interrupts, whether Linux is in the middle of a spinlock or not, the real-time system is always able to respond to the interrupt with minimal latency.
The results speak for themselves: RTLinux running on a generic x86 generates a worst-case interval of less than 15 microseconds from the moment a hardware interrupt is detected by the processor to the moment its interrupt handler starts to execute. An RTLinux periodic task runs within 25 microseconds of its scheduled time on the same hardware. These times are hardware-limited, and so will improve as hardware improves.
The first widely used release of RTLinux, V1, was a simple system, really intended only for low-end x86-based computers. The V1 API was homegrown, designed for the convenience of the implementors without much forethought. By 1998 the RTLinux developers realized that if they stayed with the V1 API it would would have to be extended and patched up to work around old assumptions. Now that other people were using RTLinux in serious applications, the design team wanted a more durable and, well, standard API. The challenge was to move toward a standard API without sacrificing the speed, efficiency and lightweight structure that made RTLinux useful and interesting in the first place.
The team started by assuming that POSIX was out of the question--too big, slow, and incompatible with a lightweight real-time operating system. But the POSIX 1003.13 PSE51 specification defines a standard API that fit surprisingly well. POSIX PSE51 is for a ``minimal real-time system profile''. Systems following this standard look like a single POSIX process with multiple threads (see Figure 1). Essentially, PSE51 puts threads and signal handlers on a bare machine. So RTLinux ended up as a PSE51 POSIX real-time process where one of the threads is Linux, itself a POSIX operating system (see Figure 2).
POSIX 1003.13 permits sensible limitations on the POSIX services provided in the minimal real-time environment. For example, while any POSIX-compliant operating system must support the open() system call, PSE51 allows strict limitation of the file space, so that have to support of a general-purpose file system is unnecessary. The operation of opening a file in a general-purpose POSIX operating system is anything but realtime. It might require multiple disk reads following symbolic and hard links and even network operations. The standard RTLinux open() will open /dev/x for a fixed set of real-time devices and will not support any other path names.
RTLinux is structured as a small core component and a set of optional components.
The core component permits installation of very low-latency interrupt handlers that cannot be delayed or preempted by Linux itself, and some low-level synchronization and interrupt control routines. This core component has been extended to support SMP, and at the same time it has been simplified by removing functionality that can be provided outside the core.
The majority of RTLinux functionality is in a collection of loadable kernel modules that provide optional services and levels of abstraction. These modules include: a scheduler, a component to control the system timers, a POSIX I/O layer, real-time FIFOs, and a shared memory module (a package contributed by Tomasz Motylewski).
The key RTLinux design objectives are that the system should be transparent, modular and extensible. Transparency means that there are no unopenable black boxes, and that the cost of any operation should be determinable. Modularity means that it is possible to omit functionality (and that functionality's expense) if it isn't needed. To support this, the simplest RTLinux configuration supports high-speed interrupt handling and no more. And extensibility means that programmers should be able to add modules and tailor the system to their requirements. As an obvious example, the RTLinux simple priority scheduler can easily be replaced by schedulers more suited to the needs of some specific application.
While developing RTLinux, we have tried to maximize the advantage we get from having Linux and its powerful capabilities available. In fact, RTLinux Development Rule Number One is: if a service or operation is inherently non-real-time, it should be provided in Linux and not in the RT environment.
To facilitate this, RTLinux provides three standard interfaces between real-time tasks and Linux. The first is the RT-FIFO device interface mentioned above, the second is shared memory, and the third is the pthread signal mechanism, which allows real-time threads to generate soft interrupts for Linux. For example, a primitive data acquisition system might consist of a single real-time interrupt handler that collects data from an A/D device and dumps that data into an RT-FIFO, while on the Linux side, logging is taken care of by the shell command line:
cat /dev/rtf_a2d > log
A more elaborate system might use a C program to receive and process data from an RT-FIFO, a Tcl/Tk front end to control the data flow and send control commands to the real-time handler via a second RT-FIFO, and a Perl script sending the data over a network to a second machine for processing and graphical display (see Figure 3).
A more advanced version of this interface between RTLinux and Linux can be seen in the Real-Time Controls Laboratory (RTiC-Lab), an open-source, hard real-time controller system that frees controls engineers from the design of the hard real-time tasks and allows them to focus on the controller algorithms themselves. RTiC-Lab uses RTLinux for the implementation of hard real-time controllers and I/O, and it uses Linux for the implementation of networking, GTK+ graphical user interface (through which users can both start and stop their controller, update controller parameters, and get real-time data from the controller task) and IPC communications to other non-real-time dependent tasks (such as plotting packages, FFT algorithms and other user applications). RTiC-Lab uses a mixture of RT-FIFOs and shared memory to communicate between RTLinux and Linux.
As an example of what can be accomplished with such a system, consider the Henson Company Creature Shop's Performance Control System (see Resources). Originally designed to operate electromechanical creatures used in filmmaking, the ``animatronic'' version of the system is capable of precise control of real-world puppets that are actuated by electromechanical or hydraulic servos. The system runs RTLinux on a laptop PC at the front end for I/O and timing. The PC communicates serially with a back-end embedded PC residing in the puppet, which in turn communicates serially with microcontroller-based motor driver peripherals.
The system incorporates three critical components that run on Linux:
A large suite of high-level graphical tools that allow puppeteers to create complex expressions and performances.
The Henson ``Motion Engine'' that resolves complex character expressions before communicating them digitally to the puppet.
A networked GUI server that gives multiple users local or remote operation of the system.
The artists and engineers at the Creature Shop found that while a soft real-time system was sufficient for computer graphics puppeteering, there was too much frame-rate jitter for use in the animatronic version. In the animatronic implementation, an RTLinux periodic task maintains a 60Hz frame-rate. The periodic task uses RTL interprocess communication to commence ADC driver burst and RT serial communications, then it launches the Linux user-space portion of the motion engine with bounded latency that meets the system frame-rate. The user-space portion of the motion engine is currently being tested using several methods. Ultimately, RTLinux functionality will give it hard real-time invocation from a Linux user-space environment.
The RTLinux approach to system control gives an incredible degree of flexibility when compared to existing DSP systems. As an example, let's consider in some detail a high-speed turbodynamic application in which a one-ton rotor is suspended by a five-degree-of-freedom active magnetic bearing (AMB) and spins at 15,000RPM. This application has the following readily identifiable tasks:
Five-degree-of-freedom suspension controller. The controller runs at a fixed periodic rate (in the 50--100 microsecond range) and both reads and writes the appropriate signals necessary to suspend the rotor. This task is critical in that missing one sample can have catastrophic consequences.
Spin rate measuring task. Triggered once per revolution and used to calculate the spin rate of the rotor. For our rotor, this task will be awakened once every (60/15,000)*1E6 = 4,000 microseconds. The accuracy requirements for the spin rate make this task very intolerant to temporal error.
Anti-imbalance controller. Executed 16 times per revolution and is used to produce a synchronous force used to counteract the effect of rotor imbalance--much the same way that balance weights are added to automobile tires to stop them from wobbling. The intertask scheduling period is dependent on the spin rate of the rotor. So, for our rotor, the task would be scheduled once every (4000/16)*1E6 = 250 microseconds. For this task, very slight temporal incorrectness is allowed.
Data transfer and data plotting tasks. Used to store data to disk, screen or other devices. These tasks allow relatively large temporal error.
Network transfer tasks. Transfer data and commands to and from other computers. For example, the rotor system could be contained in a deep bunker, while the rotordynamics engineers control it from a safe room some distance away. These tasks allow a larger temporal error.
Miscellaneous tasks: such as graphical user interfaces, scripting programs (e.g., Perl and Tcl), and computational engines (Matlab, Scilab, MuPAD, Mathematica, and MathCAD). There is no temporal limitation on these tasks, and they can be performed concurrently with or postmortem to the above tasks.
This entire system can easily be implemented in a single computer using RTLinux, although the standing conventional approach has been to implement each of the above tasks in an independent DSP. In the RTLinux paradigm, the first three tasks would be implemented in the hard real-time environment (RTLinux), while the last three would be comfortably run on the non-RT side (Linux). All can run on the same computer.
Now, let's look at the code. For brevity's sake, we'll focus exclusively on tasks 1 and 2.
Task 1 can be easily implemented using the POSIX interface. Listing 1 shows the structure as it would be implemented in RTLinux.
#include <rtl.h> #include <time.h> #include <pthread.h> #define TASK_PERIOD 100000 /* 100 microseconds period */ #define TASK_PRIORITY 1 /* priority assigned to task */ pthread_t thread; void * task_one(void *arg) { struct sched_param p; p . sched_priority = TASK_PRIORITY; pthread_setschedparam (pthread_self(), SCHED_FIFO, &p); pthread_make_periodic_np (pthread_self(), gethrtime(), TASK_PERIOD); /* startup controller initialization routine goes here */ . . . while (1) { pthread_wait_np(); /* periodic controller routines go here */ . . . } return 0; } int init_module(void) { pthread_create (&thread, NULL, task_one, 0); pthread_setfp_np(&thread,TRUE); return 0; } void cleanup_module(void) { pthread_cancel (thread); pthread_setfp_np(&thread,FALSE); pthread_join (thread, NULL); }
Here, the init_module() and cleanup_module() functions can be seen as the RTLinux equivalent of the main() function in user-level C programs. Upon startup, the init_module() function is called. This function then immediately tells the scheduler to create a new thread--where the function task_one() comprises the body of the thread--and sets up permissions to enable floating point calculations. Likewise, when the program is stopped, the cleanup_module() function is executed, which in turn stops the thread from further execution, removes permissions to use the floating point unit and quits.
The thread itself can be separated into three segments--initialization, periodic and shutdown--which are represented in the code, respectively, by the segments prior to, inside and after the ``while'' loop.
First, during initialization, we establish the attributes for this particular thread. We specify the scheduler type to use (SCHED_FIFO), the priority and the frequency at which the thread will be called by the scheduler, and we perform all tasks necessary to initialize our controller.
Next, during the periodic segment, we first encounter the call to pthread_wait_np(), which causes the thread to block until the scheduler calls it again. Thus, the thread will execute the entire contents of the while loop once per execution cycle. Note that in this particular example the shutdown part of our code will never execute, since there is no provision in our example code to exit the while loop. Instead, it will be terminated immediately upon execution of the cleanup_module().
There are several ways of implementing this second task. In what follows, we shall focus on two: an Interrupt Service Routine running within RTLinux, and a signal handler running within a user-level Linux task.
To run Task 2 as an Interrupt Service Routine (ISR), we need to add the following to the code for Task 1 above (assuming that we are using IRQ 7):
#define IRQ 7<\n> unsigned int example_isr(unsigned int, struct pt_regs *); unsigned int example_isr(unsigned int irq_number, struct pt_regs *p) { /* insert non FP dependent calculations here */ . . rtl_hard_enable_irq(IRQ); return(0); }
However, in order to tell RTLinux to associate example_isr() to IRQ 7, we must use the rtl_request_irq() and rtl_hard_enable() functions in init_module():
rtl_request_irq(IRQ, example_isr);<\n> /* <-- I/O IRQ initialization routines go here --> */ rtl_hard_enable_irq(IRQ);And we must of course clean up after ourselves, so we add the following to cleanup_module():
rtl_hard_disable_irq(IRQ);<\n> rtl_free_irq(IRQ);The problem becomes more interesting for the implementation of the third task. The thread for Task 3 is created in much the same way as was done for Task 1, however we now use our aforementioned ISR to trigger the first execution of Task 3. The subsequent 15 executions of Task 3 have a period, P, that is dependent on the interarrival times, T, of IRQ 7 as P=T/(16+1). But that's getting beyond the scope of the present discussion.
An even more interesting example of the rich RTLinux programming API can be seen with an alternative approach to the development of Task 2. Suppose that instead of executing Task 2 within RTLinux, we instead would like to execute it in Linux itself. For example, let's say that we now want to combine the functionality of Tasks 2 and 4: We want to plot a point to the screen each time that the rotor rotates once about its axis. We can write an application in Linux that first intercepts IRQ 7 and then plot the spin speed to the screen.
Our user-level Linux program would use the rtlinux_sigaction() function--first introduced in RTLinux V3.0--to first identify a handler within our program that would be executed each time that IRQ 7 is triggered.
. . . #include <rtlinux_signal.h> #define IRQ 7 void my_handler(int); struct rtlinux_sigaction sig, oldsig; float old_time=0.0; float new_time=0.0; float omega=10.0; /* spin speed */ int main(void) { old_time = <sampleclock>; /* capture IRQ 7 and execute my_handler each time that IRQ 7 arrives: */ sig.sa_handler = my_handler; sig.sa_flags = RTLINUX_SA_PERIODIC; rtlinux_sigaction( irq, & sig, & oldsig ) /* the main part of our program: we wish to plot information as long as the rotor is still spinning */ while(omega>1.0){ sleep(1); printf("Omega = %.1f\n",omega); /* to stdout */ plot(old_time,omega); /* via fancy plotting package*/ } /* We are no longer spinning, let's clean up after ourselves... */ /* free the irq: */ sig.sa_handler = RTLINUX_SIG_IGN; rtlinux_sigaction( IRQ, & sig, & oldsig ); /* exit gracefully */ return 0; } void my_handler(int argument) { /* calculate spin speed here */ new_time= <sampleclock>; omega = 1.0/(new_time - old_time); old_time = new_time; }
Function
rtlinux_sigaction(), identifies the function my_handler() as the function that we wish to execute each time that IRQ 7 is triggered. Note that ``RTLINUX_SA_PERIODIC'' tells rtlinux_sigaction() to reset itself and wait for the next signal over and over again--otherwise the signal handler would be executed exactly once. Then the while() loop in our program both prints out and plots the latest spin speed. Finally, when the spin speed drops to below 1Hz, the program begins the shutdown process, which involves the deregistering my_handler() as our signal handler.
The job of my_handler() is straightforward: calculate the spin speed. The accuracy of this calculation should be quite high because each time that IRQ 7 is triggered, the handler is called as quickly as the underlying hardware allows.
Regardless of which scheme we use to implement Task 2, the most important thing to note is the amazing versatility and elegance that the RTLinux programming environment provides.
The design compromises that make Linux such a powerful general-purpose OS render it less than ideal for hard and even soft real-time applications. By decoupling real-time and non-real-time processes, RTLinux harnesses the best of both worlds: on the one hand, it offers a high-performance, strictly deterministic real-time application environment, while on the other, it offers the rich programming environment, large application and user base, and powerful networking power of Linux. Most importantly, all improvements made to Linux by its huge development community become instantly available to RTLinux users.
RTLinux is open-source software distributed under the GPL. Further information is available at the FSMLabs web site (http://www.fsmlabs.com/downloads.html) and the software is freely available for download from the FSMLabs ftp server (ftp://ftp.fsmlabs.com/pub/rtlinux/v3/). There are working versions for x86 (uni-processor and SMP), Alpha, PowerPC (uni-processor only), and PC-104 (via miniRTL). Both source and RPM packages are available.