Writing Portable Device Drivers

by Greg Kroah-Hartman

Almost all Linux kernel device drivers work on more than just one type of processor. This only happens because device-driver writers adhere to a few important rules. These rules include using the proper variable types, not relying on specific memory page sizes, being aware of endian issues with external data, setting up proper data alignment and accessing device memory locations through the proper interface. This article explains these rules, shows why it is important that they be followed and gives examples of them in use.

Internal Kernel Data Types

One of the most basic rules to remember when writing portable code is to be aware of how big you need to make your variables. Different processors define different variable sizes for int and long data types. They also differ in specifying whether a variable size is signed or unsigned. Because of this, if you know your variable size has to be a specific number of bits, and it has to be signed or unsigned, then you need to use the built-in data types. The following typedefs can be used anywhere in kernel code and are defined in the linux/types.h header file:

u8    unsigned byte (8 bits)
u16   unsigned word (16 bits)
u32   unsigned 32-bit value
u64   unsigned 64-bit value
s8    signed byte (8 bits)
s16   signed word (16 bits)
s32   signed 32-bit value
s64   signed 64-bit value

For example, the i2c driver subsystem has a number of functions that are used to send and receive data on the i2c bus:

s32 i2c_smbus_write_byte(struct i2c_client
    *client, u8 value);
s32 i2c_smbus_read_byte_data(struct i2c_client
    *client, u8 command);
s32 i2c_smbus_write_byte_data(struct i2c_client
    *client, u8 command, u8 value);
All of these functions return a signed 32-bit value and take an unsigned 8-bit value for either a value or command parameter. Because these data types are used, this code is portable to any processor type.

If your variables are going to be used in any code that can be seen by user-space programs, then you need to use the following exportable data types. Examples of this are data structures that get passed through ioctl() calls. Once again they are defined in the linux/types.h header file:

__u8   unsigned byte (8 bits)
__u16   unsigned word (16 bits)
__u32   unsigned 32-bit value
__u64   unsigned 64-bit value
__s8    signed byte (8 bits)
__s16   signed word (16 bits)
__s32   signed 32-bit value
__s64   signed 64-bit value

For example, the usbdevice_fs.h header file defines a number of different structures that are used to talk to USB devices directly from user-space programs. Here is the definition of the ioctl that is used to send a USB control message to the device:

struct usbdevfs_ctrltransfer {
    __u8 requesttype;
    __u8 request;
    __u16 value;
    __u16 index;
    __u16 length;
    __u32 timeout;  /* in milliseconds */
    void *data;
};
#define USBDEVFS_CONTROL_IOWR('U', 0, struct
    usbdevfs_ctrltransfer)
One thing that has caused a lot of problems, as 64-bit machines are getting more popular, is the fact that the size of a pointer is not the same as the size of an unsigned integer. The size of a pointer is equal to the size of an unsigned long. This can be seen in the prototype for get_zeroed_page():
extern unsigned long FASTCALL
    (get_zeroed_page(unsigned int gfp_mask))
get_zeroed_page() returns a free memory page that has already been wiped clean with zeros. It returns an unsigned long that should be cast to the specific data type that you need. The following code snippet from the drivers/char/serial.c file in the rs_open() function shows how this is done:
static unsigned char *tmp_buf;
unsigned long page;
if (!tmp_buf) {
    page = get_zeroed_page(GFP_KERNEL);
    if (!page)
       return -ENOMEM;
    if (tmp_buf)
       free_page(page);
    else
       tmp_buf = (unsigned char *)page;
}
There are some native kernel data types that you should use instead of trying to use an unsigned long. Some of these are: pid_t, key_t, gid_t, size_t, ssize_t, ptrdiff_t, time_t, clock_t and caddr_t. If you need to use any of these types in your code, please use the given data types; it will prevent a lot of problems.
Memory Issues

As we saw above in the example taken from drivers/char/serial.c, you can ask the kernel for a memory page. The size of a memory page is not always 4KB of data (as it is on i386). If you are going to be referencing memory pages, you need to use the PAGE_SHIFT and PAGE_SIZE defines.

PAGE_SHIFT is the number of bits to shift one bit left to get the PAGE_SIZE value. Different architectures define this to different values. Table 1 shows a short list of some architectures and the values of PAGE_SHIFT and the resulting value for PAGE_SIZE.

Table 1. Some Architectures and the Values of PAGE_SHIFT and the Resulting Value for PAGE_SIZE

Even on the same base architecture type, you can have different page sizes. This depends sometimes on a configuration option (like IA-64) or is due to different variants of the processor type (like on ARM).

The code snippet from drivers/usb/audio.c in Listing 1 shows how PAGE_SHIFT and PAGE_SIZE are used when accessing memory directly.

Listing 1. Accessing Memory Directly

Endian Issues

Processors store internal data in one of two ways: little-endian or big-endian. Little-endian processors store data with the right-most bytes (those with a higher address value) being the most significant, while big-endian processors store data with the left-most bytes (those with a lower address value) being the most significant.

For example, Table 2 shows how the decimal value 684686 is stored in a 4-byte integer on the two different processor types (684686 decimal = a72be hex = 00000000 00001010 01110010 10001110 binary).

Table 2. How the Decimal Value 684686 is Stored in a 4-Byte Integer

Intel processors, for example the i386 and IA-64 series, are little-endian machines, whereas the SPARC processors are big-endian. The PowerPC processors can be run in either little- or big-endian mode, but for Linux, they are defined as running in big-endian mode. The ARM processor can be either, depending on the specific ARM chip being used, but usually it also runs in big-endian mode.

Because of the different endian types of processors, you need to be aware of data you receive from external sources and the order in which it appears. For example, the USB specification dictates that all multibyte data fields are in little-endian form. So if you have a USB driver that reads a multibyte field from the USB connection, you need to convert that data into the processor's native format. Code that assumes the processor is little-endian could ignore the data format coming from the USB connection successfully. But this same code would not work on PowerPC or ARM processors and is the leading cause of drivers that are broken on different platforms.

Thankfully, there are a number of helpful macros that have been created to make this an easy task. All of the following macros can be found in the asm/byteorder.h header file.

To convert from the processor's native format into little-endian form you can use the following functions:

u64 cpu_to_le64 (u64);
u32 cpu_to_le32 (u32);
u16 cpu_to_le16 (u16);

To convert from little-endian format into the processor's native format you should use these functions:

u64 le64_to_cpu (u64);
u32 le32_to_cpu (u32);
u16 le16_to_cpu (u16);
For big-endian forms, the following functions are available:
u64 cpu_to_be64 (u64);
u32 cpu_to_be32 (u32);
u16 cpu_to_be16 (u16);
u64 be64_to_cpu (u64);
u32 be32_to_cpu (u32);
u16 be16_to_cpu (u16);
If you have a pointer to the value to convert, then you should use the following functions:
u64 cpu_to_le64p (u64 *);
u32 cpu_to_le32p (u32 *);
u16 cpu_to_le16p (u16 *);
u64 le64_to_cpup (u64 *);
u32 le32_to_cpup (u32 *);
u16 le16_to_cpup (u16 *);
u64 cpu_to_be64p (u64 *);
u32 cpu_to_be32p (u32 *);
u16 cpu_to_be16p (u16 *);
u64 be64_to_cpup (u64 *);
u32 be32_to_cpup (u32 *);
u16 be16_to_cpup (u16 *);
If you want to convert the value within a variable and store the modified value in the same variable (in situ), then you should use the following functions:
void cpu_to_le64s (u64 *);
void cpu_to_le32s (u32 *);
void cpu_to_le16s (u16 *);
void le64_to_cpus (u64 *);
void le32_to_cpus (u32 *);
void le16_to_cpus (u16 *);
void cpu_to_be64s (u64 *);
void cpu_to_be32s (u32 *);
void cpu_to_be16s (u16 *);
void be64_to_cpus (u64 *);
void be32_to_cpus (u32 *);
void be16_to_cpus (u16 *);
As stated before, the USB protocol is in little-endian format. The code snippet from drivers/usb/serial/visor.c presented in Listing 2 shows how a structure is read from the USB connection and then converted into the proper CPU format.

Listing 2. How a structure is read from the USB connection and converted into the proper CPU format.

Data Alignment

The gcc compiler typically aligns individual fields of a structure on whatever byte boundary it likes in order to provide faster execution. For example, consider the code and resulting output shown in Listing 3.

Listing 3. Alignment of Individual Fields of a Structure

The output shows that the compiler aligned fields b and c in the struct foo on even byte boundaries. This is not a good thing when we want to overlay a structure on top of a memory location. Typically driver data structures do not have even byte padding for the individual fields. Because of this, the gcc attribute (packed) is used to tell the compiler not to place any ``memory holes'' within a structure.

If we change the struct foo structure to use the packed attribute like this:

struct foo {
    char    a;
    short   b;
    int     c;
} __attribute__ ((packed));

Then the output of the program changes to:

offset A = 0
offset B = 1
offset C = 3
Now there are no more memory holes in the structure.

This packed attribute can be used to pack an entire structure, as shown above, or it can be used only to pack a number of specific fields within a structure.

For example, the struct usb_ctrlrequest is defined in include/usb.h as the following:

struct usb_ctrlrequest {
    __u8 bRequestType;
    __u8 bRequest;
    __u16 wValue;
    __u16 wIndex;
    __u16 wLength;
} __attribute__ ((packed));

This ensures that the entire structure is packed, so that it can be used to write data directly to a USB connection.

But the definition of the struct usb_endpoint_descriptor looks like:

struct usb_endpoint_descriptor {
    __u8  bLength           __attribute__ ((packed));
    __u8  bDescriptorType   __attribute__ ((packed));
    __u8  bEndpointAddress  __attribute__ ((packed));
    __u8  bmAttributes      __attribute__ ((packed));
    __u16 wMaxPacketSize    __attribute__ ((packed));
    __u8  bInterval         __attribute__ ((packed));
    __u8  bRefresh          __attribute__ ((packed));
    __u8  bSynchAddress     __attribute__ ((packed));
    unsigned char *extra;   /* Extra descriptors */
    int extralen;
};

This ensures that the first part of the structure is packed and can be used to read directly from a USB connection, but the extra and extralen fields of the structure can be aligned to whatever the compiler thinks will be fastest to access.

I/O Memory Access

Unlike on most typical embedded systems, accessing I/O memory on Linux cannot be done directly. This is due to the wide range of different memory types and maps present on the wide range of processors on which Linux runs. To access I/O memory in a portable manner, you must call ioremap() to gain access to a memory region and iounmap() to release access.

ioremap() is defined as:

void * ioremap (unsigned long offset,
    unsigned long size);

You pass in a starting offset of the region you wish to access and the size of the region in bytes. You cannot just use the return value as a memory location to read and write from directly, but rather it is a token that must be passed to different functions to read and write data.

The functions to read and write data using memory mapped by ioremap() are:

u8  readb (unsigned long token);    /* read 8 bits */
u16 readw (unsigned long token);    /* read 16 bits */
u32 readl (unsigned long token);    /* read 32 bits */
void writeb (u8 value,
    unsigned long token);   /* write 8 bits */
void writew (u16 value,
    unsigned long token);   /* write 16 bits */
void writel (u32 value,
    unsigned long token);   /* write 32 bits */

After you are finished accessing memory, you must call iounmap() to free up the memory so that others can use it if they want to.

The code example in Listing 4 from the Compaq PCI Hot Plug driver in drivers/hotplug/cpqphp_core.c shows how to access a PCI device's resource memory properly.

Listing 4. Accessing a PCI Device's Resource Memory

Accessing PCI Memory

To access the PCI memory of a device, you again must use some general functions and not try to access the memory directly. This is due to the different ways the PCI bus can be accessed, depending on the type of hardware you have. If you use the general functions, then your PCI driver will be able to work on any type of Linux system that has a PCI bus.

To read data from the PCI bus use the following functions:

int pci_read_config_byte(struct pci_dev *dev,
    int where, u8 *val);
int pci_read_config_word(struct pci_dev *dev,
    int where, u16 *val);
int pci_read_config_dword(struct pci_dev *dev,
    int where, u32 *val);

and to write data, use these functions:

int pci_write_config_byte(struct pci_dev *dev,
    int where, u8 val);
int pci_write_config_word(struct pci_dev *dev,
    int where, u16 val);
int pci_write_config_dword(struct pci_dev *dev,
    int where, u32 val);

Where are the pci_read_config_* and pci_write_config_* functions actually declared?

These functions allow you to write 8, 16 or 32 bits to a specific location that is assigned to a specific PCI device. If you wish to access the memory location of a specific PCI device that has not been initialized by the Linux PCI core yet, you can use the following functions that are present in the pci_hotplug core code:

int pci_read_config_byte_nodev(struct pci_ops *ops,
    u8 bus, u8 device, u8 function, int where, u8 *val);
int pci_read_config_word_nodev(struct pci_ops *ops,
    u8 bus, u8 device, u8 function, int where, u16 *val);
int pci_read_config_dword_nodev(struct pci_ops *ops,
    u8 bus, u8 device, u8 function, int where, u32 *val);
int pci_write_config_byte_nodev(struct pci_ops *ops,
    u8 bus, u8 device, u8 function, int where, u8 val);
int pci_write_config_word_nodev(struct pci_ops *ops,
    u8 bus, u8 device, u8 function, int where, u16 val);
int pci_write_config_dword_nodev(struct pci_ops *ops,
    u8 bus, u8 device, u8 function, int where, u32 val);

An example of reading and writing to PCI memory by a driver can be seen in the USB OHCI driver at drivers/usb/usb-ohci.c (see Listing 5).

Listing 5. Reading and Writing to PCI Memory

Conclusion

If you follow these different rules when creating a new Linux kernel device driver, or when modifying an existing one, the resulting code will run successfully on a wide range of processors. These rules are also good to remember when debugging a driver that only works on one platform (remember those endian issues).

The most important resource to remember is to look at existing kernel drivers that are known to work on different platforms. One of Linux's strengths is the open access of its code, which provides a powerful learning tool for aspiring driver authors.

Writing Portable Device Drivers

Greg Kroah-Hartman is currently the Linux USB and PCI Hot Plug kernel maintainer. He works for IBM, doing various Linux kernel-related things and can be reached at greg@kroah.com.

Load Disqus comments