eBPF for Advanced Linux Infrastructure Monitoring
A year has passed since the pandemic left us spending the better part of our days sheltering inside our homes. It has been a challenging time for developers, Sysadmins, and entire IT teams for that matter who began to juggle the task of monitoring and troubleshooting an influx of data within their systems and infrastructures as the world was forced online. To do their job properly, free, open-source technologies like Linux have become increasingly attractive, especially amongst Ops professionals and Sysadmins in charge of maintaining growing and complex environments. Engineers, as well, are using more open-source technologies largely due to the flexibility and openness they have to offer, versus commercial offerings that are accompanied by high-cost pricing and stringent feature lock-ins.
One emerging technology in particular - eBPF - has made its appearance in multiple projects, including commercial and open-source offerings. Before discussing more about the community surrounding eBPF and its growth during the pandemic, it’s important to understand what it is and how it’s being utilized. eBPF, or extended Berkley packet filtering, was originally introduced as BPF back in 1992 in a paper by Lawrence Berkeley Laboratory researchers as a rule-based mechanism to filter and capture network packets. Filters would be implemented to run inside a register-based Virtual Machine (VM), which itself would exist inside the Linux Kernel. After several years of non-activity, BPF was extended to eBPF, featuring a full-blown VM to run small programs inside the Linux Kernel. Since these programs run from inside the Kernel, they can be attached to a particular code path and be executed when it is traversed, making them perfect to create applications for packet filtering and performance analysis and monitoring.
Originally, it was not easy to create eBPF programs, as the programmer needed to know an extremely low-level language. However, the community around that technology has evolved considerably through their creation of tools and libraries to simplify and speed up the process of developing and loading an eBPF program inside the Kernel. This was crucial for creating a large number of tools that can trace system and application activity down to a very granular level. The image that follows demonstrates this, showing the sheer number of tools that exist to trace various parts of the Linux stack.
In today’s reality, users write programs in a relatively high-level language, such as C, using tools such as Linux BCC to compile the program into a form that can be loaded into the Kernel. Quite interestingly, while the eBPF program itself is written in a form of C, there are loaders in Python and Golang, amongst other languages, an eBPF program can actually be integrated into your project fairly easily, provided, of course, that the Linux Kernel on top of which the program runs supports eBPF.
It takes very little for a new user to get into the eBPF ecosystem and identify the tools that could greatly enhance their efficiency in launching a highly-granular, real-time infrastructure monitoring program. A great introductory post is offered by ebpf.io which covers the high-level technology and tools that are used by the eBPF stack. It also mentions the tools and development toolchains that exist, so that the user can proceed and test it on their own. Moreover, this introductory page offers a large number of links to additional resources for the engineer who wishes to dig deeper. During my own research, I discovered an awesome GitHub repository which was quite illuminating: awesome-ebpf.
With its impact on the developer community, there are also a number of very innovative commercial offerings on the market, particularly in the area of networking and security. With the proliferation of Kubernetes as the de-facto deployment strategy, eBPF has proven a powerful ally to understanding the often complex and obscure network-related issues that are produced by these detailed topologies. One company pioneering the space is Cilium, another is Netdata, which offers out-of-the-box eBPF-powered infrastructure monitoring. And despite preferences toward open-source or commercial offering, the relatively young eBPF community is growing and attracting a number of professionals who are coming together to share and develop industry-wide best-practices.
This is great news for eBPF and signifies that more engineers and Sysadmins are being introduced to it, understanding it's value, and then utilizing it in their open-source projects or in their organization's infrastructure monitoring functions. This ground-up approach of eBPF’s adoption is key for its continued innovation and growth.
This “mini-revolution” may have gotten lost in the noise of the pandemic, but it's safe to say that an open-source project has come again to the foreground, with Linux following suit as the obvious bedrock for any vendor who has open-source in mind. Plus, today’s cloud-native environments add to a growing issue - infrastructure complexity as data explodes. To be able to cope with this complexity, technologies like eBPF will prove valuable allies, and the added efficiency they provide will play a key role in an environment where demand is rising and costs need to be managed.