Who's Behind That Kernel You're Using?
Way back in April 2008, the Linux Foundation published a little report that upended a lot of perceptions about Linux development. Now, they've done it again.
The report in question — Linux Kernel Development: How Fast it is Going, Who is Doing It, What They are Doing, and Who is Sponsoring It? — revealed a number of things about the Linux kernel. At a total line count of 8,859,683, the kernel was growing at roughly 10% per year, with an average of some 3,621 lines of code added every day, while 1,550 lines were removed and 1,425 lines were changed. Possibly the most interesting numbers, however, were those regarding the faces behind those changes.
According to the report, the latest release — at the time it was written — was written by 3,678 developers, a fifty percent increase in individual developers in just three years time. It also found that 15% of the work on the kernel was being performed by the top-ten developers, each having contributed more than 1% of the code, with some contributing just shy of 2%. Interestingly enough — though not surprising, given his role as supreme maintainer — developer-in-chief Linus Torvalds ranked as the 27th most prolific, at 0.6%.
The most enlightening part of the report, perhaps, was the level of corporate involvement. Open Source projects seem to have a certain mystique, a prevalent idea that they are the result of hundreds of individual contributors slaving away in their spare time. The Foundation's report shattered this perception, finding that 75% of the work being undertaken was contributed by the top ten groups involved. The top two of these — "none" and "unknown" — were responsible for a great deal of work, though the "unknown" group was composed entirely of developers with ten or less contributions in the previous three years. According to the report, even if one assumes that every single developer in the "unknown" group was unaffiliated — highly unlikely — the work being done by those paid to contribute still constituted over 70% of all kernel development. Not bad for those corporations everybody is always wanting to be rid of.
Sadly, the report has become outdated — as it demonstrates itself, kernel development accelerates at a pace that renders a year and a half old report obsolete. It is for that reason that the Linux Foundation announced today that an updated report is available for download, written by original authors Jonathan Corbet and Greg Kroah-Hartman, along with Foundation Vice President Amanda McPherson.
The new report, as might be expected, finds that kernel growth is still going strong, with the 2.6.30 kernel containing 11,560,971 lines of code, an increase of 2,701,288 lines in the past year and a half. It reports that growth has increased exponentially with the addition of the linux-next tree, where patches are staged prior to being committed to the main kernel tree. An average of 6,422 lines of code have been added, 3,285 removed, and 1,687 changed each day in the past four and a half years — that's an increase of fifty percent in the number of daily additions and deletions from the original report.
The new report finds that, in the last three years, the individual developer community has grown by some fifty percent, with a total of 1,150 developers having a hand in the 2.6.30 kernel. Still, the top contributors are top for a reason. Some 33% of developers — almost 400 in all — who submitted a patch did exactly that: submitted one patch. The top ten are responsible for just shy of 12% of the code, while the top thirty can claim 25%. The current top individual contributor, David S. Miller, at 1.5%, held the second-highest spot in the original report, with 1.8%. The report notes the "amusing" revelation that Linus has fallen out of the top-thirty, unsurprising given his more administrative role. It goes on to note that "merge commits" are not counted in the report, and that Linus is responsible for a significant number of these.
And, of course, there is the sponsored section. "None" retains the top spot, with 18.% of the total contributions being made — "Unknown," however, slips to the number three spot, down to 7.6% from its previous 12.9%. The second place spot was seized by Red Hat — third in the previous report — which rose from its previous 11.2% to a current 12.3%. Once again, more than 70% of all development on the kernel is being done by those who are paid to do it. The report also notes that the jump in contributions from the "None" group is in part due to better identification — the drop in "Unknown" represents identification of their affiliations or lack there of, and many subsequently moved into the "None" category.
Of course, both reports hold a host of other data, including those reviewing and approving changes — the full seventeen page report is an interesting read for anyone interested in pushing back the curtain to see what is really going on in kernel development.
The full text of both reports is available from the Linux Foundation — the 2009 report is in PDF.