University of Minnesota Linux Ban Prompts Questions About Open Source
Researchers snuck vulnerabilities past the peer-review process and into the open-source Linux kernel codebase. What does this mean for the ubiquitous Linux kernel, and open source in general?
Researchers snuck vulnerabilities past the peer-review process and into the open-source Linux kernel codebase. What does this mean for the ubiquitous Linux kernel, and open source in general?
Michael J. McDonald
Michael J. McDonald is an award-winning journalist who has worked at Bloomberg News and Thomson Financial.
There are generally two types of hackers: those who break into computer systems to find vulnerabilities in order to fix them and criminals who exploit weaknesses to steal data and hold organizations hostage.
Then there’s Kangjie Lu, an assistant professor at the University of Minnesota who specializes in computer security. He gained notoriety recently for purposefully creating vulnerabilities in the world’s most prominent open-source software system, the Linux kernel. After he published an academic paper about his exploits, the Linux kernel team banned the university and reverted its previous fixes.
Intentional Linux Kernel Vulnerabilities
The revelations prompted questions around open-source security and the OS kernel, a foundational component of countless devices and servers. Lu buried vulnerabilities in some minor fixes he and a graduate student submitted to Linux’s vast repository of software code, subverting a collaborative method that is essential to keeping the program secure.
“They were distressingly successful,” says Alexander Sereda, a project manager based in Toronto who joined the Toptal network in 2020. “Their malicious code passed the community oversight that’s supposed to weed out these kinds of submissions.”
While Windows and macOS dominate desktops, Linux-based OSs are by far the most popular for servers and supercomputers, and Google uses a modified version of the kernel for Android. Tech giants like Red Hat have built their enterprise software businesses around open source and even Microsoft, which for years was seen as a leading opponent of the open-source ecosystem, has come around: In 2018 it bought GitHub, the biggest host of open-source projects; today, Linux distributions can be found in the Microsoft Store.
It’s hard to overstate the pervasiveness of the Linux kernel and open-source software, evolving from the stuff of hobbyists and idealists into a cornerstone of today’s software market. There are literally millions of open-source projects for developers to sift through, and because they’re free and for the most part reliable, more than 90% of commercial applications contain such components. That figure keeps growing, according to an annual study from the silicon design and software security firm Synopsys.
Being reliable is just as important as being free, and the open-source community prides itself on delivering programs that are equal to, if not better than, proprietary software. The movement emerged in the 1990s with a central tenet, coined by developer and author Eric Raymond, that “given enough eyeballs, all bugs are shallow.” That open source has so far survived relatively unscathed—even as the world contends with an unprecedented epidemic of hacking and ransomware—is only further affirmation that such open collaboration can still be effective.
But, as Lu showed in his research, no security is perfect, even in something as critical and closely monitored as the Linux kernel. While writing secure code can be straightforward, finding vulnerabilities in code after the fact can be incredibly difficult—“like removing the milk from your tea after it’s been stirred in,” says Sereda. While open source has in general performed admirably, apart from a couple of significant hiccups like the Heartbleed bug that emerged in 2014, there is a growing body of evidence that developers, for their part, are taking too many shortcuts when incorporating free software into their products.
Open-source Security Risks
In its 2021 annual report, Synopsys reveals that every company in the marketing tech industry it audited had open source in their codebase and 95% of those codebases contained vulnerabilities. It found similar results when looking at the healthcare sector, the financial services sector, and the retail and e-commerce industries. There are a number of reasons for the poor results, but chief among them is that software keeps growing more complicated, and as it does it gets harder to track and monitor components.
Open source has a much better security record than proprietary products but that doesn’t guarantee that it’s always going to be the best quality, says Sam Watkins, a freelance full-stack developer who joined Toptal’s network in 2021. “We have a much bigger problem, which is overly complicated programs. It’s insecure, though not from malicious intent.”
The problem then isn’t necessarily that open source is too open. Instead, it’s the gap that persists because there is no one vendor pushing out patches for the community even as software cycles keep shrinking, says Timothy Mackey, principal security strategist at Synopsys. Tight budgets force programmers to use imperfect shortcuts like simplistic rating systems to pick their components—by popularity, rather than quality. There are multiple services that offer programmers such shortcuts, including Openbase, Stack Builder, and the Open Source Index, which highlights the most popular projects on GitHub.
Managing Open-source Vulnerability
According to programmers and academics, while there is value in these open-source ratings systems, there needs to be more validation and consideration when weighing options, rather than just grabbing components that appear to be the best match. Each organization should establish a set of best practices that includes principles for carefully choosing software that takes into account the amount of support required and the risks faced. Companies should also be tracking and frequently updating all their open-source components.
Some other best practices that our experts identified for consideration include:
- Using automation, verifying processes, documenting everything, and using Git to track codebase changes.
- Creating a positive community, including helping people new to open source who could become important collaborators.
- Keeping all open-source supply chains auditable.
- Using open-source containers that share the kernel of the host OS.
- Identifying the most critical open-source components, then tracking their security issues, engaging with their developers, and contributing back to upstream projects with both patches and funding.
Ann Barcomb, an assistant professor at the University of Calgary’s Schulich School of Engineering, adds that, ideally, organizations should be using a set of best practices to build libraries of pre-approved products so that software is never selected arbitrarily. However, she acknowledges that this process is time consuming, costly, and not widely practiced.
“You want more security but that security comes at a huge price,” says Ayush Poddar, a freelance back-end developer who joined Toptal in 2021.
Platforms like Black Duck, Sonatype, Snyk, and WhiteSource provide automation to help find open-source components in a computer program’s stack and identify vulnerabilities. Still, these tools are limited and keeping up with code patches is another problem that’s only getting worse—the US Cybersecurity & Infrastructure Security Agency often reports hundreds of new software vulnerabilities per week.
“You can’t test every combination of how each bit of code can be executed,” says Aidan McManus, a retired tech executive who oversaw IT architecture and engineering at CA Technologies. “It would take years.”
Mats Heimdahl, the head of the University of Minnesota’s Department of Computer Science and Engineering, notes that Kangjie and his researchers also found numerous bad patches in the kernel that were separate from the bugs they submitted. “It is, in my view, pretty clear that a manual review process by overworked and under-appreciated volunteers (even by extremely skilled and dedicated maintainers) will inevitably be imperfect,” Heimdahl wrote in an email.
The fact that vulnerabilities are growing raises fundamental questions about how open source is going to be managed. While it accelerates innovation, it’s essentially a shared resource, a vast library of free-to-use software that saves consumers $60 billion a year and also increases profits for companies by cutting their development costs. There may simply be too many free riders, with not enough resources being committed to upkeep and security.
The University of Minnesota Linux Kernel Ban: Limits Learned
While there’s nothing to indicate that the vetting of fixes has changed, the Linux Foundation is establishing a set of best practices for researchers who work with the kernel, and recommended the University of Minnesota appoint a reviewer for its submissions. There really isn’t an alternative to code review and the community in place now is doing a reasonable job of keeping malicious code out, says Barcomb. Given the autonomy required for knowledge work, she says, “the best you can do is have processes in place to identify trust violations and respond accordingly, in the best case before changes are incorporated.”
Heimdahl notes that his institution is organizing a committee to advise on patch submissions as it awaits the ban to be lifted.
Linux was once an outsider’s idea, a bold counterpoint to traditional thinking about proprietary software, but it has evolved into something that looks much more like a commercial project. Huawei, Intel, and Red Hat lead the hundreds of companies that regularly contribute code to the Linux kernel. While many of these companies also donate money to the Linux Foundation and affiliates of the Open Source Initiative, it might be time for a more systematic approach to support the software in order to help improve security going forward—one that better values the benefits of such crucial open-source systems.
“People just take it for granted that open source works,” says Christopher Tozzi, a senior lecturer at Rensselaer Polytechnic Institute. “There’s a whole new generation of people who haven’t really thought about these issues.”
Further Reading on the Toptal Blog:
Understanding the basics
How secure is the Linux kernel?
The Linux kernel is one of the more secure open-source projects, given its funding and numerous volunteers. If anything, the end result of the University of Minnesota “hypocrite commits” incident shows that the maintainers are keeping the kernel secure after all.
What is code security?
Code security refers to minimizing vulnerabilities and preventing them from being introduced into a codebase.
Why do we need to secure code?
Unsecure code exposes end users to a wide variety of risks, depending on the type of software involved—everything from financial loss to physical harm.
What is meant by open-source code?
Open source means that source code is publicly available and can be analyzed by anyone for vulnerabilities.
How does open source work?
Open-source projects typically welcome public contributions so users have an opportunity to improve the free products they benefit from. Contributions to source code are vetted by codebase maintainers—the current key developers behind a given project.