Over 350,000 open-source repositories may be open to compromise because they include a Python module containing an unpatched 15-year-old vulnerability.
That’s the finding of researchers at Trellix, who said the hole, CVE-2007-4559, is in Python’s tarfile module, which isn’t properly checking for path traversal vulnerabilities. A developer could inadvertently include the vulnerability in their own code, say researchers — and, they suggest, developers have been doing it for years.
“Today, left unchecked, this vulnerability has been unintentionally added to hundreds of thousands of open- and closed-source projects worldwide, creating a substantial software supply chain attack surface,” the researchers said Wednesday in a blog.
The long-forgotten hole was discovered while researchers were investigating an unrelated vulnerability. While the vulnerability was originally only marked as a 6.8 severity, the researchers were able to confirm that in most cases an attacker can gain code execution from an arbitrary file write. For a detailed technical understanding of the CVE and the technical consequences of an attack, see this separate blog.
With GitHub’s co-operation, the researchers were able to determine there were around 2.87 million open-source files containing Python’s tarfile module in about 588,000 unique repositories. Of those, an estimated 350,000 unique open-source repositories in a vast number of industries will be vulnerable to attack.
The blog notes that Python documentation warns developers about the tarfile problem, urging them to never extract archives from untrusted sources without prior inspection.
Briefly, the actual vulnerability arises from two or three lines of code using un-sanitized tarfile.extract() or the built-in defaults of tarfile.extractall(). Failure to write any safety code to sanitize the tarfile member’s files before calling tarfile.extract() or tarfile.extractall() results in a directory traversal vulnerability, enabling a bad actor to access the file system.
Trellix has created automated tools to start issuing fixes for open-source code it sees in GitHub and other code websites. So far it has patches for 11,005 repositories, ready for pull requests. Each patch will be added to a forked repository and a pull request made over time. This will help individuals and organizations alike become aware of the problem and give them a one-click fix, Trellix says.
Over the next few weeks, just over 12 per cent — about 70,000 projects — could be fixed if all the Pull Requests are accepted by the project maintainers.
“The real solution is to tackle the root of the problem,” says Trellix researcher Charles McFarland. “That is, diligent security assessments of open-source code and timely patching. N-days should be measured in days, not years. We need to ensure we are doing our due diligence to audit OSS [open-source software] and not leave vulnerable code in the wild to be exploited. If this tarfile vulnerability is any indicator, we are woefully behind and need to increase our efforts to ensure OSS is secure.”