Verifying packages with Debian's ReproducibleBuilds
Identical Build
Open source software offers a big security benefit: Unlike proprietary software, anyone can view the source code, so in theory you know what you are installing. However, the overwhelming majority of users install prebuilt software packages provided by their Linux distributors. These users rely on system developers and package maintainers to ensure that the binary packages do not contain malicious code that deviates from the official source code.
The Debian ReproducibleBuilds project helps you verify that the package matches the source code and that no flaws have been introduced (Figure 1) [1].
Attack Scenarios
As a popular Linux distribution, Debian distributes its own software to a large number of users worldwide. The customers are not only private users, but also organizations, research institutions, and companies. This complex and decentralized software distribution system creates opportunities for attackers to foist malicious code onto unsuspecting users.
One obvious attack scenario is targeted manipulation of a DEB binary package. Past exploits like the OpenSSH bug CVE-2002-0083 [2] show that sometimes changing just one bit is sufficient to install a backdoor [3]. For sophisticated attacks, attackers could dump a kernel rootkit on a package maintainer's computer, which would then secretly change the code at build time.
Secure Binary Packages
The idea of making binary packages for Debian reproducible has existed since 2007 [4]. At first, the idea was met with little response until decisive impetus came from projects with high security requirements. For example, the Tor Project has pushed the development of reproducible package building [5].
Since the Snowden revelations, significantly more users are interested in security gains offered by this approach. Bitcoin developers, for example, have a vested interest in safeguarding the money market software that distributes the virtual currency to users.
Matching Builds
If you want to identify manipulations by tracking different package build results, the first step is to ensure that the build process always produces identical packages. This is not the case in general. Instead, two binaries built from the same source code may often differ for several reasons [6].
For example, the packets generated here change when developers build on different machines. Or, the build process has a different timestamp in the header of the gzip archives it generates, or in the man pages created using the docbook-to-man
tool. The program documentation contains timestamps, for example, as do HTML pages made by Doxygen [7] or PDF documents built with LaTeX [8].
Problems are also caused by different lists of files that are created because the POSIX readdir()
function does not sort the output. A different build path, for example, results in changes to the build ID of the binaries. The locale used for the build also makes a difference to binary packages, as does the hostname of the build system, and many other factors.
Buy this article as PDF
(incl. VAT)