XZ backdoor lessons: reproducing target-isns release tarballs
Andres Freund detected a backdoor in XZ, a data compression library, and disclosed it as CVE-2024-3094 at the end of March 2024.
Many good articles describe how the attackers implanted the backdoor:
- LWN: Free software's not-so-eXZellent adventure.
- LWN: How the XZ backdoor works.
- Russ Cox: Timeline of the XZ open source attack.
The attackers prepared the backdoor for more than two years and completed their work when they became maintainers. They published a release tarball with a malicious build script not present in the source code. The attackers uploaded the release tarball to GitHub.
I maintain target-isns, a niche free software project packaged in several Linux distributions. There is not much activity in the project because it is mostly done. The last release of target-isns – version v0.6.8 – happened in May 2020. I built the release tarball on my machine and uploaded it to GitHub.
I believe that one of the lessons of the XZ backdoor episode is "Trust, but verify". Applied to target-isns releases I published, this could mean:
- How can users verify that the release tarball of target-isns matches the source code?
- How can we improve the release process of target-isns to ease that verification?
How to verify a release tarball built with git-archive
target-isns is written in C and built with CMake. A custom target, named dist, invokes git-archive to generate a tarball of the source code. Ironically, the tarball is compressed with XZ.
To verify that the release tarball matches the source code, we must compare the checksum of the published release tarball with one built locally from the source code.
How to compute the checksum of the published release tarball
Let's download the release tarball for target-isns v0.6.8 from GitHub and compute its SHA256 checksum:
$ curl -OL --silent https://github.com/open-iscsi/target-isns/releases/download/v0.6.8/target-isns-0.6.8.tar.xz $ sha256sum target-isns-0.6.8.tar.xz 544de09a2073242b21f6859841a5016c79c4006a53435a79b5cfc6602a59db97 target-isns-0.6.8.tar.xz
How to reproduce the release tarball from the source code
We clone the repository with git clone
and checkout the tag we want
with --branch v0.6.8
:
$ git clone --branch v0.6.8 https://github.com/open-iscsi/target-isns.git $ cd target-isns
The command git show v0.6.8
reports that the tagger (it's me) signed
the tag with GnuPG.
My about page mentions the fingerprint of my public GnuPG key. We import that key 1.
$ gpg --keyserver keyserver.ubuntu.com --search-key 81D01C4E399FFEDEDBD93D7F08390B7DF2FC1876 gpg: data source: http://185.125.188.27:11371 (1) Christophe Vu-Brugier <cvubrugier@example.org> 4096 bit RSA key 08390B7DF2FC1876, created: 2011-06-16 Keys 1-1 of 1 for "81D01C4E399FFEDEDBD93D7F08390B7DF2FC1876". Enter number(s), N)ext, or Q)uit > 1 gpg: key 08390B7DF2FC1876: public key "Christophe Vu-Brugier <cvubrugier@example.org>" imported gpg: Total number processed: 1 gpg: imported: 1
After importing my public key, we verify the tag with git tag --verify
1:
$ git tag --verify v0.6.8 object 52e4fd427b1aff902ef4e7bce9a9c2f6b358a5eb type commit tag v0.6.8 tagger Christophe Vu-Brugier <cvubrugier@example.org> 1588947171 +0200 target-isns v0.6.8 gpg: Signature made Fri May 8 14:13:16 2020 UTC gpg: using RSA key 81D01C4E399FFEDEDBD93D7F08390B7DF2FC1876 gpg: issuer "cvubrugier@example.org" gpg: Good signature from "Christophe Vu-Brugier <cvubrugier@example.org>" [unknown] gpg: WARNING: This key is not certified with a trusted signature! gpg: There is no indication that the signature belongs to the owner. Primary key fingerprint: 81D0 1C4E 399F FEDE DBD9 3D7F 0839 0B7D F2FC 1876
The tag signature is correct. We build a release tarball and compare its checksum with the published release tarball.
# mkdir build $ cd build/ $ cmake .. -- The C compiler identification is GNU 12.2.0 -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Check for working C compiler: /usr/bin/cc - skipped -- Detecting C compile features -- Detecting C compile features - done -- Configuring done -- Generating done -- Build files have been written to: /target-isns/build $ make dist Built target dist $ sha256sum target-isns-0.6.8.tar.xz 544de09a2073242b21f6859841a5016c79c4006a53435a79b5cfc6602a59db97 target-isns-0.6.8.tar.xz
The checksum (544de09a
) of the generated tarball matches the
checksum of the published tarball: we reproduce the release tarball.
I reproduced the release tarball on Debian 12 (bookworm). Does that work with other Linux distributions?
Reproducing the release tarball from other Linux distributions
I use Podman to build the release tarball in several Linux containers.
Linux distribution | Release tarball reproduced | Git version | XZ version |
---|---|---|---|
Alpine 3.19.1 | Yes | 2.43.0 | 5.4.5 |
Arch Linux | No | 2.44.0 | 5.6.1 |
Debian 11 (bullseye) | Yes | 2.30.2 | 5.2.5 |
Debian 12 (bookworm) | Yes | 2.39.2 | 5.4.1 |
Fedora 39 | Yes | 2.44.0 | 5.4.4 |
Ubuntu 22.04 (jammy) | Yes | 2.34.1 | 5.2.5 |
Ubuntu 24.04 (noble, pre-release) | Yes | 2.43.0 | 5.4.5 |
Rocky Linux 8 | Yes | 2.39.3 | 5.2.4 |
Rocky Linux 9 | Yes | 2.39.3 | 5.2.5 |
The release tarball is reproducible everywhere except on Arch
Linux. The output of git archive
is the same but XZ compresses the
tarball differently. A binary search with git bisect
finds the
commit that introduce the change:
XZ (version 5.6.1) on Arch Linux compresses data with several threads
by default. With --threads=1
, we can force XZ to compress data with
just one thread. With that, we can reproduce the release tarball of
target-isns on Arch Linux.
Usually, there are different ways for a compressor to represent the original data. That's why a compressor output may differ when its algorithm or parameters change. To reduce the risk of compressed tarballs varying across platforms, target-isns releases could switch to a compressor less likely to change such as gzip.
Future work
While existing release tarballs of target-isns are reproducible, we should improve the release process.
- Document the release process.
- A document should describe how releases are made and how to reproduce them.
- Setup a continuous integration pipeline to generate the release tarball from a tag.
- A continuous integration job should automatically generate the release tarball from a tag. The job should run in a container (for example a Docker image). With that, users could reproduce the release tarballs by running the same container image on their machine.
- Sign the release tarball with GnuPG.
- The maintainer should download the tarball generated by the release pipeline, reproduce it, and test it. Then, they should publish the release tarball, the SHA256 checksum file, and sign these files with their private GnuPG key.