XZ backdoor lessons: reproducing target-isns release tarballs

Andres Freund detected a backdoor in XZ, a data compression library, and disclosed it as CVE-2024-3094 at the end of March 2024.

Many good articles describe how the attackers implanted the backdoor:

The attackers prepared the backdoor for more than two years and completed their work when they became maintainers. They published a release tarball with a malicious build script not present in the source code. The attackers uploaded the release tarball to GitHub.

I maintain target-isns, a niche free software project packaged in several Linux distributions. There is not much activity in the project because it is mostly done. The last release of target-isns – version v0.6.8 – happened in May 2020. I built the release tarball on my machine and uploaded it to GitHub.

I believe that one of the lessons of the XZ backdoor episode is "Trust, but verify". Applied to target-isns releases I published, this could mean:

  1. How can users verify that the release tarball of target-isns matches the source code?
  2. How can we improve the release process of target-isns to ease that verification?

How to verify a release tarball built with git-archive

target-isns is written in C and built with CMake. A custom target, named dist, invokes git-archive to generate a tarball of the source code. Ironically, the tarball is compressed with XZ.

To verify that the release tarball matches the source code, we must compare the checksum of the published release tarball with one built locally from the source code.

How to compute the checksum of the published release tarball

Let's download the release tarball for target-isns v0.6.8 from GitHub and compute its SHA256 checksum:

$ curl -OL --silent https://github.com/open-iscsi/target-isns/releases/download/v0.6.8/target-isns-0.6.8.tar.xz
$ sha256sum target-isns-0.6.8.tar.xz
544de09a2073242b21f6859841a5016c79c4006a53435a79b5cfc6602a59db97  target-isns-0.6.8.tar.xz

How to reproduce the release tarball from the source code

We clone the repository with git clone and checkout the tag we want with --branch v0.6.8:

$ git clone --branch v0.6.8 https://github.com/open-iscsi/target-isns.git
$ cd target-isns

The command git show v0.6.8 reports that the tagger (it's me) signed the tag with GnuPG.

My about page mentions the fingerprint of my public GnuPG key. We import that key 1.

$ gpg --keyserver keyserver.ubuntu.com --search-key 81D01C4E399FFEDEDBD93D7F08390B7DF2FC1876
gpg: data source:
(1)     Christophe Vu-Brugier <cvubrugier@example.org>
          4096 bit RSA key 08390B7DF2FC1876, created: 2011-06-16
Keys 1-1 of 1 for "81D01C4E399FFEDEDBD93D7F08390B7DF2FC1876".  Enter number(s), N)ext, or Q)uit > 1
gpg: key 08390B7DF2FC1876: public key "Christophe Vu-Brugier <cvubrugier@example.org>" imported
gpg: Total number processed: 1
gpg:               imported: 1

After importing my public key, we verify the tag with git tag --verify 1:

$ git tag --verify v0.6.8
object 52e4fd427b1aff902ef4e7bce9a9c2f6b358a5eb
type commit
tag v0.6.8
tagger Christophe Vu-Brugier <cvubrugier@example.org> 1588947171 +0200

target-isns v0.6.8
gpg: Signature made Fri May  8 14:13:16 2020 UTC
gpg:                using RSA key 81D01C4E399FFEDEDBD93D7F08390B7DF2FC1876
gpg:                issuer "cvubrugier@example.org"
gpg: Good signature from "Christophe Vu-Brugier <cvubrugier@example.org>" [unknown]
gpg: WARNING: This key is not certified with a trusted signature!
gpg:          There is no indication that the signature belongs to the owner.
Primary key fingerprint: 81D0 1C4E 399F FEDE DBD9  3D7F 0839 0B7D F2FC 1876

The tag signature is correct. We build a release tarball and compare its checksum with the published release tarball.

# mkdir build
$ cd build/
$ cmake ..
-- The C compiler identification is GNU 12.2.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Configuring done
-- Generating done
-- Build files have been written to: /target-isns/build
$ make dist
Built target dist
$ sha256sum target-isns-0.6.8.tar.xz
544de09a2073242b21f6859841a5016c79c4006a53435a79b5cfc6602a59db97  target-isns-0.6.8.tar.xz

The checksum (544de09a) of the generated tarball matches the checksum of the published tarball: we reproduce the release tarball.

I reproduced the release tarball on Debian 12 (bookworm). Does that work with other Linux distributions?

Reproducing the release tarball from other Linux distributions

I use Podman to build the release tarball in several Linux containers.

Linux distribution Release tarball reproduced Git version XZ version
Alpine 3.19.1 Yes 2.43.0 5.4.5
Arch Linux No 2.44.0 5.6.1
Debian 11 (bullseye) Yes 2.30.2 5.2.5
Debian 12 (bookworm) Yes 2.39.2 5.4.1
Fedora 39 Yes 2.44.0 5.4.4
Ubuntu 22.04 (jammy) Yes 2.34.1 5.2.5
Ubuntu 24.04 (noble, pre-release) Yes 2.43.0 5.4.5
Rocky Linux 8 Yes 2.39.3 5.2.4
Rocky Linux 9 Yes 2.39.3 5.2.5

The release tarball is reproducible everywhere except on Arch Linux. The output of git archive is the same but XZ compresses the tarball differently. A binary search with git bisect finds the commit that introduce the change:

XZ (version 5.6.1) on Arch Linux compresses data with several threads by default. With --threads=1, we can force XZ to compress data with just one thread. With that, we can reproduce the release tarball of target-isns on Arch Linux.

Usually, there are different ways for a compressor to represent the original data. That's why a compressor output may differ when its algorithm or parameters change. To reduce the risk of compressed tarballs varying across platforms, target-isns releases could switch to a compressor less likely to change such as gzip.

Future work

While existing release tarballs of target-isns are reproducible, we should improve the release process.

Document the release process.
A document should describe how releases are made and how to reproduce them.
Setup a continuous integration pipeline to generate the release tarball from a tag.
A continuous integration job should automatically generate the release tarball from a tag. The job should run in a container (for example a Docker image). With that, users could reproduce the release tarballs by running the same container image on their machine.
Sign the release tarball with GnuPG.
The maintainer should download the tarball generated by the release pipeline, reproduce it, and test it. Then, they should publish the release tarball, the SHA256 checksum file, and sign these files with their private GnuPG key.
A crab spider (Misumena vatia) caught a bee.

A crab spider (Misumena vatia) caught a bee.

  1. I removed my email from the command output.