Category: Debian

New software: sicherboot

Fork me on GitHub

Today, I wrote sicherboot, a tool to integrate systemd-boot into a Linux distribution in an entirely new way: With secure boot support. To be precise: The use case here is to only run trusted code which then unmounts an otherwise fully encrypted disk, as in my setup:


If you want, sicherboot automatically creates db, KEK, and PK keys, and puts the public keys on your EFI System Partition (ESP) together with the KeyTool tool, so you can enroll the keys in UEFI. You can of course also use other keys, you just need to drop a db.crt and a db.key file into /etc/sicherboot/keys. It would be nice if sicherboot could enroll the keys directly in Linux, but there seems to be a bug in efitools preventing that at the moment. For some background: The Platform Key (PK) signs the Key Exchange Key (KEK) which signs the database key (db). The db key is the one signing binaries.

sicherboot also handles installing new kernels to your ESP. For this, it combines the kernel with its initramfs into one executable UEFI image, and then signs that. Combined with a fully encrypted disk setup, this assures that only you can run UEFI binaries on the system, and attackers cannot boot any other operating system or modify parts of your operating system (except for, well, any block of your encrypted data, as XTS does not authenticate the data; but then you do have to know which blocks are which which is somewhat hard).

sicherboot integrates with various parts of Debian: It can work together by dracut via an evil hack (diverting dracut’s kernel/postinst.d config file, so we can run sicherboot after running dracut), it should support initramfs-tools (untested), and it also integrates with systemd upgrades via triggers on the /usr/lib/systemd/boot/efi directory.

Currently sicherboot only supports Debian-style setups with /boot/vmlinuz-<version> and /boot/initrd.img-<version> files, it cannot automatically create combined boot images from or install boot loader entries for other naming schemes yet. Fixing that should be trivial though, with a configuration setting and some eval magic (or string substitution).

Future planned features include: (1) support for multiple ESP partitions, so you can have a fallback partition on a different drive (think RAID type situation, keep one ESP on each drive, so you can remove a failing one); and (2) a tool to create a self-contained rescue disk image from a directory (which will act as initramfs) and a kernel (falling back to a vmlinuz file )

It might also be interesting to add support for other bootloaders and setups, so you could automatically sign a grub cryptodisk image for example. Not sure how much sense that makes.

I published the source at (MIT licensed) and uploaded the package to Debian, it should enter the NEW queue soon (or be in NEW by the time you read this). Give it a try, and let me know what you think.

apt 1.3 RC4 – Tweaking apt update

Did that ever happen to you: You run apt update, it fetches a Release file, then starts fetching DEP-11 metadata, then any pdiff index stuff, and then applies them; all after another? Or this: You don’t see any update progress until very near the end? Worry no more: I tweaked things a bit in 1.3~rc4 (git commit).

Prior to 1.3~rc4, acquiring the files for an update worked like this: We create some object for the Release file, once a release file is done we queue any next object (DEP-11 icons, .diff/Index files, etc). There is no prioritizing, so usually we fetch the 5MB+ DEP-11 icons and components files first, and only then start working on other indices which might use Pdiff.

In 1.3~rc4 I changed the queues to be priority queues: Release files and .diff/Index files have the highest priority (once we have them all, we know how much to fetch). The second level of priority goes to the .pdiff files which are later on passed to the rred process to patch an existing Packages, Sources, or Contents file. The third priority level is taken by all other index targets.

Actually, I implemented the priority queues back in Jun. There was just one tiny problem: Pipelining. We might be inserting elements into our fetching queues in order of priority, but with pipelining enabled, stuff of lower priority might already have their HTTP request sent before we even get to queue the higher priority stuff.

Today I had an epiphany: We fill the pipeline up to a number of items (the depth, currently 10). So, let’s just fill the pipeline with items that have the same (or higher) priority than the maximum priority of the already-queued ones; and pretend it is full when we only have lower priority items.

And that works fine: First the Release and .diff/Index stuff is fetched, which means we can start showing accurate progress info from there one. Next, the pdiff files are fetched, meaning that we can apply them in parallel to any targets downloading later in parallel (think DEP-11 icon tarballs).

This has a great effect on performance: For the 01 Sep 2016 03:35:23 UTC -> 02 Sep 2016 09:25:37 update of Debian unstable and testing with Contents and appstream for amd64 and i386, update time reduced from 37 seconds to 24-28 seconds.


In other news

I recently cleaned up the apt packaging which renamed /usr/share/bug/apt/script to /usr/share/bug/apt. That broke on overlayfs, because dpkg could not rename the old apt directory to a backup name during unpack (only directories purely on the upper layer can be renamed). I reverted that now, so all future updates should be fine.

David re-added the Breaks against apt-utils I recently removed by accident during the cleanup, so no more errors about overriding dump solvers. He also added support for fingerprints in gpgv’s GOODSIG output, which apparently might come at some point.

I Also fixed a few CMake issues, fixed the test suite for gpgv 2.1.15, allow building with a system-wide gtest library (we really ought to add back a pre-built one in Debian), and modified debian/rules to pass -O to make. I wish debhelper would do the latter automatically (there’s a bug for that).

Finally, we fixed some uninitialized variables in the base256 code, out-of-bound reads in the Sources file parser, off-by-one errors in the tagfile comment stripping code[1], and some memcpy() with length 0. Most of these will be cherry-picked into the 1.2 (xenial) and (jessie) branches (releases 1.2.15 and If you forked off your version of apt at another point, you might want to do the same.

[1] those were actually causing the failures and segfaults in the unit tests on hurd-i386 buildds. I always thought it was a hurd-specific issue…

PS. Building for Fedora on OBS has a weird socket fd #3 that does not get closed during the test suite despite us setting CLOEXEC on it. Join us in #debian-apt on oftc if you have ideas.

Porting APT to CMake

Ever since it’s creation back in the dark ages, APT shipped with it’s own build system consisting of autoconf and a bunch of makefiles. In 2009, I felt like replacing that with something more standard, and because nobody really liked autotools, decided to go with CMake. Well, the bazaar branch was never really merged back in 2009.

Fast forward 7 years to 2016. A few months ago, we noticed that our build system had trouble with correct dependencies in parallel building. So, in search for a way out, I picked up my CMake branch from 2009 last Thursday and spent the whole weekend working on it, and today I am happy to announce that I merged it into master:

123 files changed, 1674 insertions(+), 3205 deletions(-)

More than 1500 lines less build system code. Quite impressive, eh? This also includes about 200 lines of less code in debian/, as that switched from prehistoric debhelper stuff to modern dh (compat level 9, almost ready for 10).

The annoying Tale of Targets vs Files

Talking about CMake: I don’t really love it. As you might know, CMake differentiates between targets and files. Targets can in some cases depend on files (generated by a command in the same directory), but overall files are not really targets. You also cannot have a target with the same name as a file you are generating in a custom command, you have to rename your target (make is OK with the generated stuff, but ninja complains about cycles because your custom target and your custom command have the same name).

Byproducts for the (time) win

One interesting thing about CMake and Ninja are byproducts. In our tree, we are building C++ files. We also have .pot templates depending on them, and .mo files depending on the templates (we have multiple domains, and merge the per-domain .pot with the all-domain .po file during the build to get a per-domain .mo). Now, if we just let them depend naively, changing a C++ file causes the .pot file to be regenerated which in turns causes us to build .mo files for every freaking language in the package. Even if nothing changed.

Byproducts solve this problem. Instead of just building the .pot file, we also create a stamp file (AKA the witness) and write the .pot file (without a header) into a temporary name and only copy it to its final name if the content changed. The .pot file is declared as a byproduct of the command.

The command doing the .pot->.mo step still depends on the .pot file (the byproduct), but as that only changes now if strings change, the .mo files only get rebuild if I change a translatable string. We still need to ensure that that the .pot file is actually built before we try to use it – the solution here is to specify a custom target depending on the witness and then have the target containing the .mo build commands depend on that target.

Now if you use  make, you might now this trick already. In make, the byproducts remain undeclared, though, while in CMake we can now actually express them, and they are used by the Ninja generator and the Ninja build tool if you chose that over make (try it out, it’s fast).

Further Work

Some command names are hardcoded, I should find_program() them. Also cross-building the package does not yet work successfully, but it only requires a tiny amount of patches in debhelper and/or cmake.

I also tried building the package on a Fedora docker image (with dpkg installed, it’s available in the Fedora sources). While I could eventually get the programs build and most of the integration test suite to pass, there are some minor issues to fix, mostly in the documentation building and GTest department: Fedora ships its docbook stylesheets in a different location, and ships GTest as a pre-compiled library, and not a source tree.

I have not yet tested building on exotic platforms like macOS, or even a BSD. Please do and report back. In Debian, CMake is not enough on the non-Linux platforms to build APT due to test suite failures, I hope those can be fixed/disabled soon (it appears to be a timing issue AFAICT).

I hope that we eventually get some non-Debian backends for APT. I’d love that.

Clarifications and updates on APT + SHA1

The APT 1.2.7 release is out now.

Despite of what I wrote earlier, we now print warnings for Release files signed with signatures using SHA1 as the digest algorithm. This involved extending the protocol APT uses to communicate with the methods a bit, by adding a new 104 Warning message type.

W: gpgv:/var/lib/apt/lists/apt.example.com_debian_dists_sid_InRelease: The repository is insufficiently signed by key
1234567890ABCDEF0123456789ABCDEF01234567 (weak digest)

Also note that SHA1 support is not dropped, we merely do not consider it trustworthy. This means that it feels like SHA1 support is dropped, because sources without SHA2 won’t work; but the SHA1 signatures will still be used in addition to the SHA2 ones, so there’s no point removing them (same for MD5Sum fields).

We also fixed some small bugs!

Dropping SHA-1 support in APT

Tomorrow is the anniversary of Caesar’s assassination APT will see a new release, turning of support for SHA-1 checksums in Debian unstable and in Ubuntu xenial, the upcoming LTS release.  While I have no knowledge of an imminent attack on our use of SHA1, Xenial (Ubuntu 16.04 LTS) will be supported for 5 years, and the landscape may change a lot in the next 5 years. As disabling the SHA1 support requires a bit of patching in our test suite, it’s best to do that now rather than later when we’re forced to do it.

This will mean that starting tomorrow, some third party repositories may stop working, such as the one for the web browser I am writing this with. Debian Derivatives should be mostly safe for that change, if they are registered in the Consensus, as that has checks for that. This is a bit unfortunate, but we have no real choice: Technical restrictions prevent us from just showing a warning in a sensible way.

There is one caveat, however: GPG signatures may still use SHA1. While I have prepared the needed code to reject SHA1-based signatures in APT, a lot of third party repositories still ship Release files signed with signatures using SHA-1 as the digest algorithms. Some repositories even still use 1024-bit DSA keys.

I plan to enforce SHA2 for GPG signatures some time after the release of xenial, and definitely for Ubuntu 16.10, so around June-August (possibly during DebConf). For xenial, I plan to have a SRU (stable release update) in January to do the same (it’s just adding one member to an array). This should give 3rd party providers a reasonable time frame to migrate to a new digest algorithm for their GPG config and possibly a new repository key.


  • Tomorrow: Disabling SHA1 support for Release, Packages, Sources files
  • June/July/August: Disabling SHA1 support for GPG signatures (InRelease/Release.gpg) in development releases
  • January 2017: Disabling SHA1 support for GPG signatures in Ubuntu 16.04 LTS via a stable-release-update.

APT 1.1.8 to 1.1.10 – going “faster”

Not only do I keep incrementing version numbers faster than ever before, APT also keeps getting faster. But not only that, it also has some bugs fixed and the cache is now checked with a hash when opening.

Important fix for 1.1.6 regression

Since APT 1.1.6, APT uses the configured xz compression level. Unfortunately, the default was set to 9, which requires 674 MiB of RAM, compared to the 94 MiB required at level 6.

This caused the test suite to fail on the Ubuntu autopkgtest servers, but I thought it was just some temporary hickup on their part, and so did not look into it for the 1.1.7, 1.1.8, and 1.1.9 releases.  When the Ubuntu servers finally failed with 1.1.9 again (they only started building again on Monday it seems), I noticed something was wrong.

Enter git bisect. I created a script that compiles the APT source code and runs a test with ulimit for virtual and resident memory set to 512 (that worked in 1.1.5), and let it ran, and thus found out the reason mentioned above.

The solution: APT now defaults to level 6.

New Features

APT 1.1.8 introduces /usr/lib/apt/apt-helper cat-file which can be used to read files compressed by any compressor understood by APT. It is used in the recent apt-file experimental release, and serves to prepare us for a future in which files on the disk might be compressed with a different compressor (such as LZ4 for Contents files, this will improve rred speed on them by factor 7).

David added a feature that enables servers to advertise that they do not want APT to download and use some Architecture: all contents when they include all in their list of architectures. This is to allow archives to drop Architecture: all packages from the architecture-specific content files, to avoid redundant data and (thus) improve the performance of apt-file.

Buffered writes

APT 1.1.9 introduces buffered writing for rred, reducing the runtime by about 50% on a slowish SSD, and maybe more on HDDs. The 1.1.9 release is a bit buggy and might mess up things when a write syscall is interrupted, this is fixed in 1.1.10.

Cache generation improvements

APT 1.1.9 and APT 1.1.10 improve the cache generation algorithms in several ways: Switching a lookup table from std::map to std::unordered_map, providing an inline isspace_ascii() function, and inlining the tolower_ascii() function which are tiny functions that are called a lot.

APT 1.1.10 also switches the cache’s hash function to the DJB hash function and increases the default hash table sizes to the smallest prime larger than 15000, namely 15013. This reduces the average bucket size from 6.5 to 4.5. We might increase this further in the future.

Checksum for the cache, but no more syncs

Prior to APT 1.1.10 writing the cache was a multi-part process:

  1. Write the the cache to a temporary file with the dirty bit set to true
  2. Call fsync() to sync the cache
  3. Write a new header with the dirty bit set to false
  4. Call fsync() to sync the new header
  5. (Rename the temporary file to the target name)

The last step was obviously not needed, as we could easily live with an intact cache that has its dirty field set to false, as we can just rebuild it.

But what matters more is step 2. Synchronizing the entire 40 or 50 MB takes some time. On my HDD system, it consumed 56% of the entire cache generation time, and on my SSD system, it consumed 25% of the time.

APT 1.1.10 does not sync the cache at all. It now embeds a hashsum (adler32 for performance reasons) in the cache. This helps ensure that no matter what parts of the cache are written in case of some failure somewhere, we can still detect a failure with reasonable confidence (and even more errors than before).

This means that cache generation is now much faster for a lot of people. On the bad side, commands like apt-cache show that previously took maybe 10 ms to execute can now take about 80 ms.

Please report back on your performance experience with 1.1.10 release, I’m very interested to see if that works reasonably for other people. And if you have any other idea how to solve the issue, I’d be interested to hear them (all data needs to be written before the header with dirty=0 is written, but we don’t want to sync the data).

Future work

We seem to have a lot of temporary (?) std::string objects during the cache generation, accounting for about 10% of the run time. I’m thinking of introducing a string_view class similar to the one proposed for C++17 and make use of that.

I also thought about calling posix_fadvise() before starting to parse files, but the cache generation process does not seem to spend a lot of its time in system calls (even with all caches dropped before the run), so I don’t think this will improve things.

If anyone has some other suggestions or patches for performance stuff, let me know.

Much faster incremental apt updates

APT’s performance in applying the Pdiffs files, which are the diff format used for Packages, Sources, and other files in the archive has been slow.

Improving performance for uncompressed files

The reason for this is that our I/O is unbuffered, and we were reading one byte at a time in order to read lines. This changed on December 24, by adding read buffering for reading lines, vastly improving the performance of rred.

But it was still slow, so today I profiled – using gperftools – the rred method running on a 430MB uncompressed Contents file with a 75 KB large patch. I noticed that our ReadLine() method was calling some method which took a long time (google-pprof told me it was some _nss method, but that was wrong [thank you, addr2line]).

After some further look into the code, I noticed that we set the length of the buffer using the length of the line. And whenever we moved some data out of the buffer, we called memmove() to move the remaining data to the front of the buffer.

So, I tried to use a fixed buffer size of 4096 (commit). Now memmove() would spend less time moving memory around inside the buffer. This helped a lot, bringing the run time on my example file down from 46 seconds to about 2 seconds.

Later on, I rewrote the code to not use memmove() at all – opting for start and end variables instead; and increasing the start variable when reading from the buffer (commit).

This in turn further improved things, bringing it down to about 1.6 seconds. We could now increase the buffer size again, without any negative effect.

Effects on apt-get update

I measured the run-time of apt-get update, excluding appstream and apt-file files, for the update from todays 07:52 to the 13:52 dinstall run. Configured sources are unstable and experimental with amd64 and i386 architectures. appstream and apt-file indexes are disabled for testing, so only Packages and Sources indexes are fetched.

The results are impressive:

  • For APT 1.1.6, updating with PDiffs enabled took 41 seconds.
  • For APT 1.1.7, updating with PDiffs enabled took 4 seconds.

That’s a tenfold increase in performance. By the way, running without PDiffs took 20 seconds, so there’s no reason not to use them.

Future work

Writes are still unbuffered, and account for about 75% to 80% of our runtime. That’s an obvious area for improvements.


Performance for patching compressed files

Contents files are usually compressed with gzip, and kept compressed locally because they are about 500 MB uncompressed and only 30MB compressed. I profiled this, and it turns out there is not much we can do about it: The majority of the time is spent inside zlib, mostly combining CRC checksums:


Going forward, I think a reasonable option might be to recompress Contents files using lzo – they will be a bit bigger (50MB instead of 30MB), but lzo is about 6 times as fast (compressing a 430MB Contents file took 1 second instead of 6).