APT 1.6 alpha 1 – seccomp and more

I just uploaded APT 1.6 alpha 1, introducing a very scary thing: Seccomp sandboxing for methods, the programs downloading files from the internet and decompressing or compressing stuff. With seccomp I reduced the number of system calls these methods can use to 149 from 430. Specifically we excluded most ways of IPC, xattrs, and most importantly, the ability for methods to clone(2), fork(2), or execve(2) (or execveat(2)). Yes, that’s right – methods can no longer execute programs.

This was a real problem, because the http method did in fact execute programs – there is this small option called ProxyAutoDetect or Proxy-Auto-Detect where you can specify a script to run for an URL and the script outputs a (list of) proxies. In order to be able to seccomp the http method, I moved the invocation of the script to the parent process. The parent process now executes the script within the sandbox user, but without seccomp (obviously).

I tested the code on amd64, ppc64el, s390x, arm64, mipsel, i386, and armhf. I hope it works on all other architectures libseccomp is currently built for in Debian, but I did not check that, so your apt might be broken now if you use powerpc, powerpcspe, armel, mips, mips64el, hhpa, or x32 (I don’t think you can even really use x32).

Also, apt-transport-https is gone for good now. When installing the new apt release, any installed apt-transport-https package is removed (apt breaks apt-transport-https now, but it also provides it versioned, so any dependencies should still be satisfiable).

David also did a few cool bug fixes again, finally teaching apt-key to ignore unsupported GPG key files instead of causing weird errors ūüôā

Advertisements

APT 1.5 is out

APT 1.5 is out, after almost 3 months the release of 1.5 alpha 1, and almost six months since the release of 1.4 on April 1st. This release cycle was unusually short, as 1.4 was the stretch release series and the zesty release series, and we waited for the latter of these releases before we started 1.5. In related news, 1.4.8 hit stretch-proposed-updates today, and is waiting in the unapproved queue for zesty.

This release series moves https support from apt-transport-https into apt proper, bringing with it support for https:// proxies, and support for autodetectproxy scripts that return http, https, and socks5h proxies for both http and https.

Unattended updates and upgrades now work better: The dependency on network-online was removed and we introduced a meta wait-online helper with support for NetworkManager, systemd-networkd, and connman that allows us to wait for network even if we want to run updates directly after a resume (which might or might not have worked before, depending on whether update ran before or after network was back up again). This also improves a boot performance regression for systems with rc.local files:

The rc.local.service unit specified After=network-online.target, and login stuff was After=rc.local.service, and apt-daily.timer was Wants=network-online.target, causing network-online.target to be pulled into the boot and the rc.local.service ordering dependency to take effect, significantly slowing down the boot.

An earlier less intrusive variant of that fix is in 1.4.8: It just moves the network-online.target Want/After from apt-daily.timer to apt-daily.service so most boots are uncoupled now. I hope we get the full solution into stretch in a later point release, but we should gather some experience first before discussing this with the release time.

Balint Reczey also provided a patch to increase the time out before killing the daily upgrade service to 15 minutes, to actually give unattended-upgrades some time to finish an in-progress update. Honestly, I’d have though the machine hung up and force rebooted it after 5 seconds already. (this patch is also in 1.4.8)

We also made sure that unreadable config files no longer cause an error, but only a warning, as that was sort of a regression from previous releases; and we added documentation for /etc/apt/auth.conf, so people actually know the preferred way to place sensitive data like passwords (and can make their sources.list files world-readable again).

We also fixed apt-cdrom to support discs without MD5 hashes for Sources (the Files field), and re-enabled support for udev-based detection of cdrom devices which was accidentally broken for 4 years, as it was trying to load libudev.so.0 at runtime, but that library had an SONAME change to libudev.so.1 – we now link against it normally.

Furthermore, if certain information in Release files change, like the codename, apt will now request confirmation from the user, avoiding a scenario where a user has stable in their sources.list and accidentally upgrades to the next release when it becomes stable.

Paul Wise contributed patches to allow configuring the apt-daily intervals more easily – apt-daily is invoked twice a day by systemd but has more fine-grained internal timestamp files. You can now specify the intervals in seconds, minutes, hours, and day units, or specify “always” to always run (that is, up to twice a day on systemd, once per day on non-systemd platforms).

Development for the 1.6 series has started, and I intent to upload a first alpha to unstable in about a week, removing the apt-transport-https package and enabling compressed index files by default (save space, a lot of space, at not much performance cost thanks to lz4). There will also be some small clean ups in there, but I don’t expect any life-changing changes for now.

I think our new approach of uploading development releases directly to unstable instead of parking them in experimental is working out well. Some people are confused why alpha releases appear in unstable, but let me just say one thing: These labels basically just indicate feature-completeness, and not stability. An alpha is just very likely to get a lot more features, a beta is less likely (all the big stuff is in), and the release candidates just fix bugs.

Also, we now have 3 active stable series: The 1.2 LTS series, 1.4 medium LTS, and 1.5. 1.2 receives updates as part of Ubuntu 16.04 (xenial), 1.4 as part of Debian 9.0 (stretch) and Ubuntu 17.04 (zesty); whereas 1.5 will only be supported for 9 months (as part of Ubuntu 17.10). I think the stable release series are working well, although 1.4 is a bit tricky being shared by stretch and zesty right now (but zesty is history soon, so …).

Why TUF does not shine (for APT repositories)

In DebConf17 there was a talk about The Update Framework, short TUF. TUF claims to be a plug-in solution to software updates, but while it has the same practical level of security as apt, it also has the same shortcomings, including no way to effectively revoke keys.

TUF divides signing responsibilities into roles: A root role, a targets rule (signing stuff to download), a snapshots rule (signing meta data), and a time stamp rule (signing a time stamp file). There also is a mirror role for signing a list of mirrors, but we can ignore that for now. It strongly recommends that all keys except for timestamp and mirrors are kept offline, which is not applicable for APT repositories – Ubuntu updates the repository every 30 minutes, imagine doing that with offline keys. An insane proposal.

In APT repositories, we effectively only have a snapshots rule – the only thing we sign are Release files, and trust is then chained down by hashes (Release files hashes Packages index files, and they have hashes of individual packages). The keys used to sign repositories are online keys, after all, all the metadata files change every 30 minutes (Ubuntu) or 6 hours (Debian) – it’s impossible to sign them by hand. The timestamp role is replaced by a field in the Release file specifying until when the Release file is considered valid.

Let’s check the attacks TUF protects again:

  • Arbitrary installation attacks. – We protect against that with the outer signature and hashes
  • Endless data attacks. – Yes, we impose a limit on Release files (the sizes of other files are specified in there and this file is signed)
  • Extraneous dependencies attacks – That’s verified by the signed hashes of Packages files
  • Fast-forward attacks – same
  • Indefinite freeze attacks – APT has a Valid-Until field that can be used to specify a maximum life time of a release file
  • Malicious mirrors preventing updates. – Well, the user configures the mirror, so usually not applicable. if the user has multiple mirrors, APT deals with that fine
  • Mix-and-match attacks – Again, signed Release file and hashes of other files
  • Rollback attacks – We do not allow Date fields in Release files to go backwards
  • Slow retrieval attacks – TUF cannot protect against that either. APT has very high timeouts, and there is no reasonable answer to that.
  • Vulnerability to key compromises – For our purposes where we need all repository signing keys to be online, as we need to sign new releases and metadata fairly often, it does not make it less vulnerable to require a threshold of keys (APT allows repositories to specify concrete key ids they may be signed with though, that has the same effect)
  • Wrong software installation. – Does not happen, the .deb files are hashed in the Packages files which are signed by the release file

As we can see, APT addresses all attacks TUF addresses.

But both do not handle key revocation. So, if a key & mirror gets compromised (or just key and the mirror is MITMed), we cannot inform the user that the key has been compromised and block updates from the compromised repository.

I just wrote up a proposal to allow APT to query for revoked keys from a different host with a key revocation list (KRL) file that is signed by different keys than the repository. This would solve the problem of key revocation easily – even if the repository host is MITMed or compromised, we can still revoke the keys signing the repository from a different location.

jak-linux.org moved / backing up

In the past two days, I moved my main web site jak-linux.org (and jak-software.de) from a very old contract at STRATO over to something else: The domains are registered with INWX¬†and the hosting is handled by uberspace.de. Encryption is provided by¬†Let’s Encrypt.

I requested the domain transfer from STRATO on Monday at 16:23, received the auth codes at 20:10 and the .de domain was transferred completely on 20:36 (about 20 minutes if you count my overhead). The .org domain I had to ACK, which I did at 20:46 and at 03:00 I received the notification that the transfer was successful (I think there was some registrar ACKing involved there). So the whole transfer took about 10 1/2 hours, or 7 hours since I retrieved the auth code. I think that’s quite a good time ūüôā

And, for those of you who don’t know: uberspace is a shared hoster that basically just gives you an SSH shell account, directories for you to drop files in for the http server, and various tools to add subdomains, certificates, virtual users to the mailserver. You can also run your own custom build software and open ports in their firewall. That’s quite cool.

I’m considering¬†migrating the blog away from wordpress at some point in the future – having a more integrated experience is a bit nicer than having my web presence split over two sites. I’m unsure if I shouldn’t add something like cloudflare there – I don’t want to overload the servers (but I only serve static pages, so how much load is this really going to get?).

in other news: off-site backups

I also recently started doing offsite backups via borg to a server operated by the wonderful rsync.net. For those of you who do not know rsync.net: You basically get SSH to a server where you can upload your backups via common tools like rsync, scp, or you can go crazy and use git-annex, borg, attic; or you could even just plain zfs send your stuff there.

The normal price is $0.08 per GB per month, but there is a special borg price of $0.03 (that price does not include snapshotting or support, really). You can also get a discounted normal account for $0.04 if you find the correct code on Hacker News, or other discounts for open source developers, students, etc. Рyou just have to send them an email.

Finally, I must say that uberspace and rsync.net feel similar in spirit. Both heavily emphasise the command line, and don’t really have any fancy click stuff. I like that.

Starting the faster, more secure APT 1.4 series

We just released¬†the first beta of APT 1.4 to Debian unstable (beta here means that we don’t know any other big stuff to add to it, but are still open to further extensions). This is the¬†release series that will be released with Debian stretch, Ubuntu zesty, and possibly Ubuntu zesty+1 (if the Debian freeze takes a very long time, even zesty+2 is possible). It should reach the master archive in a few hours, and your mirrors shortly after that.

Security changes

APT 1.4 by default disables support for repositories signed with SHA1 keys. I announced back in January that it was my intention to do this during the summer for development releases, but I only remembered the Jan 1st deadline for stable releases supporting that (APT 1.2 and 1.3), so better late than never.

Around¬†January 1st, the same or a similar change will occur in the APT 1.2 and 1.3 series in Ubuntu 16.04 and 16.10 (subject to approval by Ubuntu’s release team). This should mean that repository provides¬†had about one year to fix their repositories, and more than 8 months since the release of 16.04. I believe that 8 months is a reasonable time frame to upgrade a repository signing key, and hope that providers¬†who have not updated their repositories yet will do so as soon as possible.

Performance work

APT 1.4 provides a 10-20% performance increase in cache generation (and according to callgrind, we went from approx 6.8 billion to 5.3 billion instructions for my laptop’s configuration, a reduction of more than 21%). The major improvements are:

We switched the parsing of Deb822 files (such as Packages files) to my perfect hash function TrieHash. TrieHash Рwhich generates C code from a set of words Рis about equal or twice as fast as the previously used hash function (and two to three times faster than gperf), and we save an additional 50% of that time as we only have to hash once during parsing now, instead of during look up as well. APT 1.4 marks the first time TrieHash is used in any software. I hope that it will spread to dpkg and other software at a later point in time.vendors.

Another important change was to drop normalization of Description-MD5 values, the fields mapping a description in a Packages files to a translated description. We used to parse the hex digits into a native binary stream, and then compared it back to hex digits for comparisons, which cost us about 5% of the run time performance.

We also optimized one of our hash functions Рthe VersionHash that hashes the important fields of a package to recognize packages with the same version, but different content Рto not normalize data to a temporary buffer anymore. This buffer has been the subject of some bugs (overflow, incompleteness) in the recent past, and also caused some slowdown due to the additional writes to the stack. Instead, we now pass the bytes we are interested in directly to our CRC code, one byte at a time.

There were also some other micro-optimisations: For example, the hash tables in the cache used to be ordered by standard compare (alphabetical followed by shortest). It is now ordered by size first, meaning we can avoid data comparisons for strings of different lengths. We also got rid of a std::string that cannot use short string optimisation in a hot path of the code. Finally, we also converted our case-insensitive djb hashes to not use a normal tolower_ascii(), but introduced tolower_ascii_unsafe() which just sets the “lowercase bit” (| 0x20) in the character.

Others

  • Sandboxing now removes some environment variables like TMP from the environment.
  • Several improvements to installation ordering.
  • Support for armored GPG keys in trusted.gpg.d.
  • Various other fixes

For a more complete overview of all changes, consult the changelog.

Introducing DNS66, a host blocker for Android

ic_launcher

I’m proud (yes, really) to announce DNS66, my host/ad blocker for Android 5.0 and newer. It’s been around since last Thursday on F-Droid, but it never really got a formal announcement.

DNS66 creates a local VPN service on your Android device, and diverts all DNS traffic to it, possibly adding new DNS servers you can configure in its UI. It can use hosts files for blocking whole sets of hosts or you can just give it a domain name to block (or multiple hosts files/hosts). You can also whitelist individual hosts or entire files by adding them to the end of the list. When a host name is looked up, the query goes to the VPN which looks at the packet and responds with NXDOMAIN (non-existing domain) for hosts that are blocked.

You can find DNS66 here:

F-Droid is the recommended source to install from. DNS66 is licensed under the GNU GPL 3, or (mostly) any later version.

Implementation Notes

DNS66’s core logic is based on another project, ¬†dbrodie/AdBuster, which arguably has the cooler name. I translated that from Kotlin to Java, and cleaned up the implementation a bit:

All work is done in a single thread by using poll() to detect when to read/write stuff. Each¬†DNS request is sent via a new UDP socket, and poll() polls over all UDP sockets, a Device Socket (for the VPN’s tun device) and a pipe (so we can interrupt the poll at any time by closing the pipe).

We literally redirect your DNS servers. Meaning if your DNS server is 1.2.3.4, all traffic to 1.2.3.4 is routed to the VPN. The VPN only understands DNS traffic, though, so you might have trouble if your DNS server also happens to serve something else. I plan to change that at some point to emulate multiple DNS servers with fake IPs, but this was a first step to get it working with fallback: Android can now transparently fallback to other DNS servers without having to be aware that they are routed via the VPN.

We also need to deal with timing out queries that we received no answer for: DNS66 stores the query into a LinkedHashMap and overrides the removeEldestEntry() method to remove the eldest entry if it is older than 10 seconds or there are more than 1024 pending queries. This means that it only times out up to one request per new request, but it eventually cleans up fine.

 

Introducing TrieHash, a order-preserving minimal perfect hash function generator for C(++)

Abstract

I introduce TrieHash an algorithm for constructing perfect hash functions from tries. The generated hash functions are pure C code, minimal, order-preserving and outperform existing alternatives. Together with the generated header files,they can also be used as a generic string to enumeration mapper (enums are created by the tool).

Introduction

APT (and dpkg) spend a lot of time in parsing various files, especially Packages files. APT currently uses a function called AlphaHash which hashes the last 8 bytes of a word in a case-insensitive manner to hash fields in those files (dpkg just compares strings in an array of structs).

There is one obvious drawback to using a normal hash function: When we want to access the data in the hash table, we have to hash the key again, causing us to hash every accessed key at least twice. It turned out that this affects something like 5 to 10% of the cache generation performance.

Enter perfect hash functions: A perfect hash function matches a set of words to constant values without collisions. You can thus just use the index to index into your hash table directly, and do not have to hash again (if you generate the function at compile time and store key constants) or handle collision resolution.

As #debian-apt people know,¬†I happened to play a bit around with tries this week before guillem suggested¬†perfect hashing. Let me tell you one thing: My trie implementation was very naive, that did not really improve things a lot…

Enter TrieHash

Now, how is this related to hashing? The answer is simple: I wrote a perfect hash function generator that is based on tries. You give it a list of words, it puts them in a trie, and generates C code out of it, using recursive switch statements (see code generation below). The function achieves competitive performance with other hash functions, it even usually outperforms them.

Given a dictionary, it generates an enumeration (a C enum or C++ enum class) of all words in the dictionary, with the values corresponding to the order in the dictionary (the order-preserving property), and a function mapping strings to members of that enumeration.

By default, the first word is considered to be 0 and each word increases a counter by one (that is, it generates a minimal hash function). You can tweak that however:

= 0
WordLabel ~ Word
OtherWord = 9

will return 0 for an unknown value, map “Word” to the enum member WordLabel and map OtherWord to 9. That is, the input list functions like the body of a C enumeration. If no label is specified for a word, it will be generated from the word. For more details see the documentation

C code generation

switch(string[0] | 32) {
case 't':
    switch(string[1] | 32) {
    case 'a':
        switch(string[2] | 32) {
        case 'g':
            return Tag;
        }
    }
}
return Unknown;

Yes, really recursive switches – they directly represent the trie. Now, we did not really do a straightforward translation, there are some optimisations to make the whole thing faster and easier to look at:

First of all, the 32 you see is used to make the check case insensitive in case all cases of the switch body are alphabetical characters. If there are non-alphabetical characters, it will generate two cases per character, one upper case and one lowercase (with one break in it). I did not know that lowercase and uppercase characters differed by only one bit before, thanks to the clang compiler for pointing that out in its generated assembler code!

Secondly, we only insert breaks only between cases. Initially, each case ended with a return Unknown, but guillem (the dpkg developer) suggested it might be faster to let them fallthrough where possible. Turns out it was not faster on a good compiler, but it’s still more readable anywhere.

Finally, we build one trie per word length, and switch by the word length first. Like the 32 trick, his gives a huge improvement in performance.

Digging into the assembler code

The whole code translates to roughly 4 instructions per byte:

  1. A memory load,
  2. an or with 32
  3. a comparison, and
  4. a conditional jump.

(On x86, the case sensitive version actually only has a cmp-with-memory and a conditional jump).

Due to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77729 this may be one instruction more: On some architectures an unneeded zero-extend-byte instruction is inserted Рthis causes a 20% performance loss.

Performance evaluation

I run the hash against all 82 words understood by APT in Packages and Sources files, 1,000,000 times for each word, and summed up the average run-time:

host arch Trie TrieCase GPerfCase GPerf DJB
plummer ppc64el 540 601 1914 2000 1345
eller mipsel 4728 5255 12018 7837 4087
asachi arm64 1000 1603 4333 2401 1625
asachi armhf 1230 1350 5593 5002 1784
barriere amd64 689 950 3218 1982 1776
x230 amd64 465 504 1200 837 693

Suffice to say, GPerf does not really come close.

All hosts except the x230 are Debian porterboxes. The x230 is my laptop with a a Core i5-3320M, barriere has an Opteron 23xx. I included the DJB hash function for another reference.

Source code

The generator is written in Perl, licensed under the MIT license and available from¬†https://github.com/julian-klode/triehash¬†–¬†I initially prototyped it in Python, but¬†guillem complained that this would add new build dependencies to dpkg, so I rewrote it¬†in Perl.

Benchmark is available from https://github.com/julian-klode/hashbench

Usage

See the script for POD documentation.