Category: Ubuntu

Clarifications and updates on APT + SHA1

The APT 1.2.7 release is out now.

Despite of what I wrote earlier, we now print warnings for Release files signed with signatures using SHA1 as the digest algorithm. This involved extending the protocol APT uses to communicate with the methods a bit, by adding a new 104 Warning message type.

W: gpgv:/var/lib/apt/lists/apt.example.com_debian_dists_sid_InRelease: The repository is insufficiently signed by key
1234567890ABCDEF0123456789ABCDEF01234567 (weak digest)

Also note that SHA1 support is not dropped, we merely do not consider it trustworthy. This means that it feels like SHA1 support is dropped, because sources without SHA2 won’t work; but the SHA1 signatures will still be used in addition to the SHA2 ones, so there’s no point removing them (same for MD5Sum fields).

We also fixed some small bugs!

Dropping SHA-1 support in APT

Tomorrow is the anniversary of Caesar’s assassination APT will see a new release, turning of support for SHA-1 checksums in Debian unstable and in Ubuntu xenial, the upcoming LTS release.  While I have no knowledge of an imminent attack on our use of SHA1, Xenial (Ubuntu 16.04 LTS) will be supported for 5 years, and the landscape may change a lot in the next 5 years. As disabling the SHA1 support requires a bit of patching in our test suite, it’s best to do that now rather than later when we’re forced to do it.

This will mean that starting tomorrow, some third party repositories may stop working, such as the one for the web browser I am writing this with. Debian Derivatives should be mostly safe for that change, if they are registered in the Consensus, as that has checks for that. This is a bit unfortunate, but we have no real choice: Technical restrictions prevent us from just showing a warning in a sensible way.

There is one caveat, however: GPG signatures may still use SHA1. While I have prepared the needed code to reject SHA1-based signatures in APT, a lot of third party repositories still ship Release files signed with signatures using SHA-1 as the digest algorithms. Some repositories even still use 1024-bit DSA keys.

I plan to enforce SHA2 for GPG signatures some time after the release of xenial, and definitely for Ubuntu 16.10, so around June-August (possibly during DebConf). For xenial, I plan to have a SRU (stable release update) in January to do the same (it’s just adding one member to an array). This should give 3rd party providers a reasonable time frame to migrate to a new digest algorithm for their GPG config and possibly a new repository key.

Summary

  • Tomorrow: Disabling SHA1 support for Release, Packages, Sources files
  • June/July/August: Disabling SHA1 support for GPG signatures (InRelease/Release.gpg) in development releases
  • January 2017: Disabling SHA1 support for GPG signatures in Ubuntu 16.04 LTS via a stable-release-update.

APT 1.1.8 to 1.1.10 – going “faster”

Not only do I keep incrementing version numbers faster than ever before, APT also keeps getting faster. But not only that, it also has some bugs fixed and the cache is now checked with a hash when opening.

Important fix for 1.1.6 regression

Since APT 1.1.6, APT uses the configured xz compression level. Unfortunately, the default was set to 9, which requires 674 MiB of RAM, compared to the 94 MiB required at level 6.

This caused the test suite to fail on the Ubuntu autopkgtest servers, but I thought it was just some temporary hickup on their part, and so did not look into it for the 1.1.7, 1.1.8, and 1.1.9 releases.  When the Ubuntu servers finally failed with 1.1.9 again (they only started building again on Monday it seems), I noticed something was wrong.

Enter git bisect. I created a script that compiles the APT source code and runs a test with ulimit for virtual and resident memory set to 512 (that worked in 1.1.5), and let it ran, and thus found out the reason mentioned above.

The solution: APT now defaults to level 6.

New Features

APT 1.1.8 introduces /usr/lib/apt/apt-helper cat-file which can be used to read files compressed by any compressor understood by APT. It is used in the recent apt-file experimental release, and serves to prepare us for a future in which files on the disk might be compressed with a different compressor (such as LZ4 for Contents files, this will improve rred speed on them by factor 7).

David added a feature that enables servers to advertise that they do not want APT to download and use some Architecture: all contents when they include all in their list of architectures. This is to allow archives to drop Architecture: all packages from the architecture-specific content files, to avoid redundant data and (thus) improve the performance of apt-file.

Buffered writes

APT 1.1.9 introduces buffered writing for rred, reducing the runtime by about 50% on a slowish SSD, and maybe more on HDDs. The 1.1.9 release is a bit buggy and might mess up things when a write syscall is interrupted, this is fixed in 1.1.10.

Cache generation improvements

APT 1.1.9 and APT 1.1.10 improve the cache generation algorithms in several ways: Switching a lookup table from std::map to std::unordered_map, providing an inline isspace_ascii() function, and inlining the tolower_ascii() function which are tiny functions that are called a lot.

APT 1.1.10 also switches the cache’s hash function to the DJB hash function and increases the default hash table sizes to the smallest prime larger than 15000, namely 15013. This reduces the average bucket size from 6.5 to 4.5. We might increase this further in the future.

Checksum for the cache, but no more syncs

Prior to APT 1.1.10 writing the cache was a multi-part process:

  1. Write the the cache to a temporary file with the dirty bit set to true
  2. Call fsync() to sync the cache
  3. Write a new header with the dirty bit set to false
  4. Call fsync() to sync the new header
  5. (Rename the temporary file to the target name)

The last step was obviously not needed, as we could easily live with an intact cache that has its dirty field set to false, as we can just rebuild it.

But what matters more is step 2. Synchronizing the entire 40 or 50 MB takes some time. On my HDD system, it consumed 56% of the entire cache generation time, and on my SSD system, it consumed 25% of the time.

APT 1.1.10 does not sync the cache at all. It now embeds a hashsum (adler32 for performance reasons) in the cache. This helps ensure that no matter what parts of the cache are written in case of some failure somewhere, we can still detect a failure with reasonable confidence (and even more errors than before).

This means that cache generation is now much faster for a lot of people. On the bad side, commands like apt-cache show that previously took maybe 10 ms to execute can now take about 80 ms.

Please report back on your performance experience with 1.1.10 release, I’m very interested to see if that works reasonably for other people. And if you have any other idea how to solve the issue, I’d be interested to hear them (all data needs to be written before the header with dirty=0 is written, but we don’t want to sync the data).

Future work

We seem to have a lot of temporary (?) std::string objects during the cache generation, accounting for about 10% of the run time. I’m thinking of introducing a string_view class similar to the one proposed for C++17 and make use of that.

I also thought about calling posix_fadvise() before starting to parse files, but the cache generation process does not seem to spend a lot of its time in system calls (even with all caches dropped before the run), so I don’t think this will improve things.

If anyone has some other suggestions or patches for performance stuff, let me know.

0x15 + 1/365

Yesterday was my 21st birthday, and I received all “Hitchhiker’s Guide to the Galaxy” novels, the five ones in one book, and the sixth one written by Eoin Colfer in another book. Needless to say, the first book weights more than an N900. I did not read them yet, so now is the perfect chance to do so. Yes, I did not know that 25th is towel day, sorry for that.

I also bought a Toshiba AC100 before my birthday, a Tegra 2 based notebook/netbook/”web companion” with 1 GHz dual core ARM Cortex A9 chip and 512 MB RAM. It runs Android by default, and had a price of 160€ which is low compared to anything else with Cortex A9. It currently runs Ubuntu 11.04 with a specialised kernel 2.6.37 from time to time, without sound and accelerated video (and not functioning HDMI). Mostly waiting for Nvidia to release a new binary blob for the video part (And yes, if you just want to build packages, you can probably get happy without those things).

Another thing happening last week is the upload of python-apt 0.8.0 to unstable, marking the beginning (or end) of the API transition I started more than a year ago. Almost all packages not supporting it have proper Breaks in python-apt [most of them already fixed, only 2 packages remaining, one of which is “maintained” (well, not really maintained right now) by me], but there may be some which do not work correctly despite being fixed (or at least thought to be fixed).

If you know any other interesting thing I did last week, leave a comment, I wrote enough now. And yes, WordPress wants to write a multiplication sign instead of an x, so I had to use &#120 instead.

Nokia/Intel/Google/Canonical – openness and professionality in MeeGo, Android, Ubuntu

MeeGo (Nokia/Intel): Openness does not seem to be very important for Nokia and Intel. They develop their stuff behind closed doors and then do a large code drop (once dropped, stuff gets developed in the open). In terms of professionality, it does not look much better: If you look at the meego-dev mailing list, you feel like you are in some kind of a kindergarten for open source developers – Things like HTML emails and top-posting appear to be completely normal for them, they don’t even follow the basic rules for emails and they also appear to ignore any advice on this topic. Oh, and writing a platform in GTK+ while pushing Qt as the supported toolset is not a good idea.

Android (Google): The situation with Android appears to be even worse. Google keeps an internal development tree and the only time the public sees changes is when a new release is coming and Google does a new code drop. Such a behavior is not acceptable.

Ubuntu (Canonical): Canonical does an outstanding job on openness. Canonical employees develop their software completely in the open, and there are no big code drops. They basically have no competitive advantage and allow you to use their work for your own project before they use it themselves. Canonical employees basically behave like normal free software developers doing stuff in his free time, the only exception being that they are paid for it. Most Canonical employees also know how to behave on mailing lists and do not post HTML emails or do top-posting. There are of course also other people: The design process of the Ubuntu font on the other hand is a disaster, with the font being created using proprietary software running on Windows; for a free software company, that’s not acceptable (such persons should probably not be hired at all). Furthermore, the quality of Ubuntu packages is often bad when compared to the quality of a Debian package (especially in terms of API documentation of Canonical-created software); a result of vastly different development cycles and different priorities.

On a scale from 0 to 15 (15 being the best), Nokia/Intel receive 5 points (“D” in U.S. grades), Google receives 3 points (“F” in U.S. grades), and Canonical receives 12 points (“B” in U.S. grades). Please note that those grades only apply for the named projects, the companies may behave better or worse in other projects.

Final tips for the companies:

  • Nokia/Intel: Teach your employees how to behave in email discussions.
  • Nokia/Intel/Google: Don’t do big code drops, develop in the open. If someone else can create something awesome using your work before you can, you are on the right way. Competitive Advantage must not exist.
  • Canonical: Fire those who used Windows to create the Ubuntu font and restart using free tools.
  • All: Document your code.

My First upload with new source format

Yesterday, I uploaded command-not-found 0.2.38-1 (based on version 0.2.38ubuntu4) to Debian unstable, using the “3.0 (quilt)” source format. All steps worked perfectly, including stuff like cowbuilder, lintian, debdiff, dput and the processing on ftp-master. Next steps are reverting my machine from Ubuntu 9.10 to my Debian unstable system and uploading new versions of gnome-main-menu, python-apt (0.7.93, not finished yet) and some other packages.

In other news, the development of Ubuntu 10.04 Lucid Lynx just started. For the first time in Ubuntu’s history, the development will be based on the testing tree of Debian and not on the unstable tree. This is done in order to increase the stability of the distribution, as this release is going to be a long term supported release. Ubuntu will freeze in February, one month before the freeze of Debian Squeeze. This should give us enough room to collaborate, especially on bugfixes. This also means that I will freeze my packages in February, so they will have the same version in Squeeze and Lucid (applying the earliest freeze to both distributions; exceptions where needed).

Debian’s new time-based freezes

Overall, having time-based freezes is a good idea. But the chosen cycle is problematic, especially if one considers Ubuntu’s LTS release cycles. The problem is that if Debian releases a new version at approximately the same time as Ubuntu, there will not be much synchronization and Ubuntu will have newer program versions.

Consider the releases of Ubuntu 8.04 LTS (April 2008) and Debian GNU/Linux 5.0 (February 2009). Whereas Ubuntu 8.04 provides GNOME 2.22 including a new Nautilus, Debian provides GNOME 2.22 with nautilus 2.20. Ubuntu’s release made at about the same time (Ubuntu 9.04) already included GNOME 2.26.

The reason for this are the different development processes. Whereas a Debian release is based on stable upstream versions, the development of Ubuntu releases happens using newest pre-releases, causing Ubuntu releases to be one generation ahead in most technologies, although released at the same time.

Another difference is the way of freezing and the duration of the total freeze. Ubuntu has a freeze process split into multiple stages. The freeze which is comparable to Debian’s freeze is the feature freeze, which usually happens two months before the release. Debian’s freeze happens 6 month before the release, just about the time when Ubuntu has just defined the features to be included in the next release.

Synchronizing the release cycles of Debian and Ubuntu basically means that Ubuntu will always provide the newer features, better hardware support, etc. Ubuntu is the winner of this decision. It will have less bugs because it inherits from a frozen Debian branch and it can include newer versions where required because it freezes 3 months later.

To synchronize the package versions shipped in Debian and Ubuntu you have to make your release schedules asynchronous, in a way that the Debian release freezes after the Ubuntu release. This basically means that I expect Debian 6.0 (2010/H1) to be more similar to Ubuntu 9.10 than to Ubuntu 10.04. I would have preferred to freeze Squeeze in December 2010 and release in April 2011, and Ubuntu LTS to be re-scheduled for October 2010.

Now let’s say you are a Debian and Ubuntu user (and you prefer none of them) and you want to setup a new system in 2010/H2 (so the systems have a half year to stabilize). This system should be supported for a long time. You have two options: Debian 6.0 and Ubuntu 10.04. Which will you choose? Most people would probably consider Ubuntu now, because it provides better support for their hardware and has the newer features. A (partial) solution to this problem would be to make backports an official part of Debian starting with Debian 6.0.

Furthermore, there is the question of security support. If we want to provide the ability to upgrade directly from Debian 5.0 to Debian 7.0, we will have to support Lenny until 2012/H2 (to give users enough time to upgrade, when we release 7.0 in 2012/H1). This means that we would have to support in 2012: Debian 5.0, 6.0 and 7.0. And another side effect is that Debian 6.0 would have a 3 year security support, just like Ubuntu 10.04.

All in all, the decision means that Debian and Ubuntu LTS releases will occur at about the same time, will be supported for about the same time and that Ubuntu has the newer packages and less bugs to fix.