cupt and how to write package managers

cupt is a new package manager written in Perl by Eugene V. Lyubimkin, who previously contributed to APT. And more than all, the project makes no sense at all.

First of all, there is a language issue. Implementing a package manager in Perl has some major drawbacks. One of the features of APT was it being written in a lower-level language (i.e. C++ which really is below Perl), making it possible to write applications like synaptic and python bindings which in turn lead to applications like gnome-app-install or Ubuntu’s new Software Store.

Furthermore, writing a package manager in Perl means that Distributions such as Emdebian might not be able to use it since they have excluded Perl due to its space requirements. This becomes even more important considering that cupt depends on even more perl libraries. This means that cupt will never be able to replace APT.

Secondly, a package manager should not be designed specifically for one distribution. This is another major drawback of cupt and other package managers such as yum or zypper. The smart package manager, written in Python and funded by Canonical Ltd. is an example for a distribution-neutral package manager.

Now let’s take a look at package management in modern distributions. Usually we have two levels of package managers, the first being tools like dpkg and rpm which take care of installing and removing the packages and level-2 package managers implementing dependency resolvers and package retrieval from remote locations. Recently, distributions started to add a third layer named PackageKit, which shall provide distribution-independent package management user interfaces. The project was well-received by RPM-based distributions, but failed in Debian-based distributions due to not supporting debconf. Furthermore, adding a third layer just increases the possibility of problems.

The right way to do package management is a distribution-independent level-2 package manager written in C. The smart project shows us that this is possible although itself fails to meet the lower-level language (C) requirement. That’s why I decided to write a package manager in Vala, a GObject-based language which gets converted to C and then compiled. If successful, this project will be able to replace most of the current level-2 package managers and will also provide the same distribution-independence as provided by a level-3 package manager such as PackageKit. It is also easy to create binding for other programming language such as Python or Perl thus enabling application developers to choose the language they like most.

The core of this project is a vendor-neutral library, temporarily called libapt (as the project is called APT2 for now). This library contains all the code which is not specific to a vendor i.e. file retrieval, dependency resolver, caches, etc and  is then enhanced by several vendor-specific plugins, each implementing a PackageManager (interface to the distribution’s level-1 package manager) and a Repository (well, repositories from which you can download packages) interface.

We could even enhance the vendor-independent interface to include more details of a repository. Most repositories nowadays consist of 4 components: A meta index (Release files for Debian, repomd.xml for Fedora/openSUSE/etc.), a package index (e.g. Packages files for Debian, *primary.xml.gz on Fedora, etc. ), a source index (e.g Sources files in Debian) and a files index (e.g. Contents-*.gz for Debian). I took a look at the repository formats of Slackware, openSUSE and Fedora and it seems that this concept can be applied to all of them. So maybe all we need are distribution dependent parsers for those files.

One of the most important issues with APT is its use of mmap() for the cache. Using mmap() makes it hard to grow the cache, which is sometimes needed. We see a lot of bug reports from people with too small cache sizes. We can circumvent this problem by utilizing an embedded database like SQLite for this, but we would probably loose some speed and it may be harder to maintain a flexible API. We should see what the best option is here, both ways are possible since Vala 0.7.6 includes my patches for adding mmap(), ftruncate(), mremap() and some other functions. An idea to circumvent the mmap() issue is gathering statistics about the relation between the number of repositories and the size of the cache and then using a value which is slightly above the average statistical value.

The project is not very mature yet, it only includes basic library functions for downloading files and parsing configuration files, etc. You can find the (MIT licensed) code at http://git.debian.org/?p=users/jak/apt2.git. I also have some local code for repository management and multi-threaded file fetching, but it’s just not ready to be merged yet.

55 thoughts on “cupt and how to write package managers

  1. I think you’ve missed the point of PackageKit. Whilst I welcome a C based level 2 package manager, you’ve got to understand that normal users don’t sit with a command line open.

    Have a look at http://www.packagekit.org/pk-profiles.html and this will show you a non-technical audience for Linux. Note, the distinct lack of technical geeks, as geeks can quite easily drop to the command line and use level 1 or 2 tools and they’ll know what root passwords are, and understand locking. Have a look at the video here http://video.fosdem.org/2008/maintracks/FOSDEM2008-packagekit.ogg for some good examples on why using a library to install software (rather than a service) is a bad idea.

    I also think you’re massively misunderestimating the amount of logic in yum dealing with the Fedora metadata. It’s really not as clear cut as you suggest. I wish it was, but it’s not.

    1. > “[…] on why using a library to install software (rather than a service) is a bad idea.”

      I also plan to write a D-Bus service (org.debian.PackageManager). But there may be cases where you need direct access to the library or you want direct access to the library; e.g. creating installer images.

      I envision some kind of a session class. The session provides commands to list/install/remove packages, update the cache, etc. This session class could then be exported via D-Bus (which is easy in Vala, see http://live.gnome.org/Vala/DBusServerSample).

      So the concept would be like this:
      dpkg/rpm
      libapt
      apt
      apt-dbus
      apt-gtk
      Although I think that the “apt” part could be removed and use the D-Bus service instead considering that D-Bus will be an essential part of future distribution releases (especially due to upstart using D-Bus). But it is easier to develop ideas and concepts with the library directly first, and afterwards create a service for it.

      > “[…] you’re massively misunderestimating the amount of logic in yum dealing with the Fedora metadata.[..]”

      I always think that things are easier than they are.But I also think that as smart is able to do it, it is possible.

      1. >I also plan to write a D-Bus service (org.debian.PackageManager)

        Why? Why duplicate the functionality in PackageKit? Linux isn’t utmost about choice, it’s about people working together.

        Anyway, I think I’m wasting my time. If you plan on reimplementing apt, PackageKit and the gnome-packagekit/kpackagekit tools I wish you good luck.

      2. (no threading at this level anymore)

        > Why? Why duplicate the functionality in PackageKit?

        I guess if it is so easy to write a D-Bus server in Vala, there is no reason not to do it. Ubuntu even uses a Debian-specific service already (aptdaemon), because of the limitations of PackageKit (non-interactive may be a good idea, but is just not possible at the moment; and there should be a way to get interactive if wanted).

        > Anyway, I think I’m wasting my time. If you plan on reimplementing apt, PackageKit and the gnome-packagekit/kpackagekit tools I wish you good luck.

        I would even like to re-implement dpkg in APT2, but I guess I would fail completely.

  2. Richard, please spare us your bullshit about PackageKit. The idea of a package management library is precisely to be able to build high-level, simple interfaces as well as command-line interfaces, with the *same logic* behind. This is where update-manager used to fail (and have been fixed recently by Stefan), and this is where PK fails as well.

    Julian, you’re definitely on the right track, I agree with everything that you said. Keep up the good work!

    1. spare us your bullshit about PackageKit

      I’m not sure it’s bullshit, given that quite a few distros are using PackageKit by default now. Did you even watch the video I linked? How do you intend on installing packages systemwide without the GUI running as root?

      1. > Ever wondered why Ubuntu started to work on aptdaemon
        > instead of embracing PackageKit?

        I think you need to read the mailing list archives and get your facts correct.

      2. >>How do you intend on installing packages systemwide >>without the GUI running as root?

        I’m a little confused here. Aren’t you the guy that started PolicyKit which is aimed at fixing just this sort of problem?
        BTW: pk-gnom/pk-kde suck. Big time. Ubuntu Software Store is a far better design overall and it’s using PolicyKit for installing packages system wide without the GUI running as root.
        I applaud your other Kit-projects because they really are nice and fix fundamental problems, but Packagekit isn’t one of them.

    2. Free Software is about choice, the idea of offering a D-Bus server in *addition* to a library and raw frontends is a good idea, especially for implementing graphical clients targeted at the new users (i.e. parts of the Ubuntu audience). And given the fact that writing a D-Bus server in Vala is as easy as adding one line to the code, there should be no problems. Combined with PolicyKit, we would have the normal “Linux-Desktop” experience.

  3. Hi =)

    I’m not a package manager engineer nor a debian devel… only a guy that found some wrong arguments.

    Firstly you started by reprobate cupt because its depends on perl. After you claim that The smart package manager is good, but ignored it was did in python (with some parts in C)… why do you think it is good at all? For me, python is bigger (more then perl, at least).

    Now you are proposing another package manager, in a different new language. But you forget to say that Vala depends on glib, gobject and all other Vala dependencies (like libGIO). It can be more fast then Cupt or Smart, however your project will be so big as the others.

    I understand you are trying to solve a old problem and would like to say thank you by your efforts, but your argument of “bloated package manager” will fail if you decide by use Vala (or perl, python, etc).

    Best regards,
    Carlos.

    1. > you claim that The smart package manager is good, but ignored it was did in python

      Quoting the post: “The smart project shows us that this is possible although itself fails to meet the lower-level language (C) requirement.”

      > But you forget to say that Vala depends on glib, gobject and all other Vala dependencies (like libGIO).
      About the size argument: At least on my system
      libstdc++: 1252 KiB
      glib: 2216 KiB (including GIO, GThread, GModule, GObject)
      perl: 18808 KiB

      As you can see, Perl is about 9 times larger than glib and cupt depends on even more packages.

      > your argument of “bloated package manager” will fail if you decide by use Vala (or perl, python, etc).
      Comparing Vala against Perl or Python is complete non-sense. Vala requires no special runtime environment, it is converted to pure C with GObject at compile time and thus only requires glib and gobject and gio (which are all in the glib package).

      1. Hi Julian,

        Sounds interesting… I did a check myself (I’m running debian unstable):
        libvala (7.6): 2228Kb
        libglib (2.22): 2064Kb

        The size is not so big!! and your D-BUS idea (listed in your comments) is really something cool. Go ahead, I’m looking forward to see your project working!😀

        Best regards,
        Carlos.

  4. I appreciate that you are also trying to approach an APT replacement. However, why denunciate cupt while you do so? This just casts a bad light on you. Competition is good, and cupt is further than your APT2. So instead of slandering it, I suggest you catch up, add choice, drive progress, so that eventually the better of the two can emerge as future replacement.

    And to be honest, the argument with embedded systems is weak, because they either use stripped down APTs already, or they’ll just be able to stay with APT — or maybe they will use APT2, while cupt might have the edge on other machines. Who knows. But these are exciting times, no need to spread the bad word.

    1. > However, why denunciate cupt while you do so? This just casts a bad light on you.
      I think that cupt has some weaknesses and pointed them out, the “project makes no sense at all” maybe was a bit too exaggerated.

      > Competition is good, and cupt is further than your APT2.
      I don’t see cupt as a real competition, as cupt is really Debian-only and not a library. cupt will be a competition once APT2 actually has a frontend, but until then, we can discuss ideas and strategies and work together. Some concepts (not code) in APT2 are already derived from cupt, as they are easier than apt’s.

      > And to be honest, the argument with embedded systems is weak, because they either use stripped down APTs already, or they’ll just be able to stay with APT

      Not if we decide to stop the APT project. If either Debian or Ubuntu decide to switch the package manager, APT will be dead unless someone wants to maintain it further (e.g. Nokia, which may need it for Maemo)

      > maybe they will use APT2, while cupt might have the edge on other machines.
      Maybe. But if the implementations reach a state where they can be considered as the standard package manager, we have to make a decision. This may happen in 2012 or 2014, in Debian 7 or 8. But we have to see.

  5. Julian, I am aware that Perl has some drawbacks which prevents integrating cupt with some software/hardware.

    Let’s recall the principle of Free Software. People write the software for themselves and share it with others. No one dictates the 100% replacement or so, just as I wrote in that announcement mail. You are free to use it, to not use it, to develop something else.

    Good luck with APT2.

    1. > I am aware that Perl has some drawbacks which prevents integrating cupt with some software/hardware.

      And I am aware that KDE people don’t like anything starting with G. I just complain about Perl because I can’t write Perl, just Python, and to some extent C, C++ and Vala. Perl is just a bit too ugly for me, although I like the shell-like feeling.

      > People write the software for themselves and share it with others.
      Most do, I write software for others and share it with myself🙂.

      > No one dictates the 100% replacement or so, just as I wrote in that announcement mail.
      Well, once we are ready we can talk about the removal of APT.

      > You are free to use it, to not use it, to develop something else.

      Actually I am not free to use it, but see the bug I just reported (it fails to read my preferences file).

      > Good luck with APT2.
      Good luck with cupt. Let’s be friendly and work together to create the two best package managers in the Debian world.

      1. >I just complain about Perl because I can’t write Perl, just Python

        Aha. And I don’t like writing in Python. Deadlock.

        >Well, once we are ready we can talk about the removal of APT.

        Hardly. Considering hundreds of existing reverse depends.

      2. > >Well, once we are ready we can talk about the removal of APT.
        > Hardly. Considering hundreds of existing reverse depends.
        In fact I meant that once we have a better library, we can start porting applications to it. And after this has been done, we can remove apt. And it is a long time until 2014.

    2. BTW Eugene, if you’re motivated to work on packaging tools and like perl, there’s one area where you can be of great help: debconf. If you could write a D-Bus frontend for debconf, which in turn could be used as an interface for frontends, we could finally ask Debconf questions without an application running as root.

      1. Feel free to send me more detailed info what should be done for this task (I didn’t worked with D-Bus yet, and the same applies to debconf nuts), and I will see whether can I help.

  6. BTW there’s one thing that will not, IMHO, work as simply as you seem to work: cross package format support.

    While it sounds nice on paper to support both .deb and .rpm packages to allow the same library to be used on different systems, RPM has some serious design flaws (such as the lack of pre-installation scripts) which do not allow to write a frontend that behaves the same with both, without seriously reducing what it can do with a correct packaging format.

    Settling on the lowest common denominator does not work, and PK is the best example of such a failure.

    1. RPM does have pre installation scripts. See the %pre, %post, %preun, and %postun sections of the spec file. In RPM the scripts get inserted into the spec file instead of a separate file like dpkg.

      I am studying both rpm and dpkg, and don’t see one as being better than the other.

      Although there does not seem to be a way to specify optional dependencies in rpm. As in this in a dpkg control file, which says one of these is required at run time:

      Depends: tesseract | ocrad | gocr

      1. AFAIK the pre-installation scripts are executed after unpacking, and so are the pre-upgrade ones. This causes serious trouble with stuff that is cannot be upgraded in a stateless way, like GConf.

      2. So Np237, are you talking about debconf when you mean pre installation (before unpacking)?

        Just trying to get a better understanding of the difference between rpm and dpkg, as I am doing work in this area too.

      3. Debconf is usually run in the “config” script, which has very strict requirements in what it can use (basically, debconf and essential packages). The config script is run before the installation or upgrade even starts, so that questions from all packages can be asked in a row.

        The preinst script is run right before unpacking, and the postinst script is run when the package is unpacked and all dependencies have already run their postinst.

        (Note that debconf can also be used in the preinst, but it’s for very special cases.)
        (I’m also omitting triggers, AFAIK rpm has them too.)

      4. Thanks for explaining. Stupidly I always say debconf when I mean the config script. Sometimes I forget that it is used for other stuff than calling debconf.

    2. I never said that it will behave identical on those systems, just that they could be supported. We don’t have to settle on the lowest common denominator.

      Each package format has its own class implementing the installation functionality. The progress handling completely relies on signals and should thus be easy to extend for a new package format. Some formats may not use all those signals, but they can.

      I guess there should be no real problems there, but I will start with .deb packages first and implement the others later (or someone else implements them). The API will only be frozen after we have the major package formats (.deb, .rpm, .txz [Slackware]) implemented; which should result in an usable API providing all the features needed by most package formats.

      1. In this case, I’m afraid you’ll run into lots of problems with backends not implementing some of the features, leading to an increasing number of code paths depending on whether the signals are supported or not.

        I wish you success since you seem to be using the correct approach, but I hope you won’t obtain a result which, while supporting any package format on the paper, would only support DEB practically.

  7. I don’t understand why you think yum is distribution specific. It is RPM specific, perhaps but it is used in many distributions.

      1. Designed for one repository type? That is the same repository type (repomd) that is used by SUSE (and understood by apt-rpm, smart etc) as well. So you still be incorrect about that.

      2. > Designed for one repository type? That is the same repository type (repomd) that is used by SUSE (and understood by apt-rpm, smart etc) as well. So you still be incorrect about that.

        You write about one repository type as well. Why do you contradict yourself?

      1. Also, you seem unaware of how annoying that “feature” is.

        I suppose the correct link is snap.snap.com

        Block it with what? Adblock? I don’t have it.

  8. I would love to see debian or ubuntu embrace a simple apt replacement coded in C. Opkg looks like a good option, and APT2 sounds interesting.

  9. “You write about one repository type as well. Why do you contradict yourself?”

    I don’t. I was merely pointing out that yum and zypper actually use the same repository type and there is absolutely nothing about the repository metadata that is distribution specific. You have to update your blog post to correct the misleading info

  10. People (and I mean debian users) are compiling from source kpackackagekit … Why ? because you may hate it or not but it is the best GUI for package mangment that is qt/kde based that you can use in debian distros.

    Alternatives ? The dead Adept ? Oh please …

    1. For now, I am still working on a generic format (i.e. classes) to represent the dependencies. Afterwards, I will try to implement a basic solver and external ones.

      I have no plans to write an SAT solver, but the external ones may be SAT-based.

Comments are closed.