Blog of Julian Andres Klode

Debian, Ubuntu, Linux in general, and other free software

simple code – clang creates 1600x faster executable than gcc

The following program, compiled with clang 1.1, runs 500 times faster than the gcc4.5-compiled code (in both cases with -O2):

#include <stdio.h>

#define len 1000000000L

unsigned long f(unsigned long a, unsigned long b) __attribute__((noinline));

int main()
    printf("%lu\n", f(0, 2*len));
    return 0;

unsigned long f(unsigned long a, unsigned long b)
    unsigned long sum = 0;
    for (; a < b; a++)
        sum += a;
    return sum;

Now, I would be interested to see what’s happening here. I took a look at the assembler code both compilers create, but the only thing I found out so far is that gcc’s assembly is easier to understand – 50 lines (gcc) vs 134 lines (clang). If someone knows the answer, please tell me.

Also see for a C++ version that calls f() via boost::thread.

Update: I also reported a bug at

Written by Julian Andres Klode

October 26, 2010 at 15:58

Posted in General

3 Responses

Subscribe to comments with RSS.

  1. It’s easy; clang knows that sum(i, i=[a,b>) is (b – a)(a + b – 1)/2, gcc doesn’t. I’m not sure whether it’s a very relevant optimization, though, as it’s not going to trigger on real code unless it’s more generic than I think it is.

    /* Steinar */

    Steinar H. Gunderson

    October 27, 2010 at 21:27

  2. Which is right: the title or the text? Is it 1600x or 500x?

    Marius Gedminas

    October 28, 2010 at 13:28

Comments are closed.


Get every new post delivered to your Inbox.

%d bloggers like this: