Context

We are currently working on integrating a basic model-checker within SimGrid. Actually Cristian developed a version during his PhD, and we integrated it into SimGrid v3.5, one year ago, but it only allows to check local assertions. So Marion works during her PhD on allowing the verification of temporal properties too.

We call it model-checking because this expression is more common but there no model in our version actually: we work with real code implementations, not with abstract models. The expression "dynamic verification" is thus preferable (but less people understand it directly).

I like working on this because it is really challenging to achieve. Also because the offered feature will be tremendously interesting to the end users when it will work, but really, the challenge is really high and thus motivating. The theoretical issues are complex: in short, you want to reduce the amount of states visited to fight the exponential explosion of the state space while granting that you visited all /relevant/ states, so you have to write algorithms dealing with the state of a program and reasoning on the equivalence of code paths. But you cannot begin to work on these problems unless systems issues are solved: you need to control the running program and get as much information about it as you can, by all means.

Multiplying heaps per processes

The basics of the model-checking algorithm is to explore one possible execution path by letting the studied program take this path, reason on it, and then explore another execution path by /rewinding/ the studied path to its initial situation to take another path. In SimGrid, we do so by checkpointing the studied application at the initial path, and then restoring it. That's quite classical.

But since the model-checker and the studied application live in the same system process rewinding the application but not the model-checker induces some specific difficulties. We didn't manage to separate the model-checker and the studied application through /fork/ing because there is unavoidable threads in our settings, and threads and forks don't mix well together. We cannot remove the threads because this is what we are using to fold our distributed applications onto one machine only: each user process of the distributed application is contained into a simple thread. Actually, we can use other containers, such as ucontextes of System V Unixes or a reimplementation of us to reduce the amount of system calls induced by context switching, but it still doesn't mix well with forks. Believe me, we got strange failures when exploring that path.

So, our model-checker and the studied application both live in the same system process, and we want to get the model-checker rewinding the application to a previous point without being affected itself by the process. To that extend, we save the state of the studied application (its memory) and then restore it later on. From a system point of view, we need to save and restore all variable memory maps: the .data areas of both the program and the libraries (that's were the global data live), the stack and the heap (where malloc allocates memory). For the stacks, there is nothing specific to do since our multi-threading solution ensures that all stack live in the heap, actually: this is the only way to have more than one stack in your system process, and it's classically done for example for multi-threaded applications.

Likewise, it is normally impossible to have several heaps per process, but insisting on doing so anyway is less common than multi-stack processes (every multi-threaded application have several stacks). You cannot rely on the sbrk system call to get memory from the operating system, since sbrk can only deal with one area (check your preferred operating system lectures if you need a refresh -- my lectures in French can be found here). So in SimGrid, we went for mmalloc, that is a malloc implementation allowing to use mmap instead of sbrk in order to get more than one heap per application. It was once in the libc, and then in gdb and were also removed from here since then. AFAIK, SimGrid is the last (living) piece of code using mmalloc.

This is all neat and cool: it allows us to redirect all malloc in the user application to a given heap, and keep the ones made by the model-checker in another heap. When we want to rewind the application, we overwrite the first heap while keeping the second one unmodified. Ok, the performance are impacted as our implementation of malloc is not as optimized as the current libc's one, but It Does The Trick.

Motivation to malloc introspection

This was for "simple" model-checking, where only local assertions can be verified. To go further and allow the verification of temporal properties, we need to reason on the application state. Which requires to inspect it. In particular, state equivalence is very important in that context. At system level, one would think that memcmp is the way to go to decide whether 2 states are equivalent or not, but this results in too much false negative: there is a lot of little useless values that may differ between two states without inducing that these states are semantically different.

So we need to introduce a notion of state /distance/ instead of state equality, at least during the development of the tool. We need to understand why our tool decides that two states differ to ensure whether he's right or whether there is a bug to fix. To that extend, we decided to greatly increase the amount of meta-data per allocated chunk of memory.

First, we decided to keep track not only of the size of the allocated chunk, but also of the size that were requested by the user. This is because when you ask for X bytes, malloc returns a block of 2^N bytes, with N as small as possible. So when asking for 42 bytes, you'll get 64 bytes (it helps reducing the memory fragmentation). This is important to us because the remaining bytes are usually not initialized, so if you memcmp them, there is no reason why they should contain the same thing from one run to another. Keeping track of the requested block size allows to ignore those bytes.

Even so, when we detect a discrepancy between the blocks, it is near to impossible to understand what causes this difference. That is why we decided to store also the ... backtrace at which the block was malloc()ed (or lastly realloc()ed). Yeah, that is why I love system programming: when the facts don't fit your theory, you can always change the facts.

Capturing a backtrace in C is less complex than it seems. You "just" have to retrieve the return address of each stack frame, and then some tools such as addr2line allow to get the corresponding location in your source code. To retrieve that address, you can use the GCC builtin called __builtin_return_address, but it's not really user-friendly. Its argument (the level in the stack that you wish to explore) must be a constant integer, so forget about writing a loop to explore all levels here.

A better solution is to use the GNU libc extension backtrace(). It takes an array of pointers as argument, and fill it with all the return addresses, so that's exactly what we need. Life is good.

Add me to the list of Drepper's haters

The only problem is that the current GNU libc implementation of backtrace actually calls indirectly calls malloc() the first time it's called (although the documentation seems to imply that only backtrace_symbol() does, since backtrace_symbols_fd() is said to be malloc-free), so calling backtrace from malloc is not very wise.

Since I really wanted to call backtrace to malloc, I dig a bit further to see whether it were possible circumvent this limitation. Afterall, that's a regression of libc circa 2005, according to the internet. And I'm really astonished by what I've found.

First, there is this bug report about my problem, and that Ulrich Drepper closes with the sentence "live with it" as sole justification. And he closes the bug 3 times without providing any more informative justification... I thought that the issue were really grave to justify this level of aggressivity from the libc maintainer.

And then, the code simply reads:

/* Return backtrace of current program state.
   Copyright (C) 2003, 2004, 2005, 2007, 2009 Free Software Foundation, Inc.
   This file is part of the GNU C Library.
   Contributed by Jakub Jelinek <jakub@redhat.com>, 2003.

   The GNU C Library is free software; you can redistribute it and/or
   modify it under the terms of the GNU Lesser General Public
   License as published by the Free Software Foundation; either
   version 2.1 of the License, or (at your option) any later version.

   The GNU C Library is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
   Lesser General Public License for more details.

   You should have received a copy of the GNU Lesser General Public
   License along with the GNU C Library; if not, write to the Free
   Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
   02111-1307 USA.  */

#include <bits/libc-lock.h>
#include <dlfcn.h>
#include <execinfo.h>
#include <stdlib.h>
#include <unwind.h>

struct trace_arg
{
  void **array;
  int cnt, size;
};

#ifdef SHARED
static _Unwind_Reason_Code (*unwind_backtrace) (_Unwind_Trace_Fn, void *);
static _Unwind_Ptr (*unwind_getip) (struct _Unwind_Context *);
static void *libgcc_handle;

static void
init (void)
{
  libgcc_handle = __libc_dlopen ("libgcc_s.so.1");

  if (libgcc_handle == NULL)
    return;

  unwind_backtrace = __libc_dlsym (libgcc_handle, "_Unwind_Backtrace");
  unwind_getip = __libc_dlsym (libgcc_handle, "_Unwind_GetIP");
  if (unwind_getip == NULL)
    unwind_backtrace = NULL;
}
#else
# define unwind_backtrace _Unwind_Backtrace
# define unwind_getip _Unwind_GetIP
#endif

static _Unwind_Reason_Code
backtrace_helper (struct _Unwind_Context *ctx, void *a)
{
  struct trace_arg *arg = a;

  /* We are first called with address in the __backtrace function.
     Skip it.  */
  if (arg->cnt != -1)
    {
      arg->array[arg->cnt] = (void *) unwind_getip (ctx);

      /* Check whether we make any progress.  */
      if (arg->cnt > 0 && arg->array[arg->cnt - 1] == arg->array[arg->cnt])
    return _URC_END_OF_STACK;
    }
  if (++arg->cnt == arg->size)
    return _URC_END_OF_STACK;
  return _URC_NO_REASON;
}

int
__backtrace (array, size)
     void **array;
     int size;
{
  struct trace_arg arg = { .array = array, .size = size, .cnt = -1 };
#ifdef SHARED
  __libc_once_define (static, once);

  __libc_once (once, init);
  if (unwind_backtrace == NULL)
    return 0;
#endif

  if (size >= 1)
    unwind_backtrace (backtrace_helper, &arg);

  /* _Unwind_Backtrace on IA-64 seems to put NULL address above
     _start.  Fix it up here.  */
  if (arg.cnt > 1 && arg.array[arg.cnt - 1] == NULL)
    --arg.cnt;
  return arg.cnt != -1 ? arg.cnt : 0;
}
weak_alias (__backtrace, backtrace)
libc_hidden_def (__backtrace)


#ifdef SHARED
/* Free all resources if necessary.  */
libc_freeres_fn (free_mem)
{
  unwind_backtrace = NULL;
  if (libgcc_handle != NULL)
    {
      __libc_dlclose (libgcc_handle);
      libgcc_handle = NULL;
    }
}
#endif

The only call in it that leads to malloc is the dlsym part, in the function init. It would be possible to issue a first call to backtrace() with a global flag mmalloc_inited set to 0, and then have mmalloc to call backtrace() on each malloc unless that flag is 0, but actually, there is a better solution. I "reimplemented" backtrace in my code the following way:

#include <unwind.h>
struct trace_arg {
  void **array;
  int cnt, size;
};

static _Unwind_Reason_Code
backtrace_helper (struct _Unwind_Context *ctx, void *a)
{
  struct trace_arg *arg = a;

  /* We are first called with address in the __backtrace function.
     Skip it.  */
  if (arg->cnt != -1)
    {
      arg->array[arg->cnt] = (void *) _Unwind_GetIP(ctx);

      /* Check whether we make any progress.  */
      if (arg->cnt > 0 && arg->array[arg->cnt - 1] == arg->array[arg->cnt])
        return _URC_END_OF_STACK;
    }
  if (++arg->cnt == arg->size)
    return _URC_END_OF_STACK;
  return _URC_NO_REASON;
}


int xbt_backtrace_no_malloc(void **array, int size) {
  struct trace_arg arg = { .array = array, .size = size, .cnt = -1 };

  if (size >= 1)
    _Unwind_Backtrace(backtrace_helper, &arg);

  /* _Unwind_Backtrace on IA-64 seems to put NULL address above
     _start.  Fix it up here.  */
  if (arg.cnt > 1 && arg.array[arg.cnt - 1] == NULL)
    --arg.cnt;
  return arg.cnt != -1 ? arg.cnt : 0;
}

So, the only difference is that instead of calling the gcc functions Unwind_GetIP and Unwind_Backtrace through a call to dlsym, I call them directly. I'm sure there is a good reason, but my feeling is that backtrace relies on malloc (inducing a clear regression) only to change a compile-time dependency on gcc into a run-time dependency. In other words, this regression were considered a lesser pain than updating the configuration scripts of the libc... I really hope there is a real reason, but if not, please add me to the (long) list of Drepper's haters. That's just impossible: gcc is already tested at configure time. There is something I fail to understand here. Too bad that Drepper is the way he is, I won't even try to ask him for some guidance.

Now, I have to prepare a patch and submit it to eglibc instead. Yet more todo :-/