Managing the complexity of component based systems

I’ve recently given this talk at RSISE/NICTA at the Australian National University in Canberra.

http://cecs.anu.edu.au/seminars/more/SID/2990

If somebody at linux Cong AU is interested to have a chat about around these topics, I should be around the entire week. Please drop me a note or look for me on IRC.

ABSTRACT


Free and Open Source Software (FOSS) distributions are rather peculiar instances of component-based software platforms. They are developed rapidly and without tight central coordination, they are huge (tens to thousands components per platform), and their importance in the Internet computing infrastructure is growing.

Both the construction of a coherent collection of components and the maintenance of installations based on these raise difficult problems for distribution maintainers and system administrators. Distributions evolve rapidly by releasing new component versions and strive for increasingly high Quality Assurance (QA) requirements on their component collections. System upgrades may proceed on different paths depending on the current state of the system and the available components, and system administrators are faced with difficult choices of upgrade paths and with frequent upgrade failures.

The now concluded project MANCOOSI (Managing the Complexity of the Open Source Infrastructure) aims to solve some of these problems. I will describe current and past work done in the context of MANCOOSI and some future directions.


ocamlbuild stubs and dynamic libraries

Date Tags ocaml

The other day I wrote about my experience to set up the build system for a simple library. However, since parmap includes only two simple stubs to syscall I didn’t have the chance to talk how to convince ocamlbuild to build stubs that depend on an external dynamic library.

I’m sure, facing this problem you had a look at this example : http://brion.inria.fr/gallium/index.php/Ocamlbuild_example_with_C_stubs

Despite providing all elements, the page is a bit scars of details and explanations (at least for me…).

So, suppose you want to build ocaml stubs for a C library called toto . Good practices tell us to put our stubs in a file that is called toto_stubs.c and then add the .o file in a clib file (libtoto_stubs.clib ) that ocamlbuild is going to use to build out dynamic library.

So far so good. Now we need to add a tag, say use_toto that we will use to compile our binaries and libraries. Our _tags file will look like :

<libtoto_stubs.*>: use_toto

Here I use only one tag. In the example of the ocamlbuild they use two tags, one to compile, one to link.

At this point we need to explain to ocamlbuild how to build our library. First we add a compile rule where we say that whenever we compile a c object that use toto, then we must also add its include directory.

       flag ["c"; "use_toto"; "compile"] & S[
         A"-ccopt"; A"-I/usr/include/toto";
       ];

Second we add a link flag to add to the dynamic library all the information it needs to load other external libraries. This is important as we don’t want to add any other flags anywhere else. When we use -ltoto_stubs we want all other dependencies automatically satisfied by the linker. Note the libtoto.so referred by -ltoto is the C library for which we are writing these bindings and that sits in /usr/lib/.

       flag ["c"; "use_toto"; "ocamlmklib"] & S[
         A"-ltoto";
       ];

At the end we add a flag that whenever we try to build out ocaml module ( say toto.cma ), we want to add all necessary information to load at run time its dynamic component.

       flag ["ocaml"; "use_toto"; "link"; "library"; "byte"] & S[
         A"-dllib"; A"-ltoto_stubs";
       ];

Using ocamlobjdump we can easily check if the cma contains indeed this information. The output should look something like this :

$ocamlobjinfo _build/toto.cma
File _build/toto.cma
Force custom: no
Extra C object files:
Extra C options:
Extra dynamically-loaded libraries: -ltoto_stubs
Unit name: Toto
Force link: no
...

In order to generate cmxa and a objects we need to specify few other flags and dependencies like :

       dep ["link"; "ocaml"; "link_toto"] ["libtoto_stubs.a"]
       flag ["ocaml"; "use_toto"; "link"; "library"; "native"] & S[
         A"-cclib"; A"-ltoto_stubs";
       ];

As always, if I’m missing something, please drop a note in the comment page.


Parse French dates on a en_US machine

Date Tags python

Immagine you work in France, but you are really fond of your good old en_US locales. I’m sure one day you would invariably face the task to use python to play with some french text. I just find out that this can’t be easier. You just need to set create and set the correct locales for your python script and voila’ !

In this case I need to parse a french date to build an ical file. First, if you haven’t already done it for other reasons, you should rebuild your locales and select a freench encoding, for example fr_FR.UTF-8.

On debian , this is just one command away : sudo dpkg-reconfigure locales

Now you are ready to play :

import locale, datetime
#locale.setlocale(locale.LC_TIME, 'fr_FR.ISO-8859-1')
locale.setlocale(locale.LC_TIME, 'fr_FR.UTF-8')

date_from = "Dimanche 3 Juin 2012"
DATETIME_FORMAT = "%A %d %B %Y"
d = datetime.datetime.strptime(date_from, DATETIME_FORMAT)
print d

Update

If you want to set the date for a particular time zone, this is equally easy once you discover how to do it with standard library function. At the end of the previous snippet add :

from dateutil.tz import *
d = d.replace(tzinfo=gettz('Europe/Paris'))

This is the script I was working on. It uses the vobject library to generate ical files and itertools.groupby to parse the input file.

import vobject
from itertools import groupby
import re
import string
from dateutil.tz import *

import locale, datetime
locale.setlocale(locale.LC_TIME, 'fr_FR.UTF-8')

def test(line) :
    if re.match("^Dimanche.*\n$",line) is not None :
        return True
    else :
        return False

l = []
with open("example") as f :
    for key, group in groupby(f, test):
        if key :
            a = list(group)
        else :
            l.append(a+list(group))

DATETIME_FORMAT = "%A %d %B %Y "

cal = vobject.iCalendar()

for ev in l :
    date_from = ev[0]
    d = datetime.datetime.strptime(date_from, DATETIME_FORMAT)
    d = d.replace(tzinfo=gettz('Europe/Paris'))

    vevent = cal.add('vevent')
    vevent.add('categories').value = ["test category"]
    vevent.add('dtstart').value = d.replace(hour=15)
    vevent.add('dtend').value = d.replace(hour=18)
    vevent.add('summary').value = unicode("Test event")
    vevent.add('description').value = unicode(string.join(ev[1:]),encoding='UTF')

icalstream = cal.serialize()
print icalstream

Input :

Dimanche 6 Mai 2012 

- text text
- more text

Dimanche 13 Mai 2012 

- text text
- more text

Dimanche 3 Juin 2012 

- text text
- more text

Dimanche 10 Juin 2012 

- text text
- more text

Output

BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//PYVOBJECT//NONSGML Version 1//EN
BEGIN:VTIMEZONE
TZID:CET
BEGIN:STANDARD
DTSTART:20001029T030000
RRULE:FREQ=YEARLY;BYDAY=-1SU;BYMONTH=10
TZNAME:CET
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
END:STANDARD
BEGIN:DAYLIGHT
DTSTART:20000326T020000
RRULE:FREQ=YEARLY;BYDAY=-1SU;BYMONTH=3
TZNAME:CEST
TZOFFSETFROM:+0100
TZOFFSETTO:+0200
END:DAYLIGHT
END:VTIMEZONE
BEGIN:VEVENT
UID:20111123T133829Z-53948@zed
DTSTART;TZID=CET:20120506T150000
DTEND;TZID=CET:20120506T180000
CATEGORIES:test category
DESCRIPTION:        \n - text text\n - more text\n  \n
SUMMARY:Test event
END:VEVENT
BEGIN:VEVENT
UID:20111123T133829Z-19906@zed
DTSTART;TZID=CET:20120513T150000
DTEND;TZID=CET:20120513T180000
CATEGORIES:test category
DESCRIPTION:        \n - text text\n - more text\n  \n
SUMMARY:Test event
END:VEVENT
BEGIN:VEVENT
UID:20111123T133829Z-70980@zed
DTSTART;TZID=CET:20120603T150000
DTEND;TZID=CET:20120603T180000
CATEGORIES:test category
DESCRIPTION:        \n - text text\n - more text\n  \n
SUMMARY:Test event
END:VEVENT
BEGIN:VEVENT
UID:20111123T133829Z-44400@zed
DTSTART;TZID=CET:20120610T150000
DTEND;TZID=CET:20120610T180000
CATEGORIES:test category
DESCRIPTION:        \n - text text\n - more text\n \n
SUMMARY:Test event
END:VEVENT
END:VCALENDAR

ocamlbuild and C stubs

Date Tags ocaml

Today I struggled once again to build a simple library with ocamlbuild, so once for all I decided to write something about it. I’m sure next time, googling for an answer I’ll find this post and shake my head in despair :)

The library in question is parmap written by Roberto Di Cosmo to speed up computations on modern multi processors computers. We want to build everything: cma, cmxa and cmxs. Moreover, we want to build a shared library that contains stubs for a couple of bindings to C functions. On top of that, I want to use configure to make sure that the platform I’m using supports a specific syscall.

We start by copying ocaml.m4, the more or less standard autoconf macros for OCaml ( source here ). I prefer to copy this file in my source tree as I don’t want to impose to download it from a website to compile my library. This file ends up in the directory m4. Then to use it, I need to invoke aclocal as follows aclocal -I m4. This will take care of making the m4 macros known to the configure script.

Next step is the configure script. I need check for the standard ocaml utils, if extlib is known by ocamlfind and if the function sched_setaffinity is available on the system. This is the standard autoconf way to detect if a specific function is available. Together with the AC_CONFIG_HEADERS([config.h]) call, this will define the variable HAVE_DECL_SCHED_SETAFFINITY to 1 in the file config.h that can be later used in the C source code.

AC_INIT(parmap, 0.9.4, roberto@dicosmo.org)

AC_PROG_OCAML
if test "$OCAMLC" = "no"; then
 AC_MSG_ERROR([You must install the OCaml compiler])
fi

AC_PROG_CAMLP4
AC_SUBST(CAMLP4O)
if test "$CAMLP4" = "no"; then
 AC_MSG_ERROR([You must install the Camlp4 pre-processor])
fi

AC_PROG_FINDLIB
AC_SUBST(OCAMLFIND)
if test "$OCAMLFIND" = "no"; then
 AC_MSG_ERROR([You must install OCaml findlib (the ocamlfind command)])
fi

AC_CHECK_OCAML_PKG([extlib])
if test "$OCAML_PKG_extlib" = "no"; then
 AC_MSG_ERROR([Please install OCaml findlib module 'extlib'.])
fi

AC_HEADER_STDC
AC_CHECK_HEADERS([sched.h],,AC_MSG_ERROR([missing sched.h]))
AC_CHECK_DECLS([sched_setaffinity], [], [], [[
                #define _GNU_SOURCE 1
                #include <sched.h>
                ]])

AC_CONFIG_HEADERS([config.h])
AC_CONFIG_FILES([Makefile])
AC_OUTPUT

Once this is done I’ve to generate the config .h.in with autoreconf and then run autoconf to generate my configure script. If this go through, you should be able to run your configure script as usual. So far so good. Now it time to convince ocamlbuild to play nice with us.

First we need to write a small myocamlbuild file to set few dependencies and compilation flags. The first rule we add is a dependency to convince ocamlbuild to copy the config.h file in the _build directory. Without this rule, ocamlbuild is unable to figure out the dependency just by looking at the C source file (I’ve found a [bug http://caml.inria.fr/mantis/view.php?id=5107] on the inria bts about this problem).

The second rule adds a couple of compilations options to compile our C code. I think this can be done on a file basis, but since these options make no harm, I use to compile all my C objects.

The third rule is there to make aware ocamlbuild that is I want to compile a library that uses libparmap, I must specify that libparmap is a dynamically linked library. The forth rule is an analogues rule for native libraries.

The fifth and sixth rule are one to compile the .a and then to link it. This is need to correctly generate cmxs objects.

open Ocamlbuild_plugin ;;

let _ = dispatch begin function
  | After_rules ->
      dep  ["compile"; "c"] ["config.h"];

      flag ["compile"; "c"] & S[ A"-ccopt"; A"-D_GNU_SOURCE"; A"-ccopt"; A"-fPIC" ];

      flag ["link"; "library"; "ocaml"; "byte"; "use_libparmap"] &
        S[A"-dllib"; A"-lparmap_stubs";];
      flag ["link"; "library"; "ocaml"; "native"; "use_libparmap"] &
          S[A"-cclib"; A"-lparmap_stubs"];
      dep ["link"; "ocaml"; "use_libparmap"] ["libparmap_stubs.a"];
      flag ["link"; "ocaml"; "link_libparmap"] (A"libparmap_stubs.a");

  | _ -> ()
end

But of course this is only part of the ocamlbuild configuration. We also need to specify how to build these libraries and what to include. To build the stubs library we specify a .clib file in which we tell to ocamlbuild the C objects it has to link together.

This is accomplish with by adding the file libparmap_stubs.clib :

$cat libparmap_stubs.clib
bytearray_stubs.o
setcore_stubs.o

To build parmap.cm{x,}a and parmap.cmxs I need to other files, respectively :

$cat parmap.mllib
Parmap
Bytearray

$cat parmap.mldylib
Parmap
Bytearray

and at last we write the _tags file to correctly associate different flags to each component.

$cat _tags 
<*>: annot
<parmap.cm{x,}a>: use_libparmap
<parmap.cmxs>: link_libparmap
<*.{ml,mli}>: package(extlib), package(unix), package(bigarray)

This should build all you goodies in one go :

$ocamlbuild -use-ocamlfind parmap.cma  parmap.cmxa  parmap.cmxs parmap.a -classic-display
/usr/bin/ocamlfind ocamlopt -I /usr/lib/ocaml/ocamlbuild unix.cmxa /usr/lib/ocaml/ocamlbuild/ocamlbuildlib.cmxa myocamlbuild.ml /usr/lib/ocaml/ocamlbuild/ocamlbuild.cmx -o myocamlbuild
/usr/bin/ocamlfind ocamlc -ccopt -D_GNU_SOURCE -ccopt -fPIC -c bytearray_stubs.c
/usr/bin/ocamlfind ocamlc -ccopt -D_GNU_SOURCE -ccopt -fPIC -c setcore_stubs.c
/usr/bin/ocamlmklib -o parmap_stubs bytearray_stubs.o setcore_stubs.o
/usr/bin/ocamlfind ocamldep -package bigarray -package extlib -package unix -modules parmap.mli > parmap.mli.depends
/usr/bin/ocamlfind ocamlc -c -annot -package bigarray -package extlib -package unix -o parmap.cmi parmap.mli
/usr/bin/ocamlfind ocamldep -package bigarray -package extlib -package unix -modules parmap.ml > parmap.ml.depends
/usr/bin/ocamlfind ocamldep -package bigarray -package extlib -package unix -modules bytearray.mli > bytearray.mli.depends
/usr/bin/ocamlfind ocamldep -package bigarray -package extlib -package unix -modules setcore.mli > setcore.mli.depends
/usr/bin/ocamlfind ocamlc -c -annot -package bigarray -package extlib -package unix -o bytearray.cmi bytearray.mli
/usr/bin/ocamlfind ocamlc -c -annot -package bigarray -package extlib -package unix -o setcore.cmi setcore.mli
/usr/bin/ocamlfind ocamldep -package bigarray -package extlib -package unix -modules bytearray.ml > bytearray.ml.depends
/usr/bin/ocamlfind ocamlc -c -annot -package bigarray -package extlib -package unix -o parmap.cmo parmap.ml
/usr/bin/ocamlfind ocamlc -c -annot -package bigarray -package extlib -package unix -o bytearray.cmo bytearray.ml
/usr/bin/ocamlfind ocamlc -a -dllib -lparmap_stubs bytearray.cmo parmap.cmo -o parmap.cma
/usr/bin/ocamlfind ocamlopt -c -annot -package bigarray -package extlib -package unix -o bytearray.cmx bytearray.ml
/usr/bin/ocamlfind ocamlopt -c -annot -package bigarray -package extlib -package unix -o parmap.cmx parmap.ml
/usr/bin/ocamlfind ocamlopt -a -cclib -lparmap_stubs bytearray.cmx parmap.cmx -o parmap.cmxa
/usr/bin/ocamlfind ocamlopt -shared libparmap_stubs.a bytearray.cmx parmap.cmx -o parmap.cmxs

Relevant files: http://gitorious.org/parmap/parmap/blobs/pipes/myocamlbuild.ml http://gitorious.org/parmap/parmap/blobs/pipes/configure.ac http://gitorious.org/parmap/parmap/blobs/pipes/Makefile.in  http://gitorious.org/parmap/parmap/blobs/pipes/_tags

All the source code is in Git parmap - in the pipes branch. If I’m doing something wrong, or not in the standard way, please tell !!!