Generic Graphml Printer for OcamlGraph

Graphml is a nice and widely used graph description format. This is a micro module to print OcamlGraph - graphs in this format.

The signature is minimal. Since in GraphMl all attributes are typed, we only need two functions to describe the name, type a default value for the attributes of each vertex and edge, and two functions to map the value of each vertex and edge to a key / value list.

module type GraphmlSig =
  sig
    include Graph.Sig.G
    (** the format is (key, type of the key, default value *)
    val default_vertex_properties : (string * string * string option) list
    val default_edge_properties : (string * string * string option) list

    (** the format is (key, value *)
    val data_map_vertex : vertex -> (string * string) list
    val data_map_edge : edge -> (string * string) list
  end
;;

module type GraphmlPrinterSig =
  sig
    type t
    val pp_graph : Format.formatter -> t -> unit
    val to_file : t -> string -> unit
  end

To give a small example, we build a simple graph with three vertex and two edges . In this case we only print the id of the node.

open Graphml

module V = struct
  type t = int
  let compare = compare
  let hash i = i
  let equal = (=)
end

module G = Graph.Imperative.Digraph.ConcreteBidirectional(V)

module Gr = struct
  include G
  let default_vertex_properties = ["id","string",None]
  let default_edge_properties = []
  let data_map_edge e = []
  let data_map_vertex v = ["id",string_of_int v]
end

module GraphPrinter = GraphmlPrinter (Gr) ;;

let print g = GraphPrinter.pp_graph Format.std_formatter g ;;

let g = G.create () in
G.add_vertex g 1 ;
G.add_vertex g 2 ;
G.add_vertex g 3 ;
G.add_edge g 1 2 ;
G.add_edge g 1 3 ;
print g;;

Use use ocamlbuild to compile the lot.

$ocamlbuild -use-ocamlfind -package ocamlgraph test.native

The result looks like this. I agree the formatting is not perfect …

<?xml version="1.0" encoding="UTF-8"?>
<graphml xmlns="http://graphml.graphdrawing.org/xmlns"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://graphml.graphdrawing.org/xmlns
     http://graphml.graphdrawing.org/xmlns/1.0/graphml.xsd">
<key id="id" for="node" attr.name="id" attr.type="string">

<graph id="G" edgedefault="directed">
<node id="n1">
 <data key="id">1</data> </node>
 <node id="n2">
 <data key="id">2</data>
 </node>
 <node id="n3">
 <data key="id">3</data>
 </node>

<edge id="e131199" source="n1" target="n2">
 </edge>
 <edge id="e196798" source="n1" target="n3">
 </edge>

</graph></graphml>

Using GraphTool, you can easily access a zillion more algorithms from the boost graph library. To be sincere, GraphTool accepts also graphs in dot and gml format. Graphml is the default format for GraphTool as it contains precise type information.

#! /usr/bin/env python
from graph_tool.all import *

g = load_graph("ex.xml")
graph_draw(g, pos=None, output_size=(420, 420), output="ex.pdf")

Update

A refined version of the module Graphml is going to be included in the next release of ocamlgraph ! The tgz is attached to this message.

Example


dose has a new git repository and mailing list !

Recently we did a bit of clean up of our git repositories and now thanks to roberto’s efforts we have a new shiny git repository on the inria forge and two mailing lists to discuss development and user questions.

If you are a user, or interested in dose development, please sign up to these mailing lists:

  • dose-discus :http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/dose-discuss
  • dose-devel : http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/dose-devel

if you already have a copy of the latest git repository, you can change the upstream repository issuing the following command :

     git config remote.origin.url git://gforge.inria.fr/dose/dose.git

If you are curious, you can clone dose git clone git://gforge.inria.fr/dose/dose.git and let us know what you think about it.

The API documentation is available here. The man pages or various applications developed on top of the dose library are available here. We are still actively working on the documentation and any contribution is very much welcome. I’m working on a nice homepage…

You can get the latest tar ball release here : https://gforge.inria.fr/frs/?group_id=4395 Old releases will be left on the old forge.

Update

And now we are even social ! follow us on identi.ca : http://identi.ca/group/dose3


two simple tips to speed up ocaml compilation

Date Tags ocaml

The dose3 code base is getting larger and larger and waiting 40 secs each time I want to recompile the library is not acceptable anymore. Recently I tried to understand why the ocaml compiler was so slow .

To compile dose, I use a combination of ocamlbuild, ocamlfind, camlp4 and Makefile to drive everything.

The first important problem was related to my naive use of camlp4. I extensively use the Camlp4MacroParser module for the conditional compilation of certain features and actually I use camlp4 to process all my source code. The bottom line her is that using the byte code version of camlp4o is definitely a bad idea.

To use the native binary you need to change the pre-processing directive in the file _tags :

<*/*.ml{i,}>: pp(camlp4o.opt -I_build Camlp4MacroParser.cmxs)

This will use the opt version of camlp4 and the cmxs instead of the .cma. The next problem is that debian, for some reason (I guess lack of time, more then anything else), does not ship the cmxs libraries of the Camlp4MacroParser. In other to make this available, I added a simple compilation rule in my make file to create the file Camlp4MacroParser.cmxs in the _build directory.

 ocamlopt -shared $(shell ocamlc -where)/camlp4/Camlp4Parsers/Camlp4MacroParser.cmx -o _build/Camlp4MacroParser.cmxs

The second BIG problem I solved yesterday was related to the tag traverse. Since I use git as VCS, every invocation to ocamlbuild was triggering a recursive traverse of all the .git directory with a considerable waste of time.

To solve this problem, I’ve reverted the default ocamlbuild and explicitly add only those directory that I’m really interested to traverse.

true: -traverse
<{common,algo,doseparse,deb,rpm,applications,experimental}/**>: traverse

This is actually needed not to compile the code itself, but to generate the documentation with ocamldoc.


Finding all the elementary circuits of a directed graph

Date Tags ocaml

Below you can find an implementation in ocaml of the algorithm described in this paper by D. B. Johnson

Finding all the elementary circuits of a directed graph. D. B. Johnson, SIAM Journal on Computing 4, no. 1, 77-84, 1975. http://dx.doi.org/10.1137/0204007

the idea of this algorithm is to enumerate all cycles in a strongly connected component. Next step will be to implement the [http://en.wikipedia.org/wiki/Feedback_arc_set “feedback arc set”] of this connected component so to find the optimal way to break these loops in a way that the connected component can be partitioned in smaller DAGs.

This is only the main function, you can find everything else in the git repo given below. If I’ve time I’ll also give a shot to implement tarjan algorithm and to compare the two… Johnson is supposed to be better complexity wise, but I’m curious to check the difference on my domain specific input.

let find_all_cycles_johnson g =
  if not G.is_directed then
    assert false;

  (*  stack of nodes in current path *)
  let path = Stack.create () in

  (* main function. return (true, acc) if in the last iteration we find a new cycle *)
  let rec circuit t result thisnode startnode component =

   (* add the node the candate cycles and block it *)
    Stack.push thisnode path;
    block t thisnode;

    let (closed,result) =
      G.fold_succ (fun nextnode (c,r) ->
        if G.V.equal nextnode startnode then
         (* this is a cycle ! *)
          (true,(Stack.copy path)::r)
        else begin
          if not(is_bloked t nextnode) then
            circuit t r nextnode startnode component
          else
            (c,r)
        end
      ) component thisnode (false,result)
    in
    if closed then
      (* this node is part of a cycles, but we don't want to exclude it from subsequent
          iterations. Longer cycles can contain smaller cycles *)
      unblock t thisnode
    else
      (* since it is not part of a cycles, then it must be part of a portion of the graph that
          does not contain any elementary circuits *)
      G.iter_succ (fun nextnode ->
        let l = get_notelem t nextnode in
        if List.mem thisnode l then
          Hashtbl.replace t.notelem nextnode (thisnode::l)
      ) component thisnode;

    ignore(Stack.pop path);
    (closed,result)
  in

  (* Johnson's algorithm requires some ordering of the nodes. 
      We use the order induced by the compare function associated to the
      vertex type. See ocamlgraph Pack.Diagraph  *)
  let vertex_set = G.fold_vertex SV.add g SV.empty in

  SV.fold (fun s result ->
    (* Build the subgraph induced by s and following nodes in the ordering *)
    let subset = SV.add s (partition vertex_set s) in
    let subgraph = extract_subgraph g subset in

    if G.nb_edges subgraph > 0 then begin
      (* Find the strongly connected component in the subgraph
       * that contains the least node according to the ordering *)
      let scc = G.Components.scc_list subgraph in
      let minnode = SV.min_elt subset in
      let mincomp = List.find (fun l -> List.mem minnode l) scc in

      (* smallest node in the component according to the ordering *)
      let startnode = minnode in
      let component = extract_subgraph subgraph (to_set mincomp) in

      (* init the block table for this component *)
      let t = init_block component in

      snd(circuit t result startnode startnode component);
    end else
      result
  ) vertex_set []
;;

You can get the code here : git clone http://mancoosi.org/~abate/repos/cycles.git . There are two versions of the algorithm. One is a translation in ocaml of the imperative algorithm, the other, presented above, adds a bit of functional flavor …

Update 1

the code repository now contains also two implementations of approximate algorithms the feedback arc set problem

Update 2

Since the number of cycles in a graph can be very large, I’ve added a lazy version of the algorithm using ExtLib enums (but can be easily done without). It would be cool to avoid using a global data structure to keep track of which nodes I’ve already visited like the persistent array of filliatr … the code could be further optimized of course as many operations on lists and set can be avoided with a smarter DS to build the connected component of the subgraph at each iteration … Complete code in the git repo above.

let find_all_cycles_enum g =
  let rec circuit path t thisnode startnode component =                                        
     let rec aux = function                                                                    
       |[] -> Enum.empty ()                                                                    
       |nextnode :: rest -> 
         if G.V.equal nextnode startnode then begin                                            
           let e = aux rest in                                                                 
           unblock t thisnode;
           Enum.push e (List.rev (thisnode::path));                                            
           e 
         end else
           if not(is_bloked t nextnode) then begin
             let e = circuit (thisnode::path) t nextnode startnode component in                
             Enum.append e (aux rest)                                                          
           end else begin                                                                      
             aux rest                                                                          
           end                                                                                 
     in
     block t thisnode;
     let e = aux (G.succ component thisnode) in                                                
     G.iter_succ (fun nextnode ->
       let l = get_notelem t nextnode in                                                       
       if List.mem thisnode l then 
         Hashtbl.replace t.notelem nextnode (thisnode::l)                                      
     ) component thisnode;                                                                     
     e                                                                                         
  in                                                                                           

  let vertex_set = G.fold_vertex SV.add g SV.empty in                                          
  let rec aux = function                                                                       
    |[] -> Enum.empty ()                                                                       
    |s :: rest ->  
      let subset = SV.add s (partition vertex_set s) in                                        
      let subgraph = extract_subgraph g subset in                                              
      (* I need only one... not all scc *)
      (* actually I need only the scc that contains the min *)                                 
      let scc = G.Components.scc_list subgraph in                                              
      let minnode = SV.min_elt subset in
      let mincomp = List.find (fun l -> List.mem minnode l) scc in                             
      let startnode = minnode in
      let component = extract_subgraph subgraph (to_set mincomp) in                            
      let t = init_block component in
      let partial = circuit [] t startnode startnode component in                              
      Enum.append partial (aux rest)                                                           
  in
  aux (SV.elements vertex_set)                                                                 
;;

ocamlbuild stubs and dynamic libraries

Date Tags ocaml

The other day I wrote about my experience to set up the build system for a simple library. However, since parmap includes only two simple stubs to syscall I didn’t have the chance to talk how to convince ocamlbuild to build stubs that depend on an external dynamic library.

I’m sure, facing this problem you had a look at this example : http://brion.inria.fr/gallium/index.php/Ocamlbuild_example_with_C_stubs

Despite providing all elements, the page is a bit scars of details and explanations (at least for me…).

So, suppose you want to build ocaml stubs for a C library called toto . Good practices tell us to put our stubs in a file that is called toto_stubs.c and then add the .o file in a clib file (libtoto_stubs.clib ) that ocamlbuild is going to use to build out dynamic library.

So far so good. Now we need to add a tag, say use_toto that we will use to compile our binaries and libraries. Our _tags file will look like :

<libtoto_stubs.*>: use_toto

Here I use only one tag. In the example of the ocamlbuild they use two tags, one to compile, one to link.

At this point we need to explain to ocamlbuild how to build our library. First we add a compile rule where we say that whenever we compile a c object that use toto, then we must also add its include directory.

       flag ["c"; "use_toto"; "compile"] & S[
         A"-ccopt"; A"-I/usr/include/toto";
       ];

Second we add a link flag to add to the dynamic library all the information it needs to load other external libraries. This is important as we don’t want to add any other flags anywhere else. When we use -ltoto_stubs we want all other dependencies automatically satisfied by the linker. Note the libtoto.so referred by -ltoto is the C library for which we are writing these bindings and that sits in /usr/lib/.

       flag ["c"; "use_toto"; "ocamlmklib"] & S[
         A"-ltoto";
       ];

At the end we add a flag that whenever we try to build out ocaml module ( say toto.cma ), we want to add all necessary information to load at run time its dynamic component.

       flag ["ocaml"; "use_toto"; "link"; "library"; "byte"] & S[
         A"-dllib"; A"-ltoto_stubs";
       ];

Using ocamlobjdump we can easily check if the cma contains indeed this information. The output should look something like this :

$ocamlobjinfo _build/toto.cma
File _build/toto.cma
Force custom: no
Extra C object files:
Extra C options:
Extra dynamically-loaded libraries: -ltoto_stubs
Unit name: Toto
Force link: no
...

In order to generate cmxa and a objects we need to specify few other flags and dependencies like :

       dep ["link"; "ocaml"; "link_toto"] ["libtoto_stubs.a"]
       flag ["ocaml"; "use_toto"; "link"; "library"; "native"] & S[
         A"-cclib"; A"-ltoto_stubs";
       ];

As always, if I’m missing something, please drop a note in the comment page.