ExtLib OptParse module. Options parsing made easy and clean

Date Tags ocaml

I recently discovered the extLib OptPase module [1] . It’s a very nice and complete replacement for the good old Arg in the standard library. I’m gonna give a small example on how to use it. I hope this can be useful to somebody.

I first build an Option module to clearly separate the options handling from the rest of my program. To keep it short we add only two options, debug and output. Debug has a short and long option, output is only a string. We also add two group options to spice up the example …

open ExtLib

module Options =
struct
    open OptParse
    let debug = StdOpt.store_true ()
    let out = StdOpt.str_option ()

    let options = OptParser.make ()

    open OptParser

    let g = add_group options ~description:"general options" "general" ;;
    let o = add_group options ~description:"output options" "output" ;;

    add options ~group:g ~short_name:'d' ~long_name:"debug" ~help:"Debug information" debug;
    add options ~group:o ~long_name:"out" ~help:"Send output to a file" out;
end

To actually parse the options we have a main function that invokes the parse_argv function, stores all the options in the respective variables in the module Options and return a list of string containing all the positional arguments given on the command line that are not parsed as options.

let main () =
  let posargs = OptParse.OptParser.parse_argv Options.options in

  if OptParse.Opt.get Options.debug then
     Printf.eprintf "enabling debug\n"
  ;

  (* dump all positional arguments *)
  let ch =
    if OptParse.Opt.is_set Options.out then
      open_out (OptParse.Opt.get Options.out)
    else stdout
  in

  List.iter (Printf.fprintf ch "%s\n") posargs
;;
main ()
#./test.native --help
usage: test.native [options]

options:

  -h, --help            show this help message and exit

  general:

    general options

    -d, --debug         Debug information

  output:

    output options

    --out=STR           Send output to a file

#./test.native -d one two three
one
two
three
enabling Debug
#

[1] http://ocaml-extlib.googlecode.com/svn/doc/apiref/OptParse.html


upload a file using httplib

I want to share a small snippet of code to upload a file to a remote server as a “multipart/form-data” . The function below gets two arguments. The server url ( ex: http://server.org/upload ) and a filename. First the filename encoded as a “form-data”, then we use httplib to POST it to the server. Since httplib wants the host + path in separate stages, we have to parse the url using urlparse.

The receiving server must accept the data and return the location of the newly created resource. There are many snippet on the web, but I felt they were all incomplete or too messy. The encode function below is actually part of a snippet I found googling around. Happy uploading.

import httplib
import urlparse

def upload(url,filename):
    def encode (file_path, fields=[]):
        BOUNDARY = '----------bundary------'
        CRLF = '\r\n'
        body = []
        # Add the metadata about the upload first
        for key, value in fields:
            body.extend(
              ['--' + BOUNDARY,
               'Content-Disposition: form-data; name="%s"' % key,
               '',
               value,
               ])
        # Now add the file itself
        file_name = os.path.basename(file_path)
        f = open(file_path, 'rb')
        file_content = f.read()
        f.close()
        body.extend(
          ['--' + BOUNDARY,
           'Content-Disposition: form-data; name="file"; filename="%s"'
           % file_name,
           # The upload server determines the mime-type, no need to set it.
           'Content-Type: application/octet-stream',
           '',
           file_content,
           ])
        # Finalize the form body
        body.extend(['--' + BOUNDARY + '--', ''])
        return 'multipart/form-data; boundary=%s' % BOUNDARY, CRLF.join(body)

    if os.path.exists(filename):
        content_type, body = encode(filename)
        headers = { 'Content-Type': content_type }
        u = urlparse.urlparse(url)
        server = httplib.HTTPConnection(u.netloc)
        server.request('POST', u.path, body, headers)
        resp = server.getresponse()
        server.close()

        if resp.status == 201:
            location = resp.getheader('Location', None)
        else :
            print resp.status, resp.reason
            location = None

        return location

Since I’m working with Django, this is the server part. Few remarks: I create the file name using uuid1(). This is an easy way to create unique identifier. A bit over killing maybe. I assume a model myfiles and a form UploadFileForm that you can easily guess. the function handle_uploaded_file is the procedure that actually saves the file on the disk. This is standard. I return a “Location” where the user can access the file. You have to create a small view to serve the file.

import uuid
from django.http import HttpResponse
import os
import datetime
from myapp.models import myfiles
from myapp.forms import UploadFileForm

def handle_uploaded_file(f,n):
    destination = open(n, 'wb+')
    for chunk in f.chunks():
        destination.write(chunk)
    destination.close()

def upload(request):
    if request.method == 'POST':
        form = UploadFileForm(request.POST, request.FILES)
        if form.is_valid():
            ip = request.META['REMOTE_ADDR']
            u = str(uuid.uuid1())
            uploaded = datetime.datetime.now()
            fname = os.path.join(baseupdir, u)
            handle_uploaded_file(request.FILES['file'],fname)
            size = os.path.getsize(fname)

            d = myfiles(fname=fname,size=size,uploaded=uploaded,ip=ip,uuid=u).save()

            response = HttpResponse(content="", status=201)
            response["Location"] = "/file?uuid=%s" % u
            return response # 10.2.2 201 Created
        else :
            return HttpResponse(status=400) # 10.4.1 400 Bad Request
    else :
        return HttpResponse(status=400) # 10.4.1 400 Bad Request

skype on amd64 (debian unstable)

Date Tags debian

to install skype on a debian unstable machine :

  • get the skype package here http://www.skype.com/go/getskype-linux-ubuntu
  • dpkg -i —force-architecture skype-debian_2.0.0.72-1_i386.deb

now we need to fix a bunch of dependencies: apt-get -f install apt-get install libqt4-core libqt4-gui ia32-libs-gtk get the 32 bit version of libuuid from here : http://packages.debian.org/sid/i386/libuuid1/download copy the library in /usr/lib32

run skype.

blahhhhhh . It used to be easier… Sometimes I really despise myself for using this closed source software.

  • https://wiki.ubuntu.com/SkypeEthics
  • https://help.ubuntu.com/community/Skype

simple expat based xml parser

Date Tags ocaml

The other day I needed a small xml parser to convert an xml document into a different format. First I tried xml-light. This is a simple parser all written in ocaml that stores the parser xml document in an ocaml data structure. This data structure can be user to access various fields of the xml document. It does not offer a dom-like interface, but actually I consider this a feature. Unfortunately xml-light is terribly slow. To parse 30K-plus lines of xml it takes far too long to be considered for my application.

The next logic choice was to try Expat, that is a event-based parser and it is extremely fast. Since using an event based parser can be a bit cumbersome (and I already had written of bit of code using xml-light), I decided to write a small wrapper around expat to provide a xml-light interface to it.

The code is pretty simple and the main idea is taken from the cduce xml loader.

First we provide a small data structure to hold the xml document as we examine it. Nothing deep here. Notice that we use Start and String as we descend the tree and Element we we unwind the stack.

type stack =
  | Element of (Xml.xml * stack)
  | Start of (string * (string * string) list * stack)
  | String of (string * t * stack)
  | Empty
and t =
  | PCData
  | CData

Then we need to provide expat handlers to store xml fragments on the stack as we go down. Note that we have an handler for cdata, but not an handler for pcdata as it is the default.

let pcdata buff = (Xml.PCData buff)
let cdata buff = (Xml.CData buff)

let rec create_elt acc = function
  | String (s,CData, st) -> create_elt (L.push (cdata s) acc) st
  | String (s,PCData, st) -> create_elt (L.push (pcdata s) acc) st
  | Element (x,st) -> create_elt (L.push x acc) st
  | Start (tag,attrlst,st) -> stack := Element(Xml.Element(tag,attrlst,acc),st)
  | Empty -> assert false

let start_cdata_handler () = txt.cdata <- true ;;

let start_element_handler tag attrlst =
  if not (only_ws txt.buffer txt.pos) then begin
    let str = String.sub txt.buffer 0 txt.pos in
    if txt.cdata then
      stack := String (str, CData, !stack)
    else
      stack := String (str, PCData, !stack)
  end
  ;
  txt.pos <- 0;
  txt.cdata <- false;
  stack := Start (tag,attrlst,!stack)
;;

let end_element_handler _ =
  let acc =
    if only_ws txt.buffer txt.pos then L.empty
    else
      let str = String.sub txt.buffer 0 txt.pos in
      if txt.cdata then L.one (cdata str)
      else L.one (pcdata str)
  in
  txt.pos <- 0;
  txt.cdata <- false;
  create_elt acc !stack
;;

let character_data_handler = add_string txt ;;

At the end we just register all handlers with the expat parser and we return the root of the xml document.

let parse_string str =
  let p = Expat.parser_create None in
  Expat.set_start_element_handler p start_element_handler ;
  Expat.set_end_element_handler p end_element_handler ;
  Expat.set_start_cdata_handler p start_cdata_handler ;
  Expat.set_character_data_handler p character_data_handler ;
  ignore (Expat.set_param_entity_parsing p Expat.ALWAYS);
  Expat.parse p str;
  Expat.final p;
  match !stack with
  |Element (x,Empty) -> (stack := Empty; x)
  | _ -> assert false

I’ve copied the xml-light methods and to access the document in a different file. I’ve also made everything lazy to save a bit of computing time if it is only necessary to access a part of a huge xml document.

The complete code can be found here: git clone https://www.mancoosi.org/~abate/repos/xmlparser.git

UPDATE

The other that I was made aware that this parser has a serious bug when used on a 32 bit machine. The problem is that the maximal string size on a 32bit machine is equal to Sys.max_string_length that is roughly 16Mb . If we read and parse a bit document document at once with IO.read_all , we immediately get an exception. The solution is to parse the document incrementally using the new function parser_ch below that get a channel instead of a string and run the expat parser incrementally :

let parser_aux f =
  let p = Expat.parser_create None in
  Expat.set_start_element_handler p start_element_handler ;
  Expat.set_end_element_handler p end_element_handler ;

  Expat.set_start_cdata_handler p start_cdata_handler ;

  Expat.set_character_data_handler p character_data_handler ;
  ignore (Expat.set_param_entity_parsing p Expat.ALWAYS);
  f p;
  match !stack with
  |Element (x,Empty) -> (stack := Empty; x)
  | _ -> assert false

let parse_str str =
  let f p =
    Expat.parse p str;
    Expat.final p;
  in
  parser_aux f

let parse_ch ch =
  let f p =
    try while true do
      Expat.parse p (IO.nread ch 10240)
    done with IO.No_more_input -> Expat.final p ;
  in
  parser_aux f

apt-get dist-upgrade

Date Tags debian

During the weekend I upgraded my laptop to sqeeze. I usually track unstable pretty closely, but in between transition I gave myself a bit of slack in order to avoid messing up with the gnome transition. The result is ok, NetworkManager Just Work !!!, the new kernel seems pretty snappy. I finally get the power status for my network card.

My laptop is a old dell latidute x200. I always had problem with the graphic card and Xorg. With this upgrade I’ve always motivated myself to find a solution. Not surprisingly it was quite easy. I’ve added these option to my xorg.conf :

Section "Device"
        Identifier "Configured Video Device"
        Driver "intel"
        Option "Tiling" "false"
        Option "FramebufferCompression" "false"
        Option  "XAANoOffscreenPixmaps" "true"
        Option  "AccelMethod" "EXA"
        Option  "ExaNoComposite" "true"
EndSection

I’m not entirely sure if I need them all. I’ve noticed that already my screen corruptions go away with “tiling” and “framebuffercompression” set to false. But the life changing options are the accell method (EXA seems much more stable) and the “ExaNoComposite”.

What I’ve left to figure out is to fix the hibernate function, that is still not very reliable as it works 8 out of 10 times.

After 1.3Gb of updates, I’m happy I’m again surfing the unstable wave.