My own private info

I’ve just attended the workshop Datastream: My own private info at the Open World Forum in paris. A very sensitive topic nowadays and the speakers around the table raised a number of interesting points.

Sunil Abraham, Policy Director, Center for Internet and Society (India), started his contribution pointing out the privacy is very much a related to the local culture and history of a country. In India for example, the expectation of privacy is dramatically different from western countries. A very interesting example is the amount of information that is already “encoded” in the name of the person. Being India a country based on a cast system, the name of the person, not only gives away the social status and religion of a person, but also his sex and location. Very common question during every day conversation are also related to salary, spending and wealth. Questions that are somehow taboo in western countries. This lead to very different perception and expectation of privacy that is not clearly reconcilable with western practice and policies. This also allows the Indian government to establish policies that that from a western point of view are completely unacceptable.

A different angle is proposed by the sociologist Dominique Cardon working for Orange. He points out the important difference between government surveillance versus collateral surveillance, as the stoking from people in their circle as parents, neighborhoods, etc. The large majority of people when confronted with questions about privacy often show great concerns and fears about the big brother spying on them, However, he points out, there is a clear cut between these concerns and the quantity of information that each individual then puts on the web. The problem being the distinction that unconsciously people make between data they want to share with their social network and the world as a whole (composed of governments and unknown individuals). Facebook and other social network greatly emphasize the idea of a network of friends giving people the false idea that the data they share is truly private, or restrict to a small circle of friends, when reality shows that these tools are often exploited by other individual or entities to dig information and make profit on personal data.

Somebody in the audience framed the same problem as and identity issue in the digital worlds. As he put it, people starts to develop split digital personalities (i.e. a personal facebook account and a work facebook account) in order to defend themselves from snooping and surveillance. There is clearly a need to reconcile these personalities by technological means creating privacy contexts instead of fostering the creation of completely separated and antithetic digital personae that can have repercussions on the way poeple behave in the real world.

Djordje Djokic (European rights and privacy protection on the Internet), gave a quite broad overview of the political issues related to privacy. Despite privacy being a fundamental right upon which many other rights are based, from a legal prospective it is impossible to define privacy. This is due to the fact that national policies are local while digital privacy is a global problem. If we put together this with the cultural difference among states, in the short term it will be very difficult to safe guard citizens against privacy speculators. He also made an interesting points about privacy safe heavens that can attract activists and agencies due to better legal protections.

I point I made is about the future. It seems that the entire debate was focus about the state actual state of affairs. The FLOSS community has debated for long time now about privacy and technological solutions. Enlightening talks like Eben Moglen many times this year gave rise to interesting projects like the freedom box project . The Diaspora project that will hopefully take off the ground sometimes soon, promises to offer and distributed and decentralized alternative to facebook. Status Net and Identica are also two very interesting platforms built on free and open source software that I hope will take over, or at least pave the path for commercial alternatives.

One of the biggest challenge of course is about education. People don’t understand the pitfall of many so called “free” services like facebook or twitter. These companies effectively make money on your willingness to give away information about you, your friends and your life. The large majority of the community is not aware of these problems. This makes it very difficult to privacy advocates to push policies changes because of the lack of interest with the general public. Politicians in particular do not really grasp these problems. Now we even start to see regional and national politician embracing privacy-less social medias making it difficult to for the public to move away from them at the price of being excluded from the democratic life of their country.

National education certainly do not have yet in their curricula topics such as privacy and new medias. Kids often learn from their peers and are enticed by the rich offer of these companies. This state of affair allows facebook or google founders to declare that people today do not have anymore an expectation of privacy. I personally strongly disagree with this position and I hope these will change in the future with privacy aware social media, maybe decentralized, but certainly built in a way to let individuals to retain complete ownership and control over their digital life.

The path to shift the actual tendency is certainly steep. A first step, from a technological point of view, is to create something stable and sound. But the second step, to get weight among todays’ big players is to create new and exciting services. Selling an alternative to somebody that does not understand the problem of privacy is already difficult enough (and the story about desktops on linux should already have show why this does not work in this monopolistic world). The strong selling point should be about new services, exciting new way to interact and seamlessly integration with nowadays platforms.


connect django and rfoo

This evening I spent 30 minutes to try out rconsole in the package rfoo . It’s a simple environment to inspect and modify the namespace of a running script.

If you are on debian, you need to install two packages :

sudo aptitude install cython python-dev

Then download the source code. If you want to try it out without installing you have to compile it with the —inplace option :

python setup.py build_ext --inplace

Now you’re ready to go. Add in your views.py file the following code:

from rfoo.utils import rconsole
rconsole.spawn_server()

In a console type python scripts/rconsole. Keep in mind that you have to adjust your import search path in order to use the rconsole script without installing the library.

Then you can now directly call all methods in your views from the console. For example, imagine you have a search view, then you can call it with :

>>> from django.http import HttpRequest
>>> request = HttpRequest()
>>> search(request,"debian")
<django.http.HttpResponse object at 0x2bc7490>
>>> search(request,"debian").content
'<?xml version="1.0" encoding="UTF-8"?>\n<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"\n "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">\n<html xmlns="http://www.w3.org/1999/xhtml">\n  <head>\n    
....
>>> 

I’ve to say that using rconsole for debugging it not very useful. pdb or winpdb are much more powerful and versatile. It was worth a try anyway…

Update

After getting in touch with the author of rconsole, I think it is important to put in context this post. I’ve tried rconsole with django in mind. On one hand, I was looking for a debugger that I could use in an early development stage of a project. In this context, I think a bloking debugger can do a much better job then rconsole to help the programmer to inspect variables and insert break points. rconsole is a non-blocking debugger and it is not the right tool.

On the other hand, rconsole can be of great help when debugging a live application when you don’t have the luxury to stop your server. In this regard rconsole is very lightweight and unobtrusive, and I think it can be of great help.

I had the impression I’ve been a bit unfair in my judgment…


cut and pasting in gnome / iceweasel

Date Tags rant

I’ve always wondered why iceweasel won’t paste my selection into the search area from a gnome terminal. It turned out that to put something in the gnome clipboard from a gnome terminal is not enough to select the piece of text with your mouse, but you need to explicitly do “Ctrl + shift + v” . Then in iceweasel you can use “shift insert” . Kind of brain-dead for me. So if i want to search a word I’ve on a terminal I’ve to :

  • select the word with the mouse
  • ”’ctrl + insert”’ OR ”’Ctrl + shift + v”’ to store the word in the clipboard
  • ”’alt + tab”’ to get focus on iceweasel
  • ”’ctrl + k”’ to focus on the search form
  • ”’shift + insert”’ OR ”’Ctrl + v”’ to copy
  • return

wowwwwww !

This nice post explain the root of the problem related to the gnome terminal. This gives a bit of general background.

If you just want to search something, you can try the gnome applet gspot. It’s nice, works (I tried only 5 minutes), but doesn’t allow you to configure common short-cuts. You are always obliged to use the mouse. googlizer is another applet that has the same problem are gspot, and it limited to google only (while gspot has many search engines built-in).

bah, for the moment I’m a bit stuck …

Update

A very nice solution :

google() {
    search=""
    #echo "$1"
    for term in $*; do
        search="$search%20$term"
    done
    xdg-open "https://www.google.com/search?q=$search"
}

shell power !


kprintf and failwith

Date Tags ocaml

This is just a quicky to start off the day. I often write fatal error message using a combination of Printf.eprintf ;; exit 1 ;; failwith ;; assert false ;;``` etc ... For example to throw a fatal exception with a message I would write the overly verbose

failwith (Printf.sprintf "this is a fatal error in module %s" modulename)

I see these idioms everywhere and I find them a bit ugly...

If we use Printf.kprintf we can write the statement above in a bit more compact way as:

let fatal fmt = Printf.kprintf failwith fmt ;

Moreover, we add a label to the function fatal and instantiate it once in every module we can get a localize error message for free. Something like :

let fatal label fmt =
  let l = Printf.sprintf "Fatal error in module %s: " label in
  Printf.kprintf (fun s -> failwith (l^s)) fmt
;;
val fatal : string -> ('a, unit, string, 'b) format4 -> 'a = <fun>
# let local_error = fatal "module" ;;
val local_error : ('_a, unit, string, '_b) format4 -> '_a = <fun>
# local "aaaa %d %d" 1 1;;
Exception: Failure "Fatal error in module module: aaaa 1 1".
# local_error "message %d %d" 1 1;;
Exception: Failure "Fatal error in module module: message 1 1".

It would be awesome to have a localized version in the source code as with assert , but I don't think this is possible to do in a generic way. Something like : Exception: Failure "Fatal error in module module (line 144, 63): message 1 1".

I guess this can be done with camlp4. We can catch the line and column like :

let (line,col) =
        try assert false
        with Assert_failure ("", line, col) -> (line,col)

and then feed this info in fatal . Maybe I’ll get 5 mins to write this macro. This cannot be done statically as the line reported of assert false will always be the same …


python itertools and groupby

Date Tags python

who said that ignorance is a bliss didn’t try python :) This is the assignment : you have a list of dictionaries with a field date and you want to group all these dictionaries in a map date -> list of dictionaries with this date.

The first solution that came to my mind was something ugly like :

def group_by_date(qs):
    by_date = {}
    for r in qs :
        l = by_date.get(r['date'],[])
        l.append(r)
        by_date[r['date']] = l
    return by_date

for example :

In [36]: group_by_date([{'date' : 1},{'date' : 2}]) 
Out[36]: {1: [{'date': 1}], 2: [{'date': 2}]}

6 lines of python !!! unacceptable. It hurt my eyes and it is not easy to read. The good people on the #python irc channel adviced me to check the collections.defaultdict and this is actually pretty neat. Now I can write something like

from collections import defaultdict
def group_by_date(qs):
    by_date = defaultdict(list)
    for r in qs :
        by_date[r['date']].append(r)
    return by_date

In [47]: group_by_date([{'date' : 1},{'date' : 2}]) 
Out[47]: defaultdict(<type 'list'>, {1: [{'date': 1}], 2: [{'date': 2}]})

Nice, but still … and we can do better ! itertools.groupby on the rescue :

from itertools import groupby
qs = [{'date' : 1},{'date' : 2}]
[(name, list(group)) for name, group in itertools.groupby(qs, lambda p:p['date'])]

Out[77]: [(1, [{'date': 1}]), (2, [{'date': 2}])]

Ah ! Nirvana :) I’ve to admit the most readable solution is using defaultdict, but this solution using groupby is a wonderful power-tool. If you understand list comprehension, this is a very natural solution to the problem.