Pietro Abate

alphabetic filter with generic views

The other day I decided to add a small alphabetic filter to search among the broken packages in debian weather. Searching the net for a nice solution I’ve found few snippets, but none of them struck me as particularly flexible for my needs. I’ve also found a django module, but it seems to me overly complicated for such a simple thing.

I had a look at the code and I’ve generalized the _get_available_letters function that given a table and a filed gives you back a the list of letters used in the table for that specific field. I’ve generalized the code to integrate better with django relational model. Instead of acting directly on a table (using raw sql), I plug the raw sql statement UPPER(SUBSTR(%s, 1, 1)) in the django query using the extra function. The result is pretty neat as you don’t need to know the underlying model and you can use this method with an arbitrary queryset. This is of course possible thanks to django laziness in performing sql queries…

def alpha(request,obj,field):
    alphabet = u'ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789'
    s = 'UPPER(SUBSTR(%s, 1, 1))' % field
    q = obj.distinct().extra(select={'_letter' : s}).values('_letter')
    letters_used = set([x['_letter'] for x in q])
    default_letters = set([x for x in alphabet])
    all_letters = list( default_letters | letters_used)
    all_letters.sort()
    alpha_lookup = request.GET.get('sw','')

    choices = [{
        'link': '?sw=%s' % letter,
        'title': letter,
        'active': letter == alpha_lookup,
        'has_entries': letter in letters_used,} for letter in all_letters]
    all_letters = [{
        'link': '?sw=all&page=all',
        'title': ('All'),
        'active': '' == alpha_lookup,
        'has_entries': True
    },]
    return (all_letters + choices)

This function also gets a request object in order to select the active letter. This is related to the template to display the result of this view.

   queryset = Entry.objects.all()

    #defaults pager + all letters
    element_by_page = None
    letter = request.GET.get('sw','all')

    if (letter != 'all') and (len(letter) == 1):
        queryset = queryset.filter(myfield__istartswith=letter)

    if (request.GET.get('page',None) != 'all') :
        element_by_page = ELEMENT_BY_PAGE

In my specific case I wanted to have by default all letters with pagination, but then to be able to switch off pagination and select a specific letter. I use two variables to control all this. The first variable,page, comes with the generic view list_details. It is usually a number from 0 to the last page and it is used to control pagination. I’ve added a value all to switch off pagination altogether setting element_by_page to None . The second variable one is sw that I use to select a specific letter to display.

    params = {'choices' : alpha(request,Entry.objects,"myfield")}

    return list_detail.object_list(
        request,
        paginate_by = element_by_page,
        queryset = queryset,
        template_name = 'details.html',
        extra_context = params)

If at the end of your view, you return a generic view as above, the only thing you need is to add a choises field in your template to display the alphabetic filter that will look something like this :

<link rel="stylesheet" href="{{ MEDIA_URL }}/css/alphabet.css" type="text/css" />
{% if choices %}
  <br class="clear" />
  <ul class="alphabetfilter">
    {% for choice in choices %}
      <li> {% if choice.has_entries %} <a href="{{ choice.link }}"> {% if choice.active %} <span class="selected">{{ choice.title }}</span> {% else %} <span class="inactive">{{ choice.title }}</span> {% endif %} </a> {% else %} <span class="inactive">{{ choice.title }}</span> {% endif %} </li>
    {% endfor %}
  </ul>
  <br class="clear" />
{% endif %}

This is pretty standard as it iterates over the list of letter linking the ones with content. You need to associate a small css to display the list horizontally. Put this is a file an embedd it where you want with an include statement: {% include "forecast/alphabet.html" %} .

The code for my application is here if you want to check out more details. You can have a look at the result debian here.

connect django and rfoo

This evening I spent 30 minutes to try out rconsole in the package rfoo . It’s a simple environment to inspect and modify the namespace of a running script.

If you are on debian, you need to install two packages :

sudo aptitude install cython python-dev

Then download the source code. If you want to try it out without installing you have to compile it with the —inplace option :

python setup.py build_ext --inplace

Now you’re ready to go. Add in your views.py file the following code:

from rfoo.utils import rconsole
rconsole.spawn_server()

In a console type python scripts/rconsole. Keep in mind that you have to adjust your import search path in order to use the rconsole script without installing the library.

Then you can now directly call all methods in your views from the console. For example, imagine you have a search view, then you can call it with :

>>> from django.http import HttpRequest
>>> request = HttpRequest()
>>> search(request,"debian")
<django.http.HttpResponse object at 0x2bc7490>
>>> search(request,"debian").content
'<?xml version="1.0" encoding="UTF-8"?>\n<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"\n "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">\n<html xmlns="http://www.w3.org/1999/xhtml">\n  <head>\n    
....
>>>

I’ve to say that using rconsole for debugging it not very useful. pdb or winpdb are much more powerful and versatile. It was worth a try anyway…

Update

After getting in touch with the author of rconsole, I think it is important to put in context this post. I’ve tried rconsole with django in mind. On one hand, I was looking for a debugger that I could use in an early development stage of a project. In this context, I think a bloking debugger can do a much better job then rconsole to help the programmer to inspect variables and insert break points. rconsole is a non-blocking debugger and it is not the right tool.

On the other hand, rconsole can be of great help when debugging a live application when you don’t have the luxury to stop your server. In this regard rconsole is very lightweight and unobtrusive, and I think it can be of great help.

I had the impression I’ve been a bit unfair in my judgment…

modify a django model without loosing your data

If you want to modify your model, the diango doc suggests change the model and then manually run ALTER TABLE statements in your db to propagate the changes. This is a bit annoying to do if you have more then X modifications to do. The good reason to avoid automation here, is that if something goes wrong, you loose your data (and backups ??), and nobody, not even a robot overlord, wants to assume this responsability. A different strategy, that still requires manual intervention was suggested to my on the django IRC channel.

Basically we save the data, remove everything, recreate the DB, reload the data. Let’s go thought the procedure:

First we save the data, storing everything in a fixture, that is a json marshaling of the DB using this command:./manage.py dumpdata udf > udf.json . Notice that my manage.py is often also referred as django-admin.py.
Now it’s time to clean up the DB. Since we are working on a module, we can generate automatically the DROP TABLE statements using the following command :

./manage.py sqlclear udf
BEGIN;
DROP TABLE "udf_solution";
DROP TABLE "udf_solver";
DROP TABLE "udf_conversion";
DROP TABLE "udf_cudf";
DROP TABLE "udf_dudf";
COMMIT;

Once we have the sql statements, we just need to open the db and cut and paste them on the command line.

OK ! it’s finally time to actually modify the model.py file, add attributes. Something to be careful about is to specify a default value for all the new attributes. This was the when loading the data, the django infrastructure will know who to fill these fields.
Once done, we can recreate the tables with the command : ./manage.py syncdb.
And Finally reload the data ./manage.py loaddata udf.json

I think this recipe only works if you are adding new fields. If you are removing a field, I’m not sure then loaddata routine is smart enough to ignore the field… to be tested. Feedback appreciated.

upload a file using httplib

I want to share a small snippet of code to upload a file to a remote server as a “multipart/form-data” . The function below gets two arguments. The server url ( ex: http://server.org/upload ) and a filename. First the filename encoded as a “form-data”, then we use httplib to POST it to the server. Since httplib wants the host + path in separate stages, we have to parse the url using urlparse.

The receiving server must accept the data and return the location of the newly created resource. There are many snippet on the web, but I felt they were all incomplete or too messy. The encode function below is actually part of a snippet I found googling around. Happy uploading.

import httplib
import urlparse

def upload(url,filename):
    def encode (file_path, fields=[]):
        BOUNDARY = '----------bundary------'
        CRLF = '\r\n'
        body = []
        # Add the metadata about the upload first
        for key, value in fields:
            body.extend(
              ['--' + BOUNDARY,
               'Content-Disposition: form-data; name="%s"' % key,
               '',
               value,
               ])
        # Now add the file itself
        file_name = os.path.basename(file_path)
        f = open(file_path, 'rb')
        file_content = f.read()
        f.close()
        body.extend(
          ['--' + BOUNDARY,
           'Content-Disposition: form-data; name="file"; filename="%s"'
           % file_name,
           # The upload server determines the mime-type, no need to set it.
           'Content-Type: application/octet-stream',
           '',
           file_content,
           ])
        # Finalize the form body
        body.extend(['--' + BOUNDARY + '--', ''])
        return 'multipart/form-data; boundary=%s' % BOUNDARY, CRLF.join(body)

    if os.path.exists(filename):
        content_type, body = encode(filename)
        headers = { 'Content-Type': content_type }
        u = urlparse.urlparse(url)
        server = httplib.HTTPConnection(u.netloc)
        server.request('POST', u.path, body, headers)
        resp = server.getresponse()
        server.close()

        if resp.status == 201:
            location = resp.getheader('Location', None)
        else :
            print resp.status, resp.reason
            location = None

        return location

Since I’m working with Django, this is the server part. Few remarks: I create the file name using uuid1(). This is an easy way to create unique identifier. A bit over killing maybe. I assume a model myfiles and a form UploadFileForm that you can easily guess. the function handle_uploaded_file is the procedure that actually saves the file on the disk. This is standard. I return a “Location” where the user can access the file. You have to create a small view to serve the file.

import uuid
from django.http import HttpResponse
import os
import datetime
from myapp.models import myfiles
from myapp.forms import UploadFileForm

def handle_uploaded_file(f,n):
    destination = open(n, 'wb+')
    for chunk in f.chunks():
        destination.write(chunk)
    destination.close()

def upload(request):
    if request.method == 'POST':
        form = UploadFileForm(request.POST, request.FILES)
        if form.is_valid():
            ip = request.META['REMOTE_ADDR']
            u = str(uuid.uuid1())
            uploaded = datetime.datetime.now()
            fname = os.path.join(baseupdir, u)
            handle_uploaded_file(request.FILES['file'],fname)
            size = os.path.getsize(fname)

            d = myfiles(fname=fname,size=size,uploaded=uploaded,ip=ip,uuid=u).save()

            response = HttpResponse(content="", status=201)
            response["Location"] = "/file?uuid=%s" % u
            return response # 10.2.2 201 Created
        else :
            return HttpResponse(status=400) # 10.4.1 400 Bad Request
    else :
        return HttpResponse(status=400) # 10.4.1 400 Bad Request

add sqlite3 collation with python 2.5 and django

a while ago I wrote about enabling the sqlite3 extension with storm . This is how you do it with the Django ORM. The collation is the same and all details are in the old post. The only tricky part is to establish the connection with cursor = connection.cursor() before calling the function to enable the extension. Failing to do so, will result in an error as the connection object will be null.

    def add_collation():
        from django.db import connection
        import sqlitext
        cursor = connection.cursor()
        sqlitext.enable_extension(connection.connection,1)
        cursor.execute("SELECT load_extension('sqlite/libcollate_debian.so')")