Virtual package for your python application

When you’ve got a big python application, you’ll usually split it up in modules. One big annoyance I’ve had is that a module inside a directory cannot (easily) import a module higher up in the tree. Eg: drawers/gtk.py cannot import state/bla.py.

This is usually solved by making the application a package. This allows for import myapp.drawers.gtk from everywhere inside your application. To make it a package though, you need to add the parent directory in the sys.path list. But unfortunately this also includes all other subdirectories of the parent directory as packages.

However, when the package module (eg: myapp) was already loaded, then the path from which myapp was loaded is used to find the submodules (eg: myapp.drawers.gtk) and sys.path isn’t looked at, at all. So, here is the trick:

import sys
import os.path

p = os.path.dirname(__file__)
sys.path.append(os.path.abspath(p+"/.."))
__import__(os.path.basename(p))
sys.path.pop()

Note that this script doesn’t work when directly executed, because the __file__ attribute is only available when loaded as a module.

Save this script as loader.py in the root of your application. import loader from the main script in your app, and you’ll be able to import modules by myapp.a.module, where myapp is the root directory name of your application.

Using template variables in extension tags with Mediawiki

Mediawiki, the wiki used by wikipedia, is a powerfull wiki.

It allows you to include self made templates which you can pass variables too. A very usefull template on wikipedia is the stub template. When you add {{stub}} at the start of an article it will insert text explaining the user that the article isn’t finished yet, but rather a stub. With {{Album_infobox|Name=Sehnsucht|Cover=Sehnsucht_cover.jpg|…}} an infobox is inserted used at wikipedia for all albums.

When the features of wikipedia don’t suffice you don’t neccessarily have to hack wikipedia to add features. You can add an extension which hooks into a part of the parser to create your own logic for your own custom tags.

I created a wiki to store the lyrics which I already got on a forum and found espacially templates usefull.

However, there was a slight annoyance when working with templates. The problem is that in several cases there is a link in a template which can be ambiguous (there are two songs called the same) but you don’t want an ugly suffix after the name (like “Deliverance (Album of Opeth)” instead of “Deliverance”). To solve this you have to pass both the name of the album and the name of the page of the album, but this is annoying to do everytime.

MediaWiki knows no logic so I thought to solve this by adding some basic logic in a custom tag which either uses the same text as the page name when no custom text is specified and otherwise uses the custom text. But that just wouldn’t work.

MediaWiki, when parsing a page, first escapes all nowiki elements, extensions, comments, links and so on by a unique ID. Then parsed the format and replaced variables with their values. After that it just replaced the unique id’s back with the proper content, which in this case was the output of my extension too, which was flawed because the variables weren’t replaced before the extension tag was escaped out and also because the wiki-link ([[pagename]]) wasn’t replaced because the extension was still escaped. So I ended up with having [[{{{album}}}]] on my page instead of a Sehnsucht link.

Trying to fetch template variables inside the extension just wouldn’t work because the extension was evaluated at the moment it was already included in the main page. Avoiding this by disabling the cache would require parsing the whole template each time it is viewed which is very extensive. (each link on a page in a wiki has to be looked up when parsed to see whether it exists)

So I googled a bit around and found that Mikesch Nepomuk submitted a patch on one of the mailinglists to fix this by escaping extensions seperately after the veriables were expanded. Apparently it hadn’t been approved or maybe because it wasn’t included in my release yet so I tried to apply it but it failed. The patch was a bit too old. So I did it by hand instead of by GNUpatch and with some tinkering I got it to work.

You can download the patch for the 1.5 beta4 here, if you are interested.

And I thought I would never touch PHP again.

Python Url Encoding

I looked in the Python Library Reference for a function to encode special characters in strings to be able to be put in Urls (and Uri’s), but found none.

Maybe I missed it?

Anyways, here is an implementation I wrote based on this article:

HexCharacters = "0123456789abcdef"

def UrlEncode(s):
    r = ''
    for c in s:
        o = ord(c)
        if (o >= 48 and o <= 57) or \
            (o >= 97 and o <= 122) or \
            (o >= 65 and o <= 90) or \
            o == 36 or o == 45 or o == 95 or \
            o == 46 or o == 43 or o == 33 or \
            o == 42 or o == 39 or o == 40 or \
            o == 41 or o == 44:
            r += c
        else:
            r += '%' + CleanCharHex(c)
    return r

def CleanCharHex(c):
    o = ord(c)
    r = HexCharacters[o / 16]
    r += HexCharacters[o % 16]
    return r

note: I used the character numbers instead the characters to compare with so I could do greater than and lesser than for the alphanumeric numbers. Maybe testing with a bitmask would be more efficient.

I have to write almost everytime I work with python my own something to hex function which doesn’t add the ‘0x’ in front the built-in hex does.

Either I can’t search or Python hasn’t got enough batteries included. I guess the first, if not I`ll submit my batteries.

Base64 encoding/decoding algorithm

I’ve made some python functions to encode/decode base64. I’ve been trying to develop my own algorithm for base64 for the email protection script which can be found here.

Python again has proved itself again to be a great language for quickly developing stuff.

def tobase64(s, padd = False):
    b64s = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/"
    b64p = "="
    ret = ""
    left = 0
    for i in range(0, len(s)):
        if left == 0:
            ret += b64s[ord(s[i]) >> 2]
            left = 2
        else:
            if left == 6:
                ret += b64s[ord(s[i - 1]) & 63]
                ret += b64s[ord(s[i]) >> 2]
                left = 2
            else:
                index1 = ord(s[i - 1]) & (2 ** left - 1)
                index2 = ord(s[i]) >> (left + 2)
                index = (index1 << (6 - left)) | index2
                ret += b64s[index]
                left += 2
    if left != 0:
        ret += b64s[(ord(s[len(s) - 1]) & (2 ** left - 1)) << (6 - left)]
    if(padd):
        for i in range(0, (4 - len(ret) % 4) % 4):
            ret += b64p
    return ret
def frombase64(s):
    b64s = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/"
    b64p = "="
    ret = ""
    s2 = s.replace(b64p, "")
    left = 0
    for i in range(0, len(s2)):
        if left == 0:
            left = 6
        else:
            value1 = b64s.index(s2[i - 1]) & (2 ** left - 1)
            value2 = b64s.index(s2[i]) >> (left - 2)
            value = (value1 << (8 - left)) | value2
            ret += chr(value)
            left -= 2
    return ret

The algorithm doesn’t automaticly add the required =‘s while encoding, nor does it require while deencoding.

RGB to Hex (and why the python interactive mode is so damned handy)

Update 2010-02-01 Thanks to Sameer for pointing out the short way of doing this: return “#%02X%02X%02X” % (r,g,b)

def tohex(r,g,b):
	hexchars = "0123456789ABCDEF"
	return "#" + hexchars[r / 16] + hexchars[r % 16] + hexchars[g / 16] + hexchars[g % 16] + hexchars[b / 16] + hexchars[b % 16]

Python is very convenient when you need to program a simple algorithm. I programmed this RGB to Hex converter in less than 1 minute, including using the function for a few RGB values I needed to convert.

Usualy I just grab a pen and a paper and do the calculations myself for getting either the calculator of windows itself to show up and actually calculate stuff properly (looking to screen, writing down, forgetting to press C, starting again…), or writing a program in a language like C# would just take too much time.

I’ve been using Idle for quite some time as a very good replacement for both my calculator and my pen and paper.

Parsing $_SERVER[‘PATH_INFO’]

The PHP global variable $_SERVER['PATH_INFO'] contains the path suffixed to a PHP script, if I would call the URL:

http://domain.ext/path/to/script.php/foo/bar.htm?a=b&c=d

Then $_SERVER['PATH_INFO'] would contain:

/foo/bar.htm

Traditionaly the $_GET variables are used for certain parameters like a page to display:

http://domain.ext/page.php?page=about.htm

This method is easy to program, but not only looks strange, but also is very search engine unfriendly. Most searchengines ignore the QueryString (the part of the URL after the ?). And therefor would index the first page.php?page=x they would find and ignore the rest.
Some searchengines like Google do not ignore the query string, but would give a page without using a querystring for different content a way higher ranking.

Parsing the $_SERVER['PATH_INFO'] is relatively easy, this code would do most of the stuff just fine:

if (!isset($_SERVER['PATH_INFO'])){
	$pathbits= array('');
}else{
	$pathbits = explode("/",  $_SERVER['PATH_INFO']);
}

The $pathbits array would always contain / as first element if a path info was provided, otherwise it will be an empty array.

Here is a quite simple example which parses the path info to decide which file to include:

<?php
if (!isset($_SERVER['PATH_INFO'])){
	$pathbits= array('');
}else{
	$pathbits = explode("/",  $_SERVER['PATH_INFO']);
}
if (!isset($pathbits[1]) || $pathbits[1] == ""){
	$page = "default"
}else{
	$page = basename($pathbits[1]);
}
$file = "./pages/{$page}.php";
if (!is_file($file)){
	echo "File not found";
}else{
	require $file;
}
?>