Unicode to ASCII (1)

When I want to generate usernames from real names, which can contain non-ascii characters, you can’t simply ignore the unicode characters. For instance, danielle@blaat.org is the right e-mail address for Daniëlle, danille@blaat.org isn’t.

There’s trick. Unicode has got a single code for ë itself, but it has also got a code which (simplified) adds ¨ on top of the previous character. The unicode standard defines a normal form in which (at least) all such characters, which can be, are represented using such modifiers. If you then simply ignore the non-ascii representable codes, you’ll get the desired result.

In python: unicodedata.normalize('NFKD', txt).encode('ASCII', 'ignore').

However, this isn’t the right solution. For instance, in german, one prefers ue as a replacement of ü over u.

GStreamer: accurate duration

When decoding, for instance, a variable-bitrate MP3, gstreamer reported durations are, to say the least, estimates. I’ve tried to get a better result in a few ways. First off, some files yield a duration tag, but even if you’re lucky and it is there, there are no guaranties about precision. After that I tried seeking to the end (GST_SEEK_END) of the stream and querying the position, which gstreamer didn’t like. Finally, routing the audio into a fakesink, waiting for the end of stream and then querying for the position gives the right result. It’s not the prettiest method, but it works.

This is a Python script that prints the duration of a media to stdout.

“waiting for x server to begin accepting connections”

Try DisallowTCP = false under [Security] in /etc/X11/Sessions/Gnome/custom.conf or similar for other window managers. Obviously this isn’t a very desirable solution, emerge --emptytree gnome might do the trick too.

(and obviously this might be just one of the many underlying causes for the very generic symptom of X not accepting connections)