Bye Bye Spam

I just installed Hash Cash, which is an anti spam plugin for WordPress.

Hash Cash protects this blog from spam by requiring the client to execute javascript which calculates a checksum of the content from a seed which is very hard to extract.

Since I installed it I haven’t got any spam comments :-).

The downside is that it disallows anyone who hasn’t got a javascript enabled browser to post a comment.

Now I still need to get some good means to combat trackback spam. Just putting them under moderation isn’t good enough for they keep coming

Shiftings in Architectures

Cell, the hype

Sony is working on a new kind of processor which they call Cell. A single Cell processor is said to have a 1/75th of the power of the Blue Gene/L supercomputer in teraflops, and that for 400 dollars.

The new Cell processor is hyped, a lot. People are shouting that the performance of the PlayStation 3, which will feature a Cell processor, will blow away those of the competers: the XBox and the Revolution, both using claimed to be slow PowerPC processors.

Teraflops

So what are these Teraflops? Why aren’t they talking about Hertzes anymore?

A TeraFLOPS stands for 1,000,000,000,000 Floating point Operations Per Second. (TeraFLOPS on wikipedia).

A normal computer has a few GigaFLOPS (one TeraFLOPS is 1000 GigaFLOPS).

The Cell processor in the PS3 will have 1,8 TeraFLOPS. It is very easy to believe that the PS3 will inheritly be about 500 times faster than a normal computer. But that is false.

The performance of a computer isn’t all about how fast a computer can calculate with floating point numbers. The performance of a computer has got to do with a lot more than that. Simply doubling the memory performance (not specificly memory bandwidth, but the whole) would have a bigger impact than doubling the amount of TeraFLOPS, even on applications that heavily rely on floating point operations.

So why have a lot of TeraFLOPS then? It is quite simple, the new platforms haven’t got seperate GPU’s (Graphical Processing Units), their task is now intergrated into the CPU. And what was their task? Calculating, a lot, with floating points.
The amount of TeraFLOPS on the new platforms are the sum of the amount of TeraFLOPS of both the GPU and the CPU. A GPU needs to calculate a lot with numbers for 3d rendering.

Your normal computer has got a lot more than those few GigaFLOPS in your CPU in your GPU already. Although the leap to over the one TeraFLOPS is certainly impressive.

Marketing

Super Computers need to do a lot of FLOPS too, they calculate models which got a lot of floating points. So why put the Cell processor in the Ps3 instead of directly in some supercomputer? Simple, marketing. Almost everyone that is superficially following the computer news is telling me over and over again that the Cell processor is the most incredible thing there is. It is hyped.

Linux on Ps3

Also the Cell processor uses a whole new architecture which still has to be tested a lot to mature. Sony will let you install Linux on your Ps3? Why? Simple, because they want the Linux community to add decent support for the Cell architecture.

New Architectures & CLI

I guess it will be a matter of time before new architectures will come. A problem with a lot new architectures is that there isn’t a lot of support for them yet. The solution? CLI’s like Java and .Net could bring the solution.

Microsoft rumoured to have MSIL running (semi)natively on their XBox360 and making Longhorn more and more rely on .Net (the ontop applications, not the kernel offcourse) means a lot less architecture dependency.

I wonder what will happen, but one thing is sure.. things are shifting.

Python Url Encoding

I looked in the Python Library Reference for a function to encode special characters in strings to be able to be put in Urls (and Uri’s), but found none.

Maybe I missed it?

Anyways, here is an implementation I wrote based on this article:

HexCharacters = "0123456789abcdef"

def UrlEncode(s):
    r = ''
    for c in s:
        o = ord(c)
        if (o >= 48 and o <= 57) or \
            (o >= 97 and o <= 122) or \
            (o >= 65 and o <= 90) or \
            o == 36 or o == 45 or o == 95 or \
            o == 46 or o == 43 or o == 33 or \
            o == 42 or o == 39 or o == 40 or \
            o == 41 or o == 44:
            r += c
        else:
            r += '%' + CleanCharHex(c)
    return r

def CleanCharHex(c):
    o = ord(c)
    r = HexCharacters[o / 16]
    r += HexCharacters[o % 16]
    return r

note: I used the character numbers instead the characters to compare with so I could do greater than and lesser than for the alphanumeric numbers. Maybe testing with a bitmask would be more efficient.

I have to write almost everytime I work with python my own something to hex function which doesn’t add the ‘0x’ in front the built-in hex does.

Either I can’t search or Python hasn’t got enough batteries included. I guess the first, if not I`ll submit my batteries.

BSD

BSD is Unix.. said to be the professional cousin of Linux.

This piece of propoganda for BSD against linux gives as reason that linux is bad for it is told to be just hacked together. This because of the people who develop linux are just people from the community who put a little bit of time in it to get a feature (they probably want) added into Linux and don’t really concern about making it perfect, was said.

They took some of the sarcastic ‘todo’ comments in the kernel as example, blaming that if that stuff is in the kernel linux can’t be trusted at all.

But why does BSD hasn’t got the widespread hardware support Linux has? They blame the big company’s like IBM for instance. I just wonder whether their 60 dedicated BSD programmers could code all the hardware drivers that the thousands of contributers of Linux have coded, even if it were as bad as it is now according to them.

I bet that *BSD has got more than enough of those comments in their code too. If they haven’t they are just hiding the truth and hiding points of improvement, for these TODO’s and FIXME’s are fixed in the end. And even if they get rid of all the TODO’s and FIXME’s before they release any of it they waste a lot of time it could have been used already (less efficiently though.. but it usualy doesn’t make such a big difference).

Services, a different approach

A Server is a computer (or a group of them) that offers services.

Such services could be a webmarket, a website, a game server and so on.

These services tend all to be implemented quite differently, hardly interacting at all. All these services require a different approach, a different way to manage them. This creates both overhead and chaos.

I`m working on a small framework intented to host services which is very flexible.

Framework
A service isn’t a single process as it usualy is, although it could be put in its own seperate process. A service is rather just an instance of its servervice type in the framework somewhere, meaning you can have multiple instances of one service with different configurations scattered accross processes and even machines as you like.

You can edit the settings and manage the services using the same API (and likely the same configuration tool).

A service isn’t aware where it is hosted, from its perspective it is just in a framework with other services and service modules with which it can interact in always the same way.

You as administrator however are aware of the services and its allocation. You can allocate a service (even runtime) to different locations which haven’t neccessarily got to be processes, but also can be lightweight processes found in .net like appdomains or on different machine’s processes.

Having different processes for each service decreases performance but increases stability, and visa versa.

Interaction
Services shouldn’t be restricted to their own process but should also be able to interact with other services by using their exposed interfaces.

An example could be a webserver-service which you provide with the name of the logger-service you want to be used for that webserver. The webserver could just interact by casting an object in the public interface dictionary of the logger to an log interface. The framework takes care of the required underlying communication like the remoting in case they are hosted on different processes or even different machines.

A more tighter type of interaction would be modules. A module of a webserver could be a webmarker. The webmarket would just access the object of the webserver which handles hooking to certain URI, add the hooks and await a visit. Other services could do the same but a module is specificly bound to the service it attached to.

Futhermore modules would be an easier way for administrators to manage extra functionality which is more module like than filling in the instance paths to use in the settings of a certain service.

Portability
I`m still in doubt whether I should make it cross-language. It would be a good thing but it would certainly create havoc to make everything run smooth and alike on all languages. Not even to mention the overhead to load each languages’ runtime.

Power PC dumped

Apple to ditch IBM, switch to Intel chips

When Apple will switch to x86 and ia64 it will be possible to run mac osX on your own home bought computer, given that they don’t avoid it by changing some opcodes.

One other things is sure; they will loose some power. As long as they provide binaries and provide one set for all apple computers they will have to add support for both ppc and x86 in one binary distribution which will undoubtely be a lot bigger if not slower too.

It will be interesting to see how this issue will evolve.

Subversion

I’ve recently been using CVS a lot, more specificly subversion. It makes it a lot easier to share source code. Where it wasn’t possible to work on the same projects or even on another project that depends on the first because sharing source code is tricky, subversion was the solution.

mount..?
As far as I know there isn’t a program that allows you to mount a subversion repositry into the linux filesystem. This would make managing a subversion reposity a bit easier for at the moment you need to use svn add to add every single file; svn cp to copy, svn rm to remove, and so on which could all be intergrated when wrapping a repositry up.

It is possible I haven’t been searching in the right place. Searching for it just gave me an enourmous list of other people graving for it.

pruning
To my knowladge there is no really easy way to get rid of older versions. Offcourse it would be against the basic idea behind the system storing every single version, but sometimes it just isn’t practicly. In case you are dealing with extra not sourcecode files, like images or data files, which usualy are quite large in comparison with source code, you’ll get a really big and slow repositry.

Adding a feature that will store all files before version x if it isn’t the top version in a compressed archive of some type would be really nice. Although it would drasticly decrease the access time for the older versions it does decrease the space used and increases the access time for the top versions which are used a lot more than the older ones.