Strange pagerank

Google uses pagerank to give a site a ranking of importance. Pagerank is strange.

One thing to note is that pagerank is logarithmic-ish. A pagerank of 2 is a lot better than just the double of pagerank 1.

By far the most visited site on my server, this blog, has got a page rank of 3.

Other pages on my server which are visited sometimes like w-nz.com, board.w-nz.com have got a pagerank of 2.

What catched my attention is that the page xr12.com, which basicly is a filler containing a link to the xr12 wiki has got a pagerank of 5. This is the same pagerank as a big site like newgrounds!

Maybe google values xr12.com a lot because it is about one topic and is the only site about that topic and that is xr12.com, where this blog has got tons of links about practicly everything from very various sources.

Pagerank itself isn’t the sorting factor for google but rather the context, although pagerank still is an indicator. Maybe google values a few links which are very specific above tons of links about very different topics.

Using template variables in extension tags with Mediawiki

Mediawiki, the wiki used by wikipedia, is a powerfull wiki.

It allows you to include self made templates which you can pass variables too. A very usefull template on wikipedia is the stub template. When you add {{stub}} at the start of an article it will insert text explaining the user that the article isn’t finished yet, but rather a stub. With {{Album_infobox|Name=Sehnsucht|Cover=Sehnsucht_cover.jpg|…}} an infobox is inserted used at wikipedia for all albums.

When the features of wikipedia don’t suffice you don’t neccessarily have to hack wikipedia to add features. You can add an extension which hooks into a part of the parser to create your own logic for your own custom tags.

I created a wiki to store the lyrics which I already got on a forum and found espacially templates usefull.

However, there was a slight annoyance when working with templates. The problem is that in several cases there is a link in a template which can be ambiguous (there are two songs called the same) but you don’t want an ugly suffix after the name (like “Deliverance (Album of Opeth)” instead of “Deliverance”). To solve this you have to pass both the name of the album and the name of the page of the album, but this is annoying to do everytime.

MediaWiki knows no logic so I thought to solve this by adding some basic logic in a custom tag which either uses the same text as the page name when no custom text is specified and otherwise uses the custom text. But that just wouldn’t work.

MediaWiki, when parsing a page, first escapes all nowiki elements, extensions, comments, links and so on by a unique ID. Then parsed the format and replaced variables with their values. After that it just replaced the unique id’s back with the proper content, which in this case was the output of my extension too, which was flawed because the variables weren’t replaced before the extension tag was escaped out and also because the wiki-link ([[pagename]]) wasn’t replaced because the extension was still escaped. So I ended up with having [[{{{album}}}]] on my page instead of a Sehnsucht link.

Trying to fetch template variables inside the extension just wouldn’t work because the extension was evaluated at the moment it was already included in the main page. Avoiding this by disabling the cache would require parsing the whole template each time it is viewed which is very extensive. (each link on a page in a wiki has to be looked up when parsed to see whether it exists)

So I googled a bit around and found that Mikesch Nepomuk submitted a patch on one of the mailinglists to fix this by escaping extensions seperately after the veriables were expanded. Apparently it hadn’t been approved or maybe because it wasn’t included in my release yet so I tried to apply it but it failed. The patch was a bit too old. So I did it by hand instead of by GNUpatch and with some tinkering I got it to work.

You can download the patch for the 1.5 beta4 here, if you are interested.

And I thought I would never touch PHP again.

Dual Cores, Traffic and Gaming

Dual Cores are the new thing in the processor bussiness. They claim to be perfect for people who want to do different things at the same time but in reality they are just a cheap solution to deliver more.

The truth is that two 500 mhz processors just don’t deliver the calculation power of one 1 000mhz processor. Given offcourse that they are of the same architecture. This is due to the fact that there is a lot of additional logic required to let two processors work together without getting eachother in the way. When two processors both want to write another thing to the same spot of memory you would get impredictable results. To solve this you have all kind of ways to solve it, by for instance locking a region of memory for an amount of time, making the other processor to wait for the first one to finish. But this all creates a lot of overhead and complexity.

You could compare it with driving a car. Driving a car is relatively easy. You just have to steer around some static obstacles, no big deal. When you know the way you could even do it with your eyes closed. That is unless there are other people driving a car. When driving a car you are keeping an eye, not on the road, but on the other people on the road. This not only slows down your maximum speed or decreases efficiency – you can’t just drive full speed over a junction – but it also increases the complexity and the likelyhood of errors.

The same thing goes in the case of dual core processors (or even hyper-threaded and normal multi processor platforms). Although the comparison isn’t really valid because having two processors doesn’t mean that you have to do two things at the same time. The issue is though that what you normally would do, would only be done by one processor and you are therefore wasting the other’s capability.

A good example of this are games. Games tend to be single threaded, which gives best performance for no processor time is wasted to multithreading and it is the easiest thing to do for multithreaded is rather complicated. Complicated enough that there have been a few lengthy discussions in the mono mailing lists how to lock a simple object.

Because we are getting dual core and propably ‘more’core processors lateron for the companies are too lazy to make decent processors1 games should become multithreaded to exploit the full capability of the machine it runs on.

Although it makes creating performance applications a lot more difficult it will surelly benifit distributed computing for the change from different processors to different machines is less than from 1 processor to more.

[1] Native threaded CPU’s like dualcores/multi processor/hyperthreaded processors are ideal for server applications where multiple short living requests have to be resolved. Switching an allocating on a software-threaded processor would create too much overhead for such a simple request.