Syslets and other untrusted kernelspace programming

In the wake of creating true async. IO in the linux kernel Ingo worked on so called syslets. A syslet is nothing more than one async system call to chain together a few arbitrary system calls. An example syslet would be:

1. Read from file descriptor 123
2. If 1 returned with error, break
3. Write which was read to file descriptor 321
4. If 3 succeeded jump to 1 otherwise break

This is represented by a few struct’s linked together of which the first is passed on to the kernel with the syslet system call. The systemcall returns practically directly and eventually the application can be notified (or can wait) on the syslet to complete. This could safe an enormous amount of system calls. This means way less context switches, which is very good for performance.

Syslets dawn, in a very primitive manner, kernelspace scripting. But hey, wasn’t kernelside scripting the thing that all linux dev’s dreaded? Wasn’t it Linus who joked that he would be in a mental institute whilst releasing Linux 3 with a VB message pump? Yes, they’re afraid of putting untrusted programs/scripts in kernelspace and they’ll barely acknowledge that syslets is the first step.

The problem with the current full-featured scripting languages is that they are, well, full-features gone wrong: they’re bloated and not really secure. In kernelspace you can’t allow any memory except for the scripts’ own to be accessed, not to mention the restrictions on resources and virtual memory won’t help you there. Most scripting languages weren’t developed with these restrictions in mind. Most languages have got evil functions in dark corners of the standard library that will allow you to do really evil stuff with memory.

As far as I know only .net (thus mono) have got a quite rigorous trust framework build in. .Net is bloated and proprietary and Mono is still (and probably will never be) feature complete though still being very bloated.

A very simple safe language is what we need, of which a compiler daemon is running as a system service, with which untrusted userspace programs can have scripts running in the kernel. I’m tempted to use one of the brainf*ck JIT’s, they’re small enough to thoroughly review :).

A kernelspace interpreter would do to, though, as a PoC.

Shiftings in Architectures

Cell, the hype

Sony is working on a new kind of processor which they call Cell. A single Cell processor is said to have a 1/75th of the power of the Blue Gene/L supercomputer in teraflops, and that for 400 dollars.

The new Cell processor is hyped, a lot. People are shouting that the performance of the PlayStation 3, which will feature a Cell processor, will blow away those of the competers: the XBox and the Revolution, both using claimed to be slow PowerPC processors.

Teraflops

So what are these Teraflops? Why aren’t they talking about Hertzes anymore?

A TeraFLOPS stands for 1,000,000,000,000 Floating point Operations Per Second. (TeraFLOPS on wikipedia).

A normal computer has a few GigaFLOPS (one TeraFLOPS is 1000 GigaFLOPS).

The Cell processor in the PS3 will have 1,8 TeraFLOPS. It is very easy to believe that the PS3 will inheritly be about 500 times faster than a normal computer. But that is false.

The performance of a computer isn’t all about how fast a computer can calculate with floating point numbers. The performance of a computer has got to do with a lot more than that. Simply doubling the memory performance (not specificly memory bandwidth, but the whole) would have a bigger impact than doubling the amount of TeraFLOPS, even on applications that heavily rely on floating point operations.

So why have a lot of TeraFLOPS then? It is quite simple, the new platforms haven’t got seperate GPU’s (Graphical Processing Units), their task is now intergrated into the CPU. And what was their task? Calculating, a lot, with floating points.
The amount of TeraFLOPS on the new platforms are the sum of the amount of TeraFLOPS of both the GPU and the CPU. A GPU needs to calculate a lot with numbers for 3d rendering.

Your normal computer has got a lot more than those few GigaFLOPS in your CPU in your GPU already. Although the leap to over the one TeraFLOPS is certainly impressive.

Marketing

Super Computers need to do a lot of FLOPS too, they calculate models which got a lot of floating points. So why put the Cell processor in the Ps3 instead of directly in some supercomputer? Simple, marketing. Almost everyone that is superficially following the computer news is telling me over and over again that the Cell processor is the most incredible thing there is. It is hyped.

Linux on Ps3

Also the Cell processor uses a whole new architecture which still has to be tested a lot to mature. Sony will let you install Linux on your Ps3? Why? Simple, because they want the Linux community to add decent support for the Cell architecture.

New Architectures & CLI

I guess it will be a matter of time before new architectures will come. A problem with a lot new architectures is that there isn’t a lot of support for them yet. The solution? CLI’s like Java and .Net could bring the solution.

Microsoft rumoured to have MSIL running (semi)natively on their XBox360 and making Longhorn more and more rely on .Net (the ontop applications, not the kernel offcourse) means a lot less architecture dependency.

I wonder what will happen, but one thing is sure.. things are shifting.

Introducing Paradox

Kaja Fumei and I are currently developing a light-weight rich jit-ed language called Paradox.

The main feature of Paradox will be that it will be very light weight in memory and startup time.

It will be great to use for scripting in other applications which demand high performance like for instance games. Also for normal scripting purposes or normal applications it will be way more suitable than normal interpreted languages.

Basicly it will feature a JIT Compiler, Generation GC, rich (modulair) base library and a being very extensible.

It will probably not perform as good as the .net framework but will rather come in the range of mono`s and java`s performance, which is very high compared to the performance of normal scripts.

When Kaja, who is currently working on the JITc and gGC finished the base I’ll start working on the core library and help optimizing and I will post a preview of its capabilities.

(for those wondering how long it will take:

The Unknown – 森の中に木がない。 says:
reeally long week

)

Negative .Net myths busted

There are a lot of negative myths about .net which people tend to use to favor the traditional languages like C++ above .net. I’ve busted the ones I read frequently:

  • The GC is really slow
    malloc is way slower! The Garbage Collect of .net actually is faster than any Unmanaged code for it nows whether a value is a reference (pointer) and therefore can move objects in the memory. The GC puts objects of about the same age (generation) close to eachother in the memory. Objects tend to refer and use objects in the same generation. The processor itself doesn’t directly load a value from the memory but loads a whole block of a few KiloBytes in the Cache. When the processor directly caches all the objects which one object uses it just runs a lot faster for working from the cache is a lot faster than recaching different parts of the memory over and over again which happens with unmanaged languages which just put objects where there is free space.
  • Interpreting that stupid Intermediate Language is damned slow
    .Net doesn’t interpret its IL, it compiles and optimizes IL runtime
  • Compiling runtime is very slow anyways
    (That compiling a C++ is slow doesn’t mean that .Net is slow) It saves a lot of time for compiling at runtime allows great optimalisations like getting rid of unreachable code and inlining depending on the current runtime variables. Also operations can be compiled with processor specific optimalisations from one IL source. Most of the resource intensive compiling is done at the startup of the application, it is done while the program is running too but that really makes it a lot faster instead of slower
  • If I write assembly myself it will be way superior to anything .Net can generate
    .Net can’t make all the optimalisations possible for it would take longer to analyse code than the optimalisation would gain. But usualy it creates still very optimised code. The big problem with writing very optimised assembly yourself is that the most optimised code is very processor specific and would be very hard to port, and even worse to maintain. Wanting to add one little extra feature could let you rewrite the whole code again. (Yes I indeed have made programs with assembly). Languages which avoid this a bit like C++ still require you to make a different build for every specific processor when fully optimising. Also it is nearly impossible to debug fully optimised unmanaged code but in .Net it still provides you with at least the functionname in which it has happened with the offset (try to accomplish that with C++ in release mode)
  • The runtime is soooo damned big, it sucks
    20 Mb’s isn’t a lot. It only has got to be downloaded once, and the .net framework is in Windows Update so everyone who updates his computer would have it installed by now. Usualy there is room enough on your software installation CD to include .net, it is more than worth those 20 mb. Also languages like C++ require certain runtimes which arent that cooperative. Does ‘DLLHell’ ring a bell?
  • The .net library naming SUCKS
    Yeah.. its naming is different than what MFC uses. At least the naming is very consistant which is way more important than ‘nice naming’, although when seeing some C++ API names used I still wonder why someone could prefer that above the clear .Net naming
  • The .net library itself sucks
    Really? Like what? What can’t it do?
  • You can’t use API calls like CreateFile
    Now I can’t…
    [DllImport("kernel32.dll", SetLastError = true)]public static extern IntPtr CreateFile(string lpFileName, uint dwDesiredAccess, uint dwShareMode, IntPtr lpSecurityAttributes, uint dwCreationDisposition, uint dwFlagsAndAttributes, IntPtr hTemplateFile);
    … now I can!
  • .Net sucks cause it is Microsoft
    Yeah, so what. .Net is a ECMA standard so you are pretty free to use it, and if there is a catch then that one hasn’t been exploited yet for on linux people are happily using mono to run .net stuff