Bertha: no-nonsense blob storage

For a project I need to store blobs of data (~10MB) and access them over TCP. I don’t need any features like removing, updating, authentication, statistics or replication: it’s simply not required or already handled by some other part of the project.

I only want to be able to store a blob and receive a key for it; retrieve a blob by its key and list all keys of the blobs stored.

I couldn’t find anything that fit it. Thus I created bertha. Lets delve right into it, shall we:

To run the server:

./berthad-vfs 1234 tmp data

There’s a python client

>>> from bertha import BerthaClient
>>> c = BerthaClient('serf', 1234)
>>> list(c.list())
>>> key = c.put_str('Hello world')
>>> key
>>> c.get(key).read()
'Hello world'
>>> list(c.list())
>>> ctx = c.put()
>>> ctx.f.write("Do some")
>>> ctx.f.write("Streaming")
>>> ctx.finish()
>>> c.get('975001fb9bdc0f72a78ca6326c55af86348d4c84da7ba47b7ed785a03f6803b0').read()
'Do someStreaming'

Which also install a bertha commandline tool:

$ bertha list
$ bertha get 975001fb9bdc0f72a78ca6326c55af86348d4c84da7ba47b7ed785a03f6803b0
Do someStreaming
$ echo Hi | bertha put
$ bertha get c01a4cfa25cb895cdd0bb25181ba9c1622e93895a6de6f533a7299f70d6b0cfb tmp
$ cat tmp

The berthad code is pretty small: at the moment under a thousand lines of C with lots of comments. GETs are pretty fast: berthad uses Linux’ splice syscall, which usually makes the network card read directly from the buffer the harddisk wrote to.

Leave a Reply

Your email address will not be published.