A few months back I was looking over a lot of different object-storage systems, giving them mini-reviews, and trying them out in turn.
While many were overly complex, some were simple. Simplicity is always appealing, providing it works.
My review of camlistore was generally positive, because I like the design. Unfortunately it also highlighted a lack of documentation about how to use it to scale, replicate, and rebalance.
How hard could it be to write something similar, but also paying attention to keep it as simple as possible? Well perhaps it was too easy.
First of all we write a blob-storage system. We allow three operations to be carried out:
- Retrieve a chunk of data, given an ID.
- Store the given chunk of data, with the specified ID.
- Return a list of all known IDs.
- API Server
We write a second server that consumers actually use, though it is implemented in terms of the blob-storage server listed previously.
The public API is trivial:
- Upload a new file, returning the ID which it was stored under.
- Retrieve a previous upload, by ID.
- Replication Support
The previous two services are sufficient to write an object storage system, but they don't necessarily provide replication. You could add immediate replication; an upload of a file could involve writing that data to N blob-servers, but in a perfect world servers don't crash, so why not replicate in the background? You save time if you only save uploaded-content to one blob-server.
Replication can be implemented purely in terms of the blob-servers:
- For each blob server, get the list of objects stored on it.
- Look for that object on each of the other servers. If it is found on N of them we're good.
- If there are fewer copies than we like, then download the data, and upload to another server.
- Repeat until each object is stored on sufficient number of blob-servers.
My code is reliable, the implementation is almost painfully simple, and the only difference in my design is that rather than having an API-server which allows both "uploads" and "downloads" I split it into two - that means you can leave your "download" server open to the world, so that it can be useful, and your upload-server can be firewalled to only allow a few hosts to access it.
The code is perl-based, because Perl is good, and available here on github:
TODO: Rewrite the thing in #golang to be cool.