The second tutorial I attended on Tuesday, and the last one of the conference, was Hands-on Cassandra. Actually, I missed the first half of this tutorial, for reasons which I explain in my Tuesday recap post.
I’ve been told by those that attended the full tutorial that the first half wasn’t really worth attending. In fact, when I arrived at the beginning of the second half, I caught the tail end of the presenter demonstrating how he recreated Twitter using Cassandra, something he dubbed Twissandra. This seems to be the exercise of choice for any distributed system. In a way, that’s smart. Take a highly distributed system everyone is familiar with, explain the challenges faced by such a system, then demonstrate the effectiveness with which the software in question can solve the problem.
In any case, the second half of the tutorial was mostly dedicated to an explanation of how Cassandra distributes its data. The details and, frankly, the delivery weren’t that interesting for me, so I didn’t follow the discussion. It was too high level to keep my interest.
I still think that Cassandra is deserving of some investigation. I have a project in mind that it may be perfect for. At my day job, we have what is essentially a distributed, key-based data store. We’ve had to implement all of the data replication functionality. If Cassandra can alleviate the need to design and implement our own data replication and integrity systems, we can put more effort into the final delivery of the data, instead of its transmission.