My second session of the day is Beautiful Concurrency with Erlang. I’m here for two reasons. First, Erlang looks cool; second, the speaker, Kevin Scaldeferri, is a friend of mine.
Erlang is a pure functional language (and thus no side-effects) with strong dynamic typing and syntax similar to Prolog and ML. Most notably, it contains concurrency primitives, which is what we’re here to hear about today.
Erlang concurrency primitives include spawn, to create a process, !, to send a message to a process, and receive, to listen for a message. These are not system level processes, but other Erlang processes. It’s a lot like using fork in imperative languages, but less messy.
Erlang, like many functional languages, can implement quick sort in three lines of code. I was having a discussion with a friend of mine about this topic yesterday. It’s very nice, and demonstrates the power of functional languages to trivially solve an already solved set of problems, but is it any use in the real world? Maybe. While I’ve not seen any non-trivial examples, I’m reserving judgment.
The first example is a demonstration on how simple it is to parallelize the quick sort algorithm. It’s not a worthwhile example, in fact, it’s a particularly bad idea, but it serves as a reasonable example of the ease of use of the concurrent features in Erlang. So far, it seems like changing a map call—something I love from Perl—to pmap.
The pmap function is not a built in function (BIF), but a library function built on top of the built in concurrency primitives. The code implementing the function is actually quite simple, and should be available in the slides available at the end of the conference. Conceptually, it spawns as many processes as necessary and uses them to call the function being mapped. It then gathers the results, waiting for each process to complete. It’s quite similar to code I’ve written to do scientific processing using MPI, but I’ve always thought functionally when coding.
After explaining concurrency, we make the jump to distributed systems. What’s everyone’s favorite distributed system? Twitter! Twitter, while not designed as such, is essentially a messaging system. Erlang does message passing very well, and almost all programs are designed using this paradigm. So Kevin took a stab at implementing a Twitter-like system in Erlang, the key ideas of which he will present to us.
The lightweight and convenient process architecture of Erlang lends itself to the problem. Every user can be represented as a process. Each process can then send and receive messages. In effect, the problem—the messaging part anyway—is now solved. But, what about scaling to multiple machines?
It turns out to easy (but you knew it would, right?). All we need to do is pull in the global module and we can bind our users not only to a process identifier, but combine that with a given machine as well.
However, we still don’t have a reliable system. If a process dies, that user is no longer in the system. So it really is a lot like Twitter.
OTP, the Open Telecom Platform (a legacy name from Erlang’s history at Ericcson), provides a set of common behaviors and patterns for writing reliable and distributed system. The programmer simply declares what interface they would like to use, then implement a set of callbacks defined for that behavior. Reminds me a bit of roles (because I have an unhealthy need to relate everything back to Perl).
As with everything in Erlang, it is almost impossibly easy to set up this reliability. I still can’t get over how well the syntax maps to how I actually think about code.
A question was raised about how to go about setting up the necessary cluster of hosts used in Erlang’s mesh network. Kevin went into it briefly, but it’s unfortunately out of scope for this session.
And, with that, it’s time for lunch. Thanks, Kevin!
[tags]oscon, oscon08, oscon2008, Erlang, concurrency, programming[/tags]