OSCON 2008: Beautiful Concurrency with Erlang

My second session of the day is Beautiful Concurrency with Erlang. I’m here for two reasons. First, Erlang looks cool; second, the speaker, Kevin Scaldeferri, is a friend of mine.

Erlang is a pure functional language (and thus no side-effects) with strong dynamic typing and syntax similar to Prolog and ML. Most notably, it contains concurrency primitives, which is what we’re here to hear about today.

Erlang concurrency primitives include spawn, to create a process, !, to send a message to a process, and receive, to listen for a message. These are not system level processes, but other Erlang processes. It’s a lot like using fork in imperative languages, but less messy.

Erlang, like many functional languages, can implement quick sort in three lines of code. I was having a discussion with a friend of mine about this topic yesterday. It’s very nice, and demonstrates the power of functional languages to trivially solve an already solved set of problems, but is it any use in the real world? Maybe. While I’ve not seen any non-trivial examples, I’m reserving judgment.

The first example is a demonstration on how simple it is to parallelize the quick sort algorithm. It’s not a worthwhile example, in fact, it’s a particularly bad idea, but it serves as a reasonable example of the ease of use of the concurrent features in Erlang. So far, it seems like changing a map call—something I love from Perl—to pmap.

The pmap function is not a built in function (BIF), but a library function built on top of the built in concurrency primitives. The code implementing the function is actually quite simple, and should be available in the slides available at the end of the conference. Conceptually, it spawns as many processes as necessary and uses them to call the function being mapped. It then gathers the results, waiting for each process to complete. It’s quite similar to code I’ve written to do scientific processing using MPI, but I’ve always thought functionally when coding.

After explaining concurrency, we make the jump to distributed systems. What’s everyone’s favorite distributed system? Twitter! Twitter, while not designed as such, is essentially a messaging system. Erlang does message passing very well, and almost all programs are designed using this paradigm. So Kevin took a stab at implementing a Twitter-like system in Erlang, the key ideas of which he will present to us.

The lightweight and convenient process architecture of Erlang lends itself to the problem. Every user can be represented as a process. Each process can then send and receive messages. In effect, the problem—the messaging part anyway—is now solved. But, what about scaling to multiple machines?

It turns out to easy (but you knew it would, right?). All we need to do is pull in the global module and we can bind our users not only to a process identifier, but combine that with a given machine as well.

However, we still don’t have a reliable system. If a process dies, that user is no longer in the system. So it really is a lot like Twitter.

OTP, the Open Telecom Platform (a legacy name from Erlang’s history at Ericcson), provides a set of common behaviors and patterns for writing reliable and distributed system. The programmer simply declares what interface they would like to use, then implement a set of callbacks defined for that behavior. Reminds me a bit of roles (because I have an unhealthy need to relate everything back to Perl).

As with everything in Erlang, it is almost impossibly easy to set up this reliability. I still can’t get over how well the syntax maps to how I actually think about code.

A question was raised about how to go about setting up the necessary cluster of hosts used in Erlang’s mesh network. Kevin went into it briefly, but it’s unfortunately out of scope for this session.

And, with that, it’s time for lunch. Thanks, Kevin!

[tags]oscon, oscon08, oscon2008, Erlang, concurrency, programming[/tags]

OSCON 2008: Practical Erlang Programming

After lunch and our trip to the Apple Store, I’m sitting in Portland 256 for the Practical Erlang Programming. It’s being taught by Francesco Cesarini of Erlang Training and Consulting Ltd.

Over 90 people registered for this tutorial, and the room is almost full. Save for the handful of available chairs, I’d feel guilty about auditing it instead of attending the Real Time 3D on the Web with Open Source I had originally registered for. This will be a two and a half day course compressed into three hours. Should be fun, and useful for Kevin’s session tomorrow, Beautiful Concurrency with Erlang. After seriously considering the relative merits and general usefulness of the tutorials, I decided Erlang would be much more interesting. I had made my original choice with the equivalent of a dart board, so I don’t feel too bad about changing my mind.

The tutorial started with a quick tour of Erlang’s syntax. It looks odd, but I’ve used Lisp and ML in the past, and I’m a rather good Perl hacker, so it isn’t proving too difficult to pick up. The concept of pattern matching intrigues me. It appears to use equivalency, in the mathematical sense to handle both boolean and assignment operations with the same syntax. For example,

[A,B,C] = [1,2,3]    % A is 1, B is 2, C is 3
[A,B,C] = [1,2]      % error, size mismatch
[A,B,A] = [1,2,3]    % error, A already bound to 1
[A,B,A] = [1,2,1]    % okay, A bound to 1, then equivalent to 1

Shortly into the discussion of syntax, Francesco asked that anyone who hasn’t yet installed Erlang do so. I executed yum install erlang, which pulled in unixODBC, tcl, and tk as dependencies. Well, 45 megabytes and 45 minutes later—an impressive speed of 1 MBpm—I now have Erlang installed and ready to run. Just in time for a 10 minute break.

During this first break, we were asked to do a simple exercise in Erlang: write a module, boolean.erl, that implements b_not(), b_and(), b_or(), and b_nand(), without using the built in logical operators. I’ve been able to define the structure of the module, but I don’t know how boolean values are represented in Erlang, so I may have to wait until he gives us the answer. Vim’s syntax highlighting tells me that true and false are reserved words, so I can use those.

The solution for this involves writing a simple truth table. In Erlang, functions are subject to pattern matching in the same way that many programming languages allow for function overloading. For the logical or, we start with the basic truth table:

b_or(true,true)   -> true;
b_or(true,false)  -> true;
b_or(false,true)  -> true;
b_or(false,false) -> false.

That’s downright simple and extremely easy to grasp on a conceptual level, particularly for anyone with any background in mathematics. However, and this appeals to me as Perl hacker, Erlang allows the programmer to be lazy, but in a good way. The null variable—as I’m calling it due to the analogy with /dev/null on Unix-like systems (or undef in Perl)—_, allows a kind of lazy matching:

b_or(false,false) -> false;    % the only false case with OR
b_or(_,_)         -> true.     % any other case is true

The other functions can be written in a similar way.

Back from the break, and the population of the room has thinned very slightly. Francesco immediately jumped into conditional evaluation, starting with the case clause. I suspect this may be one of the answers to the exercise. He followed that with the if clause. I find it interesting that he’s done it in that order. In most languages, the if statement is a much simpler case (no pun intended) and is covered first, before moving into more complex territory. I think I understand why, the two clauses are implemented in a very similar fashion. I’m not sure how equivalent they are, I’d have to play with them a bit.

As with any functional language, Erlang has strong support for recursion as well as a handful of built in functions (BIFs) implemented in C to accomplish things that are difficult or impossible to do directly in Erlang. After all, at a certain point, things like date and time require system calls. Also available are convenience functions to do things like convert tuples to lists or back.

At the second, official, break—taken after an official entered the room to scold Francesco for being 15 minutes late—we were presented with two more exercises. First, to write a function, sum/1, which, given a positive integer N, will return the sum of all the integers between 1 and N. As an extension, write a function, sum/2, which, given two integers N and M, return the sum of the interval between them, first ensuring N <= M. Second, write a function, create/1, which will return the list 1 through N given N as its argument. As an extension, write a function, reverse_create/1, which does the same in reverse.

As I suspected, both exercises are perfect candidates for recursion, which is quite simple to do in Erlang:

sum(N) when N > 0 ->
    N + sum(N-1);
sum(0) ->
    0.

The simpler list creation function is actually the second, and is solved similarly, but by accumulating a list instead of adding to a sum (which is, actually, also a method of accumulation):

reverse_create(0) ->
    [];
reverse_create(N) ->
    [N|reverse_create(N-1)].

The first thing I notice is, again, how mathematical Erlang is. The solution is written in exactly the same way I do it when I’m jotting down notes while thinking about how to solve the problem. To me, the syntax is quite elegant.

After going over the solutions to the exercises, we moved into concurrency. As with most languages worth using, Erlang has a spawn() BIF, used to create processes. What’s interesting about spawning processes in Erlang is that the function to do it does not take a system command. Rather, it takes another Erlang function to run. It’s quite a bit more elegant (there’s that word again) than the equivalent fork() dance done in most imperative languages.

Communication between Erlang processes is done via message passing; data is never shared. As with everything else, the method for doing so is quite elegant: Pid2 ! {self(), foo}. Okay, maybe someone has to be me to find that elegant.

The whole process concept in Erlang is quite nice and, again, elegant. It’s plain that it is the primary method by which systems in Erlang are designed. So far, though, we’ve only seen trivial examples. That’s okay, because this is only a three hour tutorial. However, as Larry Wall once said about Perl: It makes the easy things easy and the hard things possible. It’s a good litmus test for any language. It’s far too early for me to pass any judgment on Erlang. I’d like to use it in anger sometime, to see how it performs for me. Perhaps I can get my local Perl Mongers interested in chatting about it.