OSCON 2008: Beautiful Concurrency with Erlang

My second session of the day is Beautiful Concurrency with Erlang. I’m here for two reasons. First, Erlang looks cool; second, the speaker, Kevin Scaldeferri, is a friend of mine.

Erlang is a pure functional language (and thus no side-effects) with strong dynamic typing and syntax similar to Prolog and ML. Most notably, it contains concurrency primitives, which is what we’re here to hear about today.

Erlang concurrency primitives include spawn, to create a process, !, to send a message to a process, and receive, to listen for a message. These are not system level processes, but other Erlang processes. It’s a lot like using fork in imperative languages, but less messy.

Erlang, like many functional languages, can implement quick sort in three lines of code. I was having a discussion with a friend of mine about this topic yesterday. It’s very nice, and demonstrates the power of functional languages to trivially solve an already solved set of problems, but is it any use in the real world? Maybe. While I’ve not seen any non-trivial examples, I’m reserving judgment.

The first example is a demonstration on how simple it is to parallelize the quick sort algorithm. It’s not a worthwhile example, in fact, it’s a particularly bad idea, but it serves as a reasonable example of the ease of use of the concurrent features in Erlang. So far, it seems like changing a map call—something I love from Perl—to pmap.

The pmap function is not a built in function (BIF), but a library function built on top of the built in concurrency primitives. The code implementing the function is actually quite simple, and should be available in the slides available at the end of the conference. Conceptually, it spawns as many processes as necessary and uses them to call the function being mapped. It then gathers the results, waiting for each process to complete. It’s quite similar to code I’ve written to do scientific processing using MPI, but I’ve always thought functionally when coding.

After explaining concurrency, we make the jump to distributed systems. What’s everyone’s favorite distributed system? Twitter! Twitter, while not designed as such, is essentially a messaging system. Erlang does message passing very well, and almost all programs are designed using this paradigm. So Kevin took a stab at implementing a Twitter-like system in Erlang, the key ideas of which he will present to us.

The lightweight and convenient process architecture of Erlang lends itself to the problem. Every user can be represented as a process. Each process can then send and receive messages. In effect, the problem—the messaging part anyway—is now solved. But, what about scaling to multiple machines?

It turns out to easy (but you knew it would, right?). All we need to do is pull in the global module and we can bind our users not only to a process identifier, but combine that with a given machine as well.

However, we still don’t have a reliable system. If a process dies, that user is no longer in the system. So it really is a lot like Twitter.

OTP, the Open Telecom Platform (a legacy name from Erlang’s history at Ericcson), provides a set of common behaviors and patterns for writing reliable and distributed system. The programmer simply declares what interface they would like to use, then implement a set of callbacks defined for that behavior. Reminds me a bit of roles (because I have an unhealthy need to relate everything back to Perl).

As with everything in Erlang, it is almost impossibly easy to set up this reliability. I still can’t get over how well the syntax maps to how I actually think about code.

A question was raised about how to go about setting up the necessary cluster of hosts used in Erlang’s mesh network. Kevin went into it briefly, but it’s unfortunately out of scope for this session.

And, with that, it’s time for lunch. Thanks, Kevin!

[tags]oscon, oscon08, oscon2008, Erlang, concurrency, programming[/tags]

OSCON 2008: Strawberry Perl: Achieving Win32 Platform Equality

My first session of the day is Strawberry Perl: Achieving Win32 Platform Equality, presented by Adam Kennedy. Originally, I had considered a Parrot talk, but I saw a similar talk at SCALE6x, and I happened upon Adam on IRC this morning. I chatted briefly with him about his talk, and he happens to be in communication with a friend of mine, who is working on Camelbox, a Windows build of Perl originally targeted as a way to easily distribute applications written with Gtk front ends (I hope I got the motivation correct).

Recently, Adam has been funded by The Perl Foundation, Perl in Israel, and Stonehenge to use Perl from nothing but his flash drive. This provides an excellent motivation to get Strawberry Perl working in a highly portable way.

Originally, Perl was awesome and worked everywhere—except Windows. That was okay, because Windows didn’t matter. No one did any real work on Windows. Then, around 1995, Windows started to matter. A brief history of Perl on Windows followed, resulting in what is today ActiveState.

Much of what Adam wrote for PPI does not work in ActivePerl, which makes it a non-starter for him, as he tends to work on Windows. Anything depending on Scalar::Util or List::MoreUtils modules will not work with the ActivePerl build system. This led to an embarrassing problem for Adam when he gave a talk three years ago at OSCON. He couldn’t give his demo, because PPI would not build in ActivePerl. In fact, ActiveState’s package manager has gotten so much worse that almost any module that is at all useful does not exist—and thus nothing useful can be done on Windows (big surprise).

Moving away from ActiveState, this talk is essentially about Adam trying to get his own laptop to work. That’s really all he wants. It’s a modest desire. More importantly, the CPAN module has to work. Without that, what’s the use of Perl?

So Adam offered a prize: a yard-high stack of cases of any beer desired by the first person who could provide a fully-installable and working (by the above definition of working) version of Perl for Windows. After six months and no sign of a winner, he changed the prize to “craploads” of beer. In 24 hours, he received two entries. The winner cheated a lot, but the loser was Vanilla Perl, which has become a testing ground for experimentation.

Strawberry Perl is the Perl for Windows designed for people who don’t use Windows. That is, the people who do all of their work on Unix or Unix-like systems—Linux, Solaris, and Mac OS X. The main goal of the project is to make it easy—it is Perl, after all.

In the future will come Chocolate Perl—completing the holy trinity of neopolitan flavors—for people who know Windows, but don’t know Perl, and thus the Unix-like characteristics of Perl.

The target of Adam’s financial support is Portable Perl: Perl for flash drives. Carry it around, install CPAN modules onto, or from, the flash drive. It’s network-aware, does the right thing, and juliennes fries. An excellent standard being developed for portable apps is, in fact, PortableApps.com, where applications such as Firefox or Putty can be downloaded and installed to those ever-growing flash drives.

Available Thursday at the Perl Foundation‘s booth in the expo hall will be branded flash drives with Portable Perl on them. At least, I think I heard that correctly.

I really like the work Adam is doing. He’s accomplished so much to get Perl everywhere. That’s a cause I can get behind.

“The main problem today is Vista.”
— Adam Kennedy

Okay, I took that out of context, but I couldn’t resist capturing the quote. What he really means is that changes made to Windows in Vista have made things not work, in particular the access control. It’s not an unusual problem when upgrading to new systems, but it is more difficult with proprietary platforms, which Open Source authors have very little access to.

OSCON 2008: Wednesday Morning Keynotes

Kicking off the official start of OSCON on Wednesday morning is Allison Randal welcoming us to the 10th annual O’Reilly Open Source Conference. She gave us an overview of what we could expect from this year’s conference. Mostly, it’s about open systems this year, not just open source program. She then introduced the program co-chair and the man behind the personal schedule feature on the conference web site, Edd Dumbill. He started off by getting an idea of how long the audience had been coming to OSCON. Quite a few people have attended half a dozen or more. Impressive. Next, he pimped the OSCON photo contest on Flickr. He’s a very big proponent of the social networking aspects of OSCON: Flickr, Twitter, and IRC in particular.

Allison is back to tell us that the morning break will be sponsored by Intel, and lunch is sponsored by Google. That gives me some hope for a decent lunch, at least. Don’t let me down, Google.

Next up, Tim O’Reilly with an update on Open Source on the O’Reilly Radar. He started out with an overview of the history of this conference, in particular the predecessors: the Freeware conference, and the Perl conference.

He offers an important safety tip: keep your history. Be an e-pack-rat. Some day you’ll look back and appreciate that you have it. It’s like the photo album on the coffee table. It’s the story of us and how we became who we are today. So keep everything. Please. Even if it’s embarrassing. Those are always the best memories, the ones that make us laugh.

The big point he’s here to make today is how big Open Source has come in the last decade. But, don’t become complacent. There are three big challenges and opportunities coming up: cloud computing, the (open) programmable Web, and open mobile.

Cloud computing is on the tip of everyone’s tongue today. From Amazon Web Services to Google’s App Engine. Individuals and start-ups now have the ability to build applications on top of these wonderful, decentralized, and most importantly cheap platforms.

Web does not mean “http.” It is, in fact, the entire Internet, the “web” of systems that communicate and inter-operate. There are Web applications that provide platform-agnostic solutions, but there is also XMPP, mobile devices, and even non-Web APIs for those very Web applications that are often so impressive.

“The Web is 72 subsystems in search of an Operating System.”
— Tim O’Reilly

Data is the value-add by so many of the so-called open web companies. While the APIs are open and the data can be queried, the data itself is owned by the provider, to do with as they please. We need a truly Open Web Platform. Apple, as popular as the iPhone is, has created an essentially closed platform. Google, with Android, understands this. Without a truly open mobile platform, all of Google’s market share could potentially disappear overnight.

Back to Allison who introduced our next speaker, Christine Peterson. She takes the stage to tell us about Open Source Physical Security: Can We Have Both Privacy and Safety?

We passed up an opportunity with “e-voting.” The Open Source community should have been able to rise up and solve that problem. I’m not sure how or in what way. I’ve had many discussions with friends on the subject, and we’re still not convinced that computers are even a good idea when it comes to voting.

This is the political activism segment of the conference. That said, she brings up very real concerns. There are very real reasons to care about detecting weapons or other hazards. But, the very same technologies, in particular surveillance, that are used to defend against very real dangers can be used—abused—to monitor law-abiding citizens.

Terrorism is a “bottom-up” problem, which the state is attempting to solve with “top-down” solutions. We need so-called bottom-up solutions. The solutions that involve the very same openness, security and privacy that the Open Source community is already so concerned about and already so vocal about.

The take home message, if there is one, is that all this public sensing data and the information they gather should be open. Our elected officials (this is a very US-centric talk) are well-meaning, but do not have the tools or the knowledge or the experience to really understand the need for all of this to be open.

“No secret software for sensing public data.”
— Christine Peterson

Allison came back on stage to introduce our last, but certainly not least, speaker, Dirk Hohndel, Intel’s Chief Linux and Open Source Technologist. He’s here to talk about Moblin, Linux for Next Generation Mobile Internet. Given that I work for Qualcomm, this is, or at least should be, a very interesting topic for me (I work in support of the engineers, who do the actual work).

Intel is putting their money where their mouth is with Moblin (Mobile Linux, get it?). There is a new class of computers on the market, which have become affordable for the mass market: ultra portable notebooks, hand-held tablet computers, and “smart” phones. The driving force making these devices so successful is the Internet. They are connected and our data is accessible from anywhere.

But what about vendor lock-in of the platform and the data. Intel believes that the platform should be open. This is where Moblin comes in. It’s Intel’s idea of an open platform and an open software stack, allowing the community to develop applications and create new systems and services.

It’s excellent preaching to the choir, but I suspect that from a business perspective, it’s also a way of getting other people to do work for free and really get entrenched in the mobile market. After all, Intel is not the giant in the mobile space the same way that they are in the server, desktop, or notebook spaces. In fact, Qualcomm has a very impressive microprocessor, called Snapdragon, targeting the mobile market (shameless plug).

Allison is back, once again introducing Tim O’Reilly, who will be talking to Monty Widenius and Brian Aker about their work with MySQL and the acquisition by Sun Microsystems. This is a Q&A session, and I always find these difficult to blog. With any luck, a summary or transcript will be posted to the O’Reilly Radar site.

That brings us to the end of this morning’s keynotes. I’ll drop by the expo hall for a few minutes before my first session. But first, I really need to find a restroom.

Oh, Brad also wrote a few words about the keynote.

OSCON 2008: Day 3

It’s Wednesday, which means it’s day three of OSCON—day one for those here only for the sessions or expo hall. The tutorials and the Tuesday Night Extravaganza are behind us. Three days of sessions and two days of expo hall are ahead.

The morning keynotes begin in approximately 45 minutes. After that, I have only a vague idea of which sessions I’d like to attend. My current line up looks a little like this,

Of course, any of this is subject to change without notice.

OSCON 2008: Tuesday Night Extravaganza

It’s Tuesday evening and all of the tutorials are behind us. I’ve learned things about Perl no mere mortal should be trusted with, and I found out that Erlang is a really cool language. Now I’m in the Tuesday evening keynotes—or extravaganza, if you believe the marketing hype. They’ve started out with a real bang. Someone, whose name I didn’t catch, is talking about Python. As Alison Randall, the OSCON program chair said, “We have three of my favorite speakers, but first,” there’s this guy. Actually, I’m sure he’s a perfectly decent chap, I just have very little interest in Python.

Originally, I hadn’t planned on arriving at the keynote until 9:00pm, when Damian Conway is schedule to speak on Temporally Quaquaversal Virtual Nanomachine Programming In Multiple Topologically Connected Quantum-Relativistic Parallel Timespaces…Made Easy!. I mean, granted, I’m sure I already know all there is to know about it, but it still might be a little interesting.

Anyway, the keynotes got started with Mark Shuttleworth, the founder of the Ubuntu project. He’s here to speak to us about “Free software and the art of software engineering.” It (whatever “it” is) boils down to three things: innovation, methodologies, and economics.

Innovation. Society has a responsibility to stimulate it. Innovation is extremely non-linear and the key to this is disclosure, as is done in (or was once done in) academia. Free Software is the scaffolding for innovation. The real successes are accessible. The Mozilla products are examples of wildly successful open platforms, with the extension architecture they have provided.

Methodologies. The purpose of methodologies is to organize talent. How is Free software changing the direction of these methodologies. The Free Software people, that is us, are organized and motivated by interest. A second driving factor is that developers are almost never located near each other, so things like pair programming completely fall apart. Creating architecture for collaboration and participation is essential to the success of any Free Software process. While a common set of tools can never be forced upon the community, the ability for a diverse set of tools to communicate with each other is vital.

Economics. It is the combination of the technical change and innovation in economics that really moves the world forward. For example, we had the Web for years before the business models started to spring up around it and really drove us forward, both technologically and economically. Today, there is an increasing use of online services, which both drive technology forward and allow platforms to work together, and more often than not, these services are built on Free Software.

Our great task over the next two years is to lift the Linux desktop from something that is stable and works and is not-so-pretty, to something that is art. At this point, someone started clapping, and a couple of people joined in. As Jaime Zawinsky once said, “We should design software that helps our users get laid.” But really, we need to make software that is phenomenally useable, beautiful, and functional.

Next up, Chris DiBona, the Open Source program manager at Google, joined Allison on stage to present the Google O’Reilly Open Source Awards.

Next up, with Exceptional Software Explained: Embrace Error is Robert “r0ml” Lefkowitz. He is fast becoming one of my favorite speakers. He’s here to talk about software development methodologies in Open Source. This talk is almost a sequel to one he gave last year, An Open Source Lexicon. He has a real penchant for language, particularly classical language, and how to apply it to themes in the Open Source community. Unfortunately, because of this very quality, it’s extremely difficult to write about it as he speaks. It’s hard to summarize as he speaks, and he’s far too entertaining to chance missing what he’ll say next.

Josh McAdams then took the stage to continue the long standing tradition—10 years now—of the White Camel Awards. So here’s something I don’t understand. What is it that drives people to design award trophies that have a high potential for lethality? Honestly, don’t run with them. They’re worse than scissors.

Finally, it’s time for Damian’s keynote. But you know what? I’m not going to miss any of it to write about it here. If you missed it, well, you should have been here.

OSCON 2008: Practical Erlang Programming

After lunch and our trip to the Apple Store, I’m sitting in Portland 256 for the Practical Erlang Programming. It’s being taught by Francesco Cesarini of Erlang Training and Consulting Ltd.

Over 90 people registered for this tutorial, and the room is almost full. Save for the handful of available chairs, I’d feel guilty about auditing it instead of attending the Real Time 3D on the Web with Open Source I had originally registered for. This will be a two and a half day course compressed into three hours. Should be fun, and useful for Kevin’s session tomorrow, Beautiful Concurrency with Erlang. After seriously considering the relative merits and general usefulness of the tutorials, I decided Erlang would be much more interesting. I had made my original choice with the equivalent of a dart board, so I don’t feel too bad about changing my mind.

The tutorial started with a quick tour of Erlang’s syntax. It looks odd, but I’ve used Lisp and ML in the past, and I’m a rather good Perl hacker, so it isn’t proving too difficult to pick up. The concept of pattern matching intrigues me. It appears to use equivalency, in the mathematical sense to handle both boolean and assignment operations with the same syntax. For example,

[A,B,C] = [1,2,3]    % A is 1, B is 2, C is 3
[A,B,C] = [1,2]      % error, size mismatch
[A,B,A] = [1,2,3]    % error, A already bound to 1
[A,B,A] = [1,2,1]    % okay, A bound to 1, then equivalent to 1

Shortly into the discussion of syntax, Francesco asked that anyone who hasn’t yet installed Erlang do so. I executed yum install erlang, which pulled in unixODBC, tcl, and tk as dependencies. Well, 45 megabytes and 45 minutes later—an impressive speed of 1 MBpm—I now have Erlang installed and ready to run. Just in time for a 10 minute break.

During this first break, we were asked to do a simple exercise in Erlang: write a module, boolean.erl, that implements b_not(), b_and(), b_or(), and b_nand(), without using the built in logical operators. I’ve been able to define the structure of the module, but I don’t know how boolean values are represented in Erlang, so I may have to wait until he gives us the answer. Vim’s syntax highlighting tells me that true and false are reserved words, so I can use those.

The solution for this involves writing a simple truth table. In Erlang, functions are subject to pattern matching in the same way that many programming languages allow for function overloading. For the logical or, we start with the basic truth table:

b_or(true,true)   -> true;
b_or(true,false)  -> true;
b_or(false,true)  -> true;
b_or(false,false) -> false.

That’s downright simple and extremely easy to grasp on a conceptual level, particularly for anyone with any background in mathematics. However, and this appeals to me as Perl hacker, Erlang allows the programmer to be lazy, but in a good way. The null variable—as I’m calling it due to the analogy with /dev/null on Unix-like systems (or undef in Perl)—_, allows a kind of lazy matching:

b_or(false,false) -> false;    % the only false case with OR
b_or(_,_)         -> true.     % any other case is true

The other functions can be written in a similar way.

Back from the break, and the population of the room has thinned very slightly. Francesco immediately jumped into conditional evaluation, starting with the case clause. I suspect this may be one of the answers to the exercise. He followed that with the if clause. I find it interesting that he’s done it in that order. In most languages, the if statement is a much simpler case (no pun intended) and is covered first, before moving into more complex territory. I think I understand why, the two clauses are implemented in a very similar fashion. I’m not sure how equivalent they are, I’d have to play with them a bit.

As with any functional language, Erlang has strong support for recursion as well as a handful of built in functions (BIFs) implemented in C to accomplish things that are difficult or impossible to do directly in Erlang. After all, at a certain point, things like date and time require system calls. Also available are convenience functions to do things like convert tuples to lists or back.

At the second, official, break—taken after an official entered the room to scold Francesco for being 15 minutes late—we were presented with two more exercises. First, to write a function, sum/1, which, given a positive integer N, will return the sum of all the integers between 1 and N. As an extension, write a function, sum/2, which, given two integers N and M, return the sum of the interval between them, first ensuring N <= M. Second, write a function, create/1, which will return the list 1 through N given N as its argument. As an extension, write a function, reverse_create/1, which does the same in reverse.

As I suspected, both exercises are perfect candidates for recursion, which is quite simple to do in Erlang:

sum(N) when N > 0 ->
    N + sum(N-1);
sum(0) ->
    0.

The simpler list creation function is actually the second, and is solved similarly, but by accumulating a list instead of adding to a sum (which is, actually, also a method of accumulation):

reverse_create(0) ->
    [];
reverse_create(N) ->
    [N|reverse_create(N-1)].

The first thing I notice is, again, how mathematical Erlang is. The solution is written in exactly the same way I do it when I’m jotting down notes while thinking about how to solve the problem. To me, the syntax is quite elegant.

After going over the solutions to the exercises, we moved into concurrency. As with most languages worth using, Erlang has a spawn() BIF, used to create processes. What’s interesting about spawning processes in Erlang is that the function to do it does not take a system command. Rather, it takes another Erlang function to run. It’s quite a bit more elegant (there’s that word again) than the equivalent fork() dance done in most imperative languages.

Communication between Erlang processes is done via message passing; data is never shared. As with everything else, the method for doing so is quite elegant: Pid2 ! {self(), foo}. Okay, maybe someone has to be me to find that elegant.

The whole process concept in Erlang is quite nice and, again, elegant. It’s plain that it is the primary method by which systems in Erlang are designed. So far, though, we’ve only seen trivial examples. That’s okay, because this is only a three hour tutorial. However, as Larry Wall once said about Perl: It makes the easy things easy and the hard things possible. It’s a good litmus test for any language. It’s far too early for me to pass any judgment on Erlang. I’d like to use it in anger sometime, to see how it performs for me. Perhaps I can get my local Perl Mongers interested in chatting about it.

OSCON 2008: Perl Worst Practices

I’m sitting in Portland 252 for my first tutorial of the day, Perl Worst Practices with Damian Conway. He’s started off by complimenting us on our intelligence and our ability to convince our bosses or significant others that paying for a worst practices course was a good idea.

Most of us are, of course, aware of the concept of best practice when coding. Writing code that’s maintainable, predictable, and follows the rules. Oh, and uses Java.

Worst practice is, by contrast, code that is obfuscated, unmaintainable, and breaks all of the rules. Today, we will be studying code that Damian has submitted to the Obfuscated Perl contest. This promises to be very, very scary.

Damian’s entry to this contest was SelfGOL, a program capable of self-replication, rewriting other Perl programs to themselves self-replicate, detecting un-rewritable programs, playing Conway’s “Game of Life,” and, as if that wasn’t enough, animating any text as a cycling marquee banner. The main constraint of the contest is that the entry must be under 1,000 bytes of code, so it shouldn’t be too difficult to understand. Obviously it doesn’t use any modules, because that would be too easy. Not only that, but it doesn’t use a single control structure. This is going to be great.

Following an amusing demonstration of SelfGOL, we moved into treating it as a case study for a set of principles. Principles that will focus on the very practices SelfGOL embodies, and why they should never, ever be used. As I intend to enjoy the discussion, I won’t spend much time writing about the discussion and examples accompanying these principles, but rather simply note the principles for my own benefit (documentation for the win). After all, sharing all my new tips and tricks would suck all the fun out of it.

Principle 1: Sane and consistent layout makes code more maintainable (but it isn’t a magic bullet if the code itself is beyond help).

Principle 2: Using built-in features isn’t necessarily smarter or cleaner (even though fellow developers’ futile struggles to recall those features can be highly amusing).

Principle 3: Obscure obsolete features are obscure and obsolete for a reason (and restasking them for even more obscure purposes is not helping).

Principle 4: Each statement should do one thing only (since that’s the upper limit most brains can comprehend).

Principle 5: Relying on default behavior makes code very slightly easier to write and vastly harder to read (because most readers can see better than they can think).

Principle 6: Randomly placed subroutine definitionss are static (in the radio interference sense).

Principle 7: Choose data structures that simplify your task (even if the task is to make those data structures incomprehensible).

Principle 8: Just because you use some operation frequently doesn’t mean it should be in a utility function (especially if it’s in a function merely to abbreviate its name).

Principle 9: Encapsulating the familiar can decrease maintainability (refactoring isn’t a substitute for sanity).

Principle 10: Treat any clever one-line solution as an alarm bell (or as an antipersonnel mine with a six-month delay fuse).

Principle 11: Familiarity breeds comprehension (it breeds contempt (but hey, what’ doesn’t?)).

Principle 12: Table-driven solutions are clean, efficient, and extensible (as long as you don’t mind losing a little comprehensibility).

Principle 13: Building a messy data structure and then cleaning it up is often easier than building it cleanly in the first place (and to hell with the purists).

Principle 14: Some code is better compiled at run-time (but the urge to use eval is Nature’s way of letting you know there’s not yet enough pain or misey in your life).

Principle 15: Parentheses are our friends (cos, if you can remember all 24 levels of Perl’s precedence, you gotta get a life, dude!).

Principle 16: Edge cases suck (and edge cases of familiar constructs suck worst of all).

Principle 17: Code should do what it seems to be doing (especially when it seems to be doing something subtle).

Principle 18: Conceptual elegance is no guarantee of actual maintainability (nor a good substitute for it).

Principle 19: If you’re going to have default values, define them near the place they may actually be used (or, at least, somewhere they have a slim chance of being discovered).

Principle 20: No matter how good you think your error messages are, they’re still too brief, too obscure, and too hard to decipher (even if you’ve already taken Principle 20 into account).

Principle 21: Avoid using obsolete and arcane magic punctuation variables with unfamiliar default values and unexpected global effects (even if you happen to enjoy a little self-inflicted pain in other recreational activities).

Principle 22: The fundamental complexity of any problem is irreducible (optimizations merely redistribute the pain differently).

Principle 23: Code that breaks when it’s reformatted is already broken (though on a much more profound and interesting level).

Principle 24: If it’s impossible to understand, it’ll be impossible to maintain (on the bright side, of course, such code is highly stable).

This last one should, but often doesn’t, go without saying.

Principle 25: Phenomimetic retrodeterministic nominativism generally does not improve code comprehension (then again, did it sound like it would?).

Principle 26: Don’t allow dynamic behavior to violate static expectations (and the easiest way to do that is reusing over-scoped variables for unrelated purposes).

Principle 27: Explicit behaviors are better than implicit behaviors (especially when the specification of the implicit behavior is syntactically baroque and hard-to-spot, and the behavior itself is unknown to the majority of developers).

At this late point of the tutorial, Brad pointed out to me that all of these principles are in the included materials. Now that I’ve already transcribed so much from the slides, I don’t have the heart to delete it all. Of course, since I haven’t been commenting on all of the black magic to this point, there would then be very little in the end to post. Brad also has a much better post about this tutorial, since he actually took real notes.

Principle 28: Code that pre-caches or precomputes its data is much easier to maintain than code that caches or computes on-the-fly (when you’re running at multiple gigahertz, acquiring your data a few thousand operations early is still plenty JIT enough).

Principle 29: Coding is an art, but code shouldn’t be art (evolution made programmers boring, pedestrian, and aesthetically challenged for good reasons).

It’s mesmerizing to listen to the thought process behind Damian’s obfuscated code. I can’t help but wonder if this well-organized, well-thought-out explanation is anything close to how Damian designed this program. Or, rather, if there are extremely convoluted, scary, and most importantly, evil gears grinding away inside his head. In fact, I suspect this entire tutorial may have been designed purely as a way of documenting SelfGOL so Damian himself can remember how it works. Clever.

This kind of programming is silly and fun, but it serves a real purpose. Pushing the limits of a language teaches about its dark places. The understanding that comes from it vastly improves the skills of the programmer, even if—especially if—the bad things are never, ever used. Perl, even more than other languages, encourages this kind of play, thanks to its rich diversity and culture.

Important safety tip: keep these tricks and contrivances for recreational purposes only.

I don’t know what’s more disturbing, how much of the tutorial I understood, or how much I already knew coming in.

[tags]oscon, oscon08, Perl, Damian Conway[/tags]

OSCON 2008: Day 2

It’s Tuesday morning in Portland and, after last night’s festivities, I’m glad there is fruit and coffee available for breakfast at the Oregon Convention Center. The coffee is Starbucks and the fruit isn’t ripe, but it’s a welcome sustenance this morning. With approximately an hour before the morning tutorials, people are slowly beginning to filter into the expo hall in search of food.

I have a fun day lined up. This morning I will attend Perl Worst Practices in Portland 252. I’m looking forward to this tutorial, particularly because it’s being taught by Damian Conway. I—as well as my boss, I’m sure—am excited about the prospect of putting these practices to work when I return to my job next week.

After the lunch break, which will probably be spent across the river again, I am signed up for Real Time 3D on the Web with Open Source in E143/144, being taught by Matthew Edwards. I’m not sure what to expect from this session. A week prior to the conference, I received an e-mail instructing me to download a set of programs, including Blender and Inkscape. This is well out of the ordinary for me, so I’m not sure what to expect. I hope it will be fun, but if not, I may duck out and into the Practical Erlang Programming in Portland 256, which Al is attending.

A half hour now until my first tutorial. Time enough for more coffee.

Monday Night Entertainment

After the tutorials on Monday, talk on the #oscon IRC channel turned to dinner. Brad, Al, and I decided we should go in search of beer, regardless of what people wanted to do for dinner. After dropping our conference crap off in our respective hotel rooms, we met up at the conference center MAX station. Joining our party was Jonathan, from my San Diego Perl Mongers group, and Alice, Brad’s wife.

We started the night at Kells Irish Restaurant and Pub on the other side of the Willamette. The hostess there was extremely attractive, even if some in our party made note of how young she appeared. As it’s rude to ask a woman her age, I refrained from doing so. After a few beers and sweet potato fries, we needed to find food. So we decided on Italian, and Mama Mia Trattoria fit the bill. Near the end of dinner, I received a text message from Dan. He and his fellow Tierranet attendees were at Paddy’s Bar and Grill. So we made our way over there for a few more pints.

We called it a night before the MAX stopped running, and made our ways back to our respective hotels. Dan and I happen to both be staying at the Marriott and, as we passed by the bar, we saw his fellow coworkers. Not only that, but the barmaid, at that very moment, announced last call. Not wanting to pass up such a coincidence, Dan and I sat down for another pint.

Not satisfied with the early hour, Dan and I decided to walk down to American Cowgirls, a bar across the street from the Oregon Convention Center. Unfortunately, the bar is closed on Sunday and Monday, so we ended up calling it a night and heading back to our rooms.

Ah, but it’s only Monday night, and OSCON runs through Friday. It will be a good week.

OSCON 2008: Perl Security

After lunch, I wandered over to Portland 255 with Brad and Al for the Perl Security tutorial, presented by Paul Fenwick. Straight away I can tell that he’s going to be a lively and entertaining presenter. His slides go by quickly, as they are merely short counterpoints to his commentary. His commentary, which is also very quick and slightly witty. I don’t expect to have any trouble paying attention. If anything, I’m worried that I’ll fail to pay attention to my writing and, of course, to the #oscon IRC channel.

“A computer is secure if you can depend on it and its software to behave as you expect.”
—Simson Garfinkel and Gene Spafford in Practical UNIX & Internet Security

In a nutshell, that’s what security is. If a computer behaves as expected, it is secure. That is, unless it’s expected to be insecure, I suppose. I’m not sure I’d enjoy that situation, so I’ll assume the assumption of expected behavior is both expected and secure.

Most security boils down to common sense. Unfortunately, this mythical state of being is far less common than its name would imply. Sad, but true. People are often lazy or distracted, and these usually lead to really stupid mistakes.

There is a key acronym when thinking about security: CIA. No, not that CIA. Yes, I thought so, too, at first. What it really means is, Confidentiality, Integrity, and Accessibility. Confidentiality, because information will not remain secure if it does not remain confidential. Integrity, because information must remain known and trusted to remain secure. Accessibility, because denial of access to information may result in insecurity. I may not have done justice to this acronym, because the tutorial moved on quickly after this point. I’m sure there are web sites dedicated to security that can better define the term.

Perhaps the most important piece of advice for the unwitting Perl programmer is to always perform data validation. Never, ever trust input, regardless of where it came from. Fortunately, Perl provides Taint Mode, which forces the program to mistrust input.

Paul shared with us a variety of examples to demonstrate why input should not be trusted, as well as a number of examples of how to properly untaint data. As with anything, it’s easy to become lazy when untainting data, which can sometimes be as bad as not using Taint Mode at all.

The tutorial continued into what is essentially a list of best practices to follow when programming securely with files.

  • Do: Use the three argument version of open(), to prevent attacks using file names with magic characters in them.
  • Do: Use sysopen() instead of open(), which provides ways to avoid overwriting a file, thus helping to prevent symlink attacks often as a result of race conditions.

The common attack vector in so many of the examples given so far has been via file names. Wouldn’t it be great if we could write programs without file names at all? Well, when working in a Unix-like environment, we can. Perl has the ability to use anonymous files by passing undef as the third argument to open(). He was even kind enough to provide us with a way of passing these anonymous file handles to child processes, by disabling the close-on-exec flag prior to calling system(). Sorry, the slide went by too quickly for me to transcribe the method. It, along with all the other examples, are available online.

Calling system() and using backticks make Paul really, really angry. Why? Because doing it right is hard. In fact, just correctly checking the result in $? requires 10 lines of code, according to the documentation for system() in the perlfunc manual page. So, 10 lines just to verify that a single line of code executed successfully.

I briefly became distracted by news of a fire back home. However, what I was able to get is that Paul has written a module, IPC::System::Simple, which, as the name implies, makes the process of calling system commands quite simple.

After the mid-afternoon break, we ventured into setuid and setgid programs. Perl provides ways to determine who is really running the program ($<, $() and who is effectively running the program ($>, $)). Perl is, however, ignorant of the saved UID, which is the third UID in Unix, along with real and effective. Unfortunately, the standard for setuid scripts is confusing and implemented differently on various systems, so don’t use it. Really.

Even worse, the $< and $> variables are cached by Perl, so they may lie to the program, especially when using the setresuid() system call to properly drop privileges, as recommended. Fortunately, another useful module from Paul, Proc::UID provides a solution to this caching problem.

Now we move into DBI security. SQL injection attacks are very similar to the file name or shell attacks covered previously. Any database programmer worth his salt should be aware of the hazards of composing SQL, so I won’t go into the examples here. Programmers should, of course, use placeholders if they’re available. The DBI module itself provides its own Taint Mode, both for input and output, adding all the benefits of Perl Taint Mode to database interface code. Even better, it can be controlled on a per-statement basis.

All of this careful taint checking we’ve done and Perl may end up sabotaging us anyway. When presented with files on the command line, Perl is happy to just open them using the simplistic, dangerous, single argument open() call. Typically, this is done when using the <> operator in a while loop. Also, everyone forgets to use Taint Mode in cron jobs. Don’t do that. Really.

Because Perl is written in C, the null byte becomes very interesting. While it is a perfectly valid character in Perl strings, it marks the end of a C string. In most circumstances, this is not a problem. However, it can mean bad things when making systems calls, which are written in C. Normally, at a terminal, null bytes don’t occur in user input, unless that input comes from the Web. Null bytes can be trivially represented by the %00 escape sequence.

I need to go through the list of Paul’s modules, since they appear to be ideal for the type of programming I tend to do, as an IT developer. In fact, he’d like to see some Solaris patches for Proc::UID, so I can probably help him with that.

I noticed during the tutorial that Paul must read the Fail Blog and I Can Has Cheezburger, or at least knows someone who does. Quite a few of the images that have appeared on his slides have graced the pages of those web sites.

As an added bonus, the tutorial ended 40 minutes early, and Paul had bonus material. What a guy.

The tutorial, and with it the day, is now over. It’s time for dinner, then maybe a BOF session or maybe just a trip to a pub.

[tags]oscon, oscon08, perl, security[/tags]