Tuesday, August 9, 2011

Erlang's biggest missing feature: globals


There's one problem with Erlang I didn't touch upon in my previous rant, whose symptom can be found in some Erlang code called mochiglobal.erl. To understand what mochiglobal does we can read the description found in the source code's header:

Abuse module constant pools as a "read-only shared heap"

Erlang has a global centralized "code server" which maps module names to the corresponding BEAM bytecode containing all of the functions in a given module (including a limited number of previous versions, so it has some MVCC properties). The code server stores the state which represents the functions of a given module in a separate shared heap. The code server has pretty decent semantics for changing its contents, which all focus on the specific case of managing the actual Erlang code behind a particular application and allowing hot deploys.

What mochiglobal does is flip the old Lisp aphorism of "code is data" on its head. As the Erlang "code server" is exclusively focused on managing Erlang modules, if you want to use it to store a global value, you must convert it into Erlang code, compile it, and put it into the code server.  This is exactly what mochiglobal does.

It's not as if mochiglobal is some arcane hack. It's used by some prominent Erlang applications, including Riak. Global shared state, something the Erlang VM has specifically been designed to exclude, is necessary in some real-world problems that are otherwise highly conducive to the Erlang VM. How many prominent Erlang applications have "mochiglobal" or some other "mochi*" code in them to provide features which really should be included in OTP?

Erlang could've done better here by providing some mechanism for managing global shared state. The Erlang code server is just one specific case where global shared state is needed, and that's the only one the language designers tackled.

A common Erlang reaction to the suggestion that the Erlang VM should support shared state is "how do you tolerate faults?" I don't think tolerating faults is particularly hard. We can look at one model for how to tolerate faults in this case: the Erlang code server itself. If you crash the Erlang code server, it brings down the entire VM. This probably isn't the best solution, but it's one solution. Let's do better.

Another solution would be to treat the global state as a sort of database, or perhaps more specifically a key-value store. This store can contain the code which implements a particular module in addition to shared global values. Programs could modify its contents in atomic sections, which are committed when they complete or aborted and rolled back if an exception is raised. I don't know about you, but it's beginning to sound a lot like STM.

I think shared-nothing systems like actors and STM are two ways of managing state which together build a more complete picture of how you can build general purpose programs. Code should just be another kind of global state, and global state needs some kind of concurrency story, which STM addresses. Like peanut butter and chocolate, a language which combined actor-like ideas with STM ideas has a good way to manage not just code, but global shared data.