Monday, March 19, 2012

Don't use bcrypt

(Edit: Some numbers for you people who like numbers)

If you're already using bcrypt, relax, you're fine, probably. However, if you're looking for a key derivation function (or in bcrypt's case, password encryption function) for a new project, bcrypt is probably not the best one you can pick. In fact, there are two algorithms which are each better in a different way than bcrypt, and also widely available across many platforms.

I write this post because I've noticed a sort of "JUST USE BCRYPT" cargo cult (thanks Coda Hale!) This is absolutely the wrong attitude to have about cryptography. Even though people who know much more about cryptography than I do have done an amazing job packaging these ciphers into easy-to-use libraries, use of cryptography is not something you undertake lightly. Please know what you're doing when you're using it, or else it isn't going to help you.

The first cipher I'd suggest you consider besides bcrypt is PBKDF2. It's ubiquitous and time-tested with an academic pedigree from RSA Labs, you know, the guys who invented much of the cryptographic ecosystem we use today. Like bcrypt, PBKDF2 has an adjustable work factor. Unlike bcrypt, PBKDF2 has been the subject of intense research and still remains the best conservative choice.

There has been considerably less research into the soundness of bcrypt as a key derivation function as compared to PBKDF2, and simply for that reason alone bcrypt is much more of an unknown as to what future attacks may be discovered against it. bcrypt has a higher theoretical-safety-to-compute-time factor than PBKDF2, but that won't help you if an attack is discovered which mitigates bcrypt's computational complexity. Such attacks have been found in the past against ciphers like 3DES. Where 3DES uses a 168-bit key, various attacks have reduced that key size's effectiveness to 80-bits.

PBKDF2 is used by WPA, popular password safes like 1Password and LastPass, and full-disk encryption tools like TrueCrypt and FileVault. While I often poke fun at Lamer News as a Sinatra antipattern, I have to applaud antirez on his choice of PBKDF2 when he got bombarded with a "just use bcrypt!" attack (although bro, antirez, there's a PBKDF2 gem you can use, you don't have to vendor it)

The second cipher to consider is scrypt. Not only does scrypt give you more theoretical safety than bcrypt per unit compute time, but it also allows you to configure the amount of space in memory needed to compute the result. Where algorithms like PBKDF2 and bcrypt work in-place in memory, scrypt is a "memory-hard" algorithm, and thus makes a brute-force attacker pay penalties both in CPU and in memory. While scrypt's cryptographic soundness, like bcrypt's, is poorly researched, from a pure algorithmic perspective it's superior on all fronts.

The next time you need to pick a key derivation function, please, don't use bcrypt.

Thursday, March 8, 2012

Announcing Lightrail: Lightweight Rails stack for HTML5/JS applications

There's been a lot of debate lately surrounding Rails suitability for the server stack underlying modern HTML5/JS applications. Having used Rails for some four years for this purpose, and worked with a number of Rails core members, this is a problem I think the Ruby community has solved wonderfully, but yet some are confused as to what solutions are available or the way forward.

Rails 3 provided enormous advances in terms of letting you specialize what Rails provides to the problem at hand. However, to a certain extent this goes against the Rails mantra of "convention over configuration". While Rails 3 provides ample opportunities for configuration, as Rubyists, we shouldn't have to configure anything, right?

I'm a huge believer in both Rails' suitability as a backend for modern client-heavy HTML5/JS applications, and someone experienced in building such applications. Rails is, was, and continues to be a game-changer for modern web development. ActionController::Metal provides the bare minimum needed to build apps which don't need the complete set of HTML-generating abstractions provided by ActionView, but need more tools than a more minimalistic framework like Sinatra makes available. Between Sinatra and Rails lies an unaddressed middle ground, one where, in theory, you should be able to build an ActionController::Metal stack appropriate to your needs, but maybe this is too daunting a task.

For you, the JSON API builder, who wants more than Sinatra but less than Rails... I have what you desire. Introducing Lightrail:


What is Lightrail? Lightrail is Strobe's ActionController::Metal stack for HTML5 applications, originally used to provide the backend APIs for Strobecorp.com and its frontend HTML5/JS application authored with SproutCore (which has been superseded by Ember.js).

Lightrail contains everything you need to build lightweight applications on the Rails stack which serve only JSON, and furthermore, contains an innovative system for building JSON APIs around your objects. Rather than adding a fat #to_json method on your models, or using a template to construct your JSON, Lightrail's allows you to map the JSON serializations of your objects to specific wrappers that know how to serialize specific objects. Like using #to_json, this makes it easy to recursively serialize nested objects to JSON without having to use ActionView voodoo like invoking other renderers. Besides that, it still separates the concerns of what your domain objects are and how they serialize to JSON. If you've tried to build JSON APIs in Rails and found the existing mechanisms for JSON serialization lacking, please try out Lightrail::Wrapper and let me know what you think.

Lightrail is something of an experiment. I didn't write it, but it's software I believe in so much I'd like to support it and see if people are interested in it. Rather than competing with Rails, Lightrail takes the latest, greatest Rails stack and reconfigures it for lightweight applications that provide a JSON API exclusively. Lightrails builds upon all of the modularity that Rails 3 brings to the table, and simply and easily delivers a lightweight stack which is still suitable for complex applications.

Please let me know if Lightrail seems like a good idea to you and if you'd like to help support it. As I have my hands in an awful lot of other open source projects, Lightrail isn't the sort of thing I can support full time. However, if you have some time to spare and ideas to contribute, I am definitely looking for people to help maintain and improve this project.

If you're interested in using Lightrail or helping out with its development, sign up for the mailing list. Just send any message to lightrail@librelist.com to join.

Monday, March 5, 2012

Why critics of Rails have it all wrong (and Ruby's bright multicore future)

Edit: Contrary to what I said here, José Valim is not stepping down from Rails core, he is merely on sabbatical. My bad.

Lately I've been getting the feeling the Ruby community has gotten a bit emo. The enthusiasm surrounding how easy Ruby makes it to write clean, concise, well-tested web applications quickly is fading. Rails has become merely a day job for many. Whatever hype surrounded Rails at its inception has died down into people who are just getting work done.

Meanwhile, Node.js is the new hotness, and many in the Node community have sought to build Node up by bringing Ruby and Rails down. I know that once upon a time Ruby enthusiasts were doing this sort of thing to Java. However, the tables have turned, and where Ruby used to be the mudslinging hype-monkey, it's now become the whipping boy and Node.js the new provocateur.

The sad thing is many of these people are former or current Rubyists who have taken a liking to Node and build it up by spreading blatant untruths about Ruby. I won't go as far as to call them liars, but at the very least, they are extremely misinformed, ignorant of the state of the Ruby ecosystem, and pushing their own agendas.

Jeremy Ashkenas, the creator of CoffeeScript, recently trashed Rails 3 and claimed "Node.js won":


The idea that Rails 3 was a major step backward was recently reiterated by both Giles Bowkett and Matt Aimonetti. Both of them painted building ActionController::Metal applications as some sort of byzantine, impossible task which can only be accomplished by a Rails core member. Are people actually building lightweight Rails applications using the newfound modularity of Rails 3?


Jose Valim, (now former) Rails core member, published a small, simple gist illustrating how to build barebones apps on ActionController::Metal (one of the most forked gists I've ever seen) which is further documented in his book Crafting Rails Applications. In just 50 lines of code you can strip Rails down to its core, making it ideal for use in modern client-heavy HTML5 applications. The funny thing about this gist is that while the idea of a 50 line Rails app seems pretty impressive, the basis of that gist is what Rails 3 puts into your config/boot.rb, environment.rb, and application.rb, just combined into a single file. Did I just blow your mind? Sadly, all the (in my opinion completely undeserved) bad press seems to have made Jose emo as well, and he has stepped down from Rails to pursue his Elixir language.

ActionController::Metal-based applications (along with apps written in Clojure) were the basis of our backend at Strobe, where we sought to ease the pains of people building modern client-heavy HTML5/JS applications with frameworks including SproutCore/Ember, Backbone, and Spine. ActionController::Metal provided a great, fully-featured, mature, and modular platform for us to build applications on top of, and Strobe's ActionController::Metal stack for client-heavy HTML5/JS applications is available on Github. The apps we built with the Strobe ActionController::Metal stack talked only JSON and our frontend was an HTML5/JS application written with SproutCore.

Before Strobe, I worked at a company building rich HTML/JS applications for digital television. Our backend was written in Rails. Our frontends were Flash and HTML/JS applications, the latter of which were single-page client-heavy HTML/JS apps that were packaged in .zip files and installed onto digital televisions and set top boxes, a sort of weird hybrid of web technologies and installable applications. Our Rails application didn't have any views, but provided only a JSON API for the HTML/JS frontend to consume.

Rails was great for this, because it provided the sort of high level abstractions we needed in order to be productive, ensure our application was well-tested, and above all else provided the necessary foundation for clean, maintainable code. I was doing this in 2008, and even then this was already considered old hat in the Rails community. In case you're not paying attention, that's one year before Node even existed.

Modern HTML5/JS apps depend on beautiful, consistent RESTful JSON APIs. This is a great way to develop rich interactive applications, because it separates the concerns of what the backend business logic is doing from the view layer entirely. Separate teams, each specialized in their role, can develop the frontend and backend independently, the frontend people concerned with creating a great user experience, and the backend people concerned with building a great API the frontend application can consume.

Rails is great for JSON APIs.

And yet this meme persists, that somehow Rails is actually bad at JSON APIs. Those who propagate this meme insist that Rails has lost its edge, and that only Node understands the needs of these sorts of modern client-heavy web applications. Giles recently attempted to argue this position:


Giles recently blogged about this issue at length. Let's look at what he has to say about ActionController::Metal and the new level of modularity and clean design that Rails 3 brings to the table:


So Jose wrote a great book about the incredible power of Rails 3's new modular APIs... but... but... but what?

WARD CUNNINGHAM BITCHES. TWEETS > BOOKS. NODE WINS. QED.

Hurrrrrrrr? Ward Cunningham is a cool guy and his concept of a Wiki was a transformative technology for the web, but what the fuck does that have to do with Rails 3's new modular APIs or Jose's book? I think that's what people in logical debate circles call a "non-sequitur".

Perhaps there's still a cogent argument to be had here. Let's dig deeper:


Okay, so the problem is there's not a damn simple way to do websockets. OH WAIT, THERE IS:


Cramp is an awesome, easy-to-use websockets/server-sent events framework (with socket.io support) which runs on Rainbows or Thin, and Thin is a great web server. According to my benchmarks it's approximately the same speed as Node's web server:

Web Server            Throughput  Latency
----------            ----------  -------
Thin    (1.2.11)      8747 reqs/s (7.3 ms/req)
Node.js (0.6.5)       9023 reqs/s (7.1 ms/req)
Yes folks, Node isn't significantly faster than Ruby at web server performance. They're about the same.

Giles also bemoans bundler, because typing "bundle exec" represents ceremony, and using any of the myriad solutions to avoid typing "bundle exec", such as bundler binstubs or rvm gemsets, represents configuration which violates the Rails mantra of "convention over configuration", and how npm is that much easier. I'm sure we would all love to not have to add a one line .rvmrc file to each project to avoid typing "bundle exec", but uhh Giles, bro, mountain out of a molehill much?

Meanwhile, let's check out how convention over configuration is going in the JavaScript world:


But enough about Giles... what kinds of awesome, modern HTML5 applications are people using Rails to build?

I think one of the best examples of this sort of application is Travis CI. Travis is an open source distributed build system with an Ember-based frontend and a Rails backend. Travis's interface shows, in real time, the state of all builds across the entire (distributed) system, allows you to interactively explore the history, see the distributed build matrix completing jobs in realtime, and even have it stream the console output of builds in progress directly to your browser as they complete. It's an amazing, modern client-heavy HTML5/JS application, and it's built on Rails.

Who else is using Ruby/Rails for their frontend? Oh, just Twitter, LivingSocial, Groupon, Heroku, EngineYard, Github, Square, Zendesk, Shopify, Yammer, Braintree, Boundary, Stripe, Parse, Simple, and of course let's not forget 37signals. Rails is the technology underlying the frontend web stack of many huge businesses. Many of these companies have client-heavy HTML5/JS applications which consume a JSON API coming out of Rails. Many of them have APIs that are routinely cited as archetypical RESTful JSON APIs. Many of them have top notch engineering teams that choose the best tools for the job and use many languages for many different tasks. Many of them were founded "post-Node" and had the opportunity to choose Node as their frontend web technology, and while they may use Node in some capacity, their main consumer-facing sites are written with Rails.

Node is three years old now. Where are the Node.js success stories? Who's built a brand on top of Node?  Nodejitsu? Hubot? Is Node anything more than a pyramid scheme or a platform for Campfire bots? Where Rails selling points eschewed performance and instead focused on clear code, rapid development, extensive testing, and quick time-to-market, Node's selling points seem to universally revolve around its insanely fast, destroy the internet fast performance (benchmarks not provided). Meanwhile code quality is de-emphasized and large Node programs degrade into incomprehensible, byzantine structures of callbacks and flow-control libraries, instead of being written in sequential code, you know, the code you can read:

 

What about Ruby in general? What advancements in the Ruby ecosystem are worth getting excited about?

JRuby is maturing into a high-performance Ruby implementation which taps the JVM's advanced features including the HotSpot compiler, multiple pluggable garbage collectors, and parallel multithreading which makes it suitable for multicore applications. One thing I think sets JRuby apart is that it's the most mature language on the JVM which didn't start there. Other projects to implement non-JVM languages on top of the JVM, such as Rhino and Jython, have languished, while JRuby keeps going strong.

The most exciting development in JRuby is Java 7's new InvokeDynamic feature. The Java Virtual Machine was originally designed for the statically-typed Java language, but has its roots in dynamic languages, namely Smalltalk. With InvokeDynamic, the JVM has come full circle and now natively supports dynamic languages like Ruby. InvokeDynamic provides the necessary information to the JVM's HotSpot compiler to generate clean native code whenever Ruby methods are called, in addition to many other potential optimizations. So how much faster will InvokeDynamic make Ruby?


Rubinius, a clean-room Ruby virtual machine based on the Smalltalk-80 architecture, is also a very exciting prospect for the Ruby community as it matures and reaches production quality. It features an LLVM-based JIT compiler, parallel thread execution, and advanced garbage collection, also making it suitable for multicore applications. Beyond being an awesome Ruby implementation, Rubinius has evolved into a true polyglot platform and now features multiple Rubinius-specific language implementations including Fancy and Atomy.

MacRuby also eliminated the GIL from their implementation and now supports parallel thread execution along with an LLVM-based JIT compiler.

There are no less than three Ruby implementations which now support thread-level parallelism and thus multicore CPUs. This is especially relevant in a time when computing is undergoing a sort of phase transition from single-threaded sequential applications to massively multithreaded concurrent applications and distributed systems made out of these multithreaded applications.

It wasn't too long ago that having even four CPU cores in your home computer seemed like a lot, and now 16-core commodity AMD CPUs are available. The future is multicore, and if your programming language doesn't have a multicore strategy, its usefulness is vanishing. Following Moore's Law, the number of cores in a CPU is set to explode exponentially. Is your programming language prepared?

Thanks to JRuby and Rubinius, Ruby can take advantage of multicore CPUs. This still leaves the small matter that multithreaded programming is, uhh, hard. Fortunately I have some ideas about that.

Celluloid is an actor-based concurrent object system that tries to pick up on the concurrent object research that was hot in the mid-90's but died shortly after the web gained popularity. In the '90s concurrent objects were ahead of their time, but with the advent of massively multicore CPUs I believe it's an area of computer science research that's worth reviving.

Celluloid packages up Ruby's core concurrency features into a simple, easy-to-use package that doesn't require any modifications to the language. Where many functional languages solve the issues surrounding concurrency with immutable state, Celluloid solves it with encapsulation (more information is available on the Celluloid github page).

Celluloid takes advantage of many of the features of Ruby, including parallel threads, fibers (coroutines), method_missing (proxy objects), and duck typing. There aren't many other languages with this particular mix of features. Python probably comes the closest, aside from multicore execution due to its GIL. Jython supports parallel thread execution thanks to the JVM but seems abandoned. For what it's worth, Python once had a concurrent object system quite similar to Celluloid back in the '90s called ATOM, unfortunately the source code has been lost.

Ruby is by far the best language available today to implement a system like Celluloid, and that alone makes me excited to be a Rubyist. Where Node.js gives you a hammer, the single-threaded event loop, Celluloid gives you a large toolbox and provides a singular framework of interoperable components which can be used to build arbitrary hybrids of concurrent multithreaded applications, event-based nonblocking applications (that are callback-free!), and distributed systems.

Ruby is a language which can survive the massively multicore future. Whether Node will stick around remains to be seen.