Wednesday, June 6, 2012

Ruby is faster than Python, PHP, and Perl

There's a pervasive myth that Ruby is slow, and moreover, that it's the slowest language in popular use. Everyone knows Ruby is slow. Right? Who would possibly disagree that Ruby is slow? Here's an example IRC discussion on freenode's #postgres which happened just yesterday:

16:57 sobel: i can't find it now, but arstechnica benched all the popular dynamic languages
16:58 sobel: using C/C++ as the standard (1.0) they ranked other languages as multiples of C/C++ performance
16:58 sobel: java was a 2
16:58 sobel: twice as slow as C. and it was the winner, head and shoulders above the rest
16:58 sobel: i think Erlang placed somewhat favorably
16:59 sobel: python/perl were near the middle, at something like 25-35x slower than C
16:59 sobel: Ruby: a full 72x slower than C

Ruby loses against invisible Ars Technica benchmarks in the sky with unknown URLs! 72x slower than C! Over twice as slow as Python and Perl!

Fortunately, we don't have to rely on some one-off benchmark Ars Technica may or may not have done at some indeterminate point in time whose URL cannot be located offhand, because there's a site with a fairly decent reputation which provides ongoing benchmarks across multiple programming languages using implementations submitted by fans of said language. It's been around for awhile and is relatively well-trusted.

That site is the Programming Languages Shootout, and unlike the alleged Ars Technica benchmark, you can actually visit their web site at shootout.alioth.debian.org. What do they have to say about programming language performance?


According to this benchmark suite, JRuby is 34.5 times slower than (not C, not C++, but) Fortran. Ruby 1.9 (MRI/YARV) is 43.80 times slower than (not C, not C++, but) Fortran. Both JRuby and Ruby 1.9 beat Python, PHP, and Perl by a considerable margin. The nearest competitor is Python 3, at 47.93 times slower than (not C, not C++, but) Fortran. By the way, did I mention that the fastest language on their benchmark is... not C... not C++, but Fortran? (nothing personal sobel, but unsubstantiated hearsay is bad!)

Yes, that's right folks: according to the Programming Languages Shootout, Python, PHP, and Perl are all slower than Ruby. Did you think Ruby was slower than Python? Guess what, you're wrong. Ruby used to be one of the slowest popular languages, but that has changed. Ruby performance has advanced considerably over the years, so if you're still repeating some offhand information you may or may not have gotten from Ars Technica at some point but can't find the link to as your metric of Ruby performance, you may want to try again, and find modern, relevant information you can actually get a link to.

There are many future VM improvements in the pipe for Ruby, Python, and PHP (and I guess Perl users might continue dreaming of a Parrot-powered Perl 6). Rubyists can look forward to the upcoming JRuby 1.7 release which features InvokeDynamic support and allows for Java-speed method dispatch... in Ruby. InvokeDynamic is a game changer for the JVM in general, and it's a game changer for Ruby, because InvokeDynamic makes JRuby dispatch potentially as fast as Java.

Python users can look forward to PyPy, which is posting some incredibly impressive numbers, especially around numerical computing. Python users can also look forward to resumed work on Jython, which is adding InvokeDynamic support which can potentially make Python as fast as Java. Finally, PHP users can look forward to the HipHop VM developed at Facebook, which will provide much improved performance for PHP. These are all great projects, but none of them are really ready for general consumption yet (including JRuby 1.7).

All that said, the Programming Language Shootout doesn't include any of these unreleased development versions in the benchmarks you see when you visit their site. They show the numbers for the latest production releases, and those numbers show Ruby is faster than Python, PHP, and Perl.

The game has changed: you just weren't paying attention.

Last but not least, if you've seen some benchmark somewhere, even if you have an eidetic memory and remember but the numbers were, but can't even dredge up a link to it, please, please, don't quote said benchmark, even if you have an eidetic memory and remember what the numbers were.

For benchmarks to be remotely scientific, they must be both reproducible and falsifiable, and hopefully in addition to both those things they have a good methodology. If you can't even dredge up a link to the benchmark in question, please don't go quoting numbers off the top of your head to people who might be influenced by them.

Let's advance computer science beyond the state of witch doctors telling people to bleed themselves with leeches because at some point someone said they might make you feel better maybe.