Wednesday, May 11, 2011

Introducing Celluloid: a concurrent object framework for Ruby

I've spend a lot of time recently working on a new concurrency library for Ruby called Celluloid. In short, its goal is to make working with threads as easy as working with Ruby objects in most cases, while still remaining loaded with all sorts of power user features for the edge cases. It's heavily inspired by Erlang and the actor model (hence "celluloid") and represents my best effort to port as many of these concepts over while making them more Ruby-like. This is the first in what I hope will be a series of blog posts describing various concurrency problems you might encounter in Ruby and how Celluloid can help solve them.

If you're already sold on using threads in Ruby, feel free to skip the next section of this post. However, as threads have remained a perpetual pariah in Ruby as a language, I feel some explanation is in order as to why you might actually consider using them.

Ruby and Threads

Rubyists generally don't like threads. There's plenty of good reasons to dislike threads: they're error prone for end users and the original implementation of threads in the Matz Ruby Interpreter was pretty crappy and broken in multiple ways. Even with the latest YARV interpreter found in Ruby 1.9, a global lock prevents multiple threads from running concurrently.

On the flip side, if you need multicore concurrency Ruby processes are cheap and there are some pretty good libraries like DRb for allowing Ruby VMs to work together. But even then most people are using Ruby to write stateless webapps that store all state in a database, so you can just run multiple Ruby VMs which all have the same application loaded to leverage multiple CPUs in a machine.

I used to be in the thread-hater camp, having cut my teeth on multithreaded C programs which left a bitter taste in my mouth, but recently I've changed my tune. This is mainly due to the great work of the JRuby and Rubinius teams to add true multicore concurrency to their Ruby implementations. JRuby has supported multicore concurrency via threads for awhile, and Rubinius is adding it in their hydra branch. With these Ruby interpreters, you can run one virtual machine per host and the threads you create will be automatically load balanced among all available CPU cores.

This has immediate benefits for things like Rails applications which enable thread safe mode. Rails will automatically create a new thread per request, allowing one VM to service multiple requests simultaneously. On interpreters like JRuby and Rubinius Hydra, this means you can run just a single VM per host and your application will utilize all available CPU cores. All the memory overhead of loading multiple copies of your application is mitigated, and as an added benefit you can take advantage of the better garbage collection these VMs (particularly the JVM) offer.

There is a catch: libraries can't share state across threads without using some form of thread synchronization. This is often trotted out as a persistent worry of those who prefer to run their Rails apps in the standard single threaded mode. Those gem authors, who knows what they're doing? Maybe they're using global variables!  People don't think about this sort of stuff in Ruby, so shouldn't we just assume that 100% of Ruby libraries aren't thread safe per default?

The truth, at least for things like Rails apps, is that the general way they operate typically eschews thread safety issues. Ruby as a language favors object creation over mutating existing objects, and webapps generally create a new set of objects per request and don't provide mechanisms for sharing state between connections due to their stateless nature. In general, webapps are stateless and don't do things which will share state between threads.

If you do intend to go thread safe on your Rails app, you should certainly do your due diligence for auditing the libraries you use for unsafe usage of global and class variables, but in general I think the worries about running Rails apps in multithreaded mode are overblown. Ruby has much better semantics for promoting thread safety than other languages that have made the leap from single threaded to multithreaded (e.g. C/C++), and those languages have managed to make the transition with countless applications running in a multithreaded mode.

In the two years I've been deploying thread safe Rails applications, I've encountered exactly one thread safety bug, and that was in a library that originally claimed to have a specific thread safe mode but removed it from later releases and I unfortunately didn't catch that they had done so. The fix was simple: just create a thread-specific instance of an object I needed rather than sharing one across all threads. I won't say finding the bug was easy peasy, but all in all I don't think one bug was a bad price to pay for all the benefits of moving to a multithreaded deployment.

Concurrent Objects: How do they work?

Celluloid's concurrent objects work like a combination of normal Ruby objects and threads. You can call methods on them just like normal Ruby objects. To create a concurrent object, declare a class that includes the Celluloid::Actor module:

Then call the spawn method on the class:

This creates a new concurrent Charlie Sheen object. Calling the current_status method on it returns the normal value we'd expect from a method call. If an exception is raised, it will likewise be raised in the scope of the caller. But behind the scenes, all these things are happening in a separate thread.

Let's say things aren't going so well for Charlie. Instead of winning, Charlie is fired:

How can we help Charlie win again?

Calling Sheen#win! here does something special: it executes the method asynchronously. Adding a ! to the end of any method name sends a message to a concurrent object to execute a method, but doesn't wait for a response and thus will always return nil. You can think of this like signaling an object to do something in the background, and perhaps you'll check on the result later using normal synchronous method calls.

Using a ! to call a method asynchronously follows the Ruby convention of predicate methods with a bang on the end being "dangerous." While there are certain dangers of asynchronous methods (namely in how errors are handled), providing thread safe access to instance variables is not one of them. Charlie is running in his own thread, but there's no need to synchronize access to his private variables. This is where Celluloid's secret sauce comes in.

Charlie maintains a queue of pending method calls and executes them one at a time in sequence. Celluloid uses and asynchronous messaging layer that you can communicate with using normal Ruby method call syntax. However, when you call a method on a concurrent object in Celluloid, the "message" you send is quite literal and takes the form of a request object which waits for a response object (instances of Celluloid::Call and Celluloid::Response respectively).

This approach is largely inspired by the gen_server abstraction within the Erlang/OTP framework. For you Erlang nerds who might be worried Celluloid tries to jam everything into gen_server-shaped boxes, let me say right away that isn't the case, but you will have to wait for a further installment of my blog to find out why.

Celluloid by Example: Parallel Map

Let's start with a simple, practical, real-world example. If you're interested in digging deeper into the guts of Celluloid before starting this, I'd suggest you check out the README. That said, let's start with a simple problem: how can we implement a parallel map? That is to say, how can we reimplement Enumerable#map such that all of the map operations are performed in parallel rather than sequentially?

As this is a contrived and relatively simple problem, I'll go ahead and share with you how you might do it using Ruby threads as opposed to using Celluloid:

This version alone winds up being all you need to accomplish simple parallel map operations in Ruby. Here are some examples of using it from irb:

This pmap implementation behaves just like we'd expect map to. It returns the value of the block for each element if everything succeeds correctly, and raises an exception if anything goes wrong along the way.

Now I'd like to show you how to refactor this code to fit into the concurrent object pattern Celluloid uses. Let's start by trying to represent this same code using an object to perform the computation:

To turn this into a concurrent object, we first need to include Celluloid::Actor. To achieve concurrency, we need to make the method that performs the computation callable asynchronously. The initialize method is called synchronously by spawn (in case something goes wrong during initialization), so we'll need to create a separate method that actually calls the given block:

After that we can rewrite Enumerable#pmap using this class:

This creates a new Mapper actor for each element and calls Mapper#run asynchronously on each of them. After every one of them is executing they're iterated again, this time checking the return value. Since actors can only process one method call at a time, the call to Mapper#value will block until Mapper#run has completed, even though Mapper#run was called asynchronously.

This approach of allowing a value to be computed in the background and then only blocking when the value is requested is called a future. You've now seen how to implement a future, but it's also baked directly into Celluloid itself. Here's how to implement Enumerable#pmap using Celluloid::Futures:

Like Mapper, Celluloid::Future takes arguments, passes them to a block, then runs that block in the background asynchronously. Only when the value is requested does it block the current thread.

Now that we have a pmap function, what can we do with it? How about we compare the time it takes to do a bit of Google Image Search screen scraping for different color images for a particular search term using regular map vs. pmap?

The performance metrics vary across Ruby implementations, but in general, the parallel version goes approximately 3.5x faster, and the Celluloid version is 5-10ms slower than the version written using raw Ruby threads.

While this example is fairly trivial, in the next installment of this blog I'd like to demonstrate how to write a Sinatra-based chat server similar to node_chat.

What does this mean for Revactor?

In 2008 I wrote another actor library called Revactor, based on Fibers and evented I/O. While those ideas have grown increasingly popular, Revactor never did. I attribute this largely to Revactor's API, which was a fairly literal translation of Erlang's APIs into Ruby form with too little focus on putting an easy and accessible face on things. If you saw Mike Perham's recent article on actors in Rubinius (Revactor used a mostly identical API, as did MenTaLguY's Omnibus library), the code can be a little daunting, to the point you might need to learn a little Erlang just to figure out how to use it.

Celluloid is the official successor to Revactor. While Celluloid is primarily based on thread-based actors, it's been designed from the ground up with the idea of eventually incorporating event-based actors as well which can interoperate with an event library like EventMachine or cool.io. I know I originally dissed Scala for having both thread-based and event-based actors, but short of an Erlang-like process abstraction, it's not a bad compromise.

What about Reia?

One of the projects I've been working on the longest is Reia, a Ruby-like programming language for the Erlang VM. Today happens to be Reia's third birthday, and I do have to say after three years it's not where I thought it would be. It's generally usable but still missing some big features I feel are needed to make it a "complete" language. The main thing I feel is missing from Reia is a concurrent object model, and you can think of Celluloid as being an in-Ruby prototype of how it would work. I started to work on this in the legacy branch of Reia, but felt like it was too big of a project to tackle until I had fleshed out some of the other fundamentals of the language.

After I feel comfortable with how Celluloid is working I would like to try reimplementing in Reia. After that I think Reia will evolve into a truly useful language which bridges the gap between object oriented languages and Erlang-style concurrency.

I think Celluloid has the potential to be a truly useful library in Ruby on its own, however. It provides most of the underpinnings needed for more Erlang-like concurrent applications without having to switch to a different language.

27 comments:

peak said...

Wonderful!

But there is something wrong -- it seems that "Mapper" objects are not garbage-collected.

E.g. using ruby 1.9.2p180, the following fails:

a = (1 .. 1000).map { |i| i * 8 }
3.times {
b = a.pmap { |n| n / 8 }
p b.length
}

Thanks.
peak@princeton.edu

chris said...

Well Tony, just when I had given up on Ruby as my choice language, you just gave it a new life.

I love the ruby syntax, but I needed it to be BOTH evented and threaded. Up to now, it was not possible, at least easily.

After reading about 2 hours, my guess is that we can probably get 2-3K standard requests/s (I mean not "hello world") out of a normal quadcore box, which is just what I need.

Tony, you're a genius.

My first question is, why didn't I hear from it before ? Such a thing should have spread like fire.

My guess is:
- Missing 3 examples : one for msql2, one for redis, and one for HTTP request (as EM-http_request)

- also, but this is not a big deal, the "future" should be the default, as most want to program synchronously ( res=mysql("select * from tb"); puts res; )

Also, I tested your Reel server with ab, and it seems that the first request always hangs with concurrency (don't pay attention to the actual rps, it is an old machine) :

Concurrency Level: 100
Time taken for tests: 9.701 seconds
Complete requests: 10000
Failed requests: 0
Write errors: 0
Total transferred: 700000 bytes
HTML transferred: 120000 bytes
Requests per second: 1030.77 [#/sec] (mean)
Time per request: 97.015 [ms] (mean)
Time per request: 0.970 [ms] (mean, across all concurrent requests)
Transfer rate: 70.46 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 6 133.9 0 3000
Processing: 0 13 226.3 4 6702
Waiting: 0 13 226.3 4 6701
Total: 1 19 337.2 4 9699

Percentage of the requests served within a certain time (ms)
50% 4
66% 4
75% 4
80% 4
90% 4
95% 12
98% 12
99% 12
100% 9699 (longest request)

Is it my server, or something you've seen before ?

Anyway, I am back to Ruby, thanks to you. Cheers !!!

chris said...

Tony, another thought:

I remember that fibers in jruby are quite expensive, as they open a new thread for each one.

I think there were plans to sort it with coroutines, but I can't find anywhere that it actually was. Is it ? Is rbx treating fibers the same way as jruby ?

youdontwant2know said...
This comment has been removed by the author.
NinaMeyers said...

I have so much to learn about IT industry. So much languages, so much information to know. For now all I can read about is the informationg about the best essays (school assignments season, you know). But I am definitely coming back to read more of your posts. Thank you.

Amir kh said...

دانلود ویروس کش usb fix 7.924
دانلود فیلم Dredd 2012 با کیفیت عالی
دانلود لانچر پرطرفدار ایفون 5 برای اندروید iPhone 5 Launcher

shina said...

Surveillancekart | CCTV System
Pestveda | Pest Control
Termite control
Surveillancekart | CCTV System
cctv installation services
best home security camera system

emma said...

cctv camera dealers in delhi
cp plus cctv camera online
hikvision camera online
cctv camera installation services in delhi
cctv camera installation services in gurugram
cctv camera installation services in gurgaon

khaled ali said...

شركة الصفرات لتنظيف المنازل بالرياض

shabnam praveen said...

Call girls in Kolkata
Call girls in Chandigarh
Call girls in Chandigarh
Call girls in Gurgaon
Call girls in Chandigarh
Call girls in Chandigarh

shabnam praveen said...

Call girls in Lucknow
Call girls in Guwahati
Call girls in Mumbai
Call girls in Jaipur
Call girls in Jaipur
Call girls in Jaipur
Call girls in Bangalore

Barbara Lineberger said...

freedom apk whatsapp plus creehack

Dida ELhaik said...

شركة كشف تسربات المياة بالبكيرية
شركة مكافحة حشرات بالبكيرية
شركة مكافحة النمل الابيض بالبكيرية
شركة رش مبيدات حشرية بالبكيرية
شركة تنظيف بالبكيرية
شركة تنظيف كنب بالبكيرية
شركة تنظيف شقق بالبكيرية
شركة تنظيف فلل بالبكيرية
شركة تنظيف مجالس بالبكيرية
شركة تنظيف منازل بالبكيرية

Dida ELhaik said...

شركة كشف تسربات المياة بالقصيم
شركة مكافحة حشرات بالقصيم
شركة مكافحة النمل الابيض بالقصيم
شركة رش مبيدات بالقصيم
شركة تنظيف بالقصيم
شركة تنظيف شقق بالقصيم
شركة تنظيف فلل بالقصيم
شركة تنظيف كنب بالقصيم
شركة تنظيف مجالس بالقصيم
شركة تنظيف منازل بالقصيم

Aditi Ray said...

TreasureBox is operated by a group of young, passionate, and ambitious people that are working diligently towards the same goal - make your every dollar count, as we believe you deserve something better.
Check out the best

sofa bed
shoe rack nz
bedroom furniture nz

harshak said...

https://acmarket.xyz/
Ac market
Ac market apk
Ac market downloading
ac market download cracked apps store

happy chick said...

ac market downloading
ac market install

dadyar said...

دستور تخلیه زمانی مطرح می شود که به واسطه قراردادی بین اشخاص در خصوص ملک، حق استفاده از ملک به طرف دیگر داده می شود (منظور همان قرارداد موجر و مستاجر است) اما پس از سر رسید تاریخ انقضاء قرارداد، مستاجر از تحویل ملک امتناء می کند.
در این مواقع موجر می تواند با ارائه دادخواست دستور تخلیه اقدام به بازپس گیری ملک تصرف شده فرماید. وکلا در این مواقع می توانند کمک شایانی به موجرین کنند تا هر چه سریعتر نه تنها حکم نخلیه صادر شود بلکه مراحل تخلیه نیز هر چه سریعتر طی شود. در صورت داشتن هرگونه سوال و یا نیاز به مشاوره می توانید از مشاوره رایگان انلاین و تلفنی گروه وکلای مستر دادیار استفاده نمایید.
دستور تخلیه فوری
همانطور که گفته شد در صورتی که تاریخ انتقضاء قرارداد فرا رسیده باشد. با ارائه دادخواست، دستور فوری تخلیه صادر می گردد. اما اگر به موجب شرط و یا مسائل دیگری صاحب ملک درخواست تخلیه ملک را داشته باشد، پس از بررسی های لازم و به حق بودن درخواست دستور صادر می گردد.
البته برای این امر شرایطی دیگری نیز لازم است از جمله:
قرارداد در دفتر اسناد رسمی تنظیم شده باشد و یا دو شاهد آن را امضا کرده باشند
مستاجر از پرداخت سه ماه اجاره خانه امنتاع کرده باشد.
ملک مورد اجاره را برای امر نا مشروع مورد استفاده قرار داده باشد و….
تفاوت خلع ید و دستور تخلیه
همان طور که در ابتدای بحث به آن پرداختیم، دادخواست تخلیه زمانی مطرح می گردد که قراردادی مابین طرفین وجود داشته باشد و اکنون به موجب سر رسید تاریخ انقضا و یا مواردی که به موجر حق باز پس گیری ملک را می دهد به وجود آید. اما در پرونده های خلع ید چنین نیست یعنی قرار دادی وجود ندارد تا بر مبنای آن دادخواستی تنظیم گردد. یعنی شخص مال غیر منقولی را به تصرف کرده است که در این صورت باید دادخواستی برای خلع ید تنظیم گردد.

تفاوت خلع ید و رای تخلیه
شرایط طرح دستور تخلیه

در ابتدا به برخی از شرایط طرح دادخواست اشاره شد، که در این بخش به صورت کاملتری بیان خواهند شد.

مشخص بودن زمان اجاره
تنظیم اجاره نامه در دو نسخه
امضا شدن اجاره نامه توسط دو شاهد
مشخص بودن تاریخ انقضا قرارداد و به انقضا رسیدن قرارداد
در صورت فرا نرسیدن تاریخ انقضای، باید حداقل سه ماه مستاجر از پرداخت اجاره بها امتناء کرده باشد
استفاده نامشروع از ملک توسط مستاجر
نقض شرط عدم عدم انتقال به غیر توسط مستاجر
در صورت مطرح بودن سند رسمی در مورد اجاره، باید دادخواست برای گرفتن حکم تخلیه اقدام کرد و نه دستور تخلیه که به نسبت مدت زمان بیشتر طول خواهد کشید تا پرونده به نتیجه نهایی برسد.
مرجع صالح رسیدگی به دادخواست دستور تخلیه
در اماکن و املاکی که دارای سرقفلی هستند مرجع صالح رسیدگی دادگاه می باشد. اما در املاکی که فاقد سرقفلی هستند، نزدیکترین شورای حل اختلاف نزدیک ملک مرجع رسیدگی به این نوع شکایات می باشد.
به این موضوع با اهمیت توجه داشته باشید که هیچگاه مشورت با افراد با تجربه و کار بلد باعث پشیمانی نمی شود. پس حتما سعی کنید حتی در صورت نسپردن پرونده به وکیل، حتما به مشورت با آنها بپردازید. چرا که اشتباه در پرونده های حقوقی می تواند رای دادگاه را برگرداند و حق شما ضایع گردد.
هزینه دادرسی دستور تخلیه
دستورتخلیه از ناحیه صاحب خانه از جمله دعاوی غیر مالی محسوب می شود و هزینه دادرسی آن مساوی است با هزینه دادرسی دعاوی غیر مالی.
چگونگی اجرای دستور تخلیه
پس از صادر شدن دستور جهت تخلیه ملک به مستاجر ابلاغ می گردد. در صورتی که مستاجر از دستور پیروی نکن، ابلاغیه ای به کلانتری محل فرستاده می شود تا مامورین کلانتری نسبت به اجرای دستور و باز پس گیری ملک اقدام کنند.

harshak said...

Appvn App, is a gathering of independent, simple to introduce App the executives apparatuses for Android OS, App&APK Management, APK Downloader.

ravi said...

appvn apk
appvn app
https://apkmist.com/
redbox tv apk
redbox tv app

Sabarwal said...

Nice post. I learn something totally new and challenging on websites I stumbleupon on a daily basis. It will always be helpful to read through content from other writers and practice something from other sites.
Find More

Nannie Co Pam said...

IEEE Project Domain management in software engineering is distinct from traditional project deveopment in that software projects have a unique lifecycle process that requires multiple rounds of testing, updating, and faculty feedback. A IEEE Domain project Final Year Projects for CSE system development life cycle is essentially a phased project model that defines the organizational constraints of a large-scale systems project. The methods used in a IEEE DOmain Project systems development life cycle strategy Project Centers in Chennai For CSE provide clearly defined phases of work to plan, design, test, deploy, and maintain information systems.


This is enough for me. I want to write software that anyone can use, and virtually everyone who has an internet connected device with a screen can use apps written in JavaScript. JavaScript Training in Chennai JavaScript was used for little more than mouse hover animations and little calculations to make static websites feel more interactive. Let’s assume 90% of all websites using JavaScript use it in a trivial way. That still leaves 150 million substantial JavaScript Training in Chennai JavaScript applications.

Unknown said...

The data transferred over the platform secured with a binding connection protocol. the app does what you can do using Bluetooth or NFC, but faster. also works more quickly for data transfer between PCs and mobile devices, compared to USB drive transfer.
https://shareits.xyz/
SHAREit
SHAREit apk

Unknown said...

If you don't know what is an SEO company? and want to hire an SEO company for your business then before hiring any SEO company you should do a full research on it. Because first you have to clear your business requirements for SEO services only after that you can go for the right SEO company for your business.

webspace said...

We are Webspace Inc. organization working as the Best Digital Marketing Company in USA and we give many services to our client that is website designing, website development, Search Engine Optimization, E-commerce web Designing, Software Development, Google Adword and Mobile Application.
Web Development company in Los Angeles
web design New York
web development New York
online marketing New York
ecommerce web development New York
internet marketing New York
SEO company New York
seo company USA
Web development company
Web development company California
Web development company Los angeles
Professional Web Design Services USA
Website Design Comapny
Web Design Company
webiste design services
Web Development Company in USA
Web Development Services in USA
website development company in usa
Ecommerce Website Development Company in USA
Ecommerce Website Development Services
custom ecommerce development
Ecommerce Website Development Company In Usa
CMS Web Design Services USACMS Website development company In Usa
Digital Marketing compnay in Usa
Online Marketing Services
Digital Marketing Company Usa
Seo Comapny In usa
Professional Software Development Company USA
software development company
custom software development company
custom software development In Usa
App Development Services USA
Mobile App development Company
Mobile Application Development Services

admin said...

Happy new year Images
Merry Christmas 2019 Images
Happy new year 2020 wishes

Happy new year 2020 images
Merry Christmas 2019 Images
Happy new year 2020

Amber Collins said...

Appreciate the recommendation. Will try it out.

www.caramembuatwebsiteku.com/tips-tentang-struktur-website