Proposal of a new concurrency model for Ruby 3 [pdf]

pmontra · on Sept 8, 2016

Tl;dr The goal is to keep compatibility with Ruby 2. It introduces the concept of guilds and channels to send objects between guilds. The bullet points below are quoted from a couple of slides, the other text is mine:

* Guild has at least one thread (and a thread has at least one fiber)

* Threads in different guilds can run in parallel

* Threads in a same guild can not run in parallel because of GVL (or GGL: Giant Guild Lock)

A guild can't access the objects of other guilds.

About channels:

* We have Guild::Channel to communicate each other

* 2 communication methods

1. Copy

2. Transfer membership or Move in short

Copy is a deep copy and the object is duplicated into the destination guild. A transfer removes an object from a guild and makes it available to another.

There are also immutable objects that are available to all guilds. An obvious example are numbers, which are objects in Ruby, booleans and symbols. I think that other objects are frozen with https://ruby-doc.org/core-2.3.1/Object.html#method-i-freeze

They already did some encouraging benchmarks.

catnaroek · on Sept 8, 2016

> A guild can't access the objects of other guilds.

> 2. Transfer membership or Move in short

How is this enforced? What exactly happens at runtime if a guild tries to manipulate an object that belongs to another? (Absent a compile-time check, this is always a possibility.)

dragonwriter · on Sept 8, 2016

> How is this enforced? What exactly happens at runtime if a guild tries to manipulate an object that belongs to another? (Absent a compile-time check, this is always a possibility.)

It would seem that:

(1) Guild ownership would have to be tracked in the runtime, obviously.

(2) Any access from Ruby code in the runtime, the runtime would also know what Guild the access request came from as well as the Guild the object belonged to against which access was sought.

(3) The runtime would be required to fail in some well-defined way (presuming, raising an exception in the requester) when the rules were violated.

It should be reasonably straightforward to assure this for all accesses within the runtime, since you can just make sure that there is no method to request access which isn't always attached to the Guild that the request comes from. It may be possible to break the runtime with poorly-behaved extension code that subverts the normal mechanisms, and it may be impossible to fully protect against that, but that's pretty much always a potential with extension code.

catnaroek · on Sept 8, 2016

How would you transfer ownership of big linked data structures? See my other comment: https://news.ycombinator.com/item?id=12455566

I'm not particularly worried about C extensions. I already know those are a lost case.

dragonwriter · on Sept 8, 2016

> How would you transfer ownership of big linked data structures?

Very carefully?

More seriously, I think with guilds, what you absolutely don't want to do is build yourself into a position where you ever want to move a big linked data structure (that is, unless you know you are only going to use it in one guild, you never want to build a big linked data structure of mutable objects.)

Big structures of mutable objects should be guild local (or external, in a store that has its own controls for concurrent access.)

_ko1 · on Sept 8, 2016

absolutely.

pmontra · on Sept 8, 2016

It's similar to Rust's transfer of ownership. Rust is compiled so you get a compile time error if you try to access something you can't access anymore.

Check the code at http://rustbyexample.com/scope/move.html and run it (inside the page). There is a commented out println towards the end. Comment it in, run the code again and see the compiler error.

More about transfer of ownership at https://doc.rust-lang.org/book/ownership.html

catnaroek · on Sept 8, 2016

I already know very well how Rust works. What I'm not convinced of is that ownership can be correctly and efficiently enforced purely by runtime mechanisms. (Presumably Ruby's implementors aren't interested in introducing static checks anytime soon, right?)

For example, how would you transfer ownership of a big linked data structure from one guild to another?

(0) In Rust, this is as easy as handing ownership of the root node (an O(1) operation) to another thread. Ownership is transitive: whoever owns the root node also owns whatever the root node owns.

(1) In Ruby, I don't see how transitive ownership could possibly work. If I understand the proposal correctly, every object is owned directly by a guild, never by a parent object. Thus, you would have to traverse the entire linked data structure to transfer ownership of every node. This is O(n) work. Making things worse, you would have to make sure that the ownership transfer is atomic - no other part of the program should see the data structure in a “partially transferred” state.

Another example: Say you initially have a guild with three objects, Foo, Bar and Qux, where Foo and Bar point to Qux. If I transfer ownership of Foo, should Qux be transferred as well?

(0) In Rust, the type system forces me to explicitly distinguish between the following possibilities:

(0.a) Foo exclusively owns Qux, and Bar merely borrows it. In this case, Foo and Qux are frozen, and thus can't be transferred until Bar's borrow ends.

(0.b) Bar exclusively owns Qux, and Foo merely borrows it. In this case, transferring Foo doesn't change the fact Qux is owned by Bar.

(0.c) Foo and Bar jointly own Qux (using an Arc). In this case, transferring Foo doesn't change the fact Qux is jointly owned.

(1) In Ruby, what exactly should happen here? Have these guys really thought about the possibilities?

_ko1 · on Sept 8, 2016

> Thus, you would have to traverse the entire linked data structure to transfer ownership of every node. This is O(n) work.

Correct.

> Making things worse, you would have to make sure that the ownership transfer is atomic - no other part of the program should see the data structure in a “partially transferred” state.

Correct. Guild::Channel#transfer_ownership() does it.

Basically, share big linked data with multiple threads is difficult (simply we need to lock every access).

> Another example: Say you initially have a guild with three objects, Foo, Bar and Qux, where Foo and Bar point to Qux. If I transfer ownership of Foo, should Qux be transferred as well?

Foo -> Qux; Bar -> Qux

Yes, Qux also moved. Programmer can know by "exception" when accessing Qux via Bar after transfer.

This "realization" is the key of Guild. On threads, we can't realize that Qux is shared mutable.

catnaroek · on Sept 8, 2016

Thanks. This clarifies a lot of things.

parenthephobia · on Sept 8, 2016

> For example, how would you transfer ownership of a big linked data structure from one guild to another?

Slowly.

Objects well-suited to transfer would be shallow mutable structures with immutable data "underneath". Immutable data can be shared, so they'd be ignored by the transfer logic.

Another approach likely to be common would be not transferring objects at all, but sending proxy objects to other guilds which transparently marshal method invocations between those guilds and the object's home guild. For large mutable complexes which are intertwined with everything a guild does, that's probably a more manageable approach.

> Making things worse, you would have to make sure that the ownership transfer is atomic - no other part of the program should see the data structure in a “partially transferred” state

This is easy. You just run the transfer operation with the guild's mutex locked. No other guild can have a reference to a mutable object, and an immutable object doesn't need to be transferred.

> In Ruby, what exactly should happen here? Have these guys really thought about the possibilities?

It seems obvious that every linked mutable object is invalidated, so Qux will be transferred.

catnaroek · on Sept 8, 2016

> Objects well-suited to transfer would be shallow mutable structures with immutable data "underneath". Immutable data can be shared, so they'd be ignored by the transfer logic.

How often are data structures designed like this in Ruby?

> Another approach likely to be common would be not transferring objects at all, but sending proxy objects to other guilds which transparently marshal method invocations between those guilds and the object's home guild.

What if said “home guild” ends up overburdened with requests coming from all over the place?

> This is easy. You just run the transfer operation with the guild's mutex locked.

You mean both the sender and the receiver's mutexes? I'm worried about the receiver being able to observe partial transfers.

eregon · on Sept 15, 2016

The object graph would only be handed to the receiver once it's fully transferred I would imagine.

riffraff · on Sept 8, 2016

you get a runtime error

catnaroek · on Sept 8, 2016

Is it a guaranteed runtime error, or a best-effort thing, à la Java's ConcurrentModificationException?

riffraff · on Sept 8, 2016

as I understand it, a guaranteed error, it's at page 50 of the slide

    • Accessing from the source guild is invalidated
    • Cause exceptions and so on
    • ex) obj = “foo”
          ch.transfer_membership(obj)
          obj.upcase #=> Error!!
          p(obj) #=> Error!

_ko1 · on Sept 8, 2016

rst · on Sept 8, 2016

Slides also note that treatment of some state in pre-existing Ruby code, e.g., instance variables of class objects, gets messy. (Class variables are listed as per-Guild, but a fair amount of Ruby code uses instance vars on class objects instead.)

_ko1 · on Sept 8, 2016

So we need to rewrite to support multi-guilds application.

bad_user · on Sept 8, 2016

So guilds are effectively OS-managed processes?

audunw · on Sept 8, 2016

No, the heap is shared among guilds. See the last slide.

Moving data between guilds is cheap because data does not have to be copied. Referencing frozen (immutable) data is cheap to.

It seems it will track ownership of objects to make sure guilds don't access other guilds data. But it doesn't seem that it uses OS-level data protection.

pmontra · on Sept 8, 2016

From the slides I got the idea that there is only one OS process with guilds and possibly threads within guilds. But it could be that the language doesn't care and that's an implementation detail. We'll see.

azr79 · on Sept 8, 2016

sounds reasonable to me

_ko1 · on Sept 8, 2016

Could you link to http://www.atdot.net/~ko1/activities/2016_rubykaigi.pdf ? current one is on temporary file space (will be removed soon).

tenderlove · on Sept 8, 2016

Apparently I can't change the link. I'm sorry! :-(

lake99 · on Sept 8, 2016

The mods can do it. I once left a comment about changing the link. They saw it and changed it on their own. I don't know how to notify them though.

scott_s · on Sept 8, 2016

The mods will tend to see questions in threads on the main page, but the way to contact them directly is hn@ycombinator.com. (From the guidelines, https://news.ycombinator.com/newsguidelines.html)

yxhuvud · on Sept 8, 2016

I'm reasonably certain they are notified if the article is flagged, but I suppose it may not be the correct way of doing it.

nitrogen · on Sept 8, 2016

AIUI flagging the article can also affect its ranking or kill it entirely, so it's probably not the best way.

kent1 · on Sept 8, 2016

I worked on a similar proposal during my PhD thesis. It is formalized for a Java-like language and implemented in the Jikes RVM. We also carried a proof of isolation using Coq.

https://tel.archives-ouvertes.fr/tel-00933072

mattnewton · on Sept 8, 2016

That looks really useful. If you have time, please chime in on the proposal!

kent1 · on Sept 9, 2016

The ownership check is requiered for each access to an object. However it is straightforward to understand that successive checks of the same object can be optimized out if the object has not been passed to another owner. In this thesis I describe dynamic and static analyses to remove the unecessary checks.

jph · on Sept 8, 2016

The key points IMHO:

1. This Ruby 3 proposal says that Ruby 2 compatibility is mission critical, therefore this proposal rejects concurrency solutions from other languages (e.g. Erlang) and concepts (e.g. functions) and data structures (e.g. immutable collections).

2 Instead the proposal is to create a fast copy-on-write with rules to "deep freeze" some kinds of objects and primitives into an immutable sharable state.

nateberkopec · on Sept 8, 2016

> This Ruby 3 proposal says that Ruby 2 compatibility is mission critical

Matz has been very public about his fear of a "Python 3" situation occurring in the Ruby community.

awj · on Sept 8, 2016

And rightly so, I should think. Given the presence of languages like Elixir and Go, creating a situation where you are breaking people's code to introduce multicore programming systems is a pretty bad idea.

I can easily see how people might (rightly or wrongly) say "Ruby 3 broke my code, I'm rewriting in Go".

readittwice · on Sept 8, 2016

Hmm, I am wondering how moving ownership would work in a GC'ed system. You could have arbitrarily many references to the moved object (or subobjects). The slides say that an exception is thrown if an object of a different guild is accessed, but doesn't that mean that Ruby needs to check the guild at every object access?

Transfering ownership would probably also mean that Ruby not only needs to move one object but probably all subobjects recursively as well. I assume here that "moving" just means updating the guild field for each object.

Is this really feasible or wouldn't just copying the object be faster... I don't know of any system with gc that uses moving to transfer mutable objects between threads. Do such systems exist? Are there better ways of implementing this?

chrisseaton · on Sept 8, 2016

> The slides say that an exception is thrown if an object of a different guild is accessed, but doesn't that mean that Ruby needs to check the guild at every object access?

Ruby is already checking the class of the object on every access. You could combine the guild and the class into a tuple and compare against that instead, so it adds no extra overhead.

There is a paper at OOPSLA this year on doing just that http://2016.splashcon.org/event/splash-2016-oopsla-efficient...

aardvark179 · on Sept 8, 2016

I can't comment on that paper since it doesn't appear to be available publicly yet, so I'm just going to talk about guilds as proposed.

Adding a guild word to each object header is certainly a way to check ownership, and should be a cheap check to perform in the interpreter, but will obviously add some extra overhead to standard program execution.

The thing that concerns me is that explicit ownership passing can introduce as many bugs as it solves. If I have two objects A and B, with A holding a reference to B, then I can freeze A and freely pass it between guilds, but if I try and touch B I'll get an error until that too has been frozen or its ownership transferred. The same problems occurs with explicit ownership transfer of a non-frozen A, which leaves you with the slower option of a deep-copy or a recursive ownership transfer which can have equally unexpected consequences.

The "Ruby global data" slide also gives me the scream heebie-jeebies, as did finding stack overflow answers on how to unfreeze objects in MRI. I'm sure nothing will go wrong. :-)

Having said all that, it probably can work nicely for the common use cases of balancing requests between a group of worker guilds where the request is a simple data structure whose ownership can be safely transferred, but it would be hard to do a general work stealing solution that was always safe.

parenthephobia · on Sept 8, 2016

> Adding a guild word to each object header is certainly a way to check ownership, and should be a cheap check to perform in the interpreter, but will obviously add some extra overhead to standard program execution

It needn't be done this way. When an object is invalidated it could have its class pointer sneakily changed into a special "invalid object" class. Any attempt to do anything concrete with the object would be rebuffed, but normal object accesses wouldn't be changed.

> The thing that concerns me is that explicit ownership passing can introduce as many bugs as it solves. If I have two objects A and B, with A holding a reference to B, then I can freeze A and freely pass it between guilds, but if I try and touch B I'll get an error until that too has been frozen or its ownership transferred.

At least you get an error. Ultimately, the only alternative with comparable performance is sharing mutable references. That avoids this specific problem but is open to the full assortment of problems that can occur with concurrent mutable state, many of which aren't automatically detectable in principle.

> The "Ruby global data" slide also gives me the scream heebie-jeebies, as did finding stack overflow answers on how to unfreeze objects in MRI. I'm sure nothing will go wrong. :-)

If this proposal is adopted it's a simple matter to prohibit unfreezing objects that have been shared. :)

Not that it would really be necessarily. If you're reaching into MRI's internals to unfreeze an object, then it's up to you to make sure that things don't break.

Roboprog · on Sept 8, 2016

Things might generate an error, but they shouldn't just quietly munge stuff. God knows how many times I've had to clean up Java servlet stuff that cross-talked between threads. Yuck.

I really hope they make this work in Ruby 3. If you program in a "functional" style anyway, this approach ("relaxed Erlang" style messaging) should fit nicely, as you would not be mutating things willy-nilly anyway. Of course, FP practices are a hard sell to the "OOP is the one true abstraction" (COBOL with encapsulated DATA DIVISIONSs) crowd.

chrisseaton · on Sept 8, 2016

If you want to read the paper now I'm sure the authors will email you a draft.

VeejayRampay · on Sept 8, 2016

Hi Chris, you've mentioned that you didn't like discussing design choices made or attempted to be made in Ruby in the past, preferring to focus on the technical side of implementing the language, so no worries if you don't want to weigh in on this, but how do you feel about this proposal in general, in terms of feasibility and upside/downside of the technique described?

chrisseaton · on Sept 8, 2016

Yes I think it looks like a very sensible way forward. It will let the VM share and optimise a lot behind the scenes but provide the illusion of clean isolation between the parallel guilds.

VeejayRampay · on Sept 8, 2016

Glad to hear it. And thanks for all the hard work on the language I use every day.

Roboprog · on Sept 8, 2016

Hmm. I was under the impression that when you transferred the membership of an object, the "guild-local" or local lexical scope variable would be nulled out. Little to check if that's the case.

ergl · on Sept 8, 2016

That's exactly what Pony (http://www.ponylang.org/) does. It has a gc, with all the actors in the system communicating through shared memory. It uses the concept of 'capabilities' to check the owner of any given reference and disallow read or write permissions to other objects / actors.

rurban · on Sept 8, 2016

Well, yes. But the pony capabilities system checks ownership at compile-time already and has a much faster GC and smaller objects (actors), while in Ruby 3 you defer the deadlock or race errors to run-time.

_ko1 · on Sept 8, 2016

It's magic of transferring membership. I omitted details on slides.

transfire · on Sept 8, 2016

Hope thy improve the syntax, it looks horrid -- code in strings and all.

_ko1 · on Sept 8, 2016

Because of current limitation. We'll improve it.

masterleep · on Sept 8, 2016

How would you use this to parallelize Rails requests? I guess you would need a pool of guilds, each with its own set of controllers, etc.

Since the requests would not be in the "main" guild, it might be painful to call into gems.

artellectual · on Sept 8, 2016

I guess you could boot up a pool of guilds in your process or better yet get generated on demand as requests are coming in, to process the request, and kill the guild off when the process is done since the request object shouldn't be shared.

It all really depends on how much overhead there is to create and destroy guilds. If it's easy then ideally you could start 100s of guilds or 1000s should your hardware allow it.

I see guilds as a subprocess with its own isolated resources.

pmontra · on Sept 8, 2016

Ideally guilds could be equivalent to lightweight processes at application level (not OS), much like in Erlang. Then they could be scheduled to run concurrently using OS threads (multiple guilds per thread) and take advantage of multiple cores. That's part of BEAM, the Erlang VM. I think it's going to take a while.

_ko1 · on Sept 8, 2016

Similar to Erlang process, but more heavy weight (because it creates OS thread per Guild).

_ko1 · on Sept 8, 2016

More correctly, making a OS thread per a Ruby thread, and creating a Guild makes 1 Ruby thread.

ivoras · on Sept 8, 2016

Ok, "guilds"? Is the principle behind this so much different than everything done before that it requires repurposing a completely new word?

On par with That's "crates". Gives the impression some people just want to be remembered as inventing names.

dragonwriter · on Sept 8, 2016

> "guilds"? Is the principle behind this so much different than everything done before that it requires repurposing a completely new word?

Pretty much. I mean, if there is a standard name for a thing between a process and a thread that is not a thread group, I haven't heard it.

DougBarth · on Sept 8, 2016

If I'm reading this proposal correctly, locks will still be needed within multithreaded guilds to guard mutations against complex object graphs.

Here's my reasoning. Since the GVL is insufficient to guard against data races on Ruby 2, under the guild system, locks would be needed to guard against concurrency issues if multiple threads are present.

It would seem like the intention would be to replace usages of Thread with Guild to avoid the concurrency issues inherent with threaded code. Will there be API support to create a Guild that only allows a single thread?

dragonwriter · on Sept 8, 2016

> locks will still be needed within multithreaded guilds

It seems to me that is the intent; that is, any Ruby code that exists now is single-guild Ruby 3 code -- if its multithreaded, it needs locks, for the same reason it does now.

> It would seem like the intention would be to replace usages of Thread with Guild to avoid the concurrency issues inherent with threaded code

I think that'll be a common use case, though running what amount to multiple "legacy" Ruby 2 multithreaded systems in separate Guilds in the same Ruby 3 process seems also to be an intended supported use case.

> Will there be API support to create a Guild that only allows a single thread?

It certainly sounds like a good idea.

nateberkopec · on Sept 8, 2016

> If I'm reading this proposal correctly, locks will still be needed within multithreaded guilds to guard mutations against complex object graphs.

That is correct. You'll still need to use locks if doing multi-threading inside of Guilds.

It looks like Guilds can have 1 to X threads.

DanWaterworth · on Sept 8, 2016

This is interesting. It doesn't mention GC, but since frozen objects can be shared between guilds, I assume the GC remains global. Perhaps this will trigger interest in immutable datastructures in ruby.

_ko1 · on Sept 8, 2016

Quoted from slides: > GC/Heap > * Share it. Do stop the world parallel marking- and lazy concurrent sweeping. > * Synchronize only at page acquire timing. No any synchronization at creation time.

DanWaterworth · on Sept 9, 2016

I stand corrected.

zeckalpha · on Sept 8, 2016

How does this compare to the ongoing efforts to remove the GIL in Python? It looks like the Ruby GVL would stay, but be scoped to a Guild, rather than a Process?

the_mitsuhiko · on Sept 8, 2016

The proposal in the PDF looks like what I tried to implement many years ago in Python but I gave up in agony due to some stupid design decisions in Python (in particular non heap types and how type checks in the c level work).

Python's attempts to remove the GIL are not going anywhere really.

sanxiyn · on Sept 8, 2016

What do you think of PyParallel's approach of "removing" GIL?

http://pyparallel.org/

claudiug · on Sept 8, 2016

maybe you should give them some inside :) Not the follow the same path of painful mistakes :)

viraptor · on Sept 8, 2016

Are there any ongoing efforts? There's pypy trying to use STM, but don't know of any other attempts - definitely not in cpython.

zeckalpha · on Sept 10, 2016

Yes. There was an excellent talk at Pycon 2016, entitled the GILectomy: https://www.youtube.com/watch?v=P3AyI_u66Bw

artellectual · on Sept 8, 2016

Anyone correct me if I'm wrong here.

Seems like a guild is just a subprocess with its own resources. And you copy objects over as needed. And when the guild is done it will get garbage collected. Like other objects.

the_mitsuhiko · on Sept 8, 2016

> Seems like a guild is just a subprocess with its own resources.

In an ideal world the guild is the interpreter state which would be very far from processes. How far down you can go there largely depends on what promises the API made to C extensions and other things in the past.

Someone · on Sept 8, 2016

I think I would consider implementing it as 1:1 threading where every thread=guild runs its own set of green threads.

That likely would be faster than having OS threads in each guild that use PS locks to prevent running >1 of them concurrently.

_ko1 · on Sept 8, 2016

CRuby/MRI supports C-extension which can use TLS (thread-local-storage). So that each Ruby threads runs on one OS thread.

_ko1 · on Sept 8, 2016

Like sub-process, but share many things like bytecodes (ISeq in MRI context), class and module objects (and method tables) and so on. Also we can share immutable objects (deeply frozen objects) like threads.

Roboprog · on Sept 8, 2016

The Guild concept feels much like a mix of actors and messaging from Erlang and Go. Not as restrictive as Erlang, but not as permissive as Go. (there was a nod to Elixir/Erlang in the slides)

_ko1 · on Sept 8, 2016

Yes.

Actually, I think Go is some kind of multi-thraeding (goroutine is only a useful mechanism on the "threads" and can't help to avoid multi-threads difficulties (but this design helps to reduce difficulties)).

Roboprog · on Sept 18, 2016

I would like to thank the team behind this proposal, very much, by the way. I have not tinkered with Ruby in a few years, but this looks like a major breakthrough.

pmontra · on Sept 8, 2016

Yes, it becomes a guild global lock.

sciurus · on Sept 8, 2016

This reminds me in some ways of Eric Snow's (rejected, afaik) proposal to extend "subinterpreters" to allow parallelism in Python.

https://lwn.net/Articles/650489/

jellymann · on Sept 13, 2016

The PDF appears to have been removed. I'm getting a "Not Found" page.

gamesbrainiac · on Sept 8, 2016

Any idea where the video of the talk is?

steveklabnik · on Sept 8, 2016

Ruby Kaigi has just started, so I'm guessing it will be a while.

_ko1 · on Sept 8, 2016

claudiug · on Sept 8, 2016

do we have any date from this new way of doing concurrency in ruby?

pkmiec · on Sept 8, 2016

the new concurrency is part of ruby 3. matz says he wishes for it to be out by 2020. but who knows :).

cutler · on Sept 8, 2016

So 4 years to go. Not quite Perl 6 but it could be a bit late in the day considering the rate at which Ruby is losing mindshare.

porges · on Sept 8, 2016

I{HEART}COM