The GIL and MRI

In the last chapter, you learned the key differences between concurrency and parallelism. This is an important concept because MRI allows concurrent execution of Ruby code, but prevents parallel execution of Ruby code.

I’ll pre-emptively answer a question I’m sure you have: do JRuby and Rubinius have a GIL? Do they allow parallel execution of Ruby code? Both JRuby and Rubinius do not have a GIL, and thus do allow parallel execution of Ruby code.

At this point, the term GIL might be unfamiliar to you. Let’s tease it apart.

The global lock

The term GIL stands for Global Interpreter Lock. It’s sometimes called the GVL (Global VM Lock) or just The Global Lock. All three of these terms refer to the same thing. I’ll continue to use the term GIL from now on.

So what’s the GIL? The GIL is a global lock around the execution of Ruby code.

Think of it this way: there is one, and only one, GIL per instance of MRI (or per MRI process). This means that if you spawn a bunch of MRI processes from your terminal, each one will have its own GIL.

If one of those MRI processes spawns multiple threads, that group of threads will share the GIL for that process.

If one of these threads wants to execute some Ruby code, it will have to acquire this lock. One, and only one, thread can hold the lock at any given time. While one thread holds the lock, other threads need to wait for their turn to acquire the lock.

This has some very important implications for MRI. The biggest implication is that Ruby code will never run in parallel on MRI. The GIL prevents it.

An inside look at MRI

I want to walk you through the viewpoint of a thread trying to execute some Ruby code inside the MRI virtual machine. This will give you a better understanding of exactly how the GIL works and where it fits in.

Let’s pretend that MRI is executing this bit of code:

require 'digest/md5'

3.times.map do
  Thread.new do
    Digest::MD5.hexdigest(rand)
  end
end.each(&:value)

Nothing too exciting. Each thread will generate an MD5 digest based on a random number.

For simplicity’s sake, I’m going to skip right to the interesting part and assume that the threads have already been spawned. We’ll jump in assuming we have three threads spawned and ready to execute their block of Ruby code.

Remember that each MRI thread is backed by a native thread, and from the kernel’s point of view, they’re all executing in parallel. The GIL is a detail inside of MRI and doesn’t come into play except when executing Ruby code.

Since all three threads want to execute Ruby code, they all attempt to acquire the GIL. The GIL is implemented as a mutex (something you’ll see more of very soon). The operating system will guarantee that one, and only one, thread can hold the mutex at any time.

So all three threads attempt to acquire the GIL, but only one thread actually acquires it. The other two threads are put to sleep until that mutex becomes available again.

The thread that acquired the GIL (let’s call it Thread A), now has the exclusive right to execute Ruby code inside of MRI. Until Thread A releases the GIL, the other threads won’t get a chance to execute any Ruby code.

At this point, Thread A executes some arbitrary amount of Ruby code. How much? That’s unspecified, and left up to the MRI internals. After a certain interval, Thread A releases the GIL. This triggers the thread scheduler to wake up the other two threads that were sleeping, waiting on this mutex.

Now both of these threads vie for the GIL, and the kernel must decide again which thread will acquire it.

It’s given to a new thread; let’s call it Thread B. Now Thread B has exclusive ownership of the GIL. It can execute the Ruby code that it needs to. Meanwhile, Thread A has gone back and attempted to acquire the GIL again. So once again, the other two threads are sleeping, blocked waiting for their turn to acquire the GIL.

This should make it crystal clear how the GIL prevents parallel execution of Ruby code. It’s only possible for one thread to execute Ruby code at any given time.

The special case: blocking IO

Before I get to the motivations for this behaviour, I have to tell you about a special case: blocking IO. I’ve been saying the GIL prevents parallel execution of Ruby code, but blocking IO is not Ruby code.

In the above walkthrough, what happens when a thread executes some Ruby code that blocks on IO? Let’s say our example looked like this:

require 'open-uri'

3.times.map do
  Thread.new do
    open('http://zombo.com')
  end
end.each(&:value)

This Ruby code will trigger an HTTP request to be sent to the zombo.com server. Depending on network conditions and the status of zombo.com, this may take a long time to finish. Thankfully, MRI doesn’t let a thread hog the GIL when it hits blocking IO.

This is a no-brainer optimization for MRI. When a thread is blocked waiting for IO, it won’t be executing any Ruby code. Hence, when a thread is blocking on IO, it releases the GIL so another thread can execute Ruby code.

For the sake of posterity, let’s quickly run through the walkthrough from above with this blocking IO example. We’re already at the point where all the threads have been spawned. Now they all attempt to acquire the GIL to execute Ruby code.

Thread A gets the GIL. It starts executing Ruby code. It gets down to Ruby’s Socket APIs and attempts to open a connection to zombo.com. At this point, while Thread A is waiting for its response, it releases the GIL. Now Thread B acquires the GIL and goes through the same steps.

Meanwhile, Thread A is still waiting for its response. Remember that the threads can execute in parallel, so long as they’re not executing Ruby code. So it’s quite possible for Thread A and Thread B to both have initiated their connections, and both be waiting for a response.

Under the hood, each thread is using a ppoll(2) system call to be notified when their connection attempt succeeds or fails. When the ppoll(2) call returns, the socket will have some data ready for consumption. At this point, the threads will need to execute Ruby code to process the data. So now the whole process starts over again. I think you get the idea.

Why?

This walkthrough has painted a bit of a bleak picture of MRI internals. It seems that MRI is intentionally placing a huge restriction on parallel code execution. Why would it do this?

Surely, this isn’t meant as a malicious decision against you. Indeed, MRI core developers have been calling the GIL a feature for some time now, rather than a bug. In other words, the MRI team has expressed no intention of getting rid of the GIL.

There are three reasons that the GIL exists:

  1. To protect MRI internals from race conditions

    I’ve only covered this topic briefly thus far, but I’ve stressed that race conditions in your code can cause issues. The same issues that can happen in your Ruby code can happen in MRI’s C code. When it’s running in a multi-threaded context, it will need to protect critical parts of the internals with some kind of synchronization mechanism.

    The easiest way to reduce the number of race conditions that can affect the internals is to prevent multiple threads from executing in parallel.

  2. To facilitate the C extension API

    The ‘C extension API’ is a C interface to MRI’s internal functions, often used when people want to interface Ruby with a C library.

    Calling an MRI function from C is subject to the same GIL that the equivalent Ruby code would be subject to.

    This Ruby code:

    array = Array.new
    array.pop
    

    is equivalent to this C code:

    VALUE ary = rb_ary_new();
    VALUE last_element = rb_ary_pop(ary);
    

    The important thing to note is that these calls to the C extension API lock the GIL, just like the corresponding Ruby code.

    The other reason that the GIL exists for C extensions is so that MRI can function safely, even in the presence of C extensions that may not be thread-safe. Especially when wanting to integrate an existing C library with Ruby, thread-safety is not always guaranteed.

  3. To reduce the likelihood of race conditions in your Ruby code

    Just as the easiest way to protect MRI internals from race conditions is to disallow real parallel threading, it’s also the easiest way to protect your Ruby code from race conditions. In this regard, the GIL reduces the likelihood that you shoot yourself in the foot when using multiple threads. However, the cost for this reduction in entropy is high.

    In many situations, Ruby provides a lot of power to you, the user of the language. It trusts that you will do the right thing, knowing that it’s possible to really get things wrong. However, when it comes to multi-threading, MRI takes the opposite approach. They remove the ability of the system to do parallel threading.

    It’s a bit like wearing fully body armour to walk down the street: it really helps if you get attacked, but most of the time it’s just confining.

    It’s important to note that the GIL only reduces entropy here; it can’t rule it out all together. The next section goes into more detail on this.

Misconceptions

Now that you have an understanding of what the GIL is, and why it exists, this is probably the most important section of this chapter.

Over time, people have made false assumptions or incredible claims about the GIL. Unfortunately, it’s a misunderstood part of MRI. This leaves us in a situation where many people have an unwitting negative, or positive, impression of the GIL, without really understanding what it does or what it guarantees.

Up until now, I’ve given you some idea about how it works; now I’ll try to dispel two myths in the community.

Myth: the GIL guarantees your code will be thread-safe.

This isn’t true. Throughout this chapter, I’ve been careful to say that the GIL reduces the likelihood of a race condition, but can’t prevent it. When multiple threads are running in parallel, and a race condition is possible, the likelihood of it happening is higher. There are simply more opportunities for things to go wrong.

But with a GIL, two threads are never running Ruby code in parallel. This greatly reduces the likelihood of things going wrong, but still can’t prevent it. Let’s use this Ruby code as an example:

@counter = 0

5.times.map do
  Thread.new do
    temp = @counter
    temp = temp + 1

    @counter = temp
  end
end.each(&:join)

puts @counter

This probably isn’t code that you would write every day, but it illustrates the point. This is the expanded version of the += operator.

With no synchronization, even with a GIL, it’s possible that a context switch happens between incrementing temp and assigning it back to counter. If this is the case, it’s possible that two threads assign the same value to counter. In the end the result of this little snippet could be less than 5.

It’s rare to get an incorrect answer using MRI with this snippet, but almost guaranteed if you use JRuby or Rubinius. If you insert a puts in the middle of the block passed to Thread.new, then it’s very likely that MRI will produce an incorrect result. Its behaviour with blocking IO encourages a context switch while waiting for the thread to print to stdout.

As you get a better understanding of race conditions and thread safety in the upcoming chapters, the absurdity of the claim that the GIL guarantees thread safety will become clearer.

Myth: the GIL prevents concurrency.

This is a misunderstanding of terms.

The GIL prevents parallel execution of Ruby code, but it doesn’t prevent concurrent execution of Ruby code. Remember that concurrent code execution is possible even on a single core CPU by giving each thread a turn with the resources. This is the situation with MRI and the GIL.

So the GIL prevents parallel execution of Ruby code, but not concurrent execution.

The other important thing to remember is the caveat: blocking IO. The GIL allows multiple threads to be simultaneously blocked on IO. This means that you can use multiple threads to parallelize code that is IO-bound.

MRI’s multi-threading behaviour, with respect to blocking IO, is actually very similar to the behaviour of JRuby or Rubinius in the face of blocking IO. Any of these implementations will allow blocking IO to be parallelized.