Processes Are (CoW) Friendly

Let’s take a step back from looking at code for a minute to talk about a higher level concept and how it’s handled in different Ruby implementations.

Being CoW Friendly

As mentioned in the forking chapter, fork(2) creates a new child process that’s an exact copy of the parent process. This includes a copy of everything the parent process has in memory.

Physically copying all of that data can be considerable overhead, so modern Unix systems employ something called copy-on-write semantics (CoW) to combat this.

As you may have guessed from the name, CoW delays the actual copying of memory until it needs to be written.

So a parent process and a child process will actually share the same physical data in memory until one of them needs to modify it, at which point the memory will be copied so that proper separation between the two processes can be preserved.

arr = [1,2,3]

fork do
  # At this point the child process has been initialized.
  # Using CoW this process doesn't need to copy the arr variable, 
  # since it hasn't modified any shared values it can continue reading 
  # from the same memory location as the parent process.
  p arr
end
arr = [1,2,3]

fork do
  # At this point the child process has been initialized.
  # Because of CoW the arr variable hasn't been copied yet.
  arr << 4
  # The above line of code modifies the array, so a copy of
  # the array will need to be made for this process before
  # it can modify it. The array in the parent process remains
  # unchanged.
end

This is a big win when using fork(2) as it saves on resources. It means that fork(2) is fast since it doesn’t need to copy any of the physical memory of the parent. It also means that child processes only get a copy of the data they need, the rest can be shared.

In order for you to have CoW semantics, a Ruby implementation needs to be written in such a way that it doesn’t clobber this feature provided by the kernel. Versions of MRI >= 2.0 are written in such a way that they respect and preserve these semantics. Versions of MRI <= 1.9 did not preserve the semantics.

But how?

MRI's garbage collector uses a 'mark-and-sweep' algorithm. In a nutshell this means that when the GC is invoked it must traverse the graph of live objects, and for each one the GC must 'mark' it as alive.

In MRI <= 1.9, this 'mark' step was implemented as a modification to that object in memory. So when the GC was invoked right after a fork, all live objects were modified, forcing the OS to make copies of all live Ruby objects and foregoing any benefit from CoW semantics.

MRI >= 2.0 still uses a mark-and-sweep GC, but preserves CoW semantics by storing all of the 'marks' in a small data structure in a disparate region of memory. So when the GC runs after a fork, this small region of memory must be copied, but the graph of live Ruby objects can be shared between parent and child until your code modifies an object.

What does this mean for you?

If you’re building something, or using tools, that depend heavily on fork(2), you should expect much better memory utilization with MRI 2.0 than with earlier versions.