Zombie Processes

At the beginning of the last chapter we looked at an example that used a child process to asynchronously handle a task in a fire and forget manner. We need to revisit that example and ensure that we clean up that child process appropriately, lest it become a zombie!

Good Things Come to Those Who wait(2)

In the last chapter I showed that the kernel queues up status information about child processes that have exited. So even if you call Process.wait long after the child process has exited its status information is still available. I’m sure you can smell a problem here…

The kernel will retain the status of exited child processes until the parent process requests that status using Process.wait. If the parent never requests the status then the kernel can never reap that status information. So creating fire and forget child processes without collecting their status information is a poor use of kernel resources.

If you’re not going to wait for a child process to exit using Process.wait (or the technique described in the next chapter) then you need to ‘detach’ that child process. Here’s the fire and forget example from last chapter rectified to properly detach the child process:

message = 'Good Morning'
recipient = 'tree@mybackyard.com'

pid = fork do
  # In this contrived example the parent process forks a child to take
  # care of sending data to the stats collector. Meanwhile the parent
  # process has continued on with its work of sending the actual payload.

  # The parent process doesn't want to be slowed down with this task, and
  # it doesn't matter if this would fail for some reason.
  StatsCollector.record message, recipient
end

# This line ensures that the process performing the stats collection
# won't become a zombie.
Process.detach(pid)

What does Process.detach do? It simply spawns a new thread whose sole job is to wait for the child process specified by pid to exit. This ensures that the kernel doesn’t hang on to any status information we don’t need.

What Do Zombies Look Like?

# Create a child process that exits after 1 second.
pid = fork { sleep 1 }
# Print its pid.
puts pid
# Put the parent process to sleep so we can inspect the 
# process status of the child
sleep 5

Running the following command at a terminal, using the pid printed from the last snippet, will print the status of that zombie process. The status should say ‘z’ or ‘Z+’, meaning that the process is a zombie.

ps -ho pid,state -p [pid of zombie process]

In The Real World

Notice that any dead process whose status hasn’t been waited on is a zombie process. So every child process that dies while its parent is still active will be a zombie, if only for a short time. Once the parent process collects the status from the zombie then it effectively disappears, no longer consuming kernel resources.

It’s fairly uncommon to fork child processes in a fire and forget manner, never collecting their status. If work needs to be offloaded in the background it’s much more common to do that with a dedicated background queueing system.

That being said there is a Rubygem called spawnling (https://github.com/tra/spawnling) that provides this exact functionality. Besides providing a generic API over processes or threads, it ensures that fire and forget processes are properly detached.

System Calls

There’s no system call for Process.detach because it’s implemented in Ruby simply as a thread and Process.wait. The implementation in Rubinius is stark in its simplicity.