Processes Can Wait
In the examples of fork(2) up until now we have let the parent process continue on in parallel with the child process. In some cases this led to weird results, such as when the parent process exited before the child process.
That kind of scenario is really only suitable for one use case, fire and forget. It’s useful when you want a child process to handle something asynchronously, but the parent process still has its own work to do.
message = 'Good Morning'
recipient = 'tree@mybackyard.com'
fork do
# In this contrived example the parent process forks a child to take
# care of sending data to the stats collector. Meanwhile the parent
# process has continued on with its work of sending the actual payload.
# The parent process doesn't want to be slowed down with this task, and
# it doesn't matter if this would fail for some reason.
StatsCollector.record message, recipient
end
# send message to recipient
Babysitting
For most other use cases involving fork(2) you’ll want some way to keep tabs on your child processes. In Ruby, one technique for this is provided by Process.wait
. Let’s rewrite our orphan-inducing example from the last chapter to perform with less surprises.
fork do
5.times do
sleep 1
puts "I am an orphan!"
end
end
Process.wait
abort "Parent process died..."
This time the output will look like:
I am an orphan!
I am an orphan!
I am an orphan!
I am an orphan!
I am an orphan!
Parent process died...
Not only that, but control will not be returned to the terminal until all of the output has been printed.
So what does Process.wait
do? Process.wait
is a blocking call instructing the parent process to wait for one of its child processes to exit before continuing.
Process.wait and Cousins
I mentioned something key in that last statement, Process.wait
blocks until any one of its child processes exit. If you have a parent that’s babysitting more than one child process and you’re using Process.wait
, you need to know which one exited. For this, you can use the return value.
Process.wait
returns the pid of the child that exited. Check it out.
# We create 3 child processes.
3.times do
fork do
# Each one sleeps for a random amount of number less than 5 seconds.
sleep rand(5)
end
end
3.times do
# We wait for each child process to exit and print the pid that
# gets returned.
puts Process.wait
end
Communicating with Process.wait2
But wait! Process.wait
has a cousin called Process.wait2
!
Why the name confusion? It makes sense once you know that Process.wait
returns 1 value (pid), but Process.wait2
returns 2 values (pid, status).
This status can be used as communication between processes via exit codes. In our chapter on Exit Codes we mentioned that you can use exit codes to encode information for other processes. Process.wait2
gives you direct access to that information.
The status
returned from Process.wait2
is an instance of Process::Status
. It has a lot of useful information attached to it for figuring out exactly how a process exited.
# We create 5 child processes.
5.times do
fork do
# Each generates a random number. If even they exit
# with a 111 exit code, otherwise they use a 112 exit code.
if rand(5).even?
exit 111
else
exit 112
end
end
end
5.times do
# We wait for each of the child processes to exit.
pid, status = Process.wait2
# If the child process exited with the 111 exit code
# then we know they encountered an even number.
if status.exitstatus == 111
puts "#{pid} encountered an even number!"
else
puts "#{pid} encountered an odd number!"
end
end
Communication between processes without the filesystem or network!
Waiting for Specific Children
But wait! The Process.wait
cousins have two more cousins. Process.waitpid
and Process.waitpid2
.
You can probably guess what these do. They function the same as Process.wait
and Process.wait2
except, rather than waiting for any child to exit they only wait for a specific child to exit, specified by pid.
favourite = fork do
exit 77
end
middle_child = fork do
abort "I want to be waited on!"
end
pid, status = Process.waitpid2 favourite
puts status.exitstatus
Although it appears that Process.wait
and Process.waitpid
provide different behaviour don't be fooled! They are actually aliased to the same thing. Both will accept the same arguments and behave the same.
You can pass a pid to Process.wait
in order to get it to wait for a specific child, and you can pass -1
as the pid to Process.waitpid
to get it to wait for any child process.
The same is true for Process.wait2
and Process.waitpid2
.
Just like with Process.pid
vs. $$
I think it's important that, as programmers, we use the provided tools to reveal our intent where possible. Although these methods are identical you should use Process.wait
when you're waiting for any child process and use Process.waitpid
when you're waiting for a specific process.
Race Conditions
As you look at these simple code examples you may start to wonder about race conditions.
What if the code that handles one exited process is still running when another child process exits? What if I haven’t gotten back around to Process.wait
and another process exits? Let’s see:
# We create two child processes.
2.times do
fork do
# Both processes exit immediately.
abort "Finished!"
end
end
# The parent process waits for the first process, then sleeps for 5 seconds.
# In the meantime the second child process has exited and is no
# longer running.
puts Process.wait
sleep 5
# The parent process asks to wait once again, and amazingly enough, the second
# process' exit information has been queued up and is returned here.
puts Process.wait
As you can see this technique is free from race conditions. The kernel queues up information about exited processes so that the parent always receives the information in the order that the children exited.
So even if the parent is slow at processing each exited child it will always be able to get the information for each exited child when it’s ready for it.
Take note that calling any variant of Process.wait
when there are no child processes will raise Errno::ECHILD
. It's always a good idea to keep track of how many child processes you have created so you don't encounter this exception.
In the Real World
The idea of looking in on your child processes is at the core of a common Unix programming pattern. The pattern is sometimes called babysitting processes, master/worker, or preforking.
At the core of this pattern is the concept that you have one process that forks several child processes, for concurrency, and then spends its time looking after them: making sure they are still responsive, reacting if any of them exit, etc.
For example, the Unicorn web server (http://unicorn.bogomips.org) employs this pattern. You tell it how many worker processes you want it to start up for you, 5 for instance.
Then a unicorn
process will boot up that will fork
5 child processes to handle web requests. The parent (or master) process maintains a heartbeat with each child and ensures that all of the child processes stay responsive.
This pattern allows for both concurrency and reliability. Read more about Unicorn in its Appendix at the end of the book.
For an alternative usage of this technique read through the Lookout class in the attached Spyglass project.
System Calls
Ruby’s Process.wait
and cousins map to waitpid(2).