Processes Can Get Signals
In the last chapter we looked at Process.wait
. It provides a nice way for a parent process to keep tabs on its child processes. However it is a blocking call: it will not return until a child process dies.
What’s a busy parent to do? Not every parent has the luxury of waiting around on their children all day. There is a solution for the busy parent! And it’s our introduction to Unix signals.
Trapping SIGCHLD
Let’s take a simple example from the last chapter and rewrite it for a busy parent process.
child_processes = 3
dead_processes = 0
# We fork 3 child processes.
child_processes.times do
fork do
# They sleep for 3 seconds.
sleep 3
end
end
# Our parent process will be busy doing some intense mathematics.
# But still wants to know when one of its children exits.
# By trapping the :CHLD signal our process will be notified by the kernel
# when one of its children exits.
trap(:CHLD) do
# Since Process.wait queues up any data that it has for us we can ask for it
# here, since we know that one of our child processes has exited.
puts Process.wait
dead_processes += 1
# We exit explicitly once all the child processes are accounted for.
exit if dead_processes == child_processes
end
# Work it.
loop do
(Math.sqrt(rand(44)) ** 8).floor
sleep 1
end
SIGCHLD and Concurrency
Before we go on I must mention a caveat. Signal delivery is unreliable. By this I mean that if your code is handling a CHLD signal while another child process dies you may or may not receive a second CHLD signal.
This can lead to inconsistent results with the code snippet above. Sometimes the timing will be such that things will work out perfectly, and sometimes you’ll actually ‘miss’ an instance of a child process dying.
This behaviour only happens when receiving the same signal several times in quick succession; you can always count on at least one instance of the signal arriving. This same caveat is true for other signals you handle in Ruby; read on to hear more about those.
To properly handle CHLD you must call Process.wait
in a loop and look for as many dead child processes as are available, since you may have received multiple CHLD signals since entering the signal handler. But….isn’t Process.wait
a blocking call? If there’s only one dead child process and I call Process.wait
again how will I avoid blocking the whole process?
Now we get to the second argument to Process.wait
. In the last chapter we looked at passing a pid to Process.wait
as the first argument, but it also takes a second argument, flags. One such flag that can be passed tells the kernel not to block if no child has exited. Just what we need!
There’s a constant that represents the value of this flag, Process::WNOHANG
, and it can be used like so:
Process.wait(-1, Process::WNOHANG)
Easy enough.
Here’s a rewrite of the code snippet from the beginning of this chapter that won’t ‘miss’ any child process deaths:
child_processes = 3
dead_processes = 0
# We fork 3 child processes.
child_processes.times do
fork do
# They sleep for 3 seconds.
sleep 3
end
end
# Sync $stdout so the call to #puts in the CHLD handler isn't
# buffered. Can cause a ThreadError if a signal handler is
# interrupted after calling #puts. Always a good idea to do
# this if your handlers will be doing IO.
$stdout.sync = true
# Our parent process will be busy doing some intense mathematics.
# But still wants to know when one of its children exits.
# By trapping the :CHLD signal our process will be notified by the kernel
# when one of its children exits.
trap(:CHLD) do
# Since Process.wait queues up any data that it has for us we can ask for it
# here, since we know that one of our child processes has exited.
# We loop over a non-blocking Process.wait to ensure that any dead child
# processes are accounted for.
begin
while pid = Process.wait(-1, Process::WNOHANG)
puts pid
dead_processes += 1
end
rescue Errno::ECHILD
end
end
loop do
# We exit ourself once all the child processes are accounted for.
exit if dead_processes == child_processes
sleep 1
end
One more thing to remember is that Process.wait
, even this variant, will raise Errno::ECHILD
if no child processes exist. Since signals might arrive at any time it’s possible for the last CHLD signal to arrive after the previous CHLD handler has already called Process.wait
twice and gotten the last available status. This asynchronous stuff can be mind-bending. Any line of code can be interrupted with a signal. You’ve been warned!
So you must handle the Errno::ECHILD
exception in your CHLD signal handler. Also if you don’t know how many child processes you are waiting on you should rescue that exception and handle it properly.
Signals Primer
This was our first foray to Unix signals. Signals are asynchronous communication. When a process receives a signal from the kernel it can do one of the following:
- ignore the signal
- perform a specified action
- perform the default action
Where do Signals Come From?
Technically signals are sent by the kernel, just like text messages are sent by a cell phone carrier. But text messages have an original sender, and so do signals. Signals are sent from one process to another process, using the kernel as a middleman.
The original purpose of signals was to specify different ways that a process should be killed. Let’s start there.
Let’s start up two ruby programs and we’ll use one to kill the other.
For these examples we won't use irb
because it defines its own signal handlers that get in the way of our demonstrations. Instead we'll just use the ruby
program itself.
Give this a try: launch the ruby
program without any arguments. Enter some code. Hit Ctrl-D.
This executes the code that you entered and then exits.
Start up two ruby
processes using the technique mentioned above and we’ll kill one of them using a signal.
-
In the first
ruby
session execute the following code:puts Process.pid sleep # so that we have time to send it a signal
-
In the second
ruby
session issue the following command to kill the first session with a signal:Process.kill(:INT, <pid of first session>)
So the second process sent an “INT” signal to the first process, causing it to exit. “INT” is short for “INTERRUPT”.
The system default when a process receives this signal is that it should interrupt whatever it’s doing and exit immediately.
The Big Picture
Below is a table showing signals commonly supported on Unix systems. Every Unix process will be able to respond to these signals and any signal can be sent to any process.
When naming signals the SIG portion of the name is optional. The Action column in the table describes the default action for each signal:
- Term
- means that the process will terminate immediately
- Core
- means that the process will terminate immediately and dump core (stack trace)
- Ign
- means that the process will ignore the signal
- Stop
- means that the process will stop (ie pause)
- Cont
- means that the process will resume (ie unpause)
Signal Value Action Comment
-------------------------------------------------------------------------
SIGHUP 1 Term Hangup detected on controlling terminal
or death of controlling process
SIGINT 2 Term Interrupt from keyboard
SIGQUIT 3 Core Quit from keyboard
SIGILL 4 Core Illegal Instruction
SIGABRT 6 Core Abort signal from abort(3)
SIGFPE 8 Core Floating point exception
SIGKILL 9 Term Kill signal
SIGSEGV 11 Core Invalid memory reference
SIGPIPE 13 Term Broken pipe: write to pipe with no readers
SIGALRM 14 Term Timer signal from alarm(2)
SIGTERM 15 Term Termination signal
SIGUSR1 30,10,16 Term User-defined signal 1
SIGUSR2 31,12,17 Term User-defined signal 2
SIGCHLD 20,17,18 Ign Child stopped or terminated
SIGCONT 19,18,25 Cont Continue if stopped
SIGSTOP 17,19,23 Stop Stop process
SIGTSTP 18,20,24 Stop Stop typed at tty
SIGTTIN 21,21,26 Stop tty input for background process
SIGTTOU 22,22,27 Stop tty output for background process
The signals SIGKILL and SIGSTOP cannot be trapped, blocked, or ignored.
This table might seem a bit out of left field, but it gives you a rough idea of what to expect when you send a certain signal to a process. You can see that, by default, most of the signals terminate a process.
It’s interesting to note the SIGUSR1
and SIGUSR2
signals. These are signals whose action is meant specifically to be defined by your process. We’ll see shortly that we’re free to redefine any of the signal actions that we please, but those two signals are meant for your use.
Redefining Signals
Let’s go back to our two ruby
sessions and have some fun.
-
In the first
ruby
session use the following code to redefine the behaviour of the INT signal:puts Process.pid trap(:INT) { print "Na na na, you can't get me" } sleep # so that we have time to send it a signal
Now our process won’t exit when it receives the
INT
signal. -
In the second
ruby
session issue the following command and notice that the first process is taunting us!Process.kill(:INT, <pid of first session>)
-
You can try using Ctrl-C to kill that first session, and notice that it responds the same!
-
But as the table said there are some signals that cannot be redefined.
SIGKILL
will show that guy who’s boss.Process.kill(:KILL, <pid of first session>)
Ignoring Signals
-
In the first
ruby
session use the following code:puts Process.pid trap(:INT, "IGNORE") sleep # so that we have time to send it a signal
-
In the second
ruby
session issue the following command and notice that the first process isn’t affected.Process.kill(:INT, <pid of first session>)
The first
ruby
session is unaffected.
Signal Handlers are Global
Signals are a great tool and are the perfect fit for certain situations. But it’s good to keep in mind that trapping a signal is a bit like using a global variable, you might be overwriting something that some other code depends on. And unlike global variables signal handlers can’t be namespaced.
So make sure you read this next section before you go and add signal handlers to all of your open source libraries :)
Being Nice about Redefining Signals
There is a way to preserve handlers defined by other Ruby code, so that your signal handler won’t trample any other ones that are already defined. It looks something like this:
trap(:INT) { puts 'This is the first signal handler' }
old_handler = trap(:INT) {
old_handler.call
puts 'This is the second handler'
exit
}
sleep 5 # so that we have time to send it a signal
Just send it a Ctrl-C to see the effect. Both signal handlers are called.
Now let’s see if we can preserve the system default behaviour. Hit the code below with a Ctrl-C.
system_handler = trap(:INT) {
puts 'about to exit!'
system_handler.call
}
sleep 5 # so that we have time to send it a signal
:/ It blew up that time. So we can’t preserve the system default behaviour with this technique, but we can preserve other Ruby code handlers that have been defined.
In terms of best practices your code probably shouldn't define any signal handlers, unless it's a server. As in a long-running process that's booted from the command line. It's very rare that library code should trap a signal.
# The 'friendly' method of trapping a signal.
old_handler = trap(:QUIT) {
# do some cleanup
puts 'All done!'
old_handler.call if old_handler.respond_to?(:call)
}
This handler for the QUIT signal will preserve any previous QUIT handlers that have been defined. Though this looks ‘friendly’ it’s not generally a good idea. Imagine a scenario where a Ruby server tells its users they can send it a QUIT signal and it will do a graceful shutdown. You tell the users of your library that they can send a QUIT signal and it will draw an ASCII rainbow. Now if a user sends the QUIT signal both handlers will be invoked. This violates the expectations of both libraries.
Whether or not you decide to preserve previously defined signal handlers is up to you, just make sure you know why you’re doing it. If you simply want to wire up some behaviour to clean up resources before exiting you can use an at_exit
hook, which we touched on in the chapter about exit codes.
When Can’t You Receive Signals?
Your process can receive a signal anytime. That’s the beauty of them! They’re asynchronous.
Your process can be pulled out of a busy for-loop into a signal handler, or even out of a long sleep
. Your process can even be pulled from one signal handler to another if it receives one signal while processing another. But, as expected, it will always go back and finish the code in all the handlers that are invoked.
In the Real World
With signals, any process can communicate with any other process on the system, so long as it knows its pid. This makes signals a very powerful communication tool. It’s common to send signals from the shell using kill(1).
In the real world signals are mostly used by long running processes like servers and daemons. And for the most part it will be the human users who are sending signals rather than automated programs.
For instance, the Unicorn web server (http://unicorn.bogomips.org) responds to the INT
signal by killing all of its processes and shutting down immediately. It responds to the USR2
signal by re-executing itself for a zero-downtime restart. It responds to the TTIN
signal by incrementing the number of worker processes it has running.
See the SIGNALS file included with Unicorn for a full list of the signals it supports and how it responds to them.
The memprof project has a interesting example of being a friendly citizen when handling signals.
System Calls
Ruby’s Process.kill
maps to kill(2), Kernel#trap
maps roughly to sigaction(2). signal(7) is also useful.