Non-blocking IO

This chapter is about non-blocking IO. Note: this is different from asynchronous or evented IO. If you don’t know the difference, it should become clear as you progress through the rest of the book.

Non-blocking IO goes hand-in-hand with the next chapter on Multiplexing Connections, but we’ll look at this first in isolation because it can be useful on its own.

Non-blocking Reads

Do you remember a few chapters back when we looked at read? I noted that read blocked until it received EOF or was able to receive a minimum number of bytes. This may result in a lot of blocking when a client doesn’t send EOF. This blocking behaviour can be partly circumvented by readpartial, which returns any available data immediately. But readpartial will still block if there’s no data available. For a read operation that will never block you want read_nonblock.

Much like readpartial, read_nonblock requires an Integer argument specifying the maximum number of bytes to read. Remember that read_nonblock, like readpartial, might return less than the maximum amount of bytes if that’s what’s available. It works something like this:

require 'socket'

Socket.tcp_server_loop(4481) do |connection|
  loop do
    begin
      puts connection.read_nonblock(4096)
    rescue Errno::EAGAIN
      retry
    rescue EOFError
      break
    end
  end

  connection.close
end

Boot up the same client we used previously that never closes its connection:

$ tail -f /var/log/system.log | nc localhost 4481

Even when there’s no data being sent to the server the call to read_nonblock is still returning immediately. In fact, it’s raising an Errno::EAGAIN exception. Here’s what my manpages have to say about EAGAIN:

The file was marked for non-blocking I/O, and no data were ready to be read.

Makes sense. This differs from readpartial which would have just blocked in that situation.

So what should you do when you get this error and your socket would otherwise block? In this example we entered a busy loop and continued to retry over and over again. This was just for demonstration purposes and isn’t the proper way to do things.

The proper way to retry a blocked read is using IO.select:

begin
  connection.read_nonblock(4096)
rescue Errno::EAGAIN
  IO.select([connection])
  retry
end

This achieves the same effect as spamming read_nonblock with retry, but with less wasted cycles. Calling IO.select with an Array of sockets as the first argument will block until one of the sockets becomes readable. So, retry will only be called when the socket has data available for reading. We’ll cover IO.select in more detail in the next chapter.

In this example we’ve re-implemented a blocking read method using non-blocking methods. This, in itself, isn’t useful. But using IO.select gives the flexibility to monitor multiple sockets simultaneously or periodically check for readability while doing other work.

When would a read block?

The read_nonblock method first checks Ruby’s internal buffers for any pending data. If there’s some there it’s returned immediately.

It then asks the kernel if there’s any data available for reading using select(2). If the kernel says that there’s some data available, whether it be in the kernel buffers or over the network, that data is then consumed and returned. Any other condition would cause a read(2) to block and, thus, raise an exception from read_nonblock.

Non-blocking Writes

Non-blocking writes have some very important differences from the write call we saw earlier. The most notable is that it’s possible for write_nonblock to return a partial write, whereas write will always take care of writing all of the data that you send it.

Let’s boot up a throwaway server using netcat to show this behaviour:

$ nc -l localhost 4481

Then we’ll boot up this client that makes use of write_nonblock:

require 'socket'

client = TCPSocket.new('localhost', 4481)
payload = 'Lorem ipsum' * 10_000

written = client.write_nonblock(payload)
written < payload.size #=> true

When I run those two programs against each other, I routinely see true being printed out from the client side. In other words it’s returning an Integer that’s less than the full size of the payload data. The write_nonblock method returned because it entered some situation where it would block, so it didn’t write any more data and returned an Integer, letting us know how much was written. It’s now our responsibility to write the rest of the data that remains unsent.

The behaviour of write_nonblock is the same as the write(2) system call. It writes as much data as it can and returns the number of bytes written. This differs from Ruby’s write method which may call write(2) several times to write all of the data requested.

So what should you do when one call couldn’t write all of the requested data? Try again to write the missing portion, obviously. But don’t do it right away. If the underlying write(2) would still block then you’ll get an Errno::EAGAIN exception raised. The answer lies again with IO.select, it can tell us when a socket is writable, meaning it can write without blocking.

require 'socket'

client = TCPSocket.new('localhost', 4481)
payload = 'Lorem ipsum' * 10_000

begin
  loop do
    bytes = client.write_nonblock(payload)

    break if bytes >= payload.size
    payload.slice!(0, bytes)
    IO.select(nil, [client])
  end

rescue Errno::EAGAIN
  IO.select(nil, [client])
  retry
end

Here we make use of the fact that calling IO.select with an Array of sockets as the second argument will block until one of the sockets becomes writable.

The loop in the example deals properly with partial writes. When write_nonblock returns an Integer less than the size of the payload we slice that data from the payload and go back around the loop when the socket becomes writable again.

When would a write block?

The underlying write(2) can block in two situations:

  1. The receiving end of the TCP connection has not yet acknowledged receipt of pending data, and we’ve sent as much data as is allowed. Due to the algorithms TCP uses for congestion control, it ensures that the network is never flooded with packets. If the data is taking a long time to reach the receiving end of the TCP connection, then care is taken not to flood the network with more data than can be handled.
  2. The receiving end of the TCP connection cannot yet handle more data. Even once the other end acknowledges receipt of the data it still must clear its data ‘window’ in order that it may be refilled with more data. This refers to the kernel’s read buffers. If the receiving end is not processing the data it’s receiving then the congestion control algorithms will force the sending end to block until the client is ready for more data.

Non-blocking Accept

There are non-blocking variants of other methods, too, besides read and write, though they’re the most commonly used.

An accept_nonblock is very similar to a regular accept. Remember how I said that accept just pops a connection off of the listen queue? Well if there’s nothing on that queue then accept would block. In this situation accept_nonblock would raise an Errno::EAGAIN rather than blocking.

Here’s an example:

require 'socket'

server = TCPServer.new(4481)

loop do
  begin
    connection = server.accept_nonblock
  rescue Errno::EAGAIN
    # do other important work
    retry
  end
end

Non-blocking Connect

Think you can guess what the connect_nonblock method does by now? Then you’re in for a surprise! connect_nonblock behaves a bit differently than the other non-blocking IO methods.

Whereas the other methods either complete their operation or raise an appropriate exception, connect_nonblock leaves its operation in progress and raises an exception.

If connect_nonblock cannot make an immediate connection to the remote host, then it actually lets the operation continue in the background and raises Errno::EINPROGRESS to notify us that the operation is still in progress. In the next chapter we’ll see how we can be notified when this background operation completes. For now, a quick example:

require 'socket'

socket = Socket.new(:INET, :STREAM)
remote_addr = Socket.pack_sockaddr_in(80, 'google.com')

begin
  # Initiate a nonblocking connection to google.com on port 80.
  socket.connect_nonblock(remote_addr)
rescue Errno::EINPROGRESS
  # Operation is in progress.
rescue Errno::EALREADY
  # A previous nonblocking connect is already in progress.
rescue Errno::ECONNREFUSED
  # The remote host refused our connect.
end