Buffering

Here we’ll answer a few key questions: How much data should I read/write with one call? If write returns successfully does that mean that the other end of the connection received my data? Should I split a big write into a bunch of smaller writes? What’s the impact?

Write Buffers

Let’s first talk about what really happens when you write data on a TCP connection.

When you call write and it returns, without raising an exception, this does not mean that the data has been successfully sent over the network and received by the client socket. When write returns, it acknowledges that you have left your data in the capable hands of Ruby’s IO system and the underlying operating system kernel.

There is at least one layer of buffers between your application code and actual network hardware. Let’s pinpoint where those are and then we’ll look at how to work around them.

When write returns successfully, the only guarantee you have is that your data is now in the capable hands of the OS kernel. It may decide to send your data immediately, or keep it and combine it with other data for efficiency.

By default, Ruby sockets set sync to true. This skips Ruby’s internal buffering which would otherwise add another layer of buffers to the mix.

Why buffer at all?

All layers of IO buffering are in place for performance reasons, usually offering big improvements in performance.

Sending data across the network is slow, really slow. Buffering allows calls to write to return almost immediately. Then, behind the scenes, the kernel can collect all the pending writes, group them and optimize when they’re sent for maximum performance to avoid flooding the network. At the network level, sending many small packets incurs a lot overhead, so the kernel batches small writes together into larger ones.

How Much to Write?

Given what we now know about buffering we can pose this question again: should I do many small write calls or one big write call?

Thanks to buffers, we don’t really have to think about it. Generally you’ll get better performance from writing all that you have to write and letting the kernel normalize performance by splitting things up or chunking them together. If you’re doing a really big write, think files or big data, then you’d be better off splitting up the data, lest all that stuff gets moved into RAM.

In the general case, you’ll get the best performance from writing everything you have to write in one go and letting the kernel decide how to chunk the data. Obviously, the only way to be certain is to profile your application.

Read Buffers

It’s not just writes, reads are buffered too.

When you ask Ruby to read data from a TCP connection and pass a maximum read length, Ruby may actually be able to receive more data than your limit allows.

In this case that ‘extra’ data will be stored in Ruby’s internal read buffers. On the next call to read, Ruby will look first in its internal buffers for pending data before asking the OS kernel for more data.

How Much to Read?

The answer to this question isn’t quite as straightforward as it was with write buffering, but we’ll take a look at the issues and best practices.

Since TCP provides a stream of data we don’t know how much is coming from the sender. This means that we’ll always be making a guess when we decide on a read length.

Why not just specify a huge read length to make sure we always get all of the available data? When we specify our read length the kernel allocates some memory for us. If we specify more than we need, we end up allocating memory that we don’t use. This is wasted resources.

If we specify a small read length which requires many reads to consume all of the data, we incur overhead for each system call.

So, as with most things, you’ll get the best performance if you tune your program based on the data it receives. Going to receive lots of big data chunks? Then you should probably specify a bigger read length.

There’s no silver bullet answer but I’ve cheated a bit and took a survey of various Ruby projects that use sockets to see what the consensus on this question is.

I’ve looked at Mongrel, Unicorn, Puma, Passenger, and Net::HTTP, and all of them do a readpartial(1024 * 16). All of these web projects use 16KB as their read length.

Conversely, redis-rb uses 1KB as its read length.

You’ll always get best performance through tuning your server to the data at hand but, when in doubt, 16KB seems to be a generally agreed-upon read length.