James Stanley


How to read from a TCP socket (but were too afraid to ask)

Sat 10 February 2024
Tagged: software, protohackers

You can get surprisingly far, before it bites you, with only a fuzzy and incorrect understanding of how you should read from a TCP socket. I see this often in (failing) Protohackers solutions. Once you are over the initial hurdle of reading enough documentation to actually get a TCP session connected, there are 2 key things you need to understand:

  1. TCP gives you a stream of bytes, not packets.
  2. read() can give you fewer bytes than you asked for.

If either of those was a surprise: keep reading.

Byte streams

The main misconception people have is that when they're reading from a TCP socket, they are receiving packets. This is the wrong way to think about it. If you're writing anything higher level than the TCP implementation itself, then you should forget about packets. TCP is exposed to you via a pair of byte streams.

Typically both streams are on the same file handle (they have the same file descriptor), but remember there are two underlying streams: writing puts bytes into the stream that is sent to the other side, and reading gets bytes out of the stream coming from the other side.

There are a few reasons that people are not immediately disabused of packet-oriented thinking:

If you write a program that tries to read "packets", there are a handful of potential issues you can encounter:

Just stop trying to think about packets. The kernel will deal with packets. You will deal with byte streams.

To transfer discrete messages over a byte stream, you need some sort of message encoding. Some simple schemes include:

You can do whatever you want as long as you can work out where the message boundaries are. If you want to transfer structured data I suggest JSON lines for a text format or length-prefixed protobuf for a binary format.

Read semantics

The other misconception is about the semantics of reading from a socket.

Depending on the platform you are using, this could bite you in 2 different ways. Normally read() takes an argument saying how many bytes you want, and then there are 2 common ways for it to work:

  1. it gives you back any amount of bytes, up to the maximum you gave
  2. it blocks, and keeps reading, until either the end of the stream, or it gives back exactly the number of bytes you asked for

The actual read() system call is "type 1": if there are some bytes available immediately, but not as many as you asked for, you'll just get back whatever is available immediately.

C's fread() works the second way: it blocks until either the end of the stream, or it has exactly the number you asked for.

Neither of these semantics is necessarily "better" than the other. In the first case, you have to manually make sure you have all the bytes you want (i.e. keep calling read() until you have enough). In the second case, you have to make sure you don't block the entire program in the course of getting all the bytes you want (e.g. if you want 5 bytes but the kernel only has 1 to give you, fread() will block even though select() told you the socket was readable).

The following properties are common to both types:

Here are some classifications that I'm aware of:

If you want to write reliable software, you need to find out what read semantics you're using. Don't just guess.

(Also, for what it's worth, the write() system call has the same property as read(), in that it may write fewer bytes than you asked it to; check the return value to know how many bytes were actually written, and then try again to send the rest).

Half-close

While you're here, I have an axe to grind...

There's one more thing that I want you to know: half-close. You remember how we agreed that there are 2 separate byte streams? Well, corollary to that is that the 2 streams can be closed independently: you can close the stream you are writing to, even while you still want to read from the other side. If you understand the stream abstraction, this should be natural and good.

Sadly, some "transparent proxies" inadvertently break half-close, by tearing down connections as soon as they see either of the streams get closed. Please don't do this!

The correct thing to do is to propagate the half-close onwards. If you imagine the TCP session as a pipeline running from the client, through the proxy, to the server, and then back through the proxy to the client, then it is obvious that the half-close should be propagated on through the pipeline in the same path that normal messages would take. A half-close shouldn't "jump ahead" of any pending data in the pipeline by closing the session immediately.

(I wanted to draw a neat animation showing a client, a proxy, and a server, with water pipes connecting them, and in the good case, when the flow from the client to the proxy is shut off, all the valves will get shut off one by one, in order, through the whole pipeline, following the last of the flowing water; and in the bad case, as soon as the first valve is shut off, the rest of the pipes all get closed at once and the water they contain is dumped on the ground. But making animations is too much trouble, so please imagine it instead).

There is a blog post from Excentis about how some NAT proxies break half-close.

Nmap's ncat breaks half-close, I have submitted a patch but it has been ignored.

Ngrok breaks half-close, I found out because people tried to solve the Protohackers Smoke Test using ngrok, and couldn't.

If you're writing any kind of proxy, please implement half-close properly.

Conclusion

It's not that hard to read from a TCP socket, but it is easy to get it subtly wrong if you don't have the right mental models.

And (this is turning into more of a Protohackers ad than intended, but) if you want to test your understanding, you could try solving some of the Protohackers problems. If you really want a challenge: Problem 7 has you implement a basic TCP-like protocol on top of UDP.



If you like my blog, please consider subscribing to the RSS feed or the mailing list: