Square pegs and round holes

I just wrote this as a comment in one of the source files in the ssl-nio branch, but I think it’s interesting enough to repost here. Here, I’m trying to figure out how to properly code the SSL handshake output. Complete with ASCII-art diagrams:

XXX what we need to do here is generate a “stream” of handshake messages, and insert them into fragment amounts that we have available. A handshake message can span multiple records, and we can put multiple handshake messages into a single record.

So, we can have one of two states:

  1. We have enough space in the record we are creating to push out everything we need to on this round. This is easy; we just repeatedly fill in these messages in the buffer, so we get something that looks like this:
                 ________________________________
       records: |________________________________|
    handshakes: |______|__|__________|
  2. We can put part of one handshake message in the current record, but we must put the rest of it in the following record, or possibly more than one following record. So here, we’d see this:
                 ________________________
       records: |_______|_______|________|
    handshakes: |____|_______|_________|

We could make this a lot easier by just only ever emitting one handshake message per call, but then we would waste potentially a lot of space and waste a lot of TCP packets by doing it the simple way. What we desire here is that we maximize our usage of the resources given to us, and to use as much space in the present fragment as we can.

Note that we pretty much have to support this, anyway, because SSL provides no guarantees that the record size is large enough to accomodate even one handshake message. Also, callers could call on us with a short buffer, even though they aren’t supposed to.

This is somewhat complicated by the fact that we don’t know, a priori, how large a handshake message will be until we’ve built it, and our design builds the message around the byte buffer.

Some ways to handle this:

  1. Write our outgoing handshake messages to a private buffer, big enough per message (and, if we run out of space, resize that buffer) and push (possibly part of) this buffer out to the outgoing buffer. This isn’t that great because we’d need to store and copy things unnecessarily.
  2. Build outgoing handshake objects “virtually,” that is, store them as collections of objects, then compute the length, and then write them to a buffer, instead of making the objects views on ByteBuffers for both input and output. This would complicate the protocol objects a bit (although, it would amount to doing separation between client objects and server objects, which is pretty OK), and we still need to figure out how exactly to chunk those objects across record boundaries.
  3. Try to build these objects on the buffer we’re given, but detect when we run out of space in the output buffer, and split the overflow message. This sounds like the best, but also probably the hardest to code.

Fun, fun, fun!

Edit: cleaned up the wording a little. Me talk pretty.