An amazing talk, but I don't understand this part: first they tell that delimiters are insecure because of injections, then they go on to tell that delimiters are actually more (and provable) secure than length fields.
It seems that length fields make protocols somehow turing complete, or something, but I really don't understand that.
Can someone explain that to me?

Hmm, I think I understand:

TL;DR: use delimiters if you want provable security.

The science of insecurity:
* protocols are languages;
* languages needs to be interpreted;
* language interpreters are often messy;
* messy interpreters cause different implementations to do different things on the same data;
* doing different things on the same data can cause security bugs (like issuing a SSL certificate for another domain that you might think);
* language interpreters can be tidy by generating them from a grammar;
* tidy interpreters can be proven to do the same thing to the same data;
* doing the same thing to the same data prevents security bugs like the above.

The two ways of sending non-fixed length data:
* delimiters: including any necessary escape sequences, can be part of a grammar;
* length fields: can not be part of a grammar.

* delimiters are theoretically (provably!) more secure that length field;
* this theory can be practice if, and only if, the interpreters are generated from a grammar;
* otherwise, delimiters are prone to injection attacks (like SQL injection).

* length fields are in theory faster than delimiters (can allocate memory beforehand); in practice, this is not an issue;
* fixed length would be even better, but is of course not always possible.

@BartG95 length prefixes are often the only possible things for efficiency in hardware (where ram is low) or stream processing. You will find TLV (type, length, value) everywhere. From video/audio/image compression formats (png, mp4/mov container, h264) to modern protocols like apple homekit. With a length prefix you can start parsing your data as it streams in without waiting to find the delimiter to backtrack from there to allocate your processing memory.


@BartG95 From a developer standpoint i find it easier too: you can actually build tools that only extract the chunks pretty easily without knowing the complete grammar (think framing vs parsing)

Sign in to participate in the conversation

Primarily my private instance, but if you like the URL create an account. This instance is targeted at makers and software developers.