DNS Decompression (RFC 1035)
Learnings from implementing a toy dns parser.
DNS uses a clever little trick to compress the domain address data. Explained neatly in RFC 1035.
When we compress the data we first put the length of the string and then the string itself. Example foo
becomes 3foo
.
When we are parsing and decompressing, the protocol uses a clever little pointer methodology to compress the data.
A pointer is represented in the form of an octet as
The first two bits are 1 and the rest is the offset. Then the offset will have the location where to search.
Eg
Here we need to go to location 20
where it might have the data like
Domain names and labels
Domain names in messages are expressed in terms of a sequence of labels. Each label is represented as a one octet length field followed by that number of octets. Since every domain name ends with the null label of the root, a domain name is terminated by a length byte of zero. The high order two bits of every length octet must be zero, and the remaining six bits of the length field limit the label to 63 octets or less.
To simplify implementations, the total length of a domain name (i.e., label octets and label length octets) is restricted to 255 octets or less.
When parsing the data identifying the pointer bit is done by using binary AND
against a byte 11000000
which is essentially value 192
and observe the result. If we are getting a value which is not zero it means that it’s a pointer and we can go and decompress it.
Snippet from my code below
A clever little algorithm. Full code for my project can be found at github
Updated on