Intradependent encoding

We call an en­cod­ing \(E(m)\) of a mes­sage \(m\) “in­trade­pen­dent” if the fact that \(E(m)\) en­codes \(m\) can be de­duced with­out look­ing at the whole en­cod­ing. For ex­am­ple, imag­ine that you know you’re go­ing to see an 8-let­ter English word, and you see the let­ters “aardv”. You can then de­duce that the mes­sage is “aard­vark” with­out look­ing at the last three let­ters, be­cause that’s the only valid mes­sage that starts with “aardv”.

(See­ing “aard” is not enough, be­cause aard­wolfs ex­ist.)

In an in­trade­pen­dent en­cod­ing, some parts of the en­cod­ing carry in­for­ma­tion about the other parts. For ex­am­ple, once you’ve seen “aard”, there are \(26^4 = 456976\) pos­si­ble com­bi­na­tions of the next four let­ters, but “aard” cuts the space down to just two pos­si­bil­ities — “vark” and “wolf”. This means that the first four let­ters carry \(\log_2(26^4) - 1 \approx 17.8\) bits of in­for­ma­tion about the last four. (The fifth let­ter car­ries one fi­nal bit of in­for­ma­tion, in the choice be­tween “v” and “w”. The last three let­ters carry no new in­for­ma­tion.)

For­mally, the in­trade­pen­dence of an en­cod­ing is defined to be the sum of mu­tual in­for­ma­tion be­tween each sym­bol in the en­cod­ing and all the pre­vi­ous sym­bols. Given an in­trade­pen­dent en­cod­ing, it is pos­si­ble in prin­ci­ple to de­sign a more effi­cient en­cod­ing; see In­trade­pen­dent com­pres­sion.