# Most complex things are not very compressible

Although the halt­ing prob­lem means we can’t prove it doesn’t hap­pen, it would nonethe­less be ex­tremely sur­pris­ing if some 100-state Tur­ing ma­chine turned out to print the ex­act text of Shake­speare’s Romeo and Juliet. Un­less some­thing was speci­fi­cally gen­er­ated by a sim­ple al­gorithm, the Vast su­per­ma­jor­ity of data struc­tures that look like they have high al­gorith­mic com­plex­ity ac­tu­ally do have high al­gorith­mic com­plex­ity. Since there are at most $$2^{101}$$ pro­grams that can be speci­fied with at most 100 bits (in any par­tic­u­lar lan­guage), we can’t fit all the 1000-bit data struc­tures into all the 100-bit pro­grams. While Romeo and Juliet is cer­tainly highly com­press­ible, rel­a­tive to most ran­dom bit­strings of the same length, it would be shock­ing for it to com­press all the way down to a 100-state Tur­ing ma­chine. There just aren’t enough 100-state Tur­ing ma­chines for one of their out­puts to be Romeo and Juliet. Similarly, if you start with a 10 kilo­byte text file, and 7zip com­presses it down to 2 kilo­bytes, no amount of time spent try­ing to fur­ther com­press the file us­ing other com­pres­sion pro­grams will ever get that file down to 1 byte. For any given com­pres­sor there’s at most 256 start­ing files that can ever be com­pressed down to 1 byte, and your 10-kilo­byte text file al­most cer­tainly isn’t one of them.

This takes on defen­sive im­por­tance with re­spect to re­fut­ing the prob­a­bil­ity-the­o­retic fal­lacy, “Oh, sure, Oc­cam’s Ra­zor seems to say that this propo­si­tion is com­pli­cated. But how can you be sure that this ap­par­ently com­plex propo­si­tion wouldn’t turn out to be gen­er­ated by some very sim­ple mechanism?” If we con­sider a par­ti­tion of 10,000 pos­si­ble propo­si­tions, col­lec­tively hav­ing a 0.01% prob­a­bil­ity on av­er­age, then all the ar­gu­ments in the world for why var­i­ous propo­si­tions might have un­ex­pect­edly high prob­a­bil­ity, must still add up to an av­er­age prob­a­bil­ity of 0.01%. It can’t be the case that af­ter con­sid­er­ing that propo­si­tion 1 might have se­cretly high prob­a­bil­ity, and con­sid­er­ing that propo­si­tion 2 might have se­cretly high prob­a­bil­ity, and so on, we end up as­sign­ing 5% prob­a­bil­ity to each propo­si­tion, be­cause that would be a to­tal prob­a­bil­ity of 500. If we as­sign prior prob­a­bil­ities us­ing an al­gorith­mic-com­plex­ity Oc­cam prior as in Solomonoff in­duc­tion, then the ob­ser­va­tion that “most ap­par­ently com­plex things can’t be fur­ther com­pressed into an amaz­ingly sim­ple Tur­ing ma­chine”, is the same ob­ser­va­tion as that “most ap­par­ently Oc­cam-pe­nal­ized propo­si­tions can’t turn out to be sim­pler than they look” or “most ap­par­ently sub­jec­tively im­prob­a­ble things can’t turn out to have un­seen clever ar­gu­ments that would val­idly make them more sub­jec­tively prob­a­ble”.

Parents:

• Algorithmic complexity

When you com­press the in­for­ma­tion, what you are left with de­ter­mines the com­plex­ity.

• Not 2^100?

• 2^100 + 2^99 + 2^98… + 1 = 2^101 − 1.

• Is there a differ­ence be­tween any par­tic­u­lar 99-bit pro­gram and a 100-bit pro­gram with a 0 as the first bit?

• Yes. That in­cludes both the case where the length is speci­fied out­side the pro­gram, and the case where we use a pre­fix-free en­cod­ing. Ac­tu­ally I’m not sure what you’re ask­ing here—why wouldn’t prepend­ing 0 to a pro­gram change its be­hav­ior in what­ever Univer­sal Tur­ing Ma­chine you used? If the first bit always has to be 1, it might as well be omit­ted.

• Aren’t pro­grams for Tur­ing ma­chines speci­fied as marks on an in­finite tape?

I was in­ter­pret­ing 100-bit pro­gram as one where up to 100 cells on the tape have marks in them (and there’s only one kind of mark, so a cell can only have a mark or not). Maybe I’ve got the wrong pic­ture though. I haven’t stud­ied Tur­ing ma­chines in much depth.

• “Similarly, if you start with a 10 kilo­byte text file, and 7zip com­presses it down to 2 kilo­bytes, no amount of time spent try­ing to fur­ther com­press the file us­ing other com­pres­sion pro­grams will ever get that file down to 1 byte.” (Em­pha­sis added.)

This claim seems too strong. Do you mean that an ar­bi­trary com­pres­sion al­gorithm is un­likely to com­press the file to one byte?

Tak­ing what’s writ­ten at face value, if I as­sume that ev­ery com­pres­sion I ap­ply mono­ton­i­cally de­creases the size of the file, then I need only find a large enough se­quence of com­pres­sions and com­pose them to com­press my file to one byte. This is of course silly, since the com­po­si­tion is just a huge highly spe­cial­ized com­pres­sion pro­gram op­ti­mized for this in­stance, but I can spend enough time to make it hap­pen.

• No in­di­vi­d­ual com­pres­sor can mono­ton­i­cally de­crease file sizes.

And we count the string of .rar.7z.zip.rar… in your file­name into the to­tal size of your file.

• I only as­sumed the se­quence was mono­tonic.

The sec­ond com­ment is fair. I think when I first read this, I ig­nored the bit about 7zip and un­der­stood the sen­tence as claiming “no Tur­ing ma­chine takes an in­put smaller than one byte and prints out a 10kb file”.