# Data capacity

The data ca­pac­ity of an ob­ject is defined to be the log­a­r­ithm of the num­ber of differ­ent dis­t­in­guish­able states the ob­ject can be placed into. For ex­am­ple, a coin that can be placed heads or tails has a data ca­pac­ity of $$\log(2)$$ units of data. The choice of base for the log­a­r­ithm de­ter­mines the unit of data; com­mon units in­clude the bit (of data), nat, and decit (cor­re­spond­ing to base 2, e, and 10). For ex­am­ple, the coin has a data ca­pac­ity of $$\log_2(2)=1$$ bit, and a pair of two dice (which can be placed into 36 dis­t­in­guish­able states) have a data ca­pac­ity of $$\log_2(36) \approx 5.17$$ bits. Note that the data ca­pac­ity of a chan­nel de­pends on the abil­ity of an ob­server to dis­t­in­guish differ­ent states of the ob­ject: If the coin is a penny, and I’m able to tell whether you placed the image of Abra­ham Lin­coln fac­ing North, South, West, or East (re­gard­less of whether the coin is heads or tails), then that coin has a data ca­pac­ity of $$\log_2(8) = 3$$ bits when used to trans­mit a mes­sage from you to me.

The data ca­pac­ity of an ob­ject is closely re­lated to the chan­nel ca­pac­ity of a com­mu­ni­ca­tions chan­nel. The differ­ence is that the chan­nel ca­pac­ity is the amount of data the chan­nel can trans­mit per unit time (mea­sured, e.g., in bits per sec­ond), while the data ca­pac­ity of an ob­ject is the amount of data that can be en­coded by putting the ob­ject into a par­tic­u­lar state (mea­sured, e.g., in bits).

Why is data ca­pac­ity defined as the log­a­r­ithm of the num­ber of states an ob­ject can be placed into? In­tu­itively, be­cause a 2GB hard drive is sup­posed to carry twice as much data as a 1GB hard drive. More con­cretely, note that if you have $$n$$ copies of a phys­i­cal ob­ject that can be placed into $$b$$ differ­ent states, then you can use those to en­code $$b^n$$ differ­ent mes­sages. For ex­am­ple for ex­am­ple, with three coins, you can en­code eight differ­ent mes­sages: HHH, HHT, HTH, HTT, THH, THT, TTH, and TTT. The num­ber of mes­sages that the ob­jects can en­code grows ex­po­nen­tially with the num­ber of copies. Thus, if we want to define a unit of mes­sage-car­ry­ing-ca­pac­ity that grows lin­early in the num­ber of copies of an ob­ject (such that 3 coins hold 3x as much data as 1 coin, and a 9GB hard drive holds 9x as much data as a 1GB hard drive, and so on) then data must grow log­a­r­ith­mi­cally with the num­ber of mes­sages.

The data ca­pac­ity of an ob­ject bounds the length of the mes­sage that you can send us­ing that ob­ject. For ex­am­ple, it takes about 5 bits of data to en­code a sin­gle let­ter A-Z, so if you want to trans­mit an 8-let­ter word to some­body, you need an ob­ject with a data ca­pac­ity of $$5 \cdot 8 = 40$$ bits. In other words, if you have 40 coins in a coin jar on your desk, and if we worked out an en­cod­ing scheme (such as ASCII) ahead of time, then you can tell me any 8-let­ter word us­ing those coins.

What does it mean to say that an ob­ject “can” be placed into differ­ent states, and what does it mean for those states to be “dis­t­in­guish­able”? In­for­ma­tion the­ory is largely ag­nos­tic about the an­swer to those ques­tions. Rather, given a set of states that you claim you could put an ob­ject into, which you claim I can dis­t­in­guish, in­for­ma­tion the­ory can tell you how to use those ob­jects in a clever way to send mes­sages to me. For more on what it means for two phys­i­cal sub­sys­tems to com­mu­ni­cate with each other by en­cod­ing mes­sages into their en­vi­ron­ment, see Com­mu­ni­ca­tion.