Fractional bits
It takes \(\log_2(8) = 3\) bits of data to carry one message from a set of 8 possible messages. Similarly, it takes \(\log_2(1024) = 10\) bits to carry one message from a set of 1024 possibilities. How many bits does it take to carry one message from a set of 3 possibilities? By the definition of “bit,” the answers is \(\log_2(3) \approx 1.58.\) What does this mean? That it takes about one and a half yes-or-no questions to single out one thing from a set of three? What is “half of a yes-or-no question”?
Fractional bits can be interpreted as telling about the expected cost of transmitting information: If you want to single out one thing from a set of 3 using bits (in, e.g., the GalCom thought experiment) then you’ll have to purchase two bits, but sometimes you’ll be able to sell one of them back, which means you can push your expected cost lower than 2. How low can you push it? The lower bound is \(\log_2(3),\) which is the minimum expected cost of adding a 3-message to a long message when encoding your message as cleverly as possible. For more on this interpretation, see Fractional bits: Expected cost interpretation.
Fractional bits can also be interpreted as conversion rates between types of information: “A 3-message carries about 1.58 bits” can be interpreted as “one Trit is worth about 1.58 bits.” To understand this, see Exchange rates between digits, or How many bits is a trit?
Fractional units of data can also be interpreted as a measure of how much we’re using the digits allocated to encoding a number. For example, working with fractional decits instead of fractional bits, it only takes about 2.7 decits carry a 500-message, despite the fact that the number 500 clearly requires 3 decimal digits to write down. What’s going on here? Well, we could declare that all numbers that start with a number \(n \ge 5\) will be interpreted as if they start with the number \(n - 5.\) Then we have two ways of representing each number (for example, 132 can be represented as both 132 and 632). Thus, if we have 3 decits but we only need to encode a 500-message, we have one bit to spare: We can encode one extra bit in our message according to whether we use the low representation or the high representation of the intended number. Thus, the amount of data it takes to communicate a 500-message is one bit lower than the amount of data it takes to encode a 1000-message — for a total cost of 3 decits minus one bit, which comes out to about 2.70 decits (or just short of 9 bits). For more on this interpretation, see Fractional digits.