Exchange rates between digits

Let’s say that you want to store a lot of data, using physical objects such as coins, dice, or digit wheels. For concreteness, let’s say that you’re going to be given an arbitrary number between one and \(2^\text{3,000,000,000,000}\) (a.k.a. a terabyte of data), and you need to make sure you have enough physical objects to write that number down.

Let’s say that a coin costs 1¢. Because \(n\) coins can be used to encode any one of \(2^n\) different possibilities, you could get the job done with three trillion coins, which would let you encode the number (in binary) for the low low cost of thirty billion dollars. (At the time of writing this, three trillion bits of data will actually run you about $60, but leave that aside for now. Pretend you can’t use modern transistors, and you have to use comparatively large objects such as coins to store data.)

Instead of buying coins, you could also buy digit wheels, and use those instead. Because digit wheels can store more values than coins, it takes fewer digit wheels than coins to do the job. If digit wheels also cost 1¢, you should just buy digit wheels; that will cost a mere $9B. If instead digit wheels cost a million dollars apiece, you should probably buy coins instead. Somewhere in between those two prices, digit wheels switch from being a good deal to being a bad deal. The question is, if coins are worth 1¢, then how much is a digit wheel worth? At what price should you buy wheels instead of coins, and at what price should you buy coins instead of wheels?

At a first glance, it may seem like the fair price is 5¢ per wheel. A coin can store two values (by being placed ‘heads’ or ‘tails’), while a digit wheel can store ten. Ten is five times larger than 2, so perhaps the price of a digit wheel should be five times the price of a coin. However, if digit wheels cost 5¢, you should ignore digit wheels entirely and buy coins. Why? Because for only 4¢, you could buy four coins, which can hold \(2^4=16\) different values. Four coins store a 16-digit, and a 16-digit is worth more than a 10-digit, so you would have no reason to even consider buying digit wheels at a price of 5¢ per wheel.

So what is the fair price for a digit wheel? 4¢ is still too high, because that makes a 10-digit cost the same as a 16-digit, and the 16-digit is always better at that price. What about 3¢? At that price, the answer is much less clear. On the one hand, spending 3¢ on coins gets you the ability to write down only 8 possibilities, while spending 3¢ on wheels lets you write down 10 different possibilities. On the other hand, if you’re trying to store the number 101, you need either 7 coins (7¢, because \(2^6 < 101 < 2^7\)) or 3 wheels (one per digit, because a wheel stores a digit, for a total of 9¢), in which case the coins are better.

Given that digit wheels can store ten values and cost 3¢ each, and coins can store two values and cost 1¢ each:

Question: When storing the number 8000, should you buy coins or wheels?

Wheels. To store 8000, you need 13 coins (because \(2^{12} < 8000 < 2^{13}\)) for a cost of 13¢, but only 4 wheels (because 8000 is 4 digits long) for a cost of 12¢.

Question: When storing the number 15,000, should you buy coins or wheels?

Coins. Storing 15,000 costs 14 coins ($2^{13} < 15,000 < 2^{14}$) or 5 wheels; given that wheels cost 3¢ and coins cost 1¢, the 14 coins are cheaper.

Question: When storing the number 230,000, should you buy coins or wheels?

It doesn’t matter. Storing 230,000 takes either 18 coins or 6 wheels, which costs 18¢ either way.

At 3¢ per digit wheel, whether you should buy coins or wheels depends on which number you want to store.

But once the number you’re storing gets large enough, the digit wheels become a better deal, and stay that way. In this case, the final transition happens when the number you’re storing is 11 digits long or longer. Why? Say that the number \(x\) you want to store is \(n\) digits long in decimal notation.\(n\) digit wheels can always encode more possibilities than \(3n\) coins (because \(10^n > 2^{3n}\) for all \(n\)), but sometimes \(x\) can be written using fewer than \(3n\) coins (as in the case of 15,000). This stops happening once \(n\) is so large than \(10^n\) is \(2^3\) times larger than \(2^{3n}\), at which point no \(n\)-digit number can be encoded using \(2^{3(n-1)}\) or fewer coins. This happens once \(n \ge 11,\) i.e., once \(x\) is at least 11 digits long. (Proof.)

This probably isn’t too surprising: For 3¢ you can either buy 3 coins or one digit wheel. The coins store 8 possibilities, the wheel stores 10, and it’s no surprise that a huge collection of 10-digits is unambiguously better than an equally sized collection of 8-digits.

So digit wheels are worth more than 3x as much as a coin, and less than 4x. How can we find the right price exactly? We could use exactly the same argument as above to check whether 3.5¢ is too high or too low, except that it’s not clear what it would mean to buy “three and a half coins.” We can get around that problem by considering ten-packs of coins vs ten-packs of digit wheels. If wheels cost 3.5¢ and coins cost 1¢, then with 35¢, you could either buy 10 wheels or 35 coins. Which encodes more possibilities? The coins, because \(10^{10} < 2^{35}.\) Thus, by a generalized version of the argument above, for numbers sufficiently large, the coins are a better deal than the wheels — at that price, you should buy wheels.

As you can verify, \(2^{33} < 10^{10} < 2^{34},\) so the fair price must be between 3.3¢ and 3.4¢. And \(2^{332} < 10^{100} < 2^{333},\) so it must be between 3.32¢ and 3.33¢. We could keep going, and keep getting digits of the fair price. And we could keep going forever, because the fair price is irrational. (Why?)

In general, if digit wheels cost \(p\) times as much as coins, the general argument we’ve been using says that if you’re storing a large number and \(2^p > 10\) then you should buy coins, whereas if \(2^p < 10\) then you should buy digit wheels. Thus, the fair price must be the number \(p\) such that \(2^p = 10,\) and its first three digits are 3.32.

In other words, while a digit wheel holds 5 times as many values as a coin, it is only worth about 3.32 times as much as a coin (in terms of its ability to store data). Why? Because it takes about 3.32 coins to emulate a digit-wheel. What does that mean? Well, 3 coins is not enough to emulate a digit wheel (3 give you only 8 possibilities), 4 coins is more than enough (4 give you 16 possibilities). And 33 coins is not enough to emulate 10 digit wheels, but 34 coins are more than enough. And 332 coins are not enough to emulate 100 digit wheels, but 333 coins are more than enough.

A digit wheel isn’t worth 5 times as much as a coin because, when you buy multiple coins, their data storage capacity multiplies, rather than adding. Each new coin lets you say twice as many things, not just two more things. With five coins you can store \(2 \cdot 2 \cdot 2 \cdot 2 \cdot 2 = 32\) different values, not \(2 + 2 + 2 + 2 + 2 = 10.\) When finding the exchange rate between coins and digit wheels, the question is not “How many more values does a digit wheel have written on it?” Rather, the question is “How many coins does it take to emulate a digit wheel?” The answer to that question is “More than 332 coins per hundred digit wheels, less than 333 per hundred.”

The fair price \(p\) of digit wheels (in terms of coins) is the number \(p\) such that \(2^p = 10\) exactly. We have a name for this number, and it’s \(\log_2(10),\) and the argument described above is a simple algorithm for computing logarithms. (Logarithm outputs are “difficult to calculate but useful to have,” in general.)

Thus, we can look at \(\log_b(x)\) as describing the fair price of \(x\)-digits in terms of \(b\)-digits. For example, if coins are worth 1¢, then dice are worth \(\log_2(6) \approx 2.58\) cents, as you can verify by checking that \(2^2 < 6 < 2^3\) and \(2^{25} < 6^{10} < 2^{26}\) and \(2^{258} < 6^{100} < 2^{259}.\) (Exercise: Find the next digit of \(\log_2(6)\) using this method.)

This interpretation is a great intuition pump for what \(\log_b(x)\) “really means” if \(b\) and \(x\) are whole-numbers. Imagine two sets of objects, one (say, weird dice) that can be placed in \(b\) different states, and another (say, weird wheels) that can be placed in \(x\) different states. Imagine you’re storing a really really large number, and imagine that the \(b\)-objects cost 1¢. At what price of \(x\)-objects would you be indifferent between using \(x\)-objects versus \(b\)-objects to store data? In other words, how many \(b\)-objects does it take to “emulate” an \(x\)-object?

Probing this interpretation also explains many of the properties of the logarithm. For example, the fact that \(\log_b(x)\) is the fair price of \(x\)-digits in terms of \(b\)-digits immediately implies that \(\log_x(b) = \frac{1}{\log_b(x)}\), because if an \(x\)-digit is worth three \(b\)-digits then a \(b\)-digit must be worth a third of an \(x\)-digit.

This interpretation still doesn’t shed light onto what logarithms are doing when their inputs are not whole numbers. For example, what’s \(\log_{1.5}(2.5)\)? What the heck would an object that can be placed into “1.5 different states” even be? As we will see shortly, this notion of there being a “natural exchange rate” between digits reveals an interpretation of the logarithm that explains what it’s doing with fractional inputs.