# Log as generalized length

Here are a hand­ful of ex­am­ples of how the log­a­r­ithm base 10 be­haves. Can you spot the pat­tern?

\begin{align} \log_{10}(2) &\ \approx 0.30 \\ \log_{10}(7) &\ \approx 0.85 \\ \log_{10}(22) &\ \approx 1.34 \\ \log_{10}(70) &\ \approx 1.85 \\ \log_{10}(139) &\ \approx 2.14 \\ \log_{10}(316) &\ \approx 2.50 \\ \log_{10}(123456) &\ \approx 5.09 \\ \log_{10}(654321) &\ \approx 5.82 \\ \log_{10}(123456789) &\ \approx 8.09 \\ \log_{10}(\underbrace{987654321}_\text{9 digits}) &\ \approx 8.99 \end{align}

Every time the in­put gets one digit longer, the out­put goes up by one. In other words, the out­put of the log­a­r­ithm is roughly the length — mea­sured in digits — of the in­put. (Why?)

Why is it the log base 10 (rather than, say, the log base 2) that roughly mea­sures the length of a num­ber? Be­cause num­bers are nor­mally rep­re­sented in dec­i­mal no­ta­tion, where each new digit lets you write down ten times as many num­bers. The log­a­r­ithm base 2 would mea­sure the length of a num­ber if each digit only gave you the abil­ity to write down twice as many num­bers. In other words, the log base 2 of a num­ber is roughly the length of that num­ber when it’s rep­re­sented in bi­nary no­ta­tion (where $$13$$ is writ­ten $$\texttt{1101}$$ and so on):

\begin{align} \log_2(3) = \log_2(\texttt{11}) &\ \approx 1.58 \\ \log_2(7) = \log_2(\texttt{111}) &\ \approx 2.81 \\ \log_2(13) = \log_2(\texttt{1101}) &\ \approx 3.70 \\ \log_2(22) = \log_2(\texttt{10110}) &\ \approx 4.46 \\ \log_2(70) = \log_2(\texttt{1010001}) &\ \approx 6.13 \\ \log_2(139) = \log_2(\texttt{10001011}) &\ \approx 7.12 \\ \log_2(316) = \log_2(\texttt{1100101010}) &\ \approx 8.30 \\ \log_2(1000) = \log_2(\underbrace{\texttt{1111101000}}_\text{10 digits}) &\ \approx 9.97 \end{align}

If you aren’t fa­mil­iar with the idea of num­bers rep­re­sented in other num­ber bases be­sides 10, and you want to learn more, see the num­ber base tu­to­rial.

Here’s an in­ter­ac­tive vi­su­al­iza­tion which shows the link be­tween the length of a num­ber ex­pressed in base $$b$$, and the log­a­r­ithm base $$b$$ of that num­ber:

As you can see, if $$b$$ is an in­te­ger greater than 1, then the log­a­r­ithm base $$b$$ of $$x$$ is pretty close to the num­ber of digits it takes to write $$x$$ in base $$b.$$

Pretty close, but not ex­actly. The most ob­vi­ous differ­ence is that the out­puts of log­a­r­ithms gen­er­ally have a frac­tional por­tion: the log­a­r­ithm of $$x$$ always falls a lit­tle short of the length of $$x.$$ This is be­cause, in­so­far as log­a­r­ithms act like the “length” func­tion, they gen­er­al­ize the no­tion of length, mak­ing it con­tin­u­ous.

What does this frac­tional por­tion mean? Roughly speak­ing, log­a­r­ithms mea­sure not only how long a num­ber is, but also how much that num­ber is re­ally us­ing its digits. 12 and 99 are both two-digit num­bers, but in­tu­itively, 12 is “barely” two digits long, whereas 97 is “nearly” three digits. Log­a­r­ithms for­mal­ize this in­tu­ition, and tell us that 12 is re­ally only us­ing about 1.08 digits, while 97 is us­ing about 1.99.

Where are these frac­tions com­ing from? Also, look­ing at the ex­am­ples above, no­tice that $$\log_{10}(316) \approx 2.5.$$ Why is it 316, rather than 500, that log­a­r­ithms claim is “2.5 digits long”? What would it even mean for a num­ber to be 2.5 digits long? It very clearly takes 3 digits to write down “316,” namely, ‘3’, ‘1’, and ‘6’. What would it mean for a num­ber to use “half a digit”?

Well, here’s one way to ap­proach the no­tion of a “par­tial digit.” Let’s say that you work in a ware­house record­ing data us­ing digit wheels like they used to have on old desk­top com­put­ers.

Let’s say that one of your digit wheels is bro­ken, and can’t hold num­bers greater than 4 — ev­ery notch 5-9 has been stripped off, so if you try to set it to a num­ber be­tween 5 and 9, it just slips down to 4. Let’s call the re­sult­ing digit a 5-digit, be­cause it can still be sta­bly placed into 5 differ­ent states (0-4). We could eas­ily call this 5-digit a “par­tial 10-digit.”

The ques­tion is, how much of a par­tial 10-digit is it? Is it half a 10-digit, be­cause it can store 5 out of 10 val­ues that a “full 10-digit” can store? That would be a fine way to mea­sure frac­tional digits, but it’s not the method used by log­a­r­ithms. Why? Well, con­sider a sce­nario where you have to record lots and lots of num­bers on these digits (such that you can tell some­one how to read off the right data later), and let’s say also that you have to pay me one dol­lar for ev­ery digit that you use. Now let’s say that I only charge you 50¢ per 5-digit. Then you should do all your work in 5-digits! Why? Be­cause two 5-digits can be used to store 25 differ­ent val­ues (00, 01, 02, 03, 04, 10, 11, …, 44) for $1, which is way more data-stored-per-dol­lar than you would have got­ten by buy­ing a 10-digit.noteYou may be won­der­ing, are two 5-digits re­ally worth more than one 10-digit? Sure, you can place them in 25 differ­ent con­figu­ra­tions, but how do you en­code “9″ when none of the digits have a “9” sym­bol writ­ten on them? If so, see The sym­bols don’t mat­ter. In other words, there’s a nat­u­ral ex­change rate be­tween $$n$$-digits, and a 5-digit is worth more than half a 10-digit. (The ac­tual price you’d be will­ing to pay is a bit short of 70¢ per 5-digit, for rea­sons that we’ll ex­plore shortly). A 4-digit is also worth a bit more than half a 10-digit (two 4-digits lets you store 16 differ­ent num­bers), and a 3-digit is worth a bit less than half a 10-digit (two 3-digits let you store only 9 differ­ent num­bers). We now be­gin to see what the frac­tional an­swer that comes out of a log­a­r­ithm ac­tu­ally means (and why 300 is closer to 2.5 digits long that 500 is). The log­a­r­ithm base 10 of $$x$$ is not an­swer­ing “how many 10-digits does it take to store $$x$$?” It’s an­swer­ing “how many digits-of-var­i­ous-kinds does it take to store $$x$$, where as many digits as pos­si­ble are 10-digits; and how big does the fi­nal digit have to be?” The frac­tional por­tion of the out­put de­scribes how large the fi­nal digit has to be, us­ing this nat­u­ral ex­change rate be­tween digits of differ­ent sizes. For ex­am­ple, the num­ber 200 can be stored us­ing only two 10-digits and one 2-digit.$$\log_{10}(200) \approx 2.301,$$ and a 2-digit is worth about 0.301 10-digits. In fact, a 2-digit is worth ex­actly $$(\log_{10}(200) - 2)$$ 10-digits. As an­other ex­am­ple, $$\log_{10}(500) \approx 2.7$$ means “to record 500, you need two 10-digits, and also a digit worth at least $$\approx$$70¢”, i.e., two 10-digits and a 5-digit. This raises a num­ber of ad­di­tional ques­tions: Ques­tion: Wait, there is no digit that’s worth 50¢. As you said, a 3-digit is worth less than half a 10-digit (be­cause two 3-digits can only store 9 things), and a 4-digit is worth more than half a 10-digit (be­cause two 4-digits store 16 things). If $$\log_{10}(316) \approx 2.5$$ means “you need two 10-digits and a digit worth at least 50¢,” then why not just have the $$\log_{10}$$ of ev­ery­thing be­tween 301 and 400 be 2.60? They’re all go­ing to need two 10-digits and a 4-digit, aren’t they? An­swer: The nat­u­ral ex­change rates be­tween digits is ac­tu­ally way more in­ter­est­ing than it first ap­pears. If you’re try­ing to store ei­ther “301” or “400“, and you start with two 10-digits, then you have to pur­chase a 4-digit in both cases. But if you start with a 10-digit and an 8-digit, then the digit you need to buy is differ­ent in the two cases. In the “301” case you can still make do with a 4-digit, be­cause the 10, 8, and 4-digits to­gether give you the abil­ity to store any num­ber up to $$10\cdot 8\cdot 4 = 320$$. But in the “400” case you now need to pur­chase a 5-digit in­stead, be­cause the 10, 8, and 4 digits to­gether aren’t enough. The log­a­r­ithm of a num­ber tells you about ev­ery com­bi­na­tion of $$n$$-digits that would work to en­code the num­ber (and more!). This is an idea that we’ll ex­plore over the next few pages, and it will lead us to a much bet­ter un­der­stand­ing of log­a­r­ithms. Ques­tion: Hold on, where did the 2.60 num­ber come from above? How did you know that a 5-digit costs 70¢? How are you calcu­lat­ing these ex­change rates, and what do they mean? An­swer: Good ques­tion. In Ex­change rates be­tween digits, we’ll ex­plore what the nat­u­ral ex­change rate be­tween digits is, and why. Ques­tion: $$\log_{10}(100)=2,$$ but clearly, 100 is 3 digits long. In fact, $$\log_b(b^k)=k$$ for any in­te­gers $$b$$ and $$k$$, but $$k+1$$ digits are re­quired to rep­re­sent $$b^k$$ in base $$b$$ (as a one fol­lowed by $$k$$ ze­roes). Why is the log­a­r­ithm mak­ing these off-by-one er­rors? An­swer: Se­cretly, the log­a­r­ithm of $$x$$ isn’t an­swer­ing the ques­tion “how hard is it to write $$x$$ down?”, it’s an­swer­ing some­thing more like “how many digits does it take to record a whole num­ber less than $$x$$?” In other words, the $$\log_{10}$$ of 100 is the num­ber of 10-digits you need to be able to name any one of a hun­dred num­bers, and that’s two digits (which can hold any­thing from 00 to 99). Ques­tion: Wait, but what about when the in­put has a frac­tional por­tion? How long is the num­ber 100.87? And also, $$\log_{10}(100.87249072)$$ is just a hair higher than 2, but 100.87249072 is way harder to write down that 100. How can you say that their “lengths” are al­most the same? An­swer: Great ques­tions! The length in­ter­pre­ta­tion on its own doesn’t shed any light on how log­a­r­ithm func­tions han­dle frac­tional in­puts. We’ll soon de­velop a sec­ond in­ter­pre­ta­tion of log­a­r­ithms which does ex­plain the be­hav­ior on frac­tional in­puts, but we aren’t there yet. Mean­while, note that the ques­tion “how hard is it to write down an in­te­ger be­tween 0 and $$x$$ us­ing digits?” is very differ­ent from the ques­tion of “how hard is it to write down $$x$$”? For ex­am­ple, 3 is easy to write down us­ing digits, while $$\pi$$ is very difficult to write down us­ing digits. Nev­er­the­less, the log of $$\pi$$ is very close to the log of 3. The con­cept for “how hard is this num­ber to write down?” goes by the name of com­plex­ity; see the Kol­mogorov com­plex­ity tu­to­rial to learn more on this topic. Ques­tion: Speak­ing of frac­tional in­puts, if $$0 < x < 1$$ then the log­a­r­ithm of $$x$$ is nega­tive. How does that square with the length in­ter­pre­ta­tion? What would it even mean for the length of the num­ber $$\frac{1}{10}$$ to be $$-1$$? An­swer: Nice catch! The length in­ter­pre­ta­tion crashes and burns when the in­puts are less than one. The “log­a­r­ithms mea­sure length” in­ter­pre­ta­tion is im­perfect. The con­nec­tion is still use­ful to un­der­stand, be­cause you already have an in­tu­ition for how slowly the length of a num­ber grows as the num­ber gets larger. The “length” in­ter­pre­ta­tion is one of the eas­iest ways to get a gut-level in­tu­ition for what log­a­r­ith­mic growth means. If some­one says “the amount of time it takes to search my database is log­a­r­ith­mic in the num­ber of en­tries,” you can get a sense for what this means by re­mem­ber­ing that log­a­r­ith­mic growth is like how the length of a num­ber grows with the mag­ni­tude of that num­ber: The in­ter­pre­ta­tion doesn’t ex­plain what’s go­ing on when the in­put is frac­tional, but it’s still one of the fastest ways to make log­a­r­ithms start feel­ing like a nat­u­ral prop­erty on num­bers, rather than just some es­o­teric func­tion that “in­verts ex­po­nen­tials.” Length is the quick-and-dirty in­tu­ition be­hind log­a­r­ithms. For ex­am­ple, I don’t know what the log­a­r­ithm base 10 of 2,310,426 is, but I know it’s be­tween 6 and 7, be­cause 2,310,426 is seven digits long. $$\underbrace{\text{2,310,426}}_\text{7 digits}$$ In fact, I can also tell you that $$\log_{10}(\text{2,310,426})$$ is be­tween 6.30 and 6.48. How? Well, I know it takes six 10-digits to get up to 1,000,000, and then we need some­thing more than a 2-digit and less than a 3-digit to get to a num­ber be­tween 2 and 3 mil­lion. The nat­u­ral ex­change rates for 2-digits and 3-digits (in terms of 10-digits) are 30¢ and 48¢ re­spec­tively, so the cost of 2,310,426 in terms of 10-digits is be­tween$6.30 and \$6.48.

Next up, we’ll be ex­plor­ing this idea of an ex­change rate be­tween differ­ent types of digits, and build­ing an even bet­ter in­ter­pre­ta­tion of log­a­r­ithms which helps us un­der­stand what they’re do­ing on frac­tional in­puts (and why).

Parents:

• What’s $$n$$ ex­actly?

• I think you may need to spell out this 10 times as many num­bers part. This is a large un­ex­plained step in ex­plain­ing why the log is the length.

• This is slightly con­fus­ing, be­cause it’s the first digit that’s a 2.

• I might write this as, “whereas, when mul­ti­ply­ing 1 by 10 to get to x, you might have to mul­ti­ply by 10 a frac­tional num­ber of times (if x is not a power of 10), so the log base 10 of x can in­clude a frac­tional part while the num­ber of digits in the base 10 rep­re­sen­ta­tion of x is always a whole num­ber.”

Ra­tionale: in the pre­vi­ous sen­tence you’re com­par­ing the num­ber of digits needed to write x to the num­ber of times to mul­ti­ply 1 by 10. So when the next sen­tence starts with, “the only differ­ence is…” I’m ex­pect­ing it to be com­par­ing num­bers of digits and num­bers of times to mul­ti­ply. I can figure out that you’ve switched to talk­ing about “com­put­ing logs” be­cause logs count the num­ber of times to mul­ti­ply by 10, but it feels like one ex­tra step of men­tal effort.

(This is a less con­fi­dent sug­ges­tion than the amount of text I’ve used sug­gests.)

• How about, “be­cause I’m go­ing to need six 10-digits to get up to a mil­lion, and some­thing more than a 2-digit and less than a 3-digit to get from there to a num­ber be­tween 2 and 3 mil­lion.”

I’m not sure that would be the right way to say it, but I still feel like the cur­rent text is prob­le­matic, be­cause:

1) Whether you say last digit, or sev­enth digit, in ei­ther case I’m read­ing right-to-left and my first thought is that you’re talk­ing about the ones place.

2) Even if you said some­thing like left-most digit, that wouldn’t be right, be­cause it’s not that 2 is be­tween 2 and 3, it’s that the value of the whole num­ber is greater than 210^6 and less than 310^6.

I think you’re refer­ring to a digit in an ab­stract sense that doesn’t di­rectly map to the digits we write down, so you may have to go out of your way to avoid con­fus­ing nth digit with a par­tic­u­lar one of the nu­mer­als that are writ­ten above.

• Is this para­graph needed? I find my­self want­ing to skip past it.

• You use an ex­am­ple of “99” then switch to “97″.