The characteristic of the logarithm

Consider the interpretation of logarithms as the cost of communicating a message. Every time the number of possible messages to send doubles, your communication costs increase by the price of a coin, or whatever cheaper storage medium you have that can communicate one of two messages. It doesn’t matter whether the number of possible messages goes from 4 to 8 or whether it goes from 4096 to 8192; in both cases, your costs go up by the price of a coin. It is the factor by which the set grew (or shrank) that affects the cost; not the absolute number of messages added (or removed) from the space of possibilities. If the space of possible messages halves, your costs go down by one coin, regardless of how many possibilities there were before the halving.

Algebraically, writing \(f\) for the function that measures your costs, \(c(x \cdot 2) =\) \(c(x) + c(2),\) and, in general, \(c(x \cdot y) =\) \(c(x) + c(y),\) where we can interpret \(x\) as the number of possible messages before the increase, \(y\) as the factor by which the possibilities increased, and \(x \cdot y\) as the number of possibilities after the increase.

This is the key characteristic of the logarithm: It says that, when the input goes up by a factor of \(y\), the quantity measured goes up by a fixed amount (that depends on \(y\)). When you see this pattern, you can bet that \(c\) is a logarithm function. Thus, whenever something you care about goes up by a fixed amount every time something else doubles, you can measure the thing you care about by taking the logarithm of the growing thing. For example:

Consider the problem of checking whether a date is contained in a gigantic sorted list of dates. You can do this by jumping to the middle of the list, seeing whether your date is earlier or later than the date in the middle, and thereby cutting the search space in half. Each time you do this, you cut the list of dates you’re searching for in half, and so the total number of elements you need to look at goes up by one every time the size of the list doubles. Thus, the cost of searching an ordered list grows logarithmically in the size of the list. See also binary search.
Consider a colony of bacteria where each bacterium in the colony reproduces once per day. Thus, the size of the colony roughly doubles each day. If you care about how long this colony of bacteria has been growing, you can measure the days by taking the logarithm of the number of bacteria in the colony. The logarithm (base 2) counts how many times the colony has doubled (and the log base 3 counts how many times it has tripled, and so on).
The length of a number in Decimal notation grows more-or-less logarithmically in the magnitude of the number: When the magnitude of the number goes up by a factor of 10, the number of digits it takes to write the number down grows by 1. However, this analogy is not perfect: Sometimes, multiplying a number by two does not increase its length (consider the number 300), and sometimes, dividing a number by 10 does not decrease its length by one digit (consider the number 1). See also Length isn’t quite logarithmic.

Conversely, whenever you see a \(\log_2\) in an equation, you can deduce that someone wants to measure some sort of thing by counting the number of doublings that another sort of thing has undergone. For example, let’s say you see an equation where someone takes the \(\log_2\) of a relative likelihood. What should you make of this? Well, you should conclude that there is some quantity that someone wants to measure which can be measured in terms of the number of doublings in that likelihood ratio. And indeed there is! It is known as (Bayesian) evidence, and the key idea is that the strength of evidence for a hypothesis \(A\) over its negation \(\lnot A\) can be measured in terms of \(2 : 1\) updates in favor of \(A\) over \(\lnot A\). (For more on this idea, see What is evidence?).

In fact, a given function \(f\) such that \(f(x \cdot y) = f(x) + f(y)\) is almost guaranteed to be a logarithm function — modulo a few technicalities.

This puts us in a position where you can derive all the main properties of the logarithm (such as \(\log_b(x^n) = n \log_b(x)\) for any \(b\)) yourself. Check this box if that’s something you’re interested in doing. y: path: [4bz n: ]

Conditional text depending what’s next on the path.