Properties of the logarithm
With a solid interpretation of logarithms under our belt, we are now in a position to look at the basic properties of the logarithm and understand what they are saying. The defining characteristic of logarithm functions is that they are real valued functions \(f\) such that
Property 1: Multiplying inputs adds outputs.
Property 2: 1 is mapped to 0.
This says that the amount the output changes if the input grows by a factor of 1 is zero — i.e., the output does not change if the input changes by a factor of 1. This is obvious, as “the input changed by a factor of 1” means “the input did not change.”
Exercise: Prove (2) from (1).
Property 3: Reciprocating the input negates the output.
This says that the way that growing the input by a factor of \(x\) changes the output is exactly the opposite from the way that shrinking the input by a factor of \(x\) changes the output. In terms of the “communication cost” interpretation, if doubling (or tripling, or \(n\)-times-ing) the possibility space increases costs by \(c\), then halving (or thirding, or \(n\)-parts-ing) the space decreases costs by \(c.\)
Exercise: Prove (3) from (2) and (1).
Property 4: Dividing inputs subtracts outputs.
This follows immediately from (1) and (3).
Exercise: Give an interpretation of (4).
\(f\left(x \cdot \frac{1}{y}\right) = f(x) - f(y),\) i.e., shrinking the input by a factor of \(y\) is the opposite of growing the input by a factor of \(y.\)
\(f\left(z \cdot \frac{x}{y}\right) = f(z) + f(x) - f(y),\) i.e., growing the input by a factor of \(\frac{x}{y}\) affects the output just like growing the input by \(x\) and then shrinking it by \(y.\)
Try translating these into the communication cost interpretation if it is not clear why they’re true. <div><div>
Property 5: Exponentiating the input multiplies the output.
This says that multiplying the input by \(x\), \(n\) times incurs \(n\) identical changes to the output. In terms of the communication cost metaphor, this is saying that you can emulate an \(x^n\) digit using \(n\) different \(x\)-digits.
Exercise: Prove (5).
For \(n \in \mathbb Q,\) this is a bit more difficult; we leave it as an exercise to the reader. Hint: Use the proof of (6) below, for \(n \in \mathbb N,\) to bootstrap up to the case where \(n \in \mathbb Q.\)
For \(n \in \mathbb R,\) this is actually not provable from (1) alone; we need an additional assumption (such as continuity) on \(f\). <div><div>
Property 5 is actually false, in full generality — it’s possible to create a function \(f\) that obeys (1), and obeys (5) for \(n \in \mathbb Q,\) but which exhibits pathological behavior on irrational numbers. For more on this, see pathological near-logarithms.
This is the first place that property (1) fails us: 5 is true for \(n \in \mathbb Q,\) but if we want to guarantee that it’s true for \(n \in \mathbb R,\) we need \(f\) to be continuous, i.e. we need to ensure that if \(f\) follows 5 on the rationals it’s not allowed to do anything insane on irrational numbers only.
Property 6: Rooting the input divides the output.
This says that, to change the output one \(n\)th as much as you would if you multiplied the input by \(x\), multiply the input by the \(n\)th root of \(x\). See Fractional digits for a physical interpretation of this fact.
Exercise: Prove (6).
As with (5), (6) is always true if \(n \in \mathbb Q,\) but not necessarily always true if \(n \in \mathbb R.\) To prove (6) in full generality, we additionally require that \(f\) be continuous.
Property 7: The function is either trivial, or sends some input to 1.
This says that either \(f\) is very boring (and does nothing regardless of its inputs), or there is some particular factor \(b\) such that when the input changes by a factor of \(b\), the output changes by exactly \(1\). In the communication cost interpretation, this says that if you’re measuring communication costs, you’ve got to pick some unit (such as \(b\)-digits) with which to measure.
Exercise: Prove (7).
\(b\) is \(\sqrt[y]{x}.\) We know that \(b \neq 1\) because \(f(b) = 1\) whereas, by (2), \(f(1) = 0\). <div><div>
Property 8: If the function is continuous, it is either trivial or a logarithm.
This property follows immediately from (5). Thus, (8) is always true if \(x\) is a rational, and if \(f\) is continuous then it’s also true when \(x\) is irrational.
Property (8) states that if \(f\) is non-trivial, then it inverts exponentials with base \(b.\) In other words, \(f\) counts the number of \(b\)-factors in \(x\). In other words, \(f\) counts how many times you need to multiply \(1\) by \(b\) to get \(x\). In other words, \(f = \log_b\)!
Many texts take (8) to be the defining characteristic of the logarithm. As we just demonstrated, one can also define logarithms by (1) as continuous non-trivial functions whose outputs grow by a constant (that depends on \(y\)) whenever their inputs grow by a factor of \(y\). All other properties of the logarithm follow from that.
If you want to remove the “continuous” qualifier, you’re still fine as long as you stick to rational inputs. If you want to remove the “non-trivial” qualifier, you can interpret the function \(z\) that sends everything to zero as \(\log_\infty\). Allowing \(\log_\infty\) and restricting ourselves to rational inputs, every function \(f\) that satisfies equation (1) is isomorphic to a logarithm function.
In other words, if you find a function whose output changes by a constant (that depends on \(y\)) whenever its input grows by a factor of \(y\), there is basically only one way it can behave. Furthermore, that function only has one degree of freedom — the choice of \(b\) such that \(f(b)=1.\) As we will see next, even that degree of freedom is rather paltry: All logarithm functions behave in essentially the same way. As such, if we find any \(f\) such that \(f(x \cdot y) = f(x) + f(y)\) (or any physical process well-modeled by such an \(f\)), then we immediately know quite a bit about how \(f\) behaves.
Parents:
I’m having trouble parsing interpretation #1 -- which part is supposed to map onto the right hand side of equation (4)?
output?
May need to build the intuition that knowing how f(x) behaves tells me how f(c*x) is different from f(c).
(You’re using the language of “growing the input,” but I just see a static input called x.)
(8) doesn’t follow from (5). The assumption in (5) was than \(n\) ranged over naturals, not reals. In fact, (1) only implies (8) if you also require the function to be continuous.
(1) essentially says \(f\) is a homomorphism from \((\mathbb{R}^{>0},\cdot)\) to \((\mathbb{R},+)\). To generate a function satisfying (1) but not (8), we need only compose \(log\) (choose a base) with an automorphism in the additive group and show that the composition is not a multiple of a logarithm. We can get such an automorphism by considering \(\mathbb{R}\) as an infinite dimensional vector space over the rationals and, for example, swapping two dimensions.
(5) was intended to assume that \(n \in \mathbb R^{\ge 1},\) or possibly \(\in \mathbb R^{\ge 0}\) if you want an easy way to prove (6). In that case, how does (8) not follow from (5)? (If \(f(x^y)=yf(x)\) in general, then \(f(b^n)=nf(b)\) and \(f(b)=1 \implies f(b^n)=n,\) unless I’m missing something.)
The proof of (5) only goes through for \(n\in\mathbb{N}\).
You can prove a version of (8) from (5), namely, \(f(b)=1\Rightarrow f(b^q)=q\) for \(q\in\mathbb{Q}\), but this doesn’t pin down \(f\) completely, unless you include a continuity condition.
tl;dr: I did some reading on related topics, and it turns out that (1) may be sufficient to define logarithms if we take as an axiom that every set is Lebesgue measurable (which is incompatible with the axiom of choice). Otherwise, we need to add an additional condition to (1).
(1) states that \(f(x\cdot y)=f(x)+f(y)\). Given a function \(g\) satisfying this condition, we can generate an additional function satisfying this condition by composing \(g\) with a function \(h\), where \(h(x+y)=h(x)+h(y)\):
\(h\), as defined, is a solution to Cauchy’s functional equation. The family of functions given by \(h(x)=ch(x)\) for some constant \(c\) is always a solution, giving the usual logarithm family. The existence of other solutions is independent of ZF. When they do exist they are always pathological and generate non-Lebesgue measurable sets (for more, see this stackexchange link).
We can prove the existence of such solutions in ZFC by noting that the solutions of the Cauchy functional equation are exactly the homomorphisms from the additive group of \(\mathbb{R}\) to itself. The real numbers form an infinite dimensional vector space over the field \(\mathbb{Q}\). Linear transformations from the vector space to itself translate into homomorphisms from the group to itself. Since the axiom of choice implies that any vector space has a basis, we can, for example, find a non-trivial linear transformation by swapping two basis vectors. This in turn induces a homomorphism from the group to itself. (The Wikipedia page gives the general form of a solution to this functional, which turn out to be all the linear transformations on the vector space.)
(I’m not saying that this article should discuss axiomatizations of set theory, but it doesn’t seem good to make statements that are only true if you assume, e.g., an unusual alternative to the axiom of choice.)
Wikipedia proves that the pathological solutions must all be dense in \(\mathbb{R}\), so to exclude them, we can adopt any number of conditions. Wikipedia points at “$f$ is continuous”, “$f$ is monotonic on any interval”, and “$f$ is bounded on any interval”. Continuity seems to be most intuitive; once we have defined the value of the function on the rationals (which we can do with basically the arguments already on this page), the rest of its values are determined.
How are these changes? (starting at prop 5, through the end)