Life in logspace

The log lattice hints at the reason that engineers, scientists, and AI researchers find logarithms so useful.

At this point, you may have a pretty good intuition for what logarithms are doing, and you may be saying to yourself

Ok, I understand logarithms, and I see how they could be a pretty useful concept sometimes. Especially if, like, someone happens to have an out-of-control bacteria colony lying around. I can also see how they’re useful for measuring representations of things (like the length of a number when you write it down, or the cost of communicating a message). If I squint, I can sort of see why it is that the brain encodes many percepts logarithmically (such that the tone that I perceive is the \(\log_2\) of the frequency of vibration in the air), as it’s not too surprising that evolution occasionally hit upon this natural tool for representing things. I guess logs are pretty neat.

Then you start seeing logarithms in other strange places — say, you learn that AlphaGo primarily manipulated logarithms of the probability that the human would take a particular action instead of manipulating the probabilities directly. Then maybe you learn that scientists, engineers, and navigators of yore used to pay large sums of money for giant, pre-computed tables of logarithm outputs.

At that point, you might say

Hey, wait, why do these logs keep showing up all over the place? Why are they that ubiquitous? What’s going on here?

The reason here is that addition is easier than multiplication, and this is true both for navigators of yore (who had to do lots of calculations to keep their ships on course), and for modern AI algorithms.

Let’s say you have a whole bunch of numbers, and you know you’re going to have to perform lots and lots of operations on them that involve taking some numbers you already have and multiplying them together. There are two ways you could do this: One is that you could bite the bullet and multiply them all, which gets pretty tiresome (if you’re doing it by hand) and/or pretty expensive (if you’re training a gigantic neural network on millions of games of go). Alternatively, you could put all your numbers through a log function (a one-time cost), and then perform all your operations using addition (much cheaper!), and then transform them back using exponentiation when you’re done (another one-time cost).

Empirically, the second method tends to be faster, cheaper, and more convenient (at the cost of some precision, given that most log outputs are transcendental).

This is the last insight about logarithms contained in this tutorial, and it is a piece of practical advice: If you have to perform a large chain of multiplications, you can often do it cheaper by first transforming your problem into “log space”, where multiplication acts like addition (and exponentiation acts like multiplication). Then you can perform a bunch of additions instead, and transform the answer back into normal space at the end.