In ZFC without Replacement, the number of functions from an $N$-element set to an $M$-element set is $M \times (M - 1) \times \ldots \times (M - (N - 1))$.

# The Fundamental Theorem of Telegraphy

Applying insights from category theory, we may decompose each telegram essentially uniquely as an epigram followed by a monogram.

# The "Meta-formula" for $1^n + 2^n + 3^n + ... + x^n$

You’ve probably seen formulas such $1 + 2 + 3 + \ldots + x = \frac{x^2}{2} + \frac{x}{2}$, $1^2 + 2^2 + 3^2 + \ldots + x^2 = \frac{x^3}{3} + \frac{x^2}{2} + \frac{x}{6}$, and so on before, and wondered where these come from, and if the same thing can be done for sums of powers of any degree.

Well, the answer (er, to the second question) is a resounding YES! Here is a general technique that will quite readily, quite easily give you the formula for the sum of consecutive values of any polynomial, as another polynomial. Indeed, here is a “meta-formula” for the desired formula.

Actually, I will explain this same underlying idea twice, in somewhat different ways. Once a little less abstractly, and once a little more abstractly. You may read whichever of these tickles your fancy best; whichever gives you the clearest understanding.**The less abstract presentation:**

Suppose you already have a nice formula for $R(x) = 1^n + 2^n + 3^n + \ldots + x^n$ (in other words, $R(0) = 0$ and $R(x) - R(x - 1) = x^n$). How can we find a similar formula for $(n + 1)$th powers instead of $n$th powers?

Well, $x^{n + 1}$ is what we get if we integrate $x^n$ (with an initial value of $0$) and multiply by $n + 1$. And if we do this to $R$, so that $Q’ = (n + 1)R$, this $Q$ will ALMOST give us exactly what we need: we will then have that the derivative of $Q(x) - Q(x - 1)$ is $(n + 1)(R(x) - R(x - 1)) = (n + 1) x^n$, which is the derivative of $x^{n + 1}$; it follows that $Q(x) - Q(x - 1)$ and $x^{n + 1}$ differ by at most a constant.

But we want them to match exactly! So to cancel out this constant, we can add an extra linear term to $Q$; as we add $cx$ to $Q$, we end up adding $c$ to $Q(x) - Q(x - 1)$, and so by using the appropriate $c$, we can get the exact equality we desire.

So that’s it, then. To get a formula for $1^{n + 1} + 2^{n + 1} + 3^{n + 1} + \ldots + x^{n + 1}$, we take a formula for $1^n + 2^n + 3^n + \ldots + x^n$, integrate it (with an initial value of $0$) and multiply by $n + 1$, then add $cx$ where the constant $c$ is chosen to ensure that we get the right values.

Thus, since we know that the formula for the sum of the first $x$ many $0$th powers $1 + 1 + 1 + \ldots$ is $x$ , we find that the formula to sum the first $x$ many $1$st powers $1 + 2 + 3 + \ldots + x$ is $\frac{x^2}{2} + cx$. What should we choose $c$ to be? We should choose it to give a total of $1$ when $x = 1$; thus. $c = \frac{1}{2}$, and our formula when $n = 1$ is $\frac{x^2}{2} + \frac{x}{2}$.

Next, to get the formula for $1^2 + 2^2 + 3^2 + \ldots + x^2$, we integrate, multiply by $2$, and add an unknown linear term, to get $\frac{x^3}{3} + \frac{x^2}{2} + cx$. Again, we should choose $c$ so that the total comes out to $1$ when $x = 1$; thus, $c$ now must be $\frac{1}{6}$.

Next, to get the formula for $1^3 + 2^3 + 3^3 + \ldots + x^3$, we do this all again, integrating, multiplying by $3$, and adding an unknown linear term, to get $\frac{x^4}{4} + \frac{x^3}{2} + \frac{x^2}{4} + cx$. In this case, to get the total to come out to $1$ when $x = 1$, we take $c = 0$.

And so on and so on; continuing in this way, we get the formulas for any degree we like.**The more abstract presentation:**

Consider the linear “forward difference” operator which sends the polynomial $P(x)$ to the polynomial $P(x + 1) - P(x)$. Call this operator $\Delta$. Given a polynomial $P$, our goal is to find some polynomial $Q$ such that $\Delta Q = P$. Then $Q(a) - Q(b)$ will be $P(b) + P(b + 1) + P(b + 2) + \ldots + P(a - 1)$, which will allow us to easily solve any problem of this sort. (We could just as well think about “backward differences” or any such thing, but my own idiosyncratic preference is to have to keep track of less minus signs…)

But how do we find such a $Q$, given $P$?

Well, let’s consider two other linear operators of note on polynomials, that are closely related to this difference operator. First of all, there’s “differentiation” (in the sense of taking the derivative), whose name already belies it closeness; this is the familiar operator from calculus which sends $x^n$ to $n x^{n - 1}$. We’ll call this operator $\delta$. Secondly, there’s “integration” in the sense which sends each $x^n$ to $\frac{x^{n + 1}}{n + 1}$. We’ll call this operator $\int$. Note that integrating followed by differentiating is the same as doing nothing at all (i.e., $\delta \int P = P$).

Let’s make more explicit the sense in which differentiation and forward difference are related. In particular, let’s note that we have, by Taylor expansion, that $P(x + 1) = P(x) + \frac{\delta^1 P (x)}{1!} + \frac{\delta^2 P(x)}{2!} + \frac{\delta^3 P(x)}{3!} + \ldots = e^{\delta} P(x)$. That is $\Delta P(x) = P(x + 1) - P(x) = (e^{\delta} - 1) P(x)$, which is to say, $\Delta = e^{\delta} - 1$. Conversely, we must have $\delta = \ln(1 + \Delta)$, which by Taylor series expansion again, comes out to $\Delta - \frac{\Delta^2}{2} + \frac{\Delta^3}{3} - \frac{\Delta^4}{4} + \ldots$.

(For what it’s worth, though these are phrased as infinite series here, when applied to any particular polynomial, only finitely many terms are nonzero, as $\delta$ or $\Delta$ applied more than $n$ times to a polynomial of degree $n$ results in zero. So there are no convergence issues which can arise, and all the results which we know to hold for Taylor series as abstract formal objects (e.g., that $e^x - 1$ and $\ln(1 + x)$ act as inverse functions) will necessarily work just as well for these series expansions in terms of our operators as applied to particular polynomials)

Let’s see how this helps us with our goal. We wanted to find a $Q$ such that $\Delta Q = P$. But we can rewrite this as $\Delta Q = \delta\int P$, which is to say, $\Delta Q = \left( \Delta - \frac{\Delta^2}{2} + \frac{\Delta^3}{3} - \frac{\Delta^4}{4} + \ldots \right) \int P$.

Now it’s easy enough to solve. Factoring a $\Delta$ out, we see that it suffices to take $Q = \left(1 - \frac{\Delta}{2} + \frac{\Delta^2}{3} - \frac{\Delta^3}{4} + \ldots \right) \int P = \frac{\ln(1 + \Delta)}{\Delta} \int P$. And as noted above, this is a finitary calculation for any particular $P$.

Actually, because $\delta^n$ is easier to compute for polynomials in standard representation than $\Delta^n$ (since $\delta$ takes monomials to monomials), it is often more convenient to rewrite $\frac{\ln(1 + \Delta)}{\Delta}$ as $\frac{\delta}{e^{\delta} - 1} = 1 - \frac{\delta}{2} + \frac{\delta^2}{12} - \frac{\delta^4}{720} + \ldots$. The Taylor series expansion here, as for any such symbolic expression, can be evaluated to any desired length by straightforward calculus, although there are cleverer ways as well. [The coefficients so produced happen to have been studied in their own right, in the theory of the closely related “Bernoulli numbers”, in case you care to read up more on these.]

So we have that $Q = \frac{\delta}{e^{\delta} - 1} \int P \; = \left(1 - \frac{\delta}{2} + \frac{\delta^2}{12} - \frac{\delta^4}{720} + \ldots\right) \int P $ $= \left(\int - \frac{1}{2} + \frac{\delta}{12} - \frac{\delta^3}{720} + \ldots\right) P$. And, remember, we can use this to calculate $P(b) + P(b + 1) + P(b + 2) + \ldots + P(a - 1)$ as $Q(a) - Q(b)$.

This completes our general technique. Let’s apply it to, for example, the particular question of a formula for $1^2 + 2^2 + 3^2 + \ldots + x^2$.

Suppose $P(x) = x^2$. Applying $\left(\int - \frac{1}{2} + \frac{\delta}{12} - \frac{\delta^3}{720} + \ldots\right)$ to this, we get $\frac{x^3}{3} - \frac{x^2}{2} + \frac{x}{6} + 0 + 0 + 0 + \ldots$.

Taking this to be $Q(x)$, we can now compute $1^2 + 2^2 + 3^2 + \ldots + x^2$ as $Q(x + 1) - Q(1)$. Eh, it’ll be slightly simpler in the terms we’ve put this in to think of this as $0^2 + 1^2 + 2^2 + \ldots + (x - 1)^2 + x^2 = Q(x) - Q(0) + x^2$, which is $\frac{x^3}{3} + \frac{x^2}{2} + \frac{x}{6}$. Ta-da! This is our final formula.

But the great value of the discussion in this post is that we do not need to apply new cleverness to solving this type of problem again for each new polynomial or degree of powers to be summed. We can mechanically produce the answers for each, simply applying $\frac{\delta}{e^{\delta} - 1} \int = \left(\int - \frac{1}{2} + \frac{\delta}{12} - \frac{\delta^3}{720} + \ldots\right)$ to the given polynomial.

This technique is general enough that it even works for some functions which aren’t polynomials, producing answers as convergent infinite series, which then can be used to interpolate/extrapolate the corresponding “sum consecutive values” function to new contexts beyond where this sum would ordinarily be interpretable. For example, these kinds of ideas can be used to generalize the factorial to non-integer arguments (by generalizing the logarithmic factorial, which is a sum of consecutive logarithms; of course, we’ve seen already how to generalize the factorial anyway), and to extend the Riemann and Hurwitz zeta functions to arbitrary inputs (as with every topic I mention in passing, I promise I will write more on this in a later post...). It’s a very valuable idea to be aware of.

# Elementary Prime Counting

Consider the question of how many primes are $\leq n$; let us call this quantity $\pi(n)$, as is traditional.

Understanding $\pi(n)$ is not just a random curiosity, but in fact ubiquitously useful for answering other questions in number theory, as the primes entirely determine the multiplicative structure of the integers. This is by virtue of the Fundamental Theorem of Arithmetic, which tells us that each positive integer has a unique prime factorization. The existence of these factorizations is easy to see (simply keep splitting an integer up into any possible smaller factors, and splitting those in turn, until nothing can be split any more). The uniqueness of these factorizations is a subtler, trickier point (it’s not nearly as obvious as people often naively take it to be!). I will write more about the Fundamental Theorem of Arithmetic in a separate post, but at any rate, take my word for it, it makes understanding the distribution of the prime numbers important! So let's see what we can quickly determine about $\pi(n)$.

The very first most basic question is, does $\pi(n)$ eventually grow larger than any finite value? That is, are there infinitely many primes?

Well, this was answered already for us in ancient times, by Euclid. Suppose given any finitely many primes. Let's see if we can find some other prime not among them. Well, consider taking the product of all those primes, and then adding $1$. The result is some integer $> 1$ which is indivisible by each of those primes. However, any integer $> 1$ has SOME prime factor (by the “keep splitting it up into smaller factors till you can’t anymore” method noted above; that is, by the existence half of the Fundamental Theorem of Arithmetic). Thus, the prime factors of this integer, of which at least one exists, are primes not among our original finitely many. We've shown there must be infinitely many primes.

This is often viewed as a merely qualitative argument, but we can consider it quantitatively as well.

Euclid's argument tells us that each next prime is at most $1 +$ the product of all the previous primes. Better yet, making no essential difference to the argument, we see in the same way that each next prime is at most $-1 +$ the product of all the previous primes, after the first two small primes $2$ and $3$. For convenience, let's not bother worrying about the $-1$, and just note that (after the first two small primes $2$ and $3$) each prime is at most the product of all the previous primes. This means the sequence of primes grows, at fastest, like the sequence starting with $2$ and $3$ in which each further term is the product of all the previous ones. This sequence is $2, 3, 6^{2^0}, 6^{2^1}, 6^{2^2}, 6^{2^3}, ...$, and thus we can conclude that $\pi(n) \geq \lfloor \log_{2}(\log_{6}(n)) \rfloor + 3$.

This gives us a loose lower-bound on the growth rate of $\pi(n)$. But we can do much better with a different simple argument, due to Chebyshev (this argument will also establish the infinitude of the primes as a consequence, in an entirely different manner to Euclid, with a much more impressive lower-bound on the growth rate of $\pi(n)$).

Our approach is as follows: we will find a positive integer-valued function $f(n)$ with the property that each of its prime power factors is $\leq n$. As every positive integer is the product of the prime powers in its prime factorization (again, this is the existence half of the Fundamental Theorem of Arithmetic), we can conclude that $f(n) \leq n^{\pi(n)}$, which is to say, that $\pi(n)$ is at least $\log_{n}(f(n))$. If furthermore $f(n)$ grows super-polynomially, this tells us that the number of primes grows infinite.

Specifically, we will take $f(n)$ to be $\frac{\lfloor n \rfloor!}{\lfloor n/2 \rfloor !^2}$. This certainly grows super-polynomially (expanding out the factorials, there are about $n$ factors in both numerator and denominator, with the former about twice as large as the latter, so this is approximately $2^n$; at any rate, it is at least as large as $\binom{\lfloor n \rfloor}{\lfloor n/2 \rfloor}$, which is in turn at least as large as $2^{\lfloor n \rfloor}/(\lfloor n \rfloor + 1)$), so what remains is to see that each prime power dividing $f(n)$ is at most $n$.

In order to establish this, let's first note that the exponent of each $p$ in the prime factorization of $\lfloor n \rfloor!$ is the sum of $\lfloor n/p^e \rfloor$ over each positive integer $e$. (This is because the exponent of $p$ in the prime factorization of any positive integer $m$ is the same as the number of $p^e$ which divide $m$. The exponent in the prime factorization of $\lfloor n \rfloor!$ will then be the sum of that quantity over all $m \leq n$, which is to say, the number of pairs $(m, p^e)$ where $p^e$ divides $m$ which in turn is $\leq n$. For each $p^e$, there will be $\lfloor n/p^e \rfloor$ many choices of $m$, giving the stated result.)

Let's also note that $\lfloor x \rfloor - 2\lfloor x/2 \rfloor$ is the remainder when $\lfloor x \rfloor$ is divided by $2$, which is to say, $0$ or $1$ according as to whether $\lfloor x \rfloor$ is even or odd, respectively.

Putting those together, we find that the exponent of $p$ in the prime factorization of $f(n)$ is the number of $\lfloor n/p^e \rfloor$ which are odd. This is upper-bounded by the number of $\lfloor n/p^e \rfloor$ which are $\geq 1$, which amounts to the largest $e$ such that $p^e \leq n$. Thus, every prime power factor of $f(n)$ is $\leq n$, completing the proof.

How good is our bound $\pi(n) \geq \log_n(f(n))$? It's actually not too hard to show that the ratio of $\pi(n)$ to $\log_n(f(n))$ is also upper-bounded by a constant (perhaps I'll add that to this post later). So it's quite good! That having been said, we can go even further: The Prime Number Theorem says that this ratio approaches precisely $1 : \ln(2)$ as $n$ grows large [though it is usually phrased in more generally relevant terms than that].

And should anyone ever prove the Riemann Hypothesis, that would give us even tighter bounds on $\pi(n)$. (What a wonderful story that would make, to crack the most ancient chestnut of prime infinitude with the most advanced sledgehammer of the Riemann Hypothesis-turned-Theorem! It'd be the greatest saga of mathematics, spanning all the way from what's considered its most classic proof to what's considered its most sought-after one.)

# The irrationality of $\pi^2$, and therefore of $\pi$

There are many “different” proofs of the irrationality of $\pi$ which are all based on the same underlying idea. The proof I describe in this post is also based on the same underlying idea as all the other ones you'll find in the wild, but in a different presentation, one I personally find clearest for helping me understand what is going on and in what directions it generalizes. Incidentally, this simple proof shows not only the irrationality of $\pi$ but also the irrationality of $\pi^2$.

Let $f(x) = \cos(\sqrt{x})$, and, as is conventional, let $f’(x)$ denote its first derivative and more generally $f^{(N)}(x)$ denote its $N$-th derivative (with respect to $x$).

Note that $f^{(N)} = P(y)f + Q(y)f'$, where $y = \frac{1}{4x}$ and $P$ and $Q$ are integer coefficient polynomials of degree growing at rate at most linear in $N$.

[This follows inductively from $f^{(2)}$ having this form, which is just the fact that $\cos'' = -\cos$, translated to this reparametrization.]

But, for any given $x$, we have that $\left|f^{(N)}(x)\right| \sim \left|f^{(N)}(0)\right| = \frac{N!}{(2N)!}$ as $N$ grows large.

[This can be seen from the easy Taylor series expansion of $f^{(N)}(x)$; the $=$ is immediate, and the $\sim$ follows from each term dominating the next as $N$ grows large (if you're worried about commuting limits here, it's justified by dominated convergence).]

Given this super-exponential rate of decay, it cannot be that both $y$ is rational and $f$ and $f'$ are in rational ratio.

[If $y$ were rational with lowest-terms denominator $d$, then the smallest nonzero value integer polynomials of degree $O(N)$ in $y$ could produce is $\frac{1}{d^{O(N)}}$, which is a merely exponential decay rate; any fixed linear combination of such polynomials with weights in rational ratio (i.e., integer multiples of some common weight) would also therefore have a merely exponential decay rate.]

In particular, plug in $\sqrt{x} = \frac{\pi}{2}$, at which $f$ vanishes, to conclude $y = \frac{1}{\pi^2}$ is irrational, which is to say, $\pi^2$ is irrational.

More generally, this establishes that wherever $\frac{\tan(z)}{z}$ is rational or infinite, $\frac{1}{z^2}$ is irrational, which is to say for nonzero such $z$ that $z^2$ is irrational. (This includes, for what it's worth, the case of complex $z$; indeed, the whole thing might as well have been phrased in terms of $\cosh$, but people find $\cos$ more familiar.)

Of course, the corollary people talk about most is that thus $\pi$ itself is irrational.

As it happens, we know even more than this; we know that $\pi$ is not merely the square root of an irrational, but is in fact a transcendental number (that is, it is not a root of ANY nonzero polynomial with rational coefficients). The proof of that, and many further transcendentality results, is actually along similar lines as seen here, but worked out to greater extent and sophistication. I shall hope to digest and write that up to my satisfaction as well, some later time.

# Another Simple Geometric Proof of the Wallis Product

The $n$-dimensional unit sphere (in the indexing in which the Earth is a 3-dimensional sphere) has an inner volume which is its surface area divided by $n$ (by considering each tiny patch of its surface as the base of a figure tapering down to its center), and at the same time, by the argument that projecting two dimensions outwards to the enveloping "cylinder" is area-preserving (stretching latitudinally and longitudinally in precise cancellation, as in the Lambert map projection), we find that the unit $n$-sphere's surface area is $\tau$ times the unit $(n - 2)$-sphere's volume.

[Footnote: Here, whenever I say “(inner) volume”, I mean in the sense appropriate to the number of dimensions, and similarly for “surface area”; thus, for example the “surface area” of a circle is its perimeter, and the “volume” of a circle is its area]

Putting these last two facts together, the unit $n$-sphere's volume is $\frac{\tau}{n}$ times the unit $(n - 2)$-sphere's volume ($\tau$ here being the circumference of a unit circle).

Letting $V(n)$ be the unit $n$-sphere's volume, and letting $D(n) = \frac{V(n + 1)}{V(n)}$, we get that $\frac{2}{3} \times \frac{4}{5} \times \frac{6}{7} \times \ldots \times \frac{n}{n + 1}$ comes out to $\frac{V(n + 1)/V(1)}{V(n) / V(0)} = \frac{V(0)}{V(1)} D(n) = \frac{1}{2} D(n)$. And of course $\frac{2}{1} \times \frac{4}{3} \times \frac{6}{5} \times \ldots$, with as many factors, is just this same thing times $n + 1$. Putting these together, the Wallis product up through $\frac{n}{n + 1} \times \frac{n}{n - 1}$ is $\frac{n + 1}{4} D(n)^2$.

But considering the $(n + 1)$-sphere's volume in terms of its cross-sectional $n$-spheres, we also have that $D(n)$ is twice the average value of $f(x)^n$ as $x$ runs from $0$ to $1$, where $f(x) = \sqrt{1 - x^2}$ is the cross-sectional radius of at height $x$ above the center of a unit sphere. Thus $D(n)$ is a decreasing function of $n$. And our recurrence $V(n) = \frac{\tau}{n} \times V(n - 2)$ is equivalently the recurrence $D(n - 2) \times D(n - 1) = \frac{\tau}{n}$. As a monotonic sequence satisfying this recurrence, we get that $D(n)^2 \sim \frac{\tau}{n}$ as $n$ goes to infinity (since $D(n)^2$ is inbetween $D(n - 1) \times D(n)$ and $D(n) \times D(n + 1)$, and our recurrence tells us each of these is $\sim \frac{\tau}{n}$).

Combining the last two paragraphs, we get that the Wallis product in toto comes out to $\frac{\tau}{4}$ (i.e., $\frac{\pi}{2}$). Ta-da!

This is actually related to Wallis's original proof (which was done not in terms of spheres but in terms of some messy integrals; even messier perhaps by virtue of being worked out before the general integral calculus had been developed!), but that is as this seen through a glass very darkly. I would be very curious to see if anyone has written on seeing the Wallis product this way in terms of sphere volumes before.

# The Generalized Factorial

Suppose you wanted to extend the factorial function to arbitrary arguments. How might you do it?

Well, of course, there are a million ways to do it. (Where "a million" = "infinitely many"). You could say the factorial function is the normal thing at natural numbers, and $\sqrt{7}$ everywhere else. This wouldn't be a very useful extension, but it technically qualifies.

What would make a more useful extension, then? Well, we want an extension that preserves the key properties of the factorial function. For example, that $\frac{x!}{(x - 1)!} = x$.

This still isn't enough to pin down an extension, though. There's still infinitely many extensions of that sort. (You could pick any values you want for input fractions between 0 and 1, for example, and then extend these using the recurrence to corresponding values elsewhere.)

But there's another interesting property of the factorial function: $\frac{x!}{(x - r)!}$ is the number of ways to pick a sequence of $r$ items from $x$ choices, with no repetition. This is similar to, albeit less than, the number of ways to do it if you allow repetition, $x^r$. And as $x$ gets larger and larger, the probability of repetition gets negligible; we find that the ratio between $\frac{x!}{(x - r)!}$ and $x^r$ approaches 1 as $x$ grows large while $r$ is held fixed. (For that matter, the same thing happens to the ratio between $x - r$ and $x$).

In other words, if the difference between $a$ and $b$ is held fixed while their individual values grow large, then the ratio between $\frac{a!}{b!}$ and $b^{a - b}$ approaches 1.

This is a very useful property. If we continue to demand this for our extension, on top of the basic $0! = 1$ and $\frac{x!}{(x - 1)!} = x$, we *will* pin down a unique function, like so:

$x! = \frac{x!}{(x + N)!} \times \frac{(x + N)!}{N!} \times N!$.

If $N$ is a natural number, then $\frac{x!}{(x + N)!}$ and $N!$ are easy to calculate as rising products; combining these two factors produces $\frac{1}{x + 1} \times \frac{2}{x + 2} \times ... \times \frac{N}{x + N}$.

Furthermore, our newest demand is that the middle factor, $\frac{(x + N)!}{N!}$, become replaceable with $N^x$ as $N$ grows large.

Thus, we have that $x!$ is the limit, as the natural number $N$ grows large, of $N^x \times \frac{1}{x + 1} \times \frac{2}{x + 2} \times ... \times \frac{N}{x + N}$. And this definition makes sense for all kinds of $x$, not just natural numbers.

(When $x$ is a negative integer, there will be a division by zero, but for all other $x$, this limit will be well-defined. Put another way, the reciprocal of $x!$ is (just reciprocating all the factors from before) the limit as $N$ grows large of $N^{-x} \times \left(\frac{x}{1} + 1 \right) \times \left(\frac{x}{2} + 1\right) \times \dots \times \left(\frac{x}{N} + 1\right)$, and this does indeed converge to a finite limit for every finite x, though that value is 0 at negative integers.)

This defines the usual extension of the factorial function to arbitrary inputs (fractions, complex numbers, even matrices, and so on). As we demonstrated, this is the unique way to do so while satisfying our key properties. As it turns out, other definitions accomplish the same effect (and therefore are equal to this one); for example, the famous integral $\displaystyle \int_{t = 0}^{\infty} t^x e^{-t} \; dt$ (which I will discuss more later!). But this product formula falls right out of our key properties, making it often superior to that famous integral for thinking about the generalized factorial.

One last note: the so-called Gamma ($\Gamma$) function is just this extension of the factorial function, shifted over by one. The shifting over by one is of no importance at all. It's just a stupid historical convention. So don't worry about it. All that actually matters is the argument above, constructing and establishing the uniqueness of a suitable interpretation of factorial for general (non-integer) inputs.

**Connection to the sine function**

There is a also a wonderful connection between the generalized factorial and the sine function. As we just saw, we have this product representation for the generalized factorial:

$\frac{1}{x!} = \displaystyle \lim_{N \to \infty} N^{-x} \prod_{k = 1}^{N} \left(1 + \frac{x}{k} \right)$.

And as we saw in a previous video and post, we also have a similar product representation for the sine function:

$\frac{\sin(x \pi)}{x \pi} = \displaystyle \prod_{k = 1}^{\infty} \left(1 + \frac{x}{k} \right) \left(1 + \frac{x}{-k} \right)$

There are only two differences between these: in the former, we have a factor of $N^{-x}$ around to make our limit converge to a finite value, which we don’t need in the latter. And in the former, we only go through factors corresponding to positive $k$, while in the latter, we’ve bundled these together with factors corresponding to negative $k$.

But now consider what happens when we multiply $\frac{1}{x!}$ by $\frac{1}{(-x)!}$. The $N^{-x}$ and $N^x$ factors cancel out, and the rest of the factors pair up, each $\left(1 + \frac{x}{k}\right)$ factor from $\frac{1}{x!}$ bundling together with a corresponding $\left(1 + \frac{x}{-k}\right)$ factor from $\frac{1}{(-x)!}$. We get precisely the product that equals $\frac{\sin(x \pi)}{x \pi}$.

And so we can conclude, $\frac{1}{x!} \times \frac{1}{(-x)!} = \frac{\sin(x \pi)}{x \pi}$, or, put another way, $x! (-x)! = \frac{x \pi}{\sin(x \pi)}$. This is often called “the reflection formula”, and also leads to related “reflection formulas” for other important functions in mathematics, which I may have occasion to describe here eventually. But I’ll leave that as just a teaser of the future for now…

# Solving the Basel Problem with Calc 101

The Basel problem can be solved by simple integration!

Recall that the Basel problem is to determine the value of $\frac{1}{1^2} + \frac{1}{2^2} + \frac{1}{3^2} + \ldots$.

Consider $f(x) = \ldots + e^{-3x} + e^{-2x} + e^{-1x} + e^{0x} + e^{1x} + e^{2x} + e^{3x} + \ldots$. By double integrating $f$, we get $h(x) = \ldots + \frac{e^{-3x}}{3^2} + \frac{e^{-2x}}{2^2} + \frac{e^{-1x}}{1^2} + \frac{x^2}{2} + \frac{e^{1x}}{1^2} + \frac{e^{2x}}{2^2} + \frac{e^{3x}}{3^2} + \ldots + C_1x + C_2 $. If we can figure out what $h$ amounts to with $C_1$ and $C_2$ chosen to be zero, then we can readily solve the Basel problem, as the Basel sum then amounts to simply $h(0)/2$.

Let us go through this in detail. First, observe that $f(x)$, which is to say $\ldots + e^{-3x} + e^{-2x} + e^{-1x} + e^{0x} + e^{1x} + e^{2x} + e^{3x} + \ldots$, is unchanged by multiplication by $e^x$; thus, wherever $e^x \neq 1$, we must have $f(x) = 0$, at least in some sense.

(When $e^x$ *does* equal $1$, on the other hand, $f(x)$ is an infinite sum of $1$s; this can be made sense of, with $f$ turning out to be interpretable as a kind of integrable entity slightly more general than familiar finite-valued functions, but we won't need to worry about this for now.

For those who care pedantically about formal technicalities, I should also note that our $f(x) = 0$ result doesn't hold in the traditional formalization of series convergence, as $f(x)$ is not a convergent series (its terms do not approach $0$, for example), but this equation still holds in a good enough sense for our purposes; for example, $f$ is "Cesàro summable", a slight extension of traditional summation-by-convergence which agrees with traditional summation where it is well-defined, and which suffices to formally carry out our argument, should we care to be formal)

Next, consider integrating $f$ once, to obtain the particular antiderivative $g(x) = \ldots + \frac{e^{-3x}}{-3} + \frac{e^{-2x}}{-2} + \frac{e^{-1x}}{-1} + x + \frac{e^{1x}}{1} + \frac{e^{2x}}{2} + \frac{e^{3x}}{3} + \ldots$ [note the middle $x$ term, slightly different from all the rest].

On any range where $e^x \neq 1$, we must have that $g(x)$ is constant (as its derivative is $f(x) = 0$). In particular, we must have that $g(x)$ is constant as $x$ ranges between $0$ and $2 \pi i$, the range to which we will restrict all attention from hereon out. But what particular constant is $g(x)$ on this range? Well, the answer is the same as the average value of $g(x)$ on this range.

But the average value of $e^{nx}$ over this range for nonzero $n$ is zero (as that average value must be invariant under multiplication by any value of $e^{nx}$ inside the range, and some (indeed, most!) of those values will differ from $1$; essentially, $e^{nx}$ spins around a circle $n$ times, whose average value is the origin zero). So in computing the average value of $g(x)$ on our range, all terms vanish except for the average value of $x$, which will be $\pi i$ (halfway between the range's two endpoints).

Thus, $g(x) = \pi i$ for $x$ between $0$ and $2 \pi i$. (The same reasoning also tells us that $g(x) = - \pi i$ for $x$ between $0$ and $-2 \pi i$, for what it's worth, and more generally, that $g(x)$ steps up by $2 \pi i$ whenever $x$ passes a multiple of $2 \pi i$)

Integrating again, we get the particular antiderivative $h(x) = \ldots + \frac{e^{-3x}}{3^2} + \frac{e^{-2x}}{2^2} + \frac{e^{-1x}}{1^2} + \frac{x^2}{2} + \frac{e^{1x}}{1^2} + \frac{e^{2x}}{2^2} + \frac{e^{3x}}{3^2} + \ldots$. This must equal $\pi i x + C$ for $x$ between $0$ and $2 \pi i$.

Again, to determine the value of this constant $C$, we can consider the question of the average value of $h(x)$ throughout our range of interest, which again reduces to the average value of the one non-exponential term $\frac{x^2}{2}$. By the "power rule" of calculus, the average value of $\frac{x^2}{2}$ between $0$ and $2 \pi i$ is $\frac{(2 \pi i)^2/3}{2} = -\frac{2}{3} \pi^2$. This must therefore also equal the average value of $\pi i x + C$ on our range, which is $(\pi i)^2 + C = - \pi^2 + C$. Thus, we get $C = \pi^2 - \frac{2}{3} \pi^2 = \frac{1}{3} \pi^2$.

And therefore $h(0) = \frac{1}{3} \pi^2$. And since, as we noted before, the Basel sum comes out to $h(0)/2$, we find that the Basel sum is $\frac{1}{6} \pi^2$. Ta-da!

What's more, we can carry on integrating in the same way, determining a polynomial expression in $x$ for $\ldots + \frac{e^{-3x}}{3^n} + \frac{e^{-2x}}{2^n} + \frac{e^{-1x}}{1^n} + \frac{x^n}{n!} + \frac{e^{1x}}{1^n} + \frac{e^{2x}}{2^n} + \frac{e^{3x}}{3^n} + \ldots$ for each natural number $n$, and by evaluating this at $x = 0$, determining the value of $\ldots + \frac{1}{(-3)^n} + \frac{1}{(-2)^n} + \frac{1}{(-1)^n} + 0 + \frac{1}{1^n} + \frac{1}{2^n} + \frac{1}{3^n} + \ldots$. This will come out to zero for odd $n$ (the negatively indexed and positively indexed terms cancelling each other out as negated mirror images), but for even $n$, we will get interesting results (which we can divide by two, in just the same fashion as we did here, to restrict attention to just the positively indexed terms of this summation).

For example, in just the same way, with two more integrations, we find that $\frac{1}{1^4} + \frac{1}{2^4} + \frac{1}{3^4} + \ldots = \frac{1}{90} \pi^4$. And with yet two more integrations, we find that $\frac{1}{1^6} + \frac{1}{2^6} + \frac{1}{3^6} + \ldots = \frac{1}{945} \pi^6$. And in general, for each (positive) even $n$, we find that $\frac{1}{1^n} + \frac{1}{2^n} + \frac{1}{3^n} + \ldots$ is some rational coefficient times $\pi^n$ (with a simple recursive computation for that rational coefficient). Ta-da again! Infinitely many "Ta-da!"-worthy results, all at once!

# The Wallis Product and Sine Product (supplement to video)

**Commuting limits using Dominated Convergence:**

This kind of interchanging of limits and infinitary arithmetic isn’t actually always true, for arbitrary sequences. It often holds, but sometimes fails. Luckily, mathematicians have spent a lot of time thinking about these phenomena, and developing tools for quickly seeing certain conditions under which this interchanging of limits works. In this case, a particular standard result known as “Dominated Convergence” quickly assures us that we are indeed allowed to use this sleight-of-hand here. Let’s see the details of how to use that for this argument:

Although Dominated Convergence is normally phrased entirely in terms of addition, we will use a form of it tailored to multiplication: specifically, the Dominated Convergence result we will use states that if you have a multiplicative series whose $k$-th factor at time $N$ is $S_k(N)$, and you want to show that $\displaystyle \lim_{N \to \infty} \prod_{k} S_k(N) = \prod_{k} \lim_{N \to \infty} S_k(N)$, it suffices to find an additive series of positive terms $B_k$ such that $\sum_{k} B_k$ converges, and such that $|S_k(N) - 1| \leq B_k$ for each $k$ regardless of $N$. (The “for each $k$” can even be relaxed to “for all but finitely many $k$”).

To apply this to our situation, recall that at the time when we have $N$ lighthouses, the contribution from the $k$-th lighthouse to the Sailor : Keeper distance-product ratio is $\mathrm{Chord}((k - f)/N) : \mathrm{Chord}(k/N)$, if the $k$-th lighthouse is present. And, presuming $N$ always odd for convenience, we can take the lighthouses which are present to correspond to those nonzero integers $k$ for which $N > |2k|$. (When $N$ is even, there is also the question of whether to use $+N/2$ or $-N/2$ to index the lighthouse diametrically opposite “the Keeper”, but we will ignore this wart; it makes no difference to anything to restrict attention to the behavior for odd $N$). Another way of saying that the $k$-th factor only enters once $N > |2k|$ is to say that the $k$-th factor is $1$ when this condition is not satisfied.

So this gives us our $S_k(N) = \frac{\mathrm{Chord}((k - f)/N)}{\mathrm{Chord}(k/N)}$ (when $N > |2k|$, and $1$ otherwise).

We also already know that $\displaystyle \lim_{N \to \infty} S_k(N) = 1 - \frac{f}{k}$, and we also have just seen that the product over all nonzero integers $k$ of $S_k(N)$ is equal to $\frac{\mathrm{Chord}(f)}{N \times \mathrm{Chord}(f/N)}$, whose limiting value for large $N$ is $\frac{\mathrm{Chord}(f)}{f \times 2\pi} = \frac{\sin(f\pi)}{f\pi}$.

So to complete the proof of our sine product result, we just need to be able to commute the limits here. In order to use our Dominated Convergence tool to commute these limits, we need to find a series of positive values $B_k$, such that the sum of the $B_k$ converges, and $|S_k(N) - 1|$ is bounded by $B_k$.

Alas… this is impossible: Note that the limiting value of (and thus the minimal size of any potential bound on) $|S_k(N) - 1|$ is on the order of $|k|^{-1}$, and this famously diverges (as the harmonic series) rather than converges.

BUT! If we bundle our factors together, bundling the $k$-th and $-k$-th factor together as $S’_k(N) = S_k(N) \times S_{-k}(N)$, and then consider this series (now indexed only by positive integers) instead, now we have a shot, as the limiting value of $|S_k(N) - 1|$ is now on the order of $k^{-2}$, which converges. Let’s see if we can get an actual bound, irrespective of $N$, of this same quadratic order in $k$, which will allow us to use Dominated Convergence on the product with factors bundled in this way.

Expanding out $S’_k(N)$, it is $ \frac{\mathrm{Chord}((k - f)/N)}{\mathrm{Chord}(k/N)} \times \frac{\mathrm{Chord}((-k - f)/N)}{\mathrm{Chord}(-k/N)}$, where $N > 2k$ (or $1$ otherwise). With a little trig identity elbow-grease, and keeping in mind that $\mathrm{Chord}$ is basically the same thing as $\sin$ (just with its inputs and outputs re-scaled by constant factors), we get that $S’_k(N)$ equals $1 - \left( \frac{\mathrm{Chord}(f/N)}{\mathrm{Chord}(k/N)} \right) ^2$ (under, as always, the condition that $N > 2k$, and being simply $1$ otherwise).

Our goal now is to bound $\left( \frac{\mathrm{Chord}(f/N)}{\mathrm{Chord}(k/N)} \right) ^2$ for $N > 2k$ by some function of $k$ on the order of $k^{-2}$ for large $k$, which is to say, we want to bound $\frac{\mathrm{Chord}(f/N)}{\mathrm{Chord}(k/N)}$ by a function of $k$ on the order of $k^{-1}$.

To show that such a bound holds, we begin by rewriting $\frac{\mathrm{Chord}(f/N)}{\mathrm{Chord}(k/N)}$ as $\frac{\mathrm{Chord}(r x)} {\mathrm{Chord}(x)}$ where $x = k/N < \frac{1}{2}$ and $r = f/k$.

Letting $D(x) = \mathrm{Chord}(x)/x$, we can furthermore rewrite this as $\frac{D(r x)}{D(x)} \times r$. As $r$ itself is proportional to $k^{-1}$, what remains is to show that $\frac{D(r x)}{D(x)}$ is bounded by a constant, for sufficiently large $k$. Since $x \in [0, \frac{1}{2}]$ and $D(x)$ is a continuous function which never takes the value $0$ in this range (including at $x = 0$, where $D(x)$ equals the nonzero derivative of $\mathrm{Chord}(x)$), we have that $\frac{1}{D(x)}$ is bounded by a constant. Furthermore, since $x$ itself is bounded by a constant, and $r$ is bounded by a value proportional to $k^{-1}$, we have that for sufficiently large $k$, that $rx$ is arbitrarily close to $0$, and thus $D(rx)$ is arbitrarily close to the constant $D(0)$. This completes the proof that $\frac{D(r x)}{D(x)}$ is bounded by a constant for sufficiently large $k$, and indeed, retracing back through what has taken us here, completes the proof that $|S’_k(N) - 1|$ is bounded by some function of $k$ which is additively convergent over all $k$, thus allowing us to apply Dominated Convergence to $S’_k$, filling in the final rigor on our proof of the sine product formula in general and the Wallis product in particular.

You may be wondering why our particular trick of bundling positive and negative index factors into a single factor was helpful and indeed necessary for us to get around the initial obstruction to using Dominated Convergence here. One way of looking at this is as so: The fact that we could not use Dominated Convergence with our pre-bundled product corresponds to how our pre-bundled product is only conditionally convergent, not absolutely convergent; re-ordering its factors wildly could give different limiting values. But the bundling we engaged in turned our product absolutely convergent, obviating these issues and in so doing also yielding a series to which we could apply our Dominated Convergence tool.**The relationship to the Basel problem:**

Not only is our sine product cool in its own right, but we can also use it to solve the Basel problem (and indeed, this was the way that the Basel problem was first solved by Euler, though he discovered the sine product in a different way than we've shown here):

Remember, the Basel problem is to understand what the sum of the reciprocal squares comes out to, $\frac{1}{1^2} + \frac{1}{2^2} + \frac{1}{3^2} + \dots$. How does this relate to our sine product?

Well, as we've just seen, we know that $\sin(f \pi)$ is $f \pi$ times the product of $\left(1 - \frac{f}{k} \right) \times \left(1 - \frac{f}{-k} \right) = 1 - \frac{f^2}{k^2}$ over all positive integers $k$. But we also know, from the fact that $\sin$ is the negation of its own second derivative, and from the values at $0$ of $\sin$ and its derivative, that we have the Taylor series expansion $\sin(f \pi) = f \pi - \frac{(f \pi)^3}{3!} + \frac{(f \pi)^5}{5!} - \frac{(f \pi)^7}{7!} + \dots$.

So the coefficients of $f$ in both of these series must be the same. In particular, let's look at the coefficient of $f^3$:

In the Taylor series $\sin(f \pi) = f \pi - \frac{(f \pi)^3}{3!} + \frac{(f \pi)^5}{5!} - \frac{(f \pi)^7}{7!} + \dots$, the coefficient of $f^3$ is $-\frac{\pi^3}{3!}$.

In our sine product $\sin(f \pi) = f \pi \displaystyle \prod_{k \geq 1} \left(1 - \frac{f^2}{k^2} \right)$, on the other hand, if we were to rewrite this as a "power series" by distributing out the giant multiplication, we would, among other terms, obtain the terms $f \pi \times \left(-\frac{f^2}{1^2} -\frac{f^2}{2^2} - \frac{f^2}{3^2} + \dots \right)$; these are the terms in our giant multiplication that come from choosing the "$-\frac{f^2}{k^2}$" from one factor and the "$1$" from all other factors. These are the terms which are multiples of $f^3$; all the other terms would correspond to higher or lower degree in $f$. So the coefficient of $f^3$ in our sine product series is $-\pi$ times the Basel sum $\frac{1}{1^2} + \frac{1}{2^2} + \frac{1}{3^2} + \dots$.

And thus, equating the results of the two last paragraphs, $-\pi$ times the Basel sum must equal $-\frac{\pi^3}{3!}$. This means the Basel sum must equal $\frac{\pi^2}{3!} = \frac{\pi^2}{6}$. Ta-da!

By equating coefficients at higher powers of $f$ between these two series, we can also go further and discover the sum of $\frac{1}{1^n} + \frac{1}{2^n} + \frac{1}{3^n} + \dots$ for each even $n \geq 2$, in each case finding it to be some rational multiple of $\pi^n$.

(But for a much simpler way of calculating these results, which does not require our sine product, or much anything beyond Calc 101 integration, see this other post!)

**Alternative proof of the Wallis product:**

Check out another beautifully simple geometric proof of the Wallis product directly in terms of circles and spheres (and higher-dimensional spheres…) in this post! No complex numbers or tricky polynomial algebra required!**Much more to come:**

[There's much more material to come on Euler's original method of discovery of the sine product, other methods of establishing it as well, what happens when we re-order our Wallis product to interleave its two halves at different speeds, other nice series for trig functions which follow from the sine product, further connections between these and the Basel problem, and more! This post will be updated continuously over time. Stay tuned!]