Chapter 3

Power Rule through geometry

Introduction to the derivatives of polynomial terms thought about geometrically and intuitively. The goal is for these formulas to feel like something the student could have discovered, rather than something to be memorized.

Apr 30, 2017

Lesson by Grant Sanderson

Text adaptation by Kurt Bruns

You know, for a mathematician, he did not have enough imagination. But he has become a poet and now he is fine.
— David Hilbert

After introducing the derivative and its relation to rates of change in the last lesson, the next step is to learn how to compute the derivatives of functions that are explicitly given with some formula. A typical calculus student spends quite a bit of time drilling on how to compute derivatives, often without the context of a concrete rate of change problem. For example, a worksheet may ask you to compute the derivative of this function:

 $f(x) = \frac{\sin(x)}{x^2}$

But there may be no indication of what physical process this function describes, or what significance its rate of change has. That's not necessarily a bad thing, it's analogous to how we often learn the multiplication table by drilling on many facts like $7 \times 6 = 42$ without perpetually having to put each such equation in context.

Still, before diving in, it may be worth emphasizing why all these formulas are worth learning in the first place, and why exercises asking students to drill on them are worth the effort. We model many real-world phenomena, especially those in physics, with polynomials, trigonometric functions, exponentials, and the like, so building up a fluency with actually computing derivatives of these functions gives you a language to readily understand the rates of change for these phenomena.

Think of how knowing the multiplication table by heart frees up a student to think about more complicated ideas in arithmetic and algebra.

So in this chapter and the next, the aim is to show how you can think about a few of these rules intuitively and geometrically; and I really want to encourage you to never forget about the tiny nudges at the heart of derivatives.

All that said, even if there's value to drilling on these formulas and memorizing them, the goal of this series is for you to feel like these are facts you could have discovered yourself. So for the next few chapters, let's put ourselves in a mindset of patience and discovery. Discovering these formulas can be a beautiful exercise in creativity, requiring you to sniff out how tiny changes to one quantity influence tiny changes to another.

Even if you don't need to think through these derivations every time you compute a derivative, going through them can reinforce the core idea of what derivatives are all about.

Monomial Terms

Derivative of $f(x) = x^2$

Let's start with a function like $f(x) = x^2$ . What is its derivative? That is to say, if you look at some value of $x$ , like $x = 2$ , and compare it to a value slightly bigger, just $dx$ bigger, what's that corresponding change in the value of the function, $df$ ?

In particular, what is $df$ divided by $dx$ ? The rate at which this function changes per unit change in $x$ ? As the first step for intuition, we know you can think of the ratio $\frac{df}{dx}$ as the slope of a tangent line to the graph of $x^2$ . From that, we can see that the slope generally increases as $x$ increases. At $0$ , the tangent line is flat, so the slope is $0$ . At $x = 1$ , it's something steeper, at $x=2$ , it's steeper still.

But looking at graphs isn't generally the best way to understand the precise formula for a derivative. For that, it's best to take a more literal look at what $x^2$ actually means. In this case, let's picture a square whose side length is $x$ .

If you increase $x$ by a tiny nudge $dx$ , what is the resulting change to the area of the square? That slight change in area represents $df$ : the tiny increase in the value of $f(x)=x^2$ caused by increasing $x$ by a tiny nudge $dx$ .

There's three new bits of area in this diagram, two thin rectangles, and a miniscule square. The two thin rectangles have side lengths $x$ and $dx$ , so together they account for $2 \cdot x \cdot dx$ units of new area. For example, if $x$ was $3$ and $dx$ was $0.01$ , the new area from these thin rectangles would be $2 \cdot 3 \cdot 0.01$ , which is $0.06$ ; About $6$ times the size of $dx$ .

That little square has area $dx^2$ , but you should think of this as being really tiny; negligibly tiny. For example, if $dx$ was $0.01$ , it would be $0.0001$ . I'm drawing $dx$ with a fair bit of width here, so we can see it, but always remember in principle $dx$ should be thought of as a truly tiny amount.

Phrased more precisely, our final consideration will always be what happens as the size of this $dx$ approaches $0$ , and as that happens the proportion of this yellow area $df$ which is accounted for by the tiny $dx^2$ corner will go to $0$ .

The full unapproximated change $df$ represented by all the yellow area above looks like $df = 2x \cdot dx + (dx)^2$ . So you might begin thinking of the expression for the derivative like this:

 $\frac{df}{dx} = \frac{2x \cdot dx + (dx)^2}{dx} = 2x + dx$

Remember, if we're using this " $d$ " notation, the implicit meaning is that we consider what happens as $dx \to 0$ . So in this case, our final expression would look like this.

 $\frac{df}{dx} = 2x$

Notice how we could have simply ignored the $(dx)^2$ term since it doesn't get fully canceled out when dividing by $dx$ . A good rule of thumb is that you can ignore anything which includes a $dx$ raised to a power greater than one; that is, a tiny change squared is a negligible change.

Derivative of $f(x) = x^3$

Let's try a different simple function, $f(x) = x^3$ . This will be the geometric view of what you and I went through algebraically in the last chapter for the function $x^3$ . We can think of $x^3$ geometrically as the volume of a cube with side lengths $x$ .

When you increase $x$ by a tiny nudge, $dx$ , the volume increases as shown in the figure below. That represents all the volume in a cube with side length $x + dx$ that's not already in the original cube with side length $x$ .

Remember that we are interested in what happens as $dx$ approaches $0$ . The length of $dx$ is illustrated so big to demonstrate the change in volume it introduces.

This figure shows the increase in volume of a cube when its side length $x$ is increased by a small nudge $dx$ . It's nice to think of this new volume broken up into multiple components, but almost all of it comes from the three square faces; Or, said a little more precisely, as $dx$ approaches $0$ , those three squares comprise a portion closer and closer to 100% of the new volume.

Each of those thin squares has a volume of $x^2 \cdot dx$ ; the area of the face times the thickness of $dx$ , so in total this gives us $3x^2dx$ of volume change. There are some other slivers of volume along the edges, and in the corner, but their volume will be proportional to $dx^2$ , or $dx^3$ , so they can be ignored. Again, this is because ultimately they will be divided by $dx$ , and if there's still any $dx$ remaining, these terms won't survive the process of letting $dx$ approach $0$ .

 $(x + dx)^3 = x^3 + \color{#fc6255} 3x^2 \color{black} dx \color{#AAAAAA} + 3xdx^2 + dx^3$

So the derivative of $x^3$ , the rate at which $x^3$ changes per unit of change in $x$ , is $3x^2$ . Looking at the graph, this means the slope of the graph of $x^3$ at each point $x$ is exactly $3x^2$ .

Graphical intuition with slope can tell us why this derivative is high on the left, $0$ at the origin, and high on the right, but just thinking in terms of graphs would not land us on the precise quantity $3x^2$ . For that, we had to take a much more direct look at the actual meaning of the function.

Derivative of $f(x) = x^n$

In practice, you wouldn't necessarily think of the square every time you're taking a derivative of $x^2$ , nor would you necessarily think of the cube when taking a derivative of $x^3$ . Instead, thinking like a mathematician, can you generalize this approach to see if a pattern emerges? Can you invent a tool to find the derivative of any polynomial?

Let's look at the pattern of the first three monomial functions, where I've included the monomial of degree one for completeness. So far, each of these monomial functions has had a nice geometric meaning. Nudging the input $x$ by a small amount $dx$ has allowed us to see how the geometry changes and find the derivative.

However, from a geometric perspective, we have hit a roadblock. How do we visualize four dimensions? One path forward is to continue with algebra, using our geometric intuition to inform the base cases. For example, we can expand higher degree monomial expressions, where the input $x$ have been nudged by a small amount $dx$ . This gives us the following series of expressions.

Polynomial	Expansion
$f(x) = x^1$	$(x + dx)^1 = x + \color{#fc6255}{1} \color{black} dx$
$f(x) = x^2$	$(x + dx)^2 = x^2 + \color{#fc6255} 2x \color{black} dx \color{#AAAAAA} + dx^2$
$f(x) = x^3$	$(x + dx)^3 = x^3 + \color{#fc6255} 3x^2 \color{black} dx \color{#AAAAAA} + 3xdx^2 + dx^3$
$f(x) = x^4$	$(x + dx)^4 = x^4 + \color{#fc6255} 4x^3 \color{black} dx \color{#AAAAAA} + 6x^2dx^2 + 4xdx^3 + dx^4$
$f(x) = x^5$	$(x + dx)^5 = x^5 + \color{#fc6255} 5x^4 \color{black} dx \color{#AAAAAA} + 10x^3dx^2 + 10x^2dx^3 + 5xdx^4 + dx^5$

Focusing on the rate in change introduced by the very small nudge, highlighted in red, and ignoring expressions containing $dx^2$ , the pattern that emerges is what's known in the business as the "power rule". Given a monomial function raised to some power, $n$ , applying the rule gives us the derivative of the function.

Power Rule Definition

Even though in practice you will find yourself performing this derivative quickly and symbolically, imagining that exponent hopping down to the front, every now and then it's nice to step back and remember why this rule works. Not just because it's pretty, and not just because it helps to remind us that math actually makes sense and isn't just a pile of formulas to memorize, but because it flexes that very important muscle of thinking about derivatives in terms of tiny nudges.

Derivative of $f(x) = \frac{1}{x}$

As another example, think of the function $f(x) = \frac{1}{x}$ . Now, on the one hand, you could blindly try applying the power rule, since $\frac{1}{x}$ is the same as writing $x^{-1}$ . That would involve letting that $-1$ hop down to become a coefficient, leaving behind one less than itself in the exponent, $-2$ . But let's have some fun and see if we can reason this geometrically, rather than just plugging it through a formula.

The value $1/x$ is asking "what number multiplied by $x$ equals $1$ ", so here's how I'd visualize it: Imagine a little rectangular puddle of water in two dimensions with area $1$ . Let's say that it's width is $x$ , which means its height must be $1/x$ , since the total area is $1$ .

For example, if you increase $x$ to $3$ , the other side must be squished down to $\frac{1}{3}$ . And if $x=2$ , the other side is forced to be $\frac{1}{2}$ .

This is a nice way to think about the graph of $1/x$ , by the way. If you think of the width $x$ of this puddle in the $xy$ -plane, the corresponding output $1/x$ , the height of the graph above that point, is whatever height the puddle must have to maintain an area of $1$ .

For the derivative, imagine nudging the input $x$ up by a value $dx$ . How must the height of this rectangle change so that the area remains unchanged at $1$ ? That is, increasing the width by $dx$ adds some new area to the right here, so the puddle must decrease in height by some $d(1/x)$ so that the area lost off the top here cancels that out.

You should think of that $d(1/x)$ as being some tiny negative value, since it's decreasing the height of this rectangle. And once you work out $d(1/x)/dx$ , compare it to what happens if you apply the power rule purely symbolically to $x^{-1}$ .

 $\frac{d(1 / x)}{d x}= ? ? ?$

Exercises

Here are a couple questions to test your knowledge of derivatives and the past chapters.

Derivative of $f(x) = \sqrt{x}$

See if you can reason your way through the derivative of $f(x) = \sqrt{x}$ which also can be written as $x^{\frac{1}{2}}$ . By far the easiest way to compute this is to apply the power rule: The exponent hops down as a coefficient, leaving behind $\frac{1}{2} - 1 = -\frac{1}{2}$ in the exponent.

 $\frac{df}{dx} = \frac{1}{2}x^{-1 / 2}$

But is this valid? And following our current playful and geometric spirit, is there a way to read what this really means?

Approaching this question with geometry is, by far, overkill. Frankly, it's a bit of a mind warp. But it does offer a satisfying explanation of an otherwise mostly symbolic fact, and more than that it's one more opportunity to flex our muscles in reasoning about how small nudges to one value can affect another.

Unlike in previous problems, $dx$ does not represent geometric length, but instead represents area.

Previous Lesson

The paradox of the derivative

Trig Derivatives through geometry

Thanks

Special thanks to those below for supporting this lesson.

Meshal Alshammari

Ali Yahya

CrypticSwarm

Yu Jun

Shelby Doolittle

Dave Nicponski

Damion Kistler

Juan Benet

Othman Alikhan

Markus Persson

Dan Buchoff

Derek Dai

Joseph John Cox

Luc Ritchie

Guido Gambardella

Jerry Ling

Mark Govea

Vecht

Jonathan Eppele

Shimin Kuang

Rish Kundalia

Achille Brighton

Kirk Werklund

Ripta Pasay

Felipe Diniz

Soufiane Khiat

dim85

Chris

David Wyrick

Rahul Suresh

Lee Burnette

John C. Vesey

Patrik Agné

Alvin Khaled

ScienceVR

Chris Willis

Michael Rabadi

Alexander Juda

Mads Elvheim

Joseph Cutler

Curtis Mitchell

Bright

Myles Buckley

Andy Petsch

Otavio Good

Karthik T

Steve Muench

Viesulas Sliupas

Steffen Persch

Brendan Shah

Andrew Mcnab

Matt Parlmer

Dan Davison

Jose Oscar Mur-Miranda

Aidan Boneham

Henry Reich

Sean Bibby

Paul Constantine

Justin Clark

Mohannad Elhamod

Ben Granger

Jeffrey Herman

Jacob Young