3Blue1Brown

Chapter 7Implicit differentiation, what's going on here?

"Do not ask whether a statement is true until you know what it means."
- Errett Bishop

Let me share with you something I found particularly weird when I was a student first learning calculus.

Implicit Curves

Let's say you have a circle with radius 55 centered at the origin of the xyxy-coordinate plane. This shape is defined using the equation x2+y2=52x^2 + y^2 = 5^2. That is, all points on this circle are a distance 55 from the origin, as encapsulated by the pythagorean theorem where the sum of the squares of the legs of this triangle equals the square of the hypotenuse, 525^2.

And suppose you want to find the slope of a tangent line to this circle, maybe at the point (x,y)=(3,4)(x, y) = (3, 4).

Now, if you're savvy with geometry, you might already know that this tangent line is perpendicular to the radius line touching that point. But let's say you don't already know that, or that you want a technique that generalizes to curves other than circles.

As with other problems about the slopes of tangent lines to curves, they key thought here is to zoom in close enough that the curve basically looks just like its own tangent line, then ask about a tiny step along that curve.

The yy-component of that little step along the curve is what you might call dydy, and the xx-component is a little dxdx, so the slope that we want is the rise over run dy/dxdy/dx.

But unlike other tangent-slope problems in calculus, this curve is not the graph of a function. We cannot take a simple derivative, like we have in the past, by asking about the size of a tiny nudge to the output of a function caused by some tiny nudge to the input. In the equation xx is not an input and yy is not an output, they're both just interdependent values related by some equation.

This is called an "implicit curve"; it's just the set of all points (x,y)(x, y) that satisfy some property written in terms of the two variables xx and yy.

The procedure for how you actually find dy/dxdy/dx for curves like this is the thing I found very weird as a calculus student. You take the derivative of both sides of this equation like this:

  • For the derivative of x2x^2 you write 2xdx2x \cdot dx.
  • Similarly, y2y^2 becomes 2ydy2y \cdot dy.
  • The derivative of the constant 525^2 on the right is just 00.

You can see why this feels strange, right? What does it mean to take a derivative of an expression with multiple variables? And why are we tacking on the little dydy and dxdx in this way? Unlike with functions, there isn't clear flow from input to output. But if you just blindly move forward with what you get here, you can rearrange this equation to find an expression for dy/dxdy/dx, which in this case comes out to x/y-x/y.

So at a point with coordinates (x,y)=(3,4)(x, y) = (3, 4), that slope would be 3/4-3/4, evidently. This strange process is called "implicit differentiation". Don't worry, I have an explanation for how you can interpret taking a derivative of an expression with two variables like this. But first, I want to set aside this particular problem, and show how this is related to a different type of calculus problem, something called a related rates problem.

Imagine a 55 meter long ladder held up against a wall, where the top of the ladder starts off 44 meters above the ground, whichby the pythagorean theoremmeans the bottom is 33 meters away from the wall.

And say it's slipping down the wall in such a way that the top of the ladder is dropping at 11 meter per second. In that initial moment, what is the rate at which the bottom of the ladder is moving away from the wall?

It's interesting, right? That distance from the bottom of the ladder to the wall is 100%100\% determined by the distance between the top of the ladder and the floor, so we should have enough information to figure out how the rates of change for each value depend on each other, but it might not be entirely clear how exactly you relate those two.

First thing's first, it's always nice to give names to the quantities we care about. So label the distance from the top of the ladder to the ground y(t)y(t), written as a function of time because it's changing. Likewise, label the distance between the bottom of the ladder and the wall x(t)x(t). They key equation that relates these terms is the pythagorean theorem: x(t)2+y(t)2=52x(t)^2 + y(t)^2 = 5^2. What makes this equation powerful is that it's true at all points in time.

One way to solve this would be to isolate x(t)x(t), figure out what what y(t)y(t) must be based this 11 meter/second drop rate, then take a derivative of the resulting function; dx/dtdx/dt, the rate at which xx is changing with respect to time.

And that's fine; if you're willing to apply the chain rule a few times, this method will definitely work for you. But I want to show a different way to think about the same problem.

This left-hand side of the equation is a function of time, right? Which just so happens to equal a constant, meaning this value evidently doesn't change while time passes.

However, this expression is dependent on time which we can manipulate like any other function with tt as an input. In particular, we can take a derivative of the left hand side, which is a way of saying "If I let a little bit of time, dtdt, pass, which causes yy to slightly decrease, and xx to slightly increase, how much does this expression change?"

On the one hand, we know that derivative should be 00, since this expression is a constant, and constants don't care about your tiny nudge to time; they remain unchanged. But on the other hand, what do you get when you compute the derivative of this as a function of time?

The derivative of x(t)2x(t)^2 is 2x(t)2 \cdot x(t) times the derivative of xx. That's the chain rule in action. 2xdx2x \cdot dx represents the size of a change to x2x^2 caused by a change to xx, and then we're dividing out by dtdt. Likewise, the rate at which y(t)2y(t)^2 is changing is 2y(t)2 \cdot y(t) times the derivative of yy.

Now, evidently, this whole expression must be zero, and that's an equivalent of saying x2+y2x^2+y^2 must not change while the ladder moves. And at the very start, t=0t=0, the height y(t)y(t) is 44 meters, the distance x(t)x(t) is 33 meters, and since the top of the ladder is dropping at a rate of 11 meter per second, that derivative dy/dtdy/dt is 1-1 meters/second.

Now this gives us enough information to isolate the derivative dx/dtdx/dt, which, when you work it out, it comes out to be 4/34/3 meters per second.

Implicit Differentiation

Now compare this to the problem of finding the slope of the tangent line to the circle. In both cases, we had the equation x2+y2=52x^2 + y^2 = 5^2, and in both cases we ended up taking the derivative of each side of this expression. But for the ladder problem, these expressions were functions of time, so taking the derivative has a clear meaning: it's the rate at which this expression changes as time change.

What makes the circle situation strange is that rather than saying a small amount of time dtdt has passed, which causes xx and yy to change, the derivative has the tiny nudges dxdx and dydy both just floating free, not tied to some other common variable like time.

Let me show you how you can think about this: Give this expression x2+y2x^2 + y^2 a name, maybe SS. SS is essentially a function of two variables; it takes every point (x,y)(x, y) on the plane and associates it with a number.

For points on this circle, that number happens to be 2525.

If you step off that circle away from the center, that value would be bigger. For other points (x,y)(x, y) closer to the origin, that value is smaller.

What it means to take a derivative of this expression, a derivative of SS, is to consider a tiny change to both these variables, some tiny change dxdx to xx, and some tiny change dydy to yy – and not necessarily one that keeps you on this circle, by the way, it's just some tiny step in any direction on the xyxy-plane.

From there you ask how much the value of SS changes. That difference in the value of SS, from the original point to the nudged point, is what we would call "dSdS" or "the change to the function SS".

As we saw in previous lessons, these dxdx and dydy expressions raised to the second power are neglible as the nudges approaches zero.

For example, say we're starting at a point (3,4)(3,4), and let's just say that step dxdx is 0.02-0.02, and that dydy is 0.01-0.01. Then the decrease to SS, the amount that x2+y2x^2+y^2 changes over that step, will be around 2(3)(0.02)+2(4)(0.01)2(3)(-0.02) + 2(4)(-0.01).

That's what this derivative expression 2xdx+2ydy2x \cdot dx + 2y \cdot dy means. It tells you how much the value x2+y2x^2+y^2 changes, as determined by the point (x,y)(x, y) where you started, and the tiny step (dx,dy)(dx, dy) that you take.

As with all things derivative, this is only an approximation, but it's one that gets more and more true for smaller and smaller choices of dxdx and dydy.

The key point is that when you restrict yourself to steps along this circle, you're essentially saying you want to ensure that this value SS doesn't change; it starts at a value of 2525, and you want to keep it at a value of 2525; that is, dSdS should be 00. So setting this expression 2xdx+2ydy2x \cdot dx + 2y \cdot dy equal to 00 is the condition under which a tiny step stays on the circle.

Again, this is only an approximation. Speaking more precisely, that condition keeps you on a tangent line of the circle, not the circle itself, but for tiny enough steps those are essentially the same thing.

Another example

Of course, there's nothing special about the expression x2+y2=52x^2+y^2 = 5^2 here. You could have some other expression involving xx's and yy's, representing some other curve, and taking the derivative of both sides like this would give you a way to relate dxdx to dydy for tiny steps along that curve.

It's always nice to think through more examples, so consider the expression sin(x)y2=x\sin(x) \cdot y^2 = x, which corresponds to many U-shaped curves on the plane. Those curves represent all the points (x,y)(x, y) of the plane where the value of sin(x)y2\sin(x) \cdot y^2 equals the value of xx.

Now imagine taking some tiny step with components (dx,dy)(dx, dy), and not necessarily one that keeps you on the curve.

Taking the derivative of each side of the equation sin(x)y2=x\sin(x) \cdot y^2 = x will tell us how much the value of that side changes during this step. On the left side, the product rule tells us that this should be "left d-right plus right d-left": sin(x)\sin(x) times the change to y2y^2, which is 2ydy2y \cdot dy, plus y2y^2 times the change to sin(x)\sin(x), which is cos(x)dx\cos(x) \cdot dx. The right side is simply xx, so the size of a change to the value is exactly dxdx, right?

Setting these two sides equal to each other is a way of saying "whatever your tiny step with coordinates (dx,dy)(dx, dy) is, if it's going to keep us on this curve, the values of both the left-hand side and the right-hand side must change by the same amount." That's the only way this top equation can remain true.

From there, depending on what problem you're solving, you could manipulate further with algebra, where perhaps the most common goal is to find dydy divided by dxdx.

Finding derivative of ln(x)

As one more example, let me show how you can use this technique to help find new derivative formulas. I've mentioned in a footnote video that the derivative of exe^x is itself, but what about the derivative of its inverse function the natural log of xx?

The graph of ln(x)\ln(x) can be thought of as an implicit curve; all the points on the xyxy-plane where y=ln(x)y = \ln(x), it just happens to be the case that the xx's and yy's of this equation aren't as intermingled as they were in other examples. The slope of this graph, dy/dxdy/dx, should be the derivative of ln(x)\ln(x), right?

Well, to find that, first rearrange this equation y=ln(x)y = \ln(x) to be ey=xe^y = x. This is exactly what the natural log of xx means; it's saying "ee to the what equals xx?"

Graphically this should make sense too, since you can draw a diagonal line and reflect the functions to see that they are inverses of eachother.

Since we know the derivative of eye^y, we can take the derivative of both sides of this equation y=ln(x)y = \ln(x), effectively asking how a tiny step with components (dx,dy)(dx, dy) changes the value of each side.

To ensure the step stays on the curve, the change to the left side of the equation, which is eydye^y \cdot dy, must equal the change to the right side, which is dxdx.

Rearranging, this means dy/dxdy/dx, the slope of our graph, equals 1/ey1/e^y.

And when we're on this curve, eye^y is by definition the same as xx, so evidently the slope is 1/x1/x.

An expression for the slope of the graph of a function in terms of xx like this is the derivative of that function, so the derivative of ln(x)\ln(x) is 1/x1/x.

Exercises

What is the slope of the tangent line at the point (1,2)(1, \sqrt{2}) for the implicit curve defined by the equation y2x2=1y^2 - x^2 = 1? Compare this with the derivative at the point x=1x=1 for the function f(x)=1+x2f(x) = \sqrt{1 + x^2}.

Next Lesson

By the way, all of this is a little peek into multivariable calculus, where you consider functions with multiple inputs, and how they change as you tweak those inputs. The key, as always, is to have a clear image in your head of what tiny nudges are at play, and how exactly they depend on each other.

Next up, I'll talk about about what exactly a limit is, and how it's used to formalize the idea of a derivative.

TwitterRedditFacebook
Notice a mistake? Submit a correction on Github

Thanks

Special thanks to those below for supporting the original video behind this post, and to current patrons for funding ongoing projects. If you find these lessons valuable, consider joining.

Ali YahyaMeshal AlshammariCrypticSwarmNathan PellegrinKaran BhargavaJustin HelpsAnkit AgarwalYu JunDave NicponskiDamion KistlerJuan Batiz-BenetOthman AlikhanMarkus PerssonDan BuchoffDerek DaiJoseph John CoxLuc RitchieDaan SmedingaJonathan EppeleAlbert Villeneuve NguyenMustafa MahdiNils SchneiderMathew BramsonJerry LingMark GoveaVecht世珉 匡Rish KundaliaAchille BrightonKirk WerklundRipta PasayFelipe DinizSoufiane KHIATdim85ChrisJim LauridsonJim MussaredGabriel CunhaPedro F PardoLoro LukicDavid WyrickRahul SureshLee BurnetteJohn C. VeseyPatrik AgnéAlvin KhaledScienceVRChris WillisMichael RabadiMads ElvheimJoseph CutlerCurtis MitchellMyles BuckleyAndy PetschOtavio GoodViesulas SliupasBrendan ShahAndrew McnabMatt ParlmerDan DavisonJose Oscar Mur-MirandaAidan BonehamHenry ReichSean BibbyPaul ConstantineJustin ClarkMohannad ElhamodBen GrangerJeffrey Herman

Discussion

Table of Contents

Thanks

Special thanks to those below for supporting the original video behind this post, and to current patrons for funding ongoing projects. If you find these lessons valuable, consider joining.

Ali YahyaMeshal AlshammariCrypticSwarmNathan PellegrinKaran BhargavaJustin HelpsAnkit AgarwalYu JunDave NicponskiDamion KistlerJuan Batiz-BenetOthman AlikhanMarkus PerssonDan BuchoffDerek DaiJoseph John CoxLuc RitchieDaan SmedingaJonathan EppeleAlbert Villeneuve NguyenMustafa MahdiNils SchneiderMathew BramsonJerry LingMark GoveaVecht世珉 匡Rish KundaliaAchille BrightonKirk WerklundRipta PasayFelipe DinizSoufiane KHIATdim85ChrisJim LauridsonJim MussaredGabriel CunhaPedro F PardoLoro LukicDavid WyrickRahul SureshLee BurnetteJohn C. VeseyPatrik AgnéAlvin KhaledScienceVRChris WillisMichael RabadiMads ElvheimJoseph CutlerCurtis MitchellMyles BuckleyAndy PetschOtavio GoodViesulas SliupasBrendan ShahAndrew McnabMatt ParlmerDan DavisonJose Oscar Mur-MirandaAidan BonehamHenry ReichSean BibbyPaul ConstantineJustin ClarkMohannad ElhamodBen GrangerJeffrey Herman