Linear Algebra
Chapter 12

Cramer's rule, explained geometrically

What Cramer's rule is, and a geometric reason it's true

Mar 17, 2019
Lesson by Grant Sanderson
Text adaptation by Kurt Bruns & James Schloss

Jerry: Ah, you're crazy! Kramer: Am I? Or am I so sane that you just blew your mind? Jerry: It's impossible! Kramer: Is it?! Or is it so possible your head is spinning like a top?

In a previous chapter, we talked about linear systems of equations, and sort of brushed aside the discussion of actually computing solutions to these systems. While it's true that number-crunching is something we typically leave to the computers, digging into some of these computational methods is a good test for whether or not you understand what's going on since this is where the rubber meets the road.

Image

Here we want to describe the geometry behind a certain method for computing solutions to these systems, known as Cramer's rule. The relevant background needed here is an understanding of , , and of , so be sure to read the relevant chapters on those topics if you're unfamiliar or rusty.

But first! We should say up front that Cramer's rule is not the best way to compute solutions to linear systems of equations. Gaussian elimination, for example, will generally be faster, especially for larger matrices. So why learn it?

Think of this as a sort of cultural excursion; it's a helpful exercise in deepening your knowledge of the theory of these systems. Wrapping your mind around this concept will help consolidate ideas from linear algebra, like the determinant and linear systems, by seeing how they relate to each other. Also, from a purely artistic standpoint, the ultimate result is just really pretty to think about, much more so than Gaussian elimination.

The setup here will be some linear system of equations, say with two unknowns, x and y, and two equations. In principle, everything we're talking about will work systems with a larger number of unknowns, and the same number of equations. But for simplicity, a smaller example is nicer to hold in our heads.

Image

As we talked about in a previous chapter, you can think of this setup geometrically as a certain known matrix transforming an unknown vector, \left[\begin{array}{c} x \\ y \end{array}\right], where you know what the output is going to be, in this case \left[\begin{array}{c} -4 \\ -2 \end{array}\right]. Remember, the columns of this matrix tell you how the matrix acts as a transform, each one telling you where the basis vectors of the input space land.

Image

This is a puzzle. What input \left[\begin{array}{c} x \\ y \end{array}\right], is going to give you this output \left[\begin{array}{c} -4 \\ -2 \end{array}\right]?

Image

Assume Determinant is Nonzero

Remember, the type of answer you get here can depend on whether or not the transformation squishes all of space into a lower dimension. That is, if it has zero determinant. In that case, either none of the inputs land on our given output, or there are a whole bunch of inputs landing on that output.

ImageImage

For this chapter, we'll limit our view to the case of a non-zero determinant, meaning the output of this transformation still spans the full n-dimensional space it started in; every input lands on one and only one output, and every output has one and only one input.

ImageImage

One way to think about our puzzle is that we know the given output vector is some linear combination of the columns of the matrix; x\cdot\text{(the vector where }\hat{\imath}\text{ lands)} + y\cdot\text{(the vector where }\hat{\jmath}\text{ lands)}, but we wish to compute what exactly x and y are.

Image

What about dot products with basis vectors?

As a first pass, let's show an idea that is wrong, but in the right direction.

The x-coordinate of this mystery input vector is what you get by taking its dot product with \hat{\imath}. Likewise the y-coordinate is what you get by dotting it with \hat{\jmath}.

Image

Maybe you hope that after the transformation, the dot products with the transformed version of the mystery vector with the transformed versions of the basis vectors will also be these coordinates x and y.

Image

That'd be fantastic, because we know the transformed versions of each of these vectors. There's just one problem with this: it's not at all true! For most linear transformations, the dot product before and after the transformation will be very different.

For example, you could have two vectors generally pointing in the same direction, with a positive dot product, which get pulled away from each other during the transformation, in such a way that they then have a negative dot product.

Image

Likewise, if things start off perpendicular, with dot product zero, like the two basis vectors, there's no guarantee that they will stay perpendicular after the transformation, preserving that zero dot product.

Image

In the example we were looking at, dot products certainly aren't preserved. They tend to get bigger, since most vectors are getting stretched. In fact, transformations which do preserve dot products are special enough to have their own name: Orthonormal transformations. These are the ones which leave all the basis vectors perpendicular to each other with unit lengths.

Image

You often think of these as rotation matrices. They correspond to rigid motion, with no stretching, squishing or morphing.

Solving a linear system with an orthonormal matrix is very easy: Since dot products are preserved, taking the dot product between the output vector and all the columns of your matrix will be the same as taking the dot products between the input vector and all the basis vectors, which is the same as finding the coordinates of the input vector.

In that very special case, x would be the dot product of the first column with the output vector, and y would be the dot product of the second column with the output vector.

Image

A Better Approach

Now, even though this idea breaks down for most linear systems, it points us in the direction of something to look for: Is there an alternate geometric understanding for the coordinates of our input vector which remains unchanged after the transformation?

If your mind has been mulling over determinants, you might think of this clever idea: Take the parallelogram defined by the first basis vector, \hat{\imath}, and the mystery input vector \left[\begin{array}{c} x \\ y \end{array}\right]. The area of this parallelogram is its base, 1, times the height perpendicular to that base, which is the y-coordinate of our input vector.

Image

The area of this parallelogram is sort of a screwy roundabout way to describe the vector's y-coordinate; it's a wacky way to talk about coordinates, but run with us.

Actually, to be more accurate, you should think of the signed area of this parallelogram, in the sense described by the determinant video. That way, a vector with negative y-coordinate would correspond to a negative area for this parallelogram.

Image

Symmetrically, if you look at the parallelogram spanned by the vector and the second basis vector, \hat{\jmath}, its area will be the x-coordinate of the vector. Again, it's a strange way to represent the x-coordinate, but you'll see what it buys us in a moment.

Image

Here's what this would look like in three-dimensions: Ordinarily the way you might think of one of a vector's coordinates, say its z-coordinate, would be to take its dot product with the third standard basis vector, \hat{k}. But instead, consider the parallelepiped it creates with the other two basis vectors, \hat{\imath} and \hat{\jmath}.

Image

If you think of the square with area 1 spanned by \hat{\imath} and \hat{\jmath} as the base of this guy, its volume is the same as its height, which is the third coordinate of our vector.

Image

Likewise, the wacky way to think about any other coordinate of this vector is to form the parallelepiped between this vector and all the basis vectors other than the one you're looking for, and get its volume.

ImageImage

Or, rather, we should talk about the signed volume of these parallelepipeds, in the sense described in the determinant video, where the order in which you list the three vectors matters and you're using the right-hand rule. That way negative coordinates still make sense.

Follow this into the output space.

Okay, so why think of coordinates as areas and volumes like this? As you apply some matrix transformation, the areas of the parallelograms don't stay the same, they may get scaled up or down. But, and this is a key idea of determinants, all these areas get scaled by the same amount. Namely, the determinant of our transformation matrix.

Image

For example, if you look the parallelogram spanned by the vector where your first basis vector lands, which is the first column of the matrix, and the transformed version of \left[\begin{array}{c} x \\ y \end{array}\right], what is its area?

Well, this is the transformed version of that parallelogram we were looking at earlier, whose area was the y-coordinate of the mystery input vector. So its area will be the determinant of the transformation multiplied by that value.

Image

The y-coordinate of our mystery input vector is the area of this parallelogram, spanned by the first column of the matrix and the output vector, divided by the determinant of the full transformation.

y=\frac{\text { Area }}{\operatorname{det}(A)}

And how do you get this area? Well we know the coordinates for where the mystery input vector lands, that's the whole point of a linear system of equations. So create a matrix whose first column is the same as that of our matrix, and whose second column is the output vector, and take its determinant.

Image

Look at that; just using data from the output of the transformation, namely the columns of the matrix and the coordinates of our output vector, we can recover the y-coordinate of our mystery input vector.

Likewise, the same idea can get you the x-coordinate. Look at that parallelogram we defined early which encodes the x-coordinate of the mystery input vector, spanned by the input vector and \hat{\jmath}. The transformed version of this guy is spanned by the output vector and the second column of the matrix, and its area will have been multiplied by the determinant of the matrix.

Image

The x-coordinate of our mystery input vector is this area divided by the determinant of the transformation. Similar to what we did before, you can compute the area of that output parallelogram by creating a new matrix whose first column is the output vector, and whose second column is the same as the original matrix.

Image

Again, just using data from the output space, the numbers we see in our original linear system, we can recover the x-coordinate of our mystery input vector. This formula for finding the solutions to a linear system of equations is known as Cramer's rule.

Sanity check

Here, just to sanity check ourselves, plug in the numbers here. The determinant of that top altered matrix is 4+2=6 and the bottom determinant is 2, so the x-coordinate should be 3.

x=\frac{\text { Area }}{\operatorname{det}(A)}=\frac{\operatorname{det}\left(\left[\begin{array}{rr}
4 & -1 \\
2 & 1
\end{array}\right]\right)}{\operatorname{det}\left(\left[\begin{array}{rr}
2 & -1 \\
0 & 1
\end{array}\right]\right)}=\frac{(4)(1)-(-1)(2)}{(2)(1)-(-1)(0)}=\frac{6}{2}=3

And indeed, looking back at that input vector we started with, it's x-coordinate is 3.

Image

Likewise, Cramer's rule suggests the y-coordinate should be \frac{4}{2} = 2 and that is indeed the y-coordinate of the input vector we started with here.

y=\frac{\text { Area }}{\operatorname{det}(A)}=\frac{\operatorname{det}\left(\left[\begin{array}{ll}
2 & 4 \\
0 & 2
\end{array}\right]\right)}{\operatorname{det}\left(\left[\begin{array}{rr}
2 & -1 \\
0 & 1
\end{array}\right]\right)}=\frac{(2)(2)-(4)(0)}{(2)(1)-(-1)(0)}=\frac{4}{2}=2

Two Dimension Questions

What could it be...
What could it be...

In three dimensions

Image

The case with three dimensions is similar, and we highly recommend you pause to think it through yourself. Here, we'll even give you a little momentum.

Image

We have this known transformation, given by a 3x3 matrix, and a known output vector, given by the right side of our linear system, and we want to know what input vector lands on this output vector.

If you think of, say, the z-coordinate of the input vector as the volume of this parallelepiped spanned by \hat{\imath}, \hat{\jmath}, and the mystery input vector, what happens to the volume of this parallelepiped after the transformation? How can you compute that new volume?

Image

Really, pause and take a moment to think through the details of generalizing this to higher dimensions; finding an expression for each coordinate of the solution to larger linear systems. Thinking through more general cases and convincing yourself that it works is where all the learning will happen, much more so than passively consuming the lesson again.

Previous Lesson
Cross products in the light of linear transformations
Next Lesson
Change of basis


Thanks

Special thanks to those below for supporting this lesson.

CrypticSwarm
Juan Benet
Ali Yahya
Burt Humburg
Damion Kistler
Markus Persson
Yu Jun
Dave Nicponski
Kaustuv DeBiswas
Joseph John Cox
Yana Chernobilsky
Luc Ritchie
Achille Brighton
Rish Kundalia
世珉 匡
Desmos
Mathew Bramson
Mayank M. Mehrotra
Lukas Biewald
Jerry Ling
Mustafa Mahdi
Meshal Alshammari
Robert Teed
Samantha D. Suplee
Cooper Jones
Mark Govea
John Haley
Julian Pulgarin
Jeff Linse
Boris Veselinovich
Ryan Dahl
Matt Parlmer
Henry Reich
Ben Granger
V
otavio good
Eric Lavault
Mohannad Elhamod
Ripta Pasay
John C. Vesey
Lee Burnette
Chloe Zhou
Ross Garber
Andy Petsch
Andrew Busey
Gabriel Cunha
Jim Mussared
Awoo
Dr . David G. Stork
Linh Tran
Jim Lauridson
James H. Park
Devin Scott
Tomohiro Furusawa
Myles Buckley
Alan Stein
Patrick JMT
Tianyu Ge
Jason Hise
Bernd Sing
Alvin Khaled
Chris
Mathias Jansson
David Clark
Ankalagon
James Golab
Kevin Norris
Manuel Garcia
Florian Ragwitz
Mikko
Mads Elvheim
Michael Gardner
Chad Hurst
Hadrien Pierre
sidwill
Felix Tripier
Arthur Zey
David Kedmey
Jonathan Eppele
Clark Gaebel
Ted Suzman
Dan Davison
Raghavendra Kotikalapudi
Ryan Atallah
Marcelo Gómez
Jordan Scales
supershabam
Steve Cohen
Guy rosen
George John
Kenneth Larsen
Psylence
Thomas Tarler
Denis
Eurgh SireAwe
Brooks Ryba
Oliver Steele
Dave B
1stViewMaths
Jacob Magnuson
Loro Lukic
Valentin Mayer-Eichberger
Jake Vartuli - Schonberg
Jasim Schluter
Alex Samarin
Alexander Feldman
Norton Wang
Kevin Le
Isak Hietala
Eldar Gaynetdinov
Andreas Nautsch
Sergei
Chris Connett
Britt Selvitelle
Jonathan Wilson
Waleed Hamied
Thomas Peter Berntsen
Chas Leichner
Sebastian Braunert
Christopher Lorton
Eric Younge
Prasant Jagannath
Yaw Etse
Delton Ding
Akash Kumar
Tim Robinson
Nikolay Dubina
Sean Gallagher
George Chiesa
Alec Larsen
Mike Dussault
Gokcen Eraslan
Richard Barthel
Yixiu Zhao
Steven Tomlinson
Ignacio Freiberg
Zhilong Yang
David MacCumber
Tino Adams
孟子易
David House
Roman Pinchuk
Mike Dour
tatjana dzambazova
Brian Sletten
Britton Finley
David J Wu
Chandra Sripada
Chris Carrigan Brolly
Alex Frieder
Isaac Shamie
Victor Lee
Bong Choung
Dan Esposito (Guardion)
Giovanni Filippi
James Thornton
Kristoff Kiefer
Gabi Ghita
Tao Lu
Fred Ehrsam
Dorn Hetzel
Martin Sergio H. Faester
Jacob Wallingford
Andrew Poelstra
Andy Tran
Nicholas Loh
Dmitry Chepuryshkin
Max Mitchell
Richard Burgmann
John Griffith
Jameel Syed
Sean Barrett
Stephen Michael Hartley
Tlas
Alexander Juda
Keith Smith
Hoàng Tùng Lâm
James Hughes
John V Wertheim
Chris Giddings
Song Gao
William Fritzon
zheng zhang
Matt Langford
Cody Brocious
Victor Kostyuk
Andy
Patch Kessler
Emma & Brice
Günther Köckerandl
Mohammad Kabir
Claudio Corbetta