Matrix Multiplication

Composing transformations through matrices

Matrices as Transformations

Before we can understand matrix multiplication, we need to see what a single matrix does. A 2x2 matrix is not just a grid of numbers - it is a complete description of how to transform space.

When we multiply a matrix by a vector, we are asking: where does this vector land after the transformation? The answer comes from a beautifully simple rule: the columns of the matrix tell you where the basis vectors $\hat{i}$ and $\hat{j}$ end up.

Interactive: Matrix-Vector Multiplication

\vec{v} = (1.0, 1.0)

A\vec{v} = (1.0, 2.0)

Drag the white vector to see how the matrix transforms it

The formula for transforming a vector is:

\begin{bmatrix} a & b \\ c & d \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix} = x \begin{bmatrix} a \\ c \end{bmatrix} + y \begin{bmatrix} b \\ d \end{bmatrix}

Read this carefully: we take $x$ copies of the first column and $y$ copies of the second column, then add them together. The columns of the matrix are literally the transformed basis vectors.

Reading a Matrix

This insight changes how we should think about matrices. Instead of memorizing a formula, we can read what a matrix does directly from its columns.

The first column tells us where $\hat{i} = (1, 0)$ lands. The second column tells us where $\hat{j} = (0, 1)$ lands. Everything else follows from linearity.

Interactive: Build Your Own Transformation

a (top-left)1.0

b (top-right)0.0

c (bottom-left)0.0

d (bottom-right)1.0

A = \begin{bmatrix} 1.0 & 0.0 \\ 0.0 & 1.0 \end{bmatrix}

The columns of the matrix are where i and j land after transformation

Try setting $a = 0, b = -1, c = 1, d = 0$ . You will see a 90 degree rotation. Set $a = 2, b = 0, c = 0, d = 2$ for uniform scaling. The numbers directly encode the geometry.

Composition: One Transformation After Another

Now for the key insight that unlocks matrix multiplication. Suppose we have two transformations, each described by its own matrix. What if we want to apply one, then the other?

We could transform a vector by the first matrix, then take that result and transform it by the second matrix. But there is a better way: we can combine the two matrices into a single matrix that captures the entire composed transformation.

Interactive: Composing Two Transformations

Step0

Original vectors i and j

A = \begin{bmatrix} 0.87 & -0.5 \\ 0.5 & 0.87 \end{bmatrix}

Rotation

B = \begin{bmatrix} 1.5 & 0 \\ 0 & 0.5 \end{bmatrix}

Scaling

When we write $BA$ , we mean: first apply $A$ , then apply $B$ . The resulting matrix $BA$ is computed by asking where the basis vectors land after both transformations.

(BA)\vec{v} = B(A\vec{v})

The product matrix $BA$ captures this composition. Its first column is where $\hat{i}$ ends up after both transformations. Its second column is where $\hat{j}$ ends up.

The Multiplication Formula

To compute the product, we transform each column of the second matrix by the first matrix:

\begin{bmatrix} a & b \\ c & d \end{bmatrix} \begin{bmatrix} e & f \\ g & h \end{bmatrix} = \begin{bmatrix} ae + bg & af + bh \\ ce + dg & cf + dh \end{bmatrix}

Each column of the result comes from multiplying the left matrix by the corresponding column of the right matrix. This is not an arbitrary formula - it is the natural consequence of composing transformations.

Order Matters

Here is something crucial that catches many people off guard: matrix multiplication is not commutative. In general, $AB \neq BA$ .

Think about it geometrically. Rotating then shearing gives a different result than shearing then rotating. The order of transformations matters, so the order of matrix multiplication must matter too.

Interactive: AB vs BA

A = \begin{bmatrix} 1 & 1 \\ 0 & 1 \end{bmatrix}

Shear

B = \begin{bmatrix} 0 & -1 \\ 1 & 0 \end{bmatrix}

90 Rotation

AB = \begin{bmatrix} 1 & -1 \\ 1 & 0 \end{bmatrix}

Notice how the results are completely different!

This is not a quirk of the notation - it reflects a deep truth about transformations in space. When you shear first, you are shearing the original coordinate system. When you rotate first, the shear operates on an already-rotated system. The outcomes are genuinely different.

The Identity Matrix

There is one special matrix that deserves attention: the identity matrix. It leaves every vector exactly where it is.

I = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}

The identity matrix sends $\hat{i}$ to $(1, 0)$ and $\hat{j}$ to $(0, 1)$ - exactly where they started. For any matrix $A$ :

AI = IA = A

Multiplying by the identity is like multiplying a number by 1. It is the do-nothing transformation, and it plays the role of the neutral element in matrix multiplication.

Interactive: From Identity to Transformation

Transformation1.0

\begin{bmatrix} 2.00 & 1.00 \\ 0.50 & 1.50 \end{bmatrix}

Full transformation applied

Why This Matters

Matrix multiplication is the engine behind countless applications. In computer graphics, we chain transformations (translate, rotate, scale) into a single matrix. In machine learning, neural networks are essentially long compositions of linear transformations.

The geometric interpretation gives us intuition. The algebraic formulas give us computation. Together, they make matrices one of the most powerful tools in applied mathematics.

Key Takeaways

A matrix transforms space by moving the basis vectors to new positions
The columns of a matrix are where i and j land after the transformation
Matrix multiplication composes transformations: $BA$ means first $A$ , then $B$
Order matters: $AB \neq BA$ in general
The identity matrix $I$ leaves all vectors unchanged