Proof: Every matrix transformation is a linear transformation

Showing that any matrix transformation is a linear transformation is overall a pretty simple proof (though we should be careful using the word “simple” when it comes to linear algebra!) But, this gives us the chance to really think about how the argument is structured and what is or isn’t important to include – all of which are critical skills when it comes to proof writing.


Needed definitions and properties

Since we want to show that a matrix transformation is linear, we must make sure to be clear what it means to be a matrix transformation and what it means to be linear. From there, we can determine if we need more information to complete the proof.

Definition of a linear transformation

For a transformation to be linear, it must satisfy the following rule for any vectors \(\vec{u}\) and \(\vec{v}\) in the domain and for any scalars \(c\) and \(d\).

\(T(c\vec{u} + d\vec{v}) = cT(\vec{u}) + dT(\vec{v})\)

Our goal will be to show that this has to hold for any matrix transformation, regardless of the domain, codomain, or specific matrix.

Definition of a matrix transformation

A matrix transformation is any transformation T which can be written in terms of multiplying a matrix and a vector. That is, for any \(\vec{x}\) in the domain of T:

\(T(\vec{x}) = A\vec{x}\) for some matrix \(A\)

We will likely need to use this definition when it comes to showing that this implies the transformation must be linear.

Properties that hold when multiplying a matrix and a vector

Since the definition relies on a matrix multiplying a vector, it might be useful to note some of the associated properties.

  • For any scalar \(c\) and vector \(\vec{x}\): \(A(c\vec{x}) = cA\vec{x}\)
  • For any two vectors \(\vec{x}\) and \(\vec{y}\): \(A(\vec{x} + \vec{y}) = A\vec{x} + A\vec{y}\)

These properties are typically proven when you first learn matrix multiplication. So, it is OK for us to use them without additional proof.

The idea

Looking at the properties of multiplication and the definition of a linear combination, you can see that they are almost identical statements. All we need to do is work carefully with the notation to combine the rules. Since we want to show that \(T(c\vec{u} + d\vec{v}) = cT(\vec{u}) + dT(\vec{v})\), and we know that \(T(\vec{x}) = A\vec{x}\) for some matrix \(A\) (we are assuming T is a matrix transformation for the sake of the proof), then we really need to show:

\(A(c\vec{u} + d\vec{v}) = cA\vec{u} + dA\vec{v}\)

By carefully applying the properties of multiplication, this should fall right into place. We just need to organize it all!

Remember when writing your proof to always define anything you use. You will notice that is one of the first things done in the proof below.

The proof

Every matrix transformation is a linear transformation.

Suppose that T is a matrix transformation such that \(T(\vec{x}) = A\vec{x}\) for some matrix \(A\) and that the vectors \(\vec{u}\) and \(\vec{v}\) are in the domain. Then for arbitrary scalars \(c\) and \(d\):

\(\begin{align} T(c\vec{u} + d\vec{v}) &= A(c\vec{u} + d\vec{v})\\ &= A(c\vec{u}) + A(d\vec{v})\\ &= cA\vec{u} + dA\vec{v}\\ &= cT(\vec{u}) + dT(\vec{v})\end{align}\)

As \(T(c\vec{u} + d\vec{v}) = cT(\vec{u}) + dT(\vec{v})\), T must be a linear transformation. \(\blacksquare\)

That’s it! All that build up for nothing huh? Well, not quite. Organizing our thoughts beforehand allowed us to write a nice and succinct proof. That’s the idea – we want people to be able to read this and see why the statement must be correct.

Now that we have this property, we can show a transformation is linear by finding a matrix A that implements the mapping (when that’s possible – see important note below). How to do this is explained here:
Finding the matrix of a linear transformation


While every matrix transformation is a linear transformation, not every linear transformation is a matrix transformation. That means that we may have a linear transformation where we can’t find a matrix to implement the mapping. However, as long as our domain and codomain are \({R}^n\) and \(R^m\) (for some m and n), then this won’t be an issue.

Under that domain and codomain, we CAN say that every linear transformation is a matrix transformation. It is when we are dealing with general vector spaces that this will not always be true.