Linear Algebra

Proof: Every matrix transformation is a linear transformation

Showing that any matrix transformation is a linear transformation is overall a pretty simple proof (though we should be careful using the word “simple” when it comes to linear algebra!) But, this gives us the chance to really think about how the argument is structured and what is or isn’t important to include – all of which are critical skills when it comes to proof writing.

[adsenseWide]

Needed definitions and properties

Since we want to show that a matrix transformation is linear, we must make sure to be clear what it means to be a matrix transformation and what it means to be linear. From there, we can determine if we need more information to complete the proof.

Definition of a linear transformation

For a transformation to be linear, it must satisfy the following rule for any vectors \(\vec{u}\) and \(\vec{v}\) in the domain and for any scalars \(c\) and \(d\).

\(T(c\vec{u} + d\vec{v}) = cT(\vec{u}) + dT(\vec{v})\)

Our goal will be to show that this has to hold for any matrix transformation, regardless of the domain, codomain, or specific matrix.

Definition of a matrix transformation

A matrix transformation is any transformation T which can be written in terms of multiplying a matrix and a vector. That is, for any \(\vec{x}\) in the domain of T:

\(T(\vec{x}) = A\vec{x}\) for some matrix \(A\)

We will likely need to use this definition when it comes to showing that this implies the transformation must be linear.

Properties that hold when multiplying a matrix and a vector

Since the definition relies on a matrix multiplying a vector, it might be useful to note some of the associated properties.

  • For any scalar \(c\) and vector \(\vec{x}\): \(A(c\vec{x}) = cA\vec{x}\)
  • For any two vectors \(\vec{x}\) and \(\vec{y}\): \(A(\vec{x} + \vec{y}) = A\vec{x} + A\vec{y}\)

These properties are typically proven when you first learn matrix multiplication. So, it is OK for us to use them without additional proof.

The idea

Looking at the properties of multiplication and the definition of a linear combination, you can see that they are almost identical statements. All we need to do is work carefully with the notation to combine the rules. Since we want to show that \(T(c\vec{u} + d\vec{v}) = cT(\vec{u}) + dT(\vec{v})\), and we know that \(T(\vec{x}) = A\vec{x}\) for some matrix \(A\) (we are assuming T is a matrix transformation for the sake of the proof), then we really need to show:

\(A(c\vec{u} + d\vec{v}) = cA\vec{u} + dA\vec{v}\)

By carefully applying the properties of multiplication, this should fall right into place. We just need to organize it all!

Remember when writing your proof to always define anything you use. You will notice that is one of the first things done in the proof below.

The proof

Every matrix transformation is a linear transformation.

Suppose that T is a matrix transformation such that \(T(\vec{x}) = A\vec{x}\) for some matrix \(A\) and that the vectors \(\vec{u}\) and \(\vec{v}\) are in the domain. Then for arbitrary scalars \(c\) and \(d\):

\(\begin{align} T(c\vec{u} + d\vec{v}) &= A(c\vec{u} + d\vec{v})\\ &= A(c\vec{u}) + A(d\vec{v})\\ &= cA\vec{u} + dA\vec{v}\\ &= cT(\vec{u}) + dT(\vec{v})\end{align}\)

As \(T(c\vec{u} + d\vec{v}) = cT(\vec{u}) + dT(\vec{v})\), T must be a linear transformation. \(\blacksquare\)

That’s it! All that build up for nothing huh? Well, not quite. Organizing our thoughts beforehand allowed us to write a nice and succinct proof. That’s the idea – we want people to be able to read this and see why the statement must be correct.

Now that we have this property, we can show a transformation is linear by finding a matrix A that implements the mapping (when that’s possible – see important note below). How to do this is explained here:
Finding the matrix of a linear transformation

Important

While every matrix transformation is a linear transformation, not every linear transformation is a matrix transformation. That means that we may have a linear transformation where we can’t find a matrix to implement the mapping. However, as long as our domain and codomain are \({R}^n\) and \(R^m\) (for some m and n), then this won’t be an issue.

Under that domain and codomain, we CAN say that every linear transformation is a matrix transformation. It is when we are dealing with general vector spaces that this will not always be true.

Matrix transformations

A matrix transformation is a transformation whose rule is based on multiplication of a vector by a matrix. This type of transformation is of particular interest to us in studying linear algebra as matrix transformations are always linear transformations. Further, we can use the matrix that defines the transformation to better understand other properties of the transformation itself.

Mathematically, a transformation T is a matrix transformation if we can write T(\vec{x}) = A\vec{x} for some matrix A.

[adsenseWide]

Example of a matrix transformation

Let T(\vec{x}) = A\vec{x} where A = \begin{bmatrix} -1 & 2 & 0\\ 0 & 4 & 1\\ \end{bmatrix}.

Observations

The example above shows a matrix transformation, since T is defined through multiplying the matrix A and the input vector \vec{x}. To better understand this transformation, we will make a few observations.

The domain of this transformation is \mathbb{R}^3

If we “plug in” a vector \vec{x}, we find its image through multiplying the vector by A. But, this multiplication is not always defined! This will only be defined if the number of entries in \vec{x} matches the number of columns in A. This means that when we write out the size of the matrix and the size of the vector, the “inner numbers” must match.

matrix-transformation-number-of-cols-matches-vector-input

The codomain of this transformation is \mathbb{R}^2

When multiplying a matrix and a vector, the result is determined by the “outer numbers” in our diagram above.

matrix-transformation-codomain

What is happening mathematically to make this true?

If we let \vec{x} = \begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ \end{bmatrix}, then:


T(\vec{x}) = A\vec{x} = \begin{bmatrix} -1 & 2 & 0\\ 0 & 4 & 1\\ \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ \end{bmatrix} = x_1 \begin{bmatrix} -1 \\ 0\\ \end{bmatrix} + x_2\begin{bmatrix} 2 \\ 4\\ \end{bmatrix} + x_3\begin{bmatrix} 0 \\ 1\\ \end{bmatrix}

In other words, the image of \vec{x} will be a linear combination of the columns of A. Since the columns each have 2 entries, we know that the columns are vectors in \mathbb{R}^2 and so any linear combination will also be in \mathbb{R}^2 . Vectors in \mathbb{R}^2 have dimension 2 x 1.

From these two observations, we know that:


T: \mathbb{R}^3 \rightarrow \mathbb{R}^2

That is, T maps vectors in \mathbb{R}^3 to vectors in \mathbb{R}^2.

To find the image of any vector, we just need to multiply

Suppose that we wanted to find the image of the vector \begin{bmatrix} 6 \\ 1 \\ 3 \\ \end{bmatrix} under T. Then, we would simply “plug” this vector into T.


T\left(\begin{bmatrix} 6 \\ 1 \\ 3 \\ \end{bmatrix}\right) = \begin{bmatrix} -1 & 2 & 0\\ 0 & 4 & 1\\ \end{bmatrix} \begin{bmatrix} 6 \\ 1 \\ 3 \\ \end{bmatrix} = 6 \begin{bmatrix} -1 \\ 0\\ \end{bmatrix} + 1\begin{bmatrix} 2 \\ 4\\ \end{bmatrix} + 3\begin{bmatrix} 0 \\ 1\\ \end{bmatrix} = \begin{bmatrix} -4 \\ 7 \\ \end{bmatrix}

Since T\left(\begin{bmatrix} 6 \\ 1 \\ 3 \\ \end{bmatrix}\right) = \begin{bmatrix} -4 \\ 7 \\ \end{bmatrix}, the image of \begin{bmatrix} 6 \\ 1 \\ 3 \\ \end{bmatrix} under T is \begin{bmatrix} -4 \\ 7 \\ \end{bmatrix}.

T must be linear

Every matrix transformation is a linear transformation. You can review a proof of this idea here: Proof that every matrix transformation is a linear transformation

Introduction to linear transformations

In linear algebra, a transformation between two vector spaces is a rule that assigns a vector in one space to a vector in the other space. Linear transformations are transformations that satisfy a particular property around addition and scalar multiplication. In this lesson, we will look at the basic notation of transformations, what is meant by “image” and “range”, as well as what makes a linear transformation different from other transformations.

Table of Contents

  1. The idea of a mapping
  2. Terminology: domain, codomain, image, and range for linear and other transformations
    1. Example of finding the image under a transformation
  3. What makes a transformation linear?

[adsenseWide]

The idea of a mapping

In mathematics, sometimes we use the word mapping to describe the same idea of a transformation. You are already familiar with mappings. For example, we could make up a rule that maps the real number to the real numbers. One such rule could be “multiply by 10”. Then 8 would be mapped to 80, and 3 would be mapped to 30, and so on.

Transformations in linear algebra are mappings as well, but they map vectors to vectors. This can be done with a rule described using a formula, or in the case of mappings between \(R^n\) and \(R^m\), maybe a matrix.

Remember that not all transformations are linear, but many that you study in linear algebra will be, and that yields a lot of useful theorems and problem solving techniques.

In this lesson, we will only consider transformations between the vector spaces\(R^n\) and \(R^m\) (for some m and n). See: Euclidean space.

Terminology: domain, codomain, image, and range for linear and other transformations

When a transformation T “maps” vectors in \(R^n\) to vectors in \(R^m\), we write:

\(T: R^n \rightarrow R^m\)

We then call \(R^n\) the domain and \(R^m\) the codomain. That is, T maps vectors in the domain to vectors in the codomain.

transformationT

The image of a vector under a transformation and the range of a transformation

Suppose that our rule assigns the vector \(\vec{x}\) to the vector \(\vec{y}\). Then, just like in our algebra and calculus classes, we can write:

\(T(\vec{x}) = \vec{y}\)

We would then say that \(\vec{y}\) is the image of \(\vec{x}\) under T. The set of all images under T is called the range of T, denoted range(T).

Note that the range of T is a subset, or part of, the codomain. These two sets of vectors may or may not equal. This is something we will study more when we look at 1-1 and onto transformations.

Example

Let \(T\left(\begin{bmatrix} x \\ y \\ z \\ \end{bmatrix}\right) = \begin{bmatrix} 2x \\ 5y \\ 3z \\ \end{bmatrix}\)

Find the image of \(\vec{v} = \begin{bmatrix} -1 \\ 1 \\ 4 \\ \end{bmatrix}\)

Solution

This is just asking us to find \(T(\vec{v})\). Using the rule:

\(\begin{align}T\left(\begin{bmatrix} -1 \\ 1 \\ 4 \\ \end{bmatrix}\right) &= \begin{bmatrix} 2(-1) \\ 5(1) \\ 3(4) \\ \end{bmatrix} \\ &= \boxed{\begin{bmatrix} -2 \\ 5 \\ 12 \\ \end{bmatrix}}\end{align}\)

This shows us that T maps the vector \(\begin{bmatrix} -1 \\ 1 \\ 4 \\ \end{bmatrix}\) to the vector \(\begin{bmatrix} -2 \\ 5 \\ 12 \\ \end{bmatrix}\). Thus, we also know that \(\begin{bmatrix} -2 \\ 5 \\ 12 \\ \end{bmatrix}\) is in the range(T).

What makes a transformation linear?

For a transformation to be linear, the following must hold for all vectors \(\vec{u}\) and \(\vec{v}\) in the domain and all scalars, \(c\) and \(d\).

\(T(c\vec{v}) = cT(\vec{v})\)

\(T(\vec{u} + \vec{v}) = T(\vec{u}) + T(\vec{v})\)

These two rules can be combined to a single rule that must hold.

\(T(c\vec{u} + d\vec{v}) = cT(\vec{u}) + dT(\vec{v})\)

This means that we can factor out scalars before applying T and break T up over addition or subtraction. Linear transformations make up a whole class of transformations that are studied in linear algebra. For a more in depth look at this rule, you can read the following article:

How to show a transformation is linear using the definition.

Showing a transformation is linear using the definition

When we say that a transformation is linear, we are saying that we can “pull” constants out before applying the transformation and break the transformation up over addition and subtraction. Mathematically, this means that the following two rules hold for any vectors \(\vec{u}\) and \(\vec{v}\) in the domain and all scalars, \(c\) and \(d\).

  1. \(T(c\vec{v}) = cT(\vec{v})\)
  2. \(T(\vec{u} + \vec{v}) = T(\vec{u}) + T(\vec{v})\)

These two rules can be combined into the following, equivalent rule.

\(T(c\vec{u} + d\vec{v}) = cT(\vec{u}) + dT(\vec{v})\)

[adsenseWide]

Using this rule to prove a transformation is linear

Let’s use an example to see how you would use this definition to prove a given transformation is linear.

Example

Show that \(T\left(\begin{bmatrix} x \\ y \\ z \\ \end{bmatrix}\right) = \begin{bmatrix} x \\ 5y \\ x + z \\ \end{bmatrix}\) is a linear transformation, using the definition.

Solution

Looking at the rule, this transformation takes vectors in \(R^3\) to vectors in \(R^3\), as the input and output vectors both have 3 entries. We must show that the definition above holds for ANY vectors in the domain \(R^3\) and any scalars, \(c\) and \(d\).

Anytime you start a proof like this, make sure you define any variables you use. Notice that this is essentially the first line of the proof below.

Overall, since our goal is to show that \(T(c\vec{u} + d\vec{v}) = cT(\vec{u}) + dT(\vec{v})\), we will calculate one side of this equation and then the other, finally showing that they are equal.

Proof

Let \(\vec{u} = \begin{bmatrix} u_1 \\ u_2 \\ u_3 \\ \end{bmatrix}\) and \(\vec{v} = \begin{bmatrix} v_1 \\ v_2 \\ v_3 \\ \end{bmatrix}\) be vectors in \(R^3\) and \(c\) and \(d\) be scalars.

Then:

\(T(c\vec{u} + d\vec{v}) =\)

Idea: combine the vectors and then apply the rule for T.

\(\begin{align}T\left(c\begin{bmatrix} u_1 \\ u_2 \\ u_3 \\ \end{bmatrix} + d\begin{bmatrix} v_1 \\ v_2 \\ v_3 \\ \end{bmatrix}\right) &= T\left(\begin{bmatrix} cu_1 + dv_1 \\ cu_2 + dv_2 \\ cu_3 + dv_3\\\end{bmatrix}\right)\\ &= \begin{bmatrix} cu_1 + dv_1 \\ 5(cu_2 + dv_2) \\ (cu_1 + dv_1) + (cu_3 + dv_3)\\\end{bmatrix}\end{align}\)

And:

\(cT(\vec{u}) + dT(\vec{v}) =\)

Idea: apply the rule for T first and then combine the vectors.

\(\begin{align} cT\left(\begin{bmatrix} u_1 \\ u_2 \\ u_3 \\ \end{bmatrix}\right) + dT\left(\begin{bmatrix} v_1 \\ v_2 \\ v_3 \\ \end{bmatrix}\right) &= c\begin{bmatrix} u_1 \\ 5u_2 \\ u_1 + u_3 \\ \end{bmatrix} + d\begin{bmatrix} v_1 \\ 5v_2 \\ v_1 + v_3 \\ \end{bmatrix}\\ &= \begin{bmatrix} cu_1 + dv_1\\ 5cu_2 + 5dv_2 \\ c(u_1 + u_3) + d(v_1 + v_3) \\ \end{bmatrix}\end{align}\)

Idea: to conclude the proof, explain how this shows the definition holds and show the two calculations are equal.

Since:

\(\begin{bmatrix} cu_1 + dv_1 \\ 5(cu_2 + dv_2) \\ (cu_1 + dv_1) + (cu_3 + dv_3)\\\end{bmatrix} = \begin{bmatrix} cu_1 + dv_1 \\ 5cu_2 + 5dv_2 \\ (cu_1 + cu_3) + (dv_3 + dv_3)\\\end{bmatrix} = \begin{bmatrix} cu_1 + dv_1\\ 5cu_2 + 5dv_2 \\ c(u_1 + u_3) + d(v_1 + v_3) \\ \end{bmatrix}\)

we have shown that \(T(c\vec{u} + d\vec{v}) = cT(\vec{u}) + dT(\vec{v})\). Thus, by definition, the transformation is linear. \(\blacksquare\)

Important

Notice that the proof above did not use any specific vectors like \(\begin{bmatrix} 1 \\ 5 \\ 2 \\ \end{bmatrix}\) or \(\begin{bmatrix} 6 \\ 9 \\ 0 \\ \end{bmatrix}\). Showing the rule works for specific vectors or scalars only shows that it works for those specific values! The idea is that we must show it works for ANY GENERAL vectors and scalars.

Other methods

When a transformation maps vectors from \(R^n\) to \(R^m\) for some n and m (like the one above, for instance), then we have other methods that we can apply to show that it is linear. For example, we can show that T is a matrix transformation, since every matrix transformation is a linear transformation. To learn how to find such a matrix, check out this article: Finding the standard matrix for a linear transformation.

Row reduction with the TI83 or TI84 calculator (rref)

Row reducing a matrix can help us find the solution to a system of equations (in the case of augmented matrices), understand the properties of a set of vectors, and more. Knowing how to use row operations to reduce a matrix by hand is important, but in many cases, we simply need to know what the reduced matrix looks like. In these cases, technology like a graphing calculator is a great tool to use!

[adsenseWide]

We will go through the steps using this matrix:

3x3-matrix

Step 1: Go to the matrix menu on your calculator.

Press [2nd][x^-1] to enter the matrix menu. Note that some older calculators have a button that simply says [MATRX]. Press the right arrow until you are under the EDIT menu.

edit-matrix-menu-ti83-ti84

Press [ENTER] and you can now edit matrix A.

Step 2: Enter your matrix into the calculator.

The first information you are asked is the size of the matrix. This matrix has 3 rows and 3 columns, so it is a 3 x 3 matrix. Type these numbers, pressing [ENTER] after each.

edit-matrix-A-1x1-ti83-ti84

edit-matrix-A-3x3-ti83-ti84

Now you can enter each number by typing it and pressing [ENTER].

matrix-entered-into-ti83-ti84

Step 3: Quit out of the matrix editing screen.

This is a strange step, but if we don’t do it, the calculator tries to put a reduced matrix INSIDE of this matrix. It’s a bug for sure, but one we can work around.

Press [2nd] and then [MODE] to quit. You will end up at a blank screen.

Step 4: Go to the matrix math menu.

Press [2nd][X^-1] to enter the matrix menu again, but this time go over to MATH.

matrix-math-menu-ti83-ti84

Scroll down to “rref” (reduced row echelon form) and press [ENTER].

rref-menu-ti83-ti84

rref-ti83-ti84

Step 5: Select matrix A and finally row reduce!

To select matrix A, you need to go back into the matrix menu by pressing [2nd][x^-1] but stay under the NAMES menu.

matrix-names-menu-ti83-ti84

Now press [ENTER] to select matrix A.

rref-A-ti83-ti84

Close your parentheses by pressing [ ) ] and then pressing [ENTER] to get the reduced matrix.

(note: you don’t have to close the parentheses for this to work, but it is a good habit – or maybe just not closing parentheses drives me crazy… one of those..).

rref-reduced-A-ti83-ti84

Interesting! This matrix turns out to be row equivalent to the identity matrix. If you are currently studying linear algebra, you know that this is a useful fact to know about a matrix. There are many properties that are now automatically true for our matrix A!

Either way, we are done. We now have row reduced matrix A.

Fractions vs. Decimals

One thing that can happen when you row reduce, is that you end up with messy decimals. With one additional step, you can convert these to fractions. So, suppose we reduced a matrix and ended up with the following:

decimals-after-reducing-matrix-ti83-ti84

Before pressing any other buttons, press [MATH] and then select >FRAC by pressing [ENTER].

math-frac-ti83-ti84

math-frac-matrix-ti83-ti84

Now press [ENTER] to get a reduced matrix with fractions instead of decimals.

math-frac-matrix-ti83-ti84-2

This is much better! You will want to use this for most mathematical problems, as the fractions are exact values and the decimals before were approximations.

[adsenseLargeRectangle]

Additional Reading

Now that you are comfortable with row reduction using the calculator, you can learn how to multiply matrices with the TI83/84 or find the inverse with your calculator as well!

Row operations

Row operations are calculations we can do using the rows of a matrix in order to solve a system of equations, or later, simply row reduce the matrix for other purposes. There are three row operations that we can perform, each of which will yield a row equivalent matrix. This means that if we are working with an augmented matrix, the solution set to the underlying system of equations will stay the same.

[adsenseWide]

The row operations

In the following examples, the symbol ~ means “row equivalent”.

Swap two rows

When working with a system of equations, the order you write the questions doesn’t affect the solution. Since an augmented matrix represents a system of equations, with each row being an equation, we can swap two rows.

row-operations-swap-rows

Notice the notation with the double arrow. When you are performing row operations, use notation like this to keep track of what you did. It is very easy to have an arithmetic mistake, and if this happens, this notation let’s you go back and find it easily.

The formal notation for this row operation (as used in some books) would be:  R_1 \leftrightarrow R_3.

Multiply a row by a nonzero constant

There will be times when it will be useful to multiply a row by something like 2 or 1/3. Doing this will not change the solution to the underlying system of equations since multiplying any equation by a nonzero constant results in an equivalent equation (as long as you multiply BOTH sides of the equation).

row-operations-multiply-a-row-by-constant

In this example, each entry in row 2 was multiplied by the constant. A fair question here would be “Why would you do that row operation?”. We will get into that when we talk about Gauss Jordan elimination and row reduction, but for now, I chose multiplying row 2 by 3 just for the sake of showing you how it would work.

The formal notation for this particular row operation: 3R_2 \rightarrow R_2. (think: multiply row 2 by 3 and put it back where the original row 2 was)

Multiply a row by a nonzero constant and add it to another row

Think back to when you first learned how to solve systems of equations. You likely learned how to eliminate a variable by multiplying one equation by a number and then adding the two equations. We can translate that same idea into a row operation.

row-operations-adding-rows

Notice that all the work happened in row 2. Because of this, our shorthand notation has -5R_1 next to row 2. To do the actual row operation, we took each value in row 1, multiplied it by –5 and then added it to the corresponding entry in row 2.

The formal notation for this would be: -5 R_1 + R_2 \rightarrow R2. (the arrow points to where all the work will go)

The big picture

The last example shows the true power of row operations. By picking to add -5R_1 to row 2, we eliminated x_1 in the 2nd equation. If we continued this process and did similar row operations for other variables, then we should be able to eliminate the variables in a way as to see the solution to the underlying system. In fact, this will be exactly what we will study when talking about Gauss Jordan elimination and row reduction.

Augmented matrices and systems of linear equations

You can think of an augmented matrix as being a way to organize the important parts of a system of linear equations. These “important parts” would be the coefficients (numbers in front of the variables) and the constants (numbers not associated with variables).

[adsenseWide]

Writing the augmented matrix for a system

Let’s look at two examples and write out the augmented matrix for each, so we can better understand the process. The key is to keep it so each column represents a single variable and each row represents a single equation. The augment (the part after the line) represents the constants.

Example

Write the augmented matrix for the system of equations:
\(\begin{array}{l}3x_1 + 5x_2 – x_3 = 10\\ x_1 + 4x_2 + x_3 = 7\\ 9x_1 + 2x_3 = 1\\ \end{array}\)

Solution

There are three variables, and so we will need a column for each. Be careful – notice that the last equation doesn’t have an \(x_2\). That will be represented with a 0.

Augmented matrix for the system; each column represents the coefficients for the variables and each row represents an equation. Row 1 is 3 5 -1 10, Row 2 is 1 4 1 7, and row 3 is 9 0 2 1. The 10, 7, and 1 each represent the constants from the equations.

Example

Write the augmented matrix for the system of equations:

\(\begin{array}{l} x_1 – 2x_2 + 8x_3 + x_4 + x_5 = 2\\ 3x_1 – x_2 + x_3 + 2x_4 + 2x_5 = -3\\ \end{array} \)

Solution

Even though this is not the type of system we are used to seeing in our usual algebra classes, we can still write an augmented matrix to represent it. The augmented matrix for this system would be:

Augmented matrix for example 2. The first row is 1 -2 8 1 1 2, for the coefficients and constants from the first equation. The second row is 3 -1 1 2 2 -3 for the constants and coefficients from the second equation.

Common Questions

Does the order that I write the rows in matter?

No. In algebra, when you were solving a system like \(3x + y = 5\) and \(2x + 4y = 7\), it didn’t matter if you wrote one equation first or second. The solution to the problem didn’t change. The same is true when you have more than two equations. Since each row represents an equation, the order that you write the rows in doesn’t matter.

What is this used for?

Putting a system of equations in this form will allow us to use a new idea called row operations to find its solution (if one exists), describe the solution set (when there are infinitely many solutions), and more. Row operations can help us organize a way to do this regardless of how many variables or how many equations we are given. This will be studied in later articles.

How to Determine if a Vector is a Linear Combination of Other Vectors

The idea of a linear combination of vectors is very important to the study of linear algebra. We can use linear combinations to understand spanning sets, the column space of a matrix, and a large number of other topics. One of the most useful skills when working with linear combinations is determining when one vector is a linear combination of a given set of vectors.

[adsenseWide]

Suppose that we have a vector \(\vec{v}\) and we want to know the answer to the question “is \(\vec{v}\) a linear combination of the vectors \(\vec{a}_{1}\), \(\vec{a}_{2}\), and \(\vec{a}_{3}\)?”. Using the definition of a linear combination of vectors, this question can be restated to the following:

Are there scalars \(x_{1}\), \(x_{2}\), and \(x_{3}\) such that:
\(\vec{v} = x_1\vec{a}_{1 }+x_2\vec{a}_{2}+ x_3\vec{a}_{3}\)?

If the vectors are in \(R^n\) for some \(n\), then this is a question that can be answered using the equivalent augmented matrix:

\(\left[ \begin{array}{ccc|c} \vec{a}_1 & \vec{a}_2 & \vec{a}_3 & \vec{v} \\ \end{array} \right]\)

If this matrix represents a consistent system of equations, then we can say that \(\vec{v}\) is a linear combination of the other vectors.

Example

Determine if the vector \(\begin{bmatrix} 5 \\ 3 \\ 0 \\ \end{bmatrix}\) is a linear combination of the vectors:
\(\begin{bmatrix} 2 \\ 0 \\ 1 \\ \end{bmatrix}\), \(\begin{bmatrix} 1 \\ 4 \\ 3 \\ \end{bmatrix}\), \(\begin{bmatrix} 8 \\ 1 \\ 1 \\ \end{bmatrix}\), and \(\begin{bmatrix} -4 \\ 6 \\ 1 \\ \end{bmatrix}\)

Solution

Remember that this means we want to find constants \(x_{1}\), \(x_{2}\), \(x_{3}\), and \(x_{4}\) such that:

\(\begin{bmatrix} 5 \\ 3 \\ 0 \\ \end{bmatrix} = x_{1}\begin{bmatrix} 2 \\ 0 \\ 1 \\ \end{bmatrix} + x_{2}\begin{bmatrix} 1 \\ 4 \\ 3 \\ \end{bmatrix} + x_{3}\begin{bmatrix} 8 \\ 1 \\ 1 \\ \end{bmatrix} + x_{4}\begin{bmatrix} -4 \\ 6 \\ 1 \\ \end{bmatrix}\)

This vector equation is equivalent to an augmented matrix. Setting this matrix up and row reducing, we find that:

\(\left[ \begin{array}{cccc|c} 2 & 1 & 8 & -4 & 5 \\
0 & 4 & 1 & 6 & 3 \\
1 & 3 & 1 & 1 & 0 \\
\end{array} \right]
\)

Is equivalent to:

\(\left[ \begin{array}{cccc|c} 1 & 0 & 0 & -\frac{103}{29} & -\frac{74}{29} \\
0 & 1 & 0 & \frac{42}{29} & \frac{13}{29} \\
0 & 0 & 1 & \frac{6}{29} & \frac{35}{29} \\
\end{array}\right]\)

While it isn’t pretty, this matrix does NOT contain a row such as \(\begin{bmatrix} 0 & 0 & 0 & 0 & c \\ \end{bmatrix}\) where \(c \neq 0\) which would indicate the underlying system is inconsistent. Therefore the underlying system is consistent (has a solution) which means the vector equation is also consistent.

So, we can say that \(\begin{bmatrix} 5 \\ 3 \\ 0 \\ \end{bmatrix}\) is a linear combination of the other vectors.

The step-by-step process

In general, if you want to determine if a vector \(\vec{u}\) is a linear combination of vectors \(\vec{v}_{1}\), \(\vec{v}_{2}\), … , \(\vec{v}_{p}\) (for any whole number \(p > 2\)) you will do the following.

Step 1

Set up the augmented matrix

\(\left[ \begin{array}{cccc|c} \vec{v}_1 & \vec{v}_2 & \cdots & \vec{v}_p & \vec{u} \\ \end{array} \right]\)

and row reduce it.

Step 2

Use the reduced form of the matrix to determine if the augmented matrix represents a consistent system of equations. If so, then \(\vec{u}\) is a linear combination of the others. Otherwise, it is not.

In the second step, it is important to remember that a system of equations is consistent if there is one solution OR many solutions. The number of solutions is not important – only that there IS at least one solution. That means there is at least one way to write the given vector as a linear combination of the others.

Writing a Vector as a Linear Combination of Other Vectors

Sometimes you might be asked to write a vector as a linear combination of other vectors. This requires the same work as above with one more step. You need to use a solution to the vector equation to write out how the vectors are combined to make the new vector.

Let’s start with an easier case than the one we did before and then come back to it since it is a bit complicated.

Example

Write the vector \(\vec{v} = \begin{bmatrix} 2 \\ 4 \\ 2 \\ \end{bmatrix}\) as a linear combination of the vectors:
\(\begin{bmatrix} 2 \\ 0 \\ 1 \\ \end{bmatrix}\), \(\begin{bmatrix} 0 \\ 1 \\ 0 \\ \end{bmatrix}\), and \(\begin{bmatrix} -2 \\ 0 \\ 0 \\ \end{bmatrix}\)

Solution

Step 1

We set up our augmented matrix and row reduce it.

\(
\left[ \begin{array}{ccc|c} 2 & 0 & -2 & 2 \\
0 & 1 & 0 & 4 \\
1 & 0 & 0 & 2 \\
\end{array} \right]
\)

is equivalent to

\(
\left[ \begin{array}{ccc|c} 1 & 0 & 0 & 2 \\
0 & 1 & 0 & 4 \\
0 & 0 & 1 & 1 \\
\end{array} \right]
\)

Step 2

We determine if the matrix represents a consistent system of equations.

Based on the reduced matrix, the underlying system is consistent. Again, this is because there are no rows of all zeros in the coefficient part of the matrix and a single nonzero value in the augment. (you could also use the number of pivots to make the argument.)

Unlike before, we don’t only want to verify that we have a linear combination. We want to show the linear combination itself. This means that we need an actual solution. In this case, there is only one:

\(x_1 = 2\), \(x_2 = 4\), \(x_3 = 1\)

Using these values, we can write \(\vec{v}\) as:

\(\vec{v} = \begin{bmatrix} 2 \\ 4 \\ 2 \\ \end{bmatrix} = (2)\begin{bmatrix} 2 \\ 0 \\ 1 \\ \end{bmatrix} + (4)\begin{bmatrix} 0 \\ 1 \\ 0 \\ \end{bmatrix} + (1)\begin{bmatrix} -2 \\ 0 \\ 0 \\ \end{bmatrix}\)

Now let’s go back to our first example (the one with the crazy fractions) but change the instructions a bit.

Example

Write the vector \(\vec{v} = \begin{bmatrix} 5 \\ 3 \\ 0 \\ \end{bmatrix}\) as a linear combination of the vectors:
\(\begin{bmatrix} 2 \\ 0 \\ 1 \\ \end{bmatrix}\), \(\begin{bmatrix} 1 \\ 4 \\ 3 \\ \end{bmatrix}\), \(\begin{bmatrix} 8 \\ 1 \\ 1 \\ \end{bmatrix}\), and \(\begin{bmatrix} -4 \\ 6 \\ 1 \\ \end{bmatrix}\)

When we did step 1, we had the following work. This showed that the equivalent vector equation was consistent and verified that \(\vec{v}\) was a linear combination of the other vectors.

\(\left[ \begin{array}{cccc|c} 2 & 1 & 8 & -4 & 5 \\
0 & 4 & 1 & 6 & 3 \\
1 & 3 & 1 & 1 & 0 \\
\end{array} \right]
\)

Is equivalent to:

\(\left[ \begin{array}{cccc|c} 1 & 0 & 0 & -\frac{103}{29} & -\frac{74}{29} \\
0 & 1 & 0 & \frac{42}{29} & \frac{13}{29} \\
0 & 0 & 1 & \frac{6}{29} & \frac{35}{29} \\
\end{array}\right]\)

What if we wanted to write out the linear combination. This is different from the previous example in that there are infinitely many solutions to the vector equation.

Looking more closely at this augmented matrix, we can see that there is one free variable \(x_{4}\). If we write out the equations, we have:

\(x_1 – \left(\frac{103}{29}\right)x_4 = -\frac{74}{29}\)

\(x_2 + \left(\frac{42}{29}\right)x_4 = \frac{13}{29}\)

\(x_3 + \left(\frac{6}{29}\right)x_4 = \frac{35}{29}\)

Since \(x_{4}\) is a free variable, we can let it have any value and find a solution to this system of equations. A really “nice” value would be zero. If \(x_4 = 0\), then:

\(x_1 – \frac{103}{29}(0) = -\frac{74}{29}\)

\(x_2 + \frac{42}{29}(0) = \frac{13}{29}\)

\(x_3 + \frac{6}{29}(0) = \frac{35}{29}\)

Using this solution, we can write \(\vec{v}\) as a linear combination of the other vectors.

\(\vec{v} = \begin{bmatrix} 5 \\ 3 \\ 0 \\ \end{bmatrix} = \left(-\frac{72}{29}\right)\begin{bmatrix} 2 \\ 0 \\ 1 \\ \end{bmatrix} + \left(\frac{13}{29}\right)\begin{bmatrix} 1 \\ 4 \\ 3 \\ \end{bmatrix} + \left(\frac{35}{29}\right)\begin{bmatrix} 8 \\ 1 \\ 1 \\ \end{bmatrix} + (0)\begin{bmatrix} -4 \\ 6 \\ 1 \\ \end{bmatrix}\)

This would be one solution, but because \(x_4\) is free, there are infinitely many. For each possible value of \(x_4\), you have another correct way to write \(\vec{v}\) as a linear combination of the other vectors. For example, if \(x_4 = 1\):

\(\begin{align}x_1 &= -\frac{74}{29} + \frac{103}{29} \\ &= \frac{29}{29} \\ &= 1\end{align}\)

\(\begin{align}x_2 &= \frac{13}{29} – \frac{42}{29}\\ &= -\frac{29}{29} \\ &= -1\end{align}\)

\(\begin{align}x_3 &= \frac{35}{29} – \frac{6}{29}\\ &= \frac{29}{29} \\ &= 1\end{align}\)

Using this, we can also write:

\(\vec{v} = \begin{bmatrix} 5 \\ 3 \\ 0 \\ \end{bmatrix} = (1)\begin{bmatrix} 2 \\ 0 \\ 1 \\ \end{bmatrix} + (-1)\begin{bmatrix} 1 \\ 4 \\ 3 \\ \end{bmatrix} + (1)\begin{bmatrix} 8 \\ 1 \\ 1 \\ \end{bmatrix} + (1)\begin{bmatrix} -4 \\ 6 \\ 1 \\ \end{bmatrix}\)

How nice is that? (note: normally, we wouldn’t write out the 1 in the equation showing the linear combination. I left it there so you could see where each number from the solution ended up).

Again, a problem like this has infinitely many answers. All you have to do is pick a value for the free variables and you will have one particular solution you can use in writing the linear combination.

When the Vector is NOT a Linear Combination of the Others

It is worth seeing one example where a vector is not a linear combination of some given vectors. When this happens, we will end up with an augmented matrix indicating an inconsistent system of equations.

Example

Determine if the vector \(\begin{bmatrix} 1 \\ 2 \\ 1 \\ \end{bmatrix}\) is a linear combination of the vectors:
\(\begin{bmatrix} 1 \\ 1 \\ 0 \\ \end{bmatrix}\), \(\begin{bmatrix} 0 \\ 1 \\ -1 \\ \end{bmatrix}\), and \(\begin{bmatrix} 1\\ 2 \\ -1 \\ \end{bmatrix}\).

Solution

Step 1

We set up our augmented matrix and row reduce it.

\(
\left[ \begin{array}{ccc|c} 1 & 0 & 1 & 1 \\
1 & 1 & 2 & 2 \\
0 & -1 & -1 & 1 \\
\end{array} \right]
\)

is equivalent to:

\(
\left[ \begin{array}{ccc|c} 1 & 0 & 1 & 0 \\
0 & 1 & 1 & 0 \\
0 & 0 & 0 & 1 \\
\end{array} \right]
\)

Step 2

We determine if the matrix represents a consistent system of equations.

Given the form of the last row, this matrix represents an inconsistent system of equations. That means there is no way to write this vector as a linear combination of the other vectors. That’s that – nothing else to say! This will be our conclusion any time row reduction results in a row with zeros and a nonzero value on the augment.

Study guide – linear combinations and span

cover-linear-combinations-and-span

Need more practice with linear combinations and span? This 40-page study guide will help! It includes explanations, examples, practice problems, and full step-by-step solutions.


Get the study guide

Linear Combinations of Vectors – The Basics

In linear algebra, we define the concept of linear combinations in terms of vectors. But, it is actually possible to talk about linear combinations of anything as long as you understand the main idea of a linear combination:


(scalar)(something 1) + (scalar)(something 2) + (scalar)(something 3)

These “somethings” could be “everyday” variables like \(x\) and \(y\) (\(3x\) + \(2y\) is a linear combination of \(x\) and \(y\) for instance) or something more complicated like polynomials. In general, a linear combination is a particular way of combining things (variables, vectors, etc) using scalar multiplication and addition.

[adsenseWide]

Working with vectors

Now back to vectors. Let’s say we have the following vectors:

\(\vec{v}_1 = \left[ \begin{array}{c}1\\ 2\\ 3\end{array} \right]\), \(\vec{v}_2 = \left[ \begin{array}{c}3\\ 5\\ 1\end{array} \right]\), \(\vec{v}_3 = \left[ \begin{array}{c}0\\ 0\\ 8\end{array} \right]\)

What would linear combinations of these vectors look like? Well, a linear combination of these vectors would be any combination of them using addition and scalar multiplication. A few examples would be:

The vector \(\vec{b} = \left[ \begin{array}{c}3\\ 6\\ 9\end{array} \right]\) is a linear combination of \(\vec{v}_1\), \(\vec{v}_2\), \(\vec{v}_3\).

Why is this true? This vector can be written as a combination of the three given vectors using scalar multiplication and addition. Specifically,

\(\left[ \begin{array}{c}3\\ 6\\ 9\end{array} \right] = 3\left[ \begin{array}{c}1\\ 2\\ 3\end{array} \right] + 0\left[ \begin{array}{c}3\\ 5\\ 1\end{array} \right] + 0\left[ \begin{array}{c}0\\ 0\\ 8\end{array} \right]\)

Or, using the names given to each vector:

\(\vec{b} = 3\vec{v}_1 + 0\vec{v}_2 + 0\vec{v}_3\)

The vector \(\vec{x} = \left[ \begin{array}{c}2\\ 3\\ -6\end{array} \right]\) is a linear combination of \(\vec{v}_1\), \(\vec{v}_2\), \(\vec{v}_3\).

Once again, we can show this is true by showing that you can combine the vectors \(\vec{v}_1\), \(\vec{v}_2\), and \(\vec{v}_3\) using addition and scalar multiplication such that the result is the vector \(\vec{x}\).

\(\left[ \begin{array}{c}2\\ 3\\ -6\end{array} \right] = -1\left[ \begin{array}{c}1\\ 2\\ 3\end{array} \right] + 1\left[ \begin{array}{c}3\\ 5\\ 1\end{array} \right] + \left(-\dfrac{1}{2}\right)\left[ \begin{array}{c}0\\ 0\\ 8\end{array} \right]\)

or again, equivalently

\(\vec{x} = -1\vec{v}_1 +1\vec{v}_2 + \left(-\dfrac{1}{2}\right)\vec{v}_3\)

Of course, we could keep going for a long time as there are a lot of different choices for the scalars and way to combine the three vectors. In general, the set of ALL linear combinations of these three vectors would be referred to as their span. This would be written as \(\textrm{Span}\left(\vec{v}_1, \vec{v}_2, \vec{v}_3\right)\). The two vectors above are elements, or members of this set.

Formal Definition

Now that we have seen a couple of examples and the general idea, let’s finish with the formal definition of a linear combination of vectors. Let the vectors \(\vec{v}_1, \vec{v}_2, \vec{v}_3, \cdots \vec{v}_n\) be vectors in \(\mathbb{R}^{n}\) and \(c_1, c_2, \cdots , c_n\) be scalars. Then the vector \(\vec{b}\), where \(\vec{b} = c_1\vec{v}_1 + c_2\vec{v}_2 + \dots + c_n\vec{v}_n\) is called a linear combination of \(\vec{v}_1, \vec{v}_2, \vec{v}_3, … \vec{v}_n\). The scalars \(c_1, c_2, … , c_n\) are commonly called the “weights”.

Again, this is stating that \(\vec{b}\) is a result of combining the vectors using scalar multiplication (the c’s) and addition.

Study guide – linear combinations and span

cover-linear-combinations-and-span

Need more practice with linear combinations and span? This 40-page study guide will help! It includes explanations, examples, practice problems, and full step-by-step solutions.


Get the study guide