Math 314

Topics for second exam

Technically, everything covered by the first exam plus

Chapter 2 § 6 Determinants

(Square) matrices come in two flavors: invertible (all Ax = b have a solution) and non-invertible (Ax = 0 has a non-trivial solution). It is an amazing fact that one number identifies this difference; the determinant of A.

For 2×2 matrices A = (

a

b
c

d
), this number is det(A)=ad-bc; if ¹ 0, A is invertible, if =0, A is non-invertible (=singular).

For larger matrices, there is a similar (but more complicated formula):

A= n×n matrix, M_ij(A) = matrix obtained by removing ith row and jth column of A.

det(A) = S_{i = 1}ⁿ (-1)ⁱ⁺¹a_i1det(M_i1(A))

(this is called expanding along the first column)

Amazing properties:

If A is upper triangular, then det(A) = product of the entries on the diagonal

If you multiply a row of A by c to get B, then det(B) = cdet(A)

If you add a mult of one row of A to another to get B, then det(B) = det(A)

If you switch a pair of rows of A to get B, then det(B) = -det(A)

In other words, we can understand exactly how each elementary row operation affects the determinant. In part,

A is invertible iff det(A) ¹ 0; and in fact, we can use row operations to calculate det(A) (since the RREF of a matrix is upper triangular).

More interesting facts:

det(AB) = det(A)det(B) ; det(A^T) = det(A)

We can expand along other columns than the first:

det(A) = S_{i = 1}ⁿ (-1)^i+ja_ijdet(M_ij(A))

(expanding along jth column)

And since det(A^T) = det(A), we could expand along rows, as well....

A formula for the inverse of a matrix:

If we define A_c to be the matrix whose (i,j)th entry is (-1)^i+jdet(M_ij(A)), then A_c^TA = (detA)I (A_c^T is called the adjoint of A). So if det(A) ¹ 0, then we can write the inverse of A as

A^-1 = [1/det(A)]A_c^T (This is very handy for 2×2 matrices...)

The same approach allows us to write an explicit formula for the solution to Ax = b, when A is invertible:

If we write B_i = A with its ith column replaced by b, then the (unique) solution to Ax = b has ith coordinate equal to

[(det(B_i))/det(A)]

Chapter 3: Vector Spaces

§ 1:: Basic concepts

Basic idea: a vector space V is a collection of things you can add together, and multiply by scalars (= numbers)

V = things for which v,w Î V implies v+v Î V ; a Î R and v Î V implies a·v Î V

E.g., V=R², add and scalar multiply componentwise

V=all 3-by-2 matrices, add and scalar multiply entrywise

V={ax²+bx+c : a,b,c Î R} = polynomials of degree £ 2; add, scalar multiply as functions

The standard vector space of dimension n : Rⁿ = {(x₁,¼,x_n) : x_i Î R all i}

An abstract vector space is a set V together with some notion of addition and scalar multiplication, satisfying the `usual rules': for u,v,w Î V and c,d Î R we have

u+v Î V, cu Î V

u+v = v+u, u+(v+w) = (u+v)+w

There is 0 Î V and -u Î V with 0+u = u all u, and u+(-u) = 0

c(u+v) = cu+cv, (c+d)u = cu+du, (cd)u = c(du), 1u = u

Examples: R^m,n = all m×n matrices, under matrix addition/scalar mult

C[a,b] = all continuous functions f:[a,b]®R, under function addition

{A Î R^n,n : A^T = A} = all symmetric matrices, is a vector space

Note: {f Î C[a,b] : f(a) = 1} is not a vector space (e.g., has no bf 0)

Basic facts:

0v = 0, c0 = 0, (-c)v = -(cv); cv = 0 implies c = 0 or v = 0

A vector space (=VS) has only one 0; a vector has only one additive inverse

Linear operators:

T:V® W is a linear operator if T(cu+dv) = cT(u)+dT(v) for all c,d Î R, u,v Î V

Example: T_A:Rⁿ® R^m, T_A(v) = Av, is linear

T:C[a,b]® R, T(f) = f(b), is linear

T:R²® R, T(x,y) = x-xy+3y is not linear!

§ 2:: Subspaces

Basic idea: V = vector space, W Í V, then to check if W is a vector space, using the same addition and scalar multiplication as V, we need only check two things:

whenever c Î R and u,v Î W, we always have cu, u+v Î W

All other properties come for free, since they are true for V !

If V is a VS, W Í V and W is a VS using the same operations as V, we say that W is a (vector) subspace of V.

Examples: {(x,y,z) Î R³ : z = 0} is a subspace of R³

{(x,y,z) Î R³ : z = 1} is not a subspace of R³

{A Î R^n,n : A^T = A} is a subspace of R^n,n

Basic construction: v₁,¼,v_n Î V

W = {a₁v₁+¼a_nv_n : a₁,¼,a_n Î R = all linear combinations of v₁,¼,v_n = span{v₁,¼,v_n} = the span of v₁,¼,v_n , is a subspace of V

Basic fact: if w₁,¼,w_k Î span{v₁,¼,v_n}, then span{w₁,¼,w_k} Í span{v₁,¼,v_n}

§ 3:: Subspaces from matrices

column space of A = \cal C(A) = span{the columns of A}

row space of A = \cal R(A) = span{(transposes of the ) rows of A}

nullspace of A = \cal N(A) = {x Î Rⁿ : Ax = 0}

(Check: \cal N(A) is a subspace!)

Alternative view Ax = lin comb of columns of A, so is in \cal C(A); in fact, \cal C(A) = {Ax : x Î Rⁿ}

Subspaces from linear operators: T:V® W

image of T = im(T) = {Tv : v Î V}

kernel of T = ker(T) = {x : T(x) = 0}

When T = T_A, im(T) = \cal C(A), and ker(T) = \cal N(A)

T is called one-to-one if Tu = Tv implies u = v

Basic fact: T is one-to-one iff ker(T) = {0}

§ 4:: Norm and inner product

Norm means length! In Rⁿ this is computed as ||x|| = ||(x₁,¼,x_n)|| = (x₁²+¼+x_n²)^1/2

Basic facts: ||x|| ³ 0, and ||x|| = 0 iff x = 0,

||cu|| = |c|·||u||, and ||u+v|| £ ||u||+||v|| (triangle inequality)

unit vector: the norm of u/||u|| is 1; u/||u|| is the unit vector in the direction of u.

convergence: u_n® u if ||u_n-u||® 0

Inner product:

idea: assign a number to a pair of vectors (think: angle between them?)

In Rⁿ, we use the dot product: v = (v₁,¼,v_n), w = (w₁,¼,w_n)

v·w = áv,wñ = v₁w₁+¼+v_nw_n = v^Tw

Basic facts:

áv,vñ = ||v||² (so áv,vñ ³ 0, and equals 0 iff v = 0)

áv,wñ = áw,vñ; ácv,wñ = áv,cwñ = cáv,wñ

§ 5:: Applications of norms and inner products

Cauchy-Schwartz inequality: for all v,w, |áv,wñ| £ ||v||·||w||

(this implies the triangle inequality)

So: -1 £ áv,wñ/(||v||·||w||) £ 1

Define: the angle Q between v and w = the angle (between 0 and p with cos(Q) = áv,wñ/(||v||·||w||)

Ex: v = w : then cos(Q) = 1, so Q = 0

Two vectors are orthogonal if their angle is p/2, i.e., áv,wñ=0. Notation: v^w

Pythagorean theorem: if v^w, then ||v+w||² = ||v||²+||w||²

Orthogonal projection: Given v,w Î Rⁿ, then we can write v = cw+u, with u^w

c = [(áv,wñ)/(áw,wñ)];

cw = proj_wv = [(áv,wñ)/(áw,wñ)]w= [(áv,wñ)/(||w||)][w/(||w||)] = (orthogonal) projection of v onto w

u = v-cw !

Least squares:

Idea: Find the closest thing to a solution to Ax = b, when it no solution.

Overdetermined system: more equations than unknowns. Typically, the system will have no solution.

Instead, find the vector with a solution (i.e, of the form Ax) to b.

Need: Ax-b perpendicular to the subspace \cal C(A)

I.e, need: Ax-b ^ each column of A, i.e., need á(column of A),Ax-bñ = 0

I.e., need A^T(Ax-b) = 0, i.e., need (A^TA)x = (A^Tb)

Fact: such a system of equations is always consistent!

Ax will be the closest vector in \cal C(A) to b

If A^TA is invertible (need: r(A)=number of columnsof A), then we can write x = (A^TA)^-1(A^Tb); Ax = A(A^TA)^-1(A^Tb)

§ 6:: Bases and dimension

Idea: putting free and bound variables on a more solid theoretical footing

We've seen: every solution to Ax = b can be expressed in terms of the free variables (x = v+x_i₁v₁+¼+x_{i_k}v_k)

Could a different method of solution give us a different number of free variables? (Ans: No! B/c that number is the `dimension' of a certain subspace...)

Linear independence/dependence:

v₁,¼,v_n Î V are linearly independent if the only way to express 0 as a linear combination of the v_i's is with all coefficients equal to 0;

whenever c₁v₁+¼+c_nv_n = 0, we have c₁ = ¼ = c_n = 0

Otherwise, we say the vectors are linearly dependent. I.e, some non-trivial linear combination equals 0. Any vector v_i in such a linear combination having a non-zero coefficient is called redundant; the expression (lin comb = 0) can be rewritten to say that v_i = lin comb of the remaining vectors, i.e., v_i is in the span of the remaining vectors. This means:

Any redundant vector can be removed from our list of vectors without changing the span of the vectors.

A basis for a vector space V is a set of vectors v₁,¼,v_n so that (a) they are linearly independent, and (b) V=span{v₁,¼,v_n} .

Example: The vectors e₁ = (1,0,¼,0)m e₂ = (0,1,0,¼,0),¼,e_n = (0,¼,0,1) are a basis for Rⁿ, the standard basis.

To find a basis: start with a collection of vectors that span, and repeatedly throw out redundant vectors (so you don't change the span) until the ones that are left are linearly independent. Note: each time you throw one out, you need to ask: are the remaining ones lin indep?

Basic fact: If v₁,¼,v_n is a bassis for V, then every v Î V can be expressed as a linear combination of the v_i's in exactly one way. If v = a₁v₁+¼+a_nv_n, we call the a_i the coordinates of v with respect to the basis v₁,¼,v_n .

The Dimension Theorem: Any two bases of the same vector space contain the same number of vectors. (This common number is called the dimension of V, denoted dim(V) .)

Reason: if v₁,¼,v_n is a basis for V and w₁,¼,w_k Î V are linearly independent, then k £ n

As part of that proof, we also learned:

If v₁,¼,v_n is a basis for V and w₁,¼,w_k are linearly independent, then the spanning set v₁,¼,v_n,w₁,¼,w_k for V can be thinned down to a basis for V by throwing away v_i's .

In reverse: we can take any linearly independent set of vectors in V, and add to it from any basis for V, to produce a new basis for V.

Some consequences:

If dim(V)=n, and W Í V is a subspace of V, then dim(W) £ n

If dim(V)=n and v₁,¼,v_n Î V are linearly independent, then they also span V

If dim(V)=n and v₁,¼,v_n Î V span V, then they are also linearly independent.

§ 7:: Linear systems revisited

Using our new-found terminology, we have:

A system of equations Ax = b has a solution iff b Î \cal C(A) .

If Ax₀ = b, then every other solution to Ax = b is x = x₀+z, where z Î \cal N(A) .

To finish our description of (a) the vectors b that have solutions, and (b) the set of solutions to Ax = b, we need to find (useful) bases for \cal C(A) and \cal N(A).

So of course we start with:

Finding a basis for the row space.

Basic idea: if B is obtained from A by elementary row operations, then \cal R(A) = \cal R(B).

So of R is the reduced row echelon form of A, \cal R(R) = \cal R(A)

But a basis for \cal R(R) is easy to find; take all of the non-zero rows of R ! (The zero rows are clearly redundant.) These rows are linearly independent, since each has a `special coordinate' where, among the rows, only it is non-zero. That coordinate is the pivot in that row. So in any linear combination of rows, only that vector can contribute something non-zero to that coordinate. Consequently, in any linear combination, that coordinate is the coefficient of our vector! So, if the lin comb is 0, the coefficient of our vector (i.e., each vector!) is 0.

Put bluntly, to find a basis for \cal R(A), row reduce A, to R; the (transposes of) the non-zero rows of R form a basis for \cal R(A).

This in turn gives a way to find a basis for \cal C(A), since \cal C(A) = \cal R(A^T) !

To find a basis for \cal C(A), take A^T, row reduce it to S; the (transposes of) the non-zero rows of S form a basis for \cal R(A^T) =\cal C(A) .

This is probably in fact the most useful basis for \cal C(A), since each basis vector has that special coordinate. This makes it very easy to decide if, for any given vector b, Ax = b has a solution. You need to decide if b can be written as a linear combination of your basis vectors; but each coefficient will be the corrdinate of b lying at the special coordinate of each vector. Then just check to see if that linear combination of your basis vectors adds up to b !

There is another, perhaps less useful, but faster way to build a basis for \cal C(A); row reduce A to R, locate the pivots in R, and take the columns of A (Note: A, not R !) the correspond to the columns containing the pivots. These form a (different) basis for \cal C(A).

Why? Imagine building a matrix B out of just the bound columns. Then in row reduced form there is a pivot in every column. Solving Bv = 0 in the case that there are no free variables, we get v = 0, so the columns are linearly independent. If we now add a free column to B to get C, we get the same collection of pivots, so our added column represents a free variable. Then there are non-trivial solutions to Cv = 0, so the columns of C are not linearly independent. This means that the added columns can be expressed as a linear combination of the bound columns. This is true for all free columns, so the bound columns span \cal C(A).

Finally, there is the nullspace \cal N(A). To find a basis for \cal N(A):

Row reduce A to R, and use each row of R to solve Rx = 0 by expressing each bound variable in terms of the frees. collect the coeeficients together and write x = x_i₁v₁+¼+x_{i_k}v_k where the x_{i_j} are the free variables. Then the vectors v₁,¼,v_k form a basis for \cal N(A).

Why? By construction they span \cal N(A); and just with our row space procedure, each has a special coordinate where only it is 0 (the coordinate corresponding to the free variable!).

Note: since the number of vectors in the bases for \cal R(A) and \cal C(A) is the same as the number of pivots ( = number of nonzero rows in the RREF) = rank of A, we have dim(\cal R(A))=dim(\cal C(A))=r(A).

And since the number of vectors in the basis for \cal N(A) is the same as the number of free variables for A ( = the number of columns without a pivot) = nullity of A (hence the name!), we have dim(\cal N(A)) = n(A) = n-r(A) (where n=number of columns of A).

So, dim(\cal C(A)) + dim(\cal N(A)) = the number of columns of A .

File translated from T_EX by T_TH, version 0.9.