Linear Algebra Solutions

Applied Linear Algebra Instructor’s Solutions Manual by Peter J. Olver and Chehrzad Shakiban

Table of Contents Chapter 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11.

Page

Linear Algebraic Systems . . . . . . . . . . . . . . . . Vector Spaces and Bases . . . . . . . . . . . . . . . . . Inner Products and Norms . . . . . . . . . . . . . . . Minimization and Least Squares Approximation Orthogonality . . . . . . . . . . . . . . . . . . . . . . . . . Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . Linearity . . . . . . . . . . . . . . . . . . . . . . . . . . . . Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . Linear Dynamical Systems . . . . . . . . . . . . . . . . Iteration of Linear Systems . . . . . . . . . . . . . . . Boundary Value Problems in One Dimension . .

1

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

.. 1 . 46 . 78 114 131 174 193 226 262 306 346

Solutions — Chapter 1

1.1.1. (a) Reduce the system to x − y = 7, 3 y = −4; then use Back Substitution to solve for 4 x = 17 3 ,y = − 3. (b) Reduce the system to 6 u + v = 5, − 25 v = 52 ; then use Back Substitution to solve for u = 1, v = −1. (c) Reduce the system to p + q − r = 0, −3 q + 5 r = 3, − r = 6; then solve for p = 5, q = −11, r = −6. (d) Reduce the system to 2 u − v + 2 w = 2, − 23 v + 4 w = 2, − w = 0; then solve for u = 13 , v = − 34 , w = 0. (e) Reduce the system to 5 x1 + 3 x2 − x3 = 9, 15 x2 − 25 x3 = 25 , 2 x3 = −2; then solve for x1 = 4, x2 = −4, x3 = −1. (f ) Reduce the system to x + z − 2 w = − 3, − y + 3 w = 1, − 4 z − 16 w = − 4, 6 w = 6; then solve for x = 2, y = 2, z = −3, w = 1. 3 55 5 (g) Reduce the system to 3 x1 + x2 = 1, 38 x2 + x3 = 32 , 21 8 x3 + x4 = 4 , 21 x4 = 7 ; then 2 2 3 3 , x2 = 11 , x3 = 11 , x4 = 11 . solve for x1 = 11 1.1.2. Plugging in the given values of x, y and z gives a+2 b− c = 3, a−2− c = 1, 1+2 b+c = 2. Solving this system yields a = 4, b = 0, and c = 1. ♥ 1.1.3. (a) With Forward Substitution, we just start with the top equation and work down. Thus 2 x = −6 so x = −3. Plugging this into the second equation gives 12 + 3y = 3, and so y = −3. Plugging the values of x and y in the third equation yields −3 + 4(−3) − z = 7, and so z = −22. (b) We will get a diagonal system with the same solution. (c) Start with the last equation and, assuming the coefficient of the last variable is 6= 0, use the operation to eliminate the last variable in all the preceding equations. Then, again assuming the coefficient of the next-to-last variable is non-zero, eliminate it from all but the last two equations, and so on. (d) For the systems in Exercise 1.1.1, the method works in all cases except (c) and (f ). Solving the reduced system by Forward Substitution reproduces the same solution (as it must): (a) The system reduces to 32 x = 17 2 , x + 2 y = 3. 15 (b) The reduced system is 15 u = 2 2 , 3 u − 2 v = 5. (c) The method doesn’t work since r doesn’t appear in the last equation. (d) Reduce the system to 23 u = 12 , 72 u − v = 52 , 3 u − 2 w = −1. (e) Reduce the system to 32 x1 = 83 , 4 x1 + 3 x2 = 4, x1 + x2 + x3 = −1. (f ) Doesn’t work since, after the first reduction, z doesn’t occur in the next to last equation. 5 21 3 8 2 (g) Reduce the system to 55 21 x1 = 7 , x2 + 8 x3 = 4 , x3 + 3 x4 = 3 , x3 + 3 x4 = 1.

1.2.1. (a) 3 × 4,

(b) 7,

(c) 6,

(d) ( −2 0 1 2 ), 1

0

1

0 C (e) B @ 2 A. −6

0

1

1

2 3 5 6C 1.2.2. (a) A, (b) 7 8 9 0 1 1 C (e) B @ 2 A, (f ) ( 1 ). 3 B @4

1 1

2 4

0

!

3 , (c) 5

1

B @4

7

2 5 8

3 6 9

1

4 7C A, (d) ( 1 3

2

3

4 ),

1.2.3. x = − 31 , y = 34 , z = − 31 , w = 23 .

1.2.4. (a) (b) (c) (d) (e)

(f )

(g)

!

!

!

7 x −1 ; , b= , x= A= 3 y 2 ! ! ! 1 u 5 A= , x= , b= ; −2 v 5 0 1 0 1 0 1 1 1 −1 p 0 B C B C A=B 3C @ 2 −1 A, x = @ q A, b = @ 3 A; −1 −1 0 r 6 1 0 0 1 1 0 2 1 2 u 3 B C B C A=B 3 3C @ −1 A, x = @ v A, b = @ −2 A; w1 4 −3 1 0 71 0 0 0 9 x1 5 3 −1 B C C B C A=B @ 3 2 −1 A, x = @ x2 A, b = @ 5 A; x3 0 1 −1 0 1 1 2 0 1 1 x 1 0 1 −2 −3 B C B C B C 2 −1 2 −1 C B 3C B yC B C, x = B C, b = B C; A=B @ zA @ 0 −6 −4 @ 2A 2A 1 3 2 1 −1 1 w 0 1 1 0 0 1 3 1 0 0 x1 B C C B B 1 3 1 0C B1C Bx C C C, x = B 2 C, b = B C. B A=B @1A @0 1 3 1A @ x3 A 1 x4 0 0 1 3 1 1 6 3

1.2.5. (a) x − y = −1, 2 x + 3 y = −3. The solution is x = − 65 , y = − 51 . (b) u + w = −1, u + v = −1, v + w = 2. The solution is u = −2, v = 1, w = 1. (c) 3 x1 − x3 = 1, −2 x1 − x2 = 0, x1 + x2 − 3 x3 = 1. The solution is x1 = 51 , x2 = − 25 , x3 = − 52 . (d) x + y − z − w = 0, −x + z + 2 w = 4, x − y + z = 1, 2 y − z + w = 5. The solution is x = 2, y = 1, z = 0, w = 3.

1.2.6.

0

1 B B0 (a) I = B @0 0 (b) I + O =

0 1 0 0 I,

1

0

0 0 0 0 0 B 0 0C B0 0 0 C C, O=B @0 0 0 1 0A 0 0 0 0 1 I O = O I = O. No, it does

(f )

0

1

B @3

7

11 −12 8

6 4

!

0 , (d) undefined, (e) undefined, 2 1 0 1 9 9 −2 14 B −12 C 6 −17 C A, (g) undefined, (h) @ −8 A, (i) undefined. 8 12 −3 28

1.2.7. (a) undefined, (b) undefined, (c)

3 −1

1

0 0C C C. 0A 0 not.

2

1.2.8. Only the third pair commute. 1.2.9. 1, 6, 11, 16. 0

1 1.2.10. (a) B @0 0

1

0 0 0

0 0C A, (b) −1

1.2.11. (a) True, (b) true. ♥ 1.2.12. (a) Let A =

x z

0

2 B B0 B @0 0

0 −2 0 0

1

0 0 3 0

0 0C C C. 0A −3

!

ax az

y . Then A D = w

by bw

!

=

ax bz

ay bw

!

= D A, so if a 6= b these !

a 0 are equal if and only if y = z = 0. (b) Every 2 × 2 matrix commutes with = a I. 0 a 0 1 x 0 0 C (c) Only 3 × 3 diagonal matrices. (d) Any matrix of the form A = B @ 0 y z A. (e) Let 0 u v D = diag (d1 , . . . , dn ). The (i, j) entry of A D is aij dj . The (i, j) entry of D A is di aij . If di 6= dj , this requires aij = 0, and hence, if all the di ’s are different, then A is diagonal. 1.2.13. We need A of size m × n and B of size n × m for both products to be defined. Further, A B has size m × m while B A has size n × n, so the sizes agree if and only if m = n. 1.2.14. B =

x 0

y x

!

where x, y are arbitrary.

1.2.15. (a) (A + B)2 = (A + B)(A + B) = AA + AB + BA + BB = A2 + 2AB + B 2 , since ! ! 1 2 0 0 AB = BA. (b) An example: A = ,B= . 0 1 1 0 1.2.16. If A B is defined and A is m × n matrix, then B is n × p matrix and A B is m × p matrix; on the other hand if B A is defined we must have p = m and B A is n × n matrix. Now, since A B = B A, we must have p = m = n. 1.2.17. A On×p = Om×p , Ol×m A = Ol×n . 1.2.18. The (i, j) entry of the matrix equation c A = O is c aij = 0. If any aij 6= 0 then c = 0, so the only possible way that c 6= 0 is if all aij = 0 and hence A = O. 1.2.19. False: for example,

1 0

0 0

!

0 1

0 0

!

0 0

=

!

0 . 0

1.2.20. False — unless they commute: A B = B A. 1.2.21. Let v be the column vector with 1 in its j th position and all other entries 0. Then A v is the same as the j th column of A. Thus, the hypothesis implies all columns of A are 0 and hence A = O. 1.2.22. (a) A must be a square matrix. (b) By associativity, A A2 = A A A = A2 A = A3 . (c) The na¨ıve answer is n − 1. A more sophisticated answer is to note that you can comr pute A2 = A A, A4 = A2 A2 , A8 = A4 A4 , and, by induction, A2 with only r matrix multiplications. More generally, if the binary expansion of n has r + 1 digits, with s nonzero digits, then we need r + s − 1 multiplications. For example, A13 = A8 A4 A since 13 is 1101 in binary, for a total of 5 multiplications: 3 to compute A2 , A4 and A8 , and 2 more to multiply them together to obtain A13 .

3

1.2.23. A =

0 0

!

1 . 0

♦ 1.2.24. (a) If the ith row of A has all zero entries, then the (i, j) entry of A B is ai1 b1j + · · · + ain bnj = 0 b1j + · · · + 0 bnj = 0, which holds for all j, so the ith row of A B will have all 0’s. ! ! ! 1 1 1 2 1 1 . , then B A = ,B= (b) If A = 3 3 3 4 0 0 −1 3

1.2.25. The same solution X = 1.2.26. (a)

4 1

!

5 , (b) 2

5 −2

!

1 −2

!

in both cases.

−1 . They are not the same. 1 !

1 0

1.2.27. (a) X = O. (b) Yes, for instance, A =

2 ,B= 1

3 −2

!

2 ,X= −1

1 1

!

0 . 1

1.2.28. A = (1/c) I when c 6= 0. If c = 0 there is no solution. ♦ 1.2.29. (a) The ith entry of A z is 1 ai1 +1 ai2 +· · ·+1 ain = ai1 +· · ·+ain , which is the ith row sum. 1 1−n (b) Each row of W has n − 1 entries equal to n and one entry equal to n and so its row 1−n 1 sums are (n − 1) n + n = 0. Therefore, by part (a), W z = 0. Consequently, the row 0 sums 0 the 1 of B = A W are 1 1 B C (c) z = B @ 1 A, and so A z = @ 2 1 −4 0 10 2 1 − 1 2 −1 C B 3 3 B 1 2 B CB B 2 1 3 − @ A@ 3 3 1 1 − 4 5 −1 3 3

entries of = AW = A 0 = 0, and the result follows. 0 z1 0z 1 1B 1 2 2 −1 B C B C 1 3C A @ 1 A = @ 6 A, while B = A W = 5 −1 1 0 1 0 1 0 1 1 4 5 1 − 3 −3 0 3C 3 B C 1C C, and so B z = B C=B 0C @ A. 0 1 −1 @ A 3A 0 2 4 −5 1 − 3

♦ 1.2.30. Assume A has size m × n, B has size n × p and C 0 has size p × 1 q. The (k, j) entry of B C is

p X

l=1

bkl clj , so the (i, j) entry of A (B C) is

On the other hand, the (i, l) entry of A B is 0 1 p X

l=1

@

n X

k=1

aik bkl A clj =

n X

p X

k=1 l=1

n X

kk= 1 X

i=1

aik @

p X

l=1

bkl clj A =

n X

p X

k=1 l=1

aik bkl clj .

aik bkl , so the (i, j) entry of (A B) C is

aik bkl clj . The two results agree, and so A (B C) =

(A B) C. Remark : A more sophisticated, simpler proof can be found in Exercise 7.1.44. ♥ 1.2.31. (a) We need A B and B A to have the same size, and so this follows from Exercise 1.2.13. (b) A B − B A = O if and only if A B = B A. 0 1 ! ! 0 1 1 −1 2 0 0 B (c) (i) , (ii) , (iii) @ 1 0 1 C A; 6 1 0 0 −1 1 0 (d) (i) [ c A + d B, C ] = (c A + d B)C − C(c A + d B) = c(A C − C A) + d(B C − C B) = c [ A, B ] + d [ B, C ], [ A, c B + d C ] = A(c B + d C) − (c B + d C)A = c(A B − B A) + d(A C − C A) = c [ A, B ] + d [ A, C ]. (ii) [ A, B ] = A B − B A = − (B A − A B) = − [ B, A ]. 4

(iii)

h

h

h

i

[ A, B ], C = (A B − B A) C − C (A B − B A) = A B C − B A C − C A B + C B A, i

[ C, A ], B = (C A − A C) B − B (C A − A B) = C A B − A C B − B C A + B A C, i

[ B, C ], A = (B C − C B) A − A (B C − C B) = B C A − C B A − A B C + A C B. Summing the three expressions produces O.

♦ 1.2.32. (a) (i) 4, (ii) 0, (b) tr(A + B) = (c) The diagonal entries of A B are

n X

i=1

j =1

entries of B A are

n X

i=1

n X

(aii + bii ) =

n X

i=1

aij bji , so tr(A B) = n X

bji aij , so tr(B A) =

n X

i=1 j =1

aii + n X

n X

i=1 n X

i=1 j =1

bii = tr A + tr B. aij bji ; the diagonal

bji aij . These double summations are

clearly equal. (d) tr C = tr(A B − B A) = tr A B − tr B A = 0 by part (a). (e) Yes, by the same proof. ♦ 1.2.33. If b = A x, then bi = ai1 x1 + ai2 x2 + · · · + ain xn for each i. On the other hand, cj = (a1j , a2j , . . . , anj )T , and so the ith entry of the right hand side of (1.13) is x1 ai1 + x2 ai2 + · · · + xn ain , which agrees with the expression for bi . ♥ 1.2.34. (a) This follows by direct computation. (b) (i) ! ! ! ! ! ! ! −2 1 1 −2 −2 1 −2 4 1 0 −1 4 = ( 1 −2 ) + (1 0) = + = . 3 2 1 0 3 2 3 −6 2 0 5 −6 0 1 ! ! ! ! 2 5 (ii) 0 −2 1 1 −2 0 B C ( 1 −1 ) ( −3 0 ) + (2 5) + 0A = @ −3 2 −1 −3 −3 −1 2 1 −1 = (iii) 3 −1 B 2 @ −1 1 1 0

10

2 −6

5 −15

1

0

!

+

6 3

0 0

!

+

0 2

0

1

0 −2

!

=

8 −1

!

5 . −17 0

1

1

1 −1 3 1 2 3 0 B B B B C C C C 1C A @ 3 −1 4 A = @ −1 A( 2 3 0 ) + @ 2 A( 3 −1 4 ) + @ 1 A( 0 4 1 ) −5 1 1 −5 0 4 1 0 0 0 0 1 1 1 1 3 14 −3 0 4 1 −3 1 −4 6 9 0 B B B C −1 9C 8C 4 1C =B @ −2 −3 0 A + @ 6 −2 A. A + @0 A = @4 5 −18 −1 2 3 0 3 −1 4 0 −20 −5 (c) If we set B = x, where x is an n × 1 matrix, then we obtain (1.14).

(d) The (i, j) entry of A B is

n X

k=1

aik bkj . On the other hand, the (i, j) entry of ck rk equals

the product of the ith entry of ck , namely aik , with the j th entry of rk , namely bkj . Summing these entries, aik bkj , over k yields the usual matrix product formula. ♥ 1.2.35.

!

−2 −8 −1 (a) p(A) = A − 3A + 2 I , q(A) = 2A + I . (b) p(A) = , q(A) = 4 6 0 3 2 5 3 2 (c) p(A)q(A) = (A − 3A + 2 I )(2A + I ) = 2A − 5A + 4A − 3A + 2 I , while p(x)q(x) = 2 x5 − 5 x3 + 4 x2 − 3 x + 2. (d) True, since powers of A mutually commute. For the particular matrix from (b), ! 2 8 . p(A) q(A) = q(A) p(A) = −4 −6 3

2

5

!

0 . −1

♥ 1.2.36.

2 0

2

(a) Check that S = A by direct computation. Another example: S =

!

0 . Or, more 2

generally, 2 times any of the matrices in part (c). (b) S 2 is only defined if S is square.! ! a b ±1 0 , where a is arbitrary and b c = 1 − a2 . , (c) Any of the matrices c −a 0 ±1 ! 0 −1 (d) Yes: for example . 1 0 1 0 1 1 −1 B 3 0 1C C B C B C. (c) Since matrix addition is 1 1 3 ♥ 1.2.37. (a) M has size (i+j)×(k+l). (b) M = B C B @ −2 2 0A 1 1 −1 done entry-wise, adding the entries of each block is the same as adding the blocks. (d) X has size k × m, Y has size k × n, Z has size l × m, and W has size l × n. Then A X + B Z will have size i × m. Its (p, q) entry is obtained by multiplying the pth row of M times the q th column of P , which is ap1 x1q + · · · + api xiq + bp1 z1q + · · · + bpl zlq and equals the sum of the (p, q) entries of A X and B Z. A similar argument works for the remaining three ! ! 0 −1 0 , then , W = blocks. (e) For example, if X = (1), Y = ( 2 0 ), Z = 1 0 1 0 1 0 1 −1 0 1 B 4 1 2 0 7 0C B C B C B C C. The individual block products are P = @ 0 0 −1 A, and so M P = B 4 5 −1 C B @ −2 −4 −2 A 1 1 0 0 1 −1

0

0 4

!

4

1

B C @ −2 A

0

1 3

=

=

0

!

(1) +

1 0

1

1

0

1

B C B @ −2 A (1) + @ 2

1

1.3.1. (a)

−1 1

1 −2

7 −9

˛

˛ ˛ ˛ ˛ ˛

4 2

! !

1

!

1

0

! 3 C 0 0A , 1 −1

2R1 +R2

−→ 2

!

0 , 1

1 0

7 5

!

−1 0

5

−1 1 B C −2 C A = @ −2 A ( 2 −1 1

B @ −4

˛ ˛ ˛ ˛ ˛

!

1 7

1

1

1 3

= 0

(2 1

0) +

1 0 0

−1 1 1

0) + B @2 1

!

0 1 1

3 0 0C A 1 −1

!

−1 , 0 !

−1 . 0

!

4 . Back Substitution yields x2 = 2, x1 = −10. 10 ˛

!

−5 ˛˛ −1 − 3 R1 +R2 3 −5 ˛˛ −1 ˛ ˛ 26 . Back Substitution yields w = 2, z = 3. (b) −→ ˛ 1 ˛ 8 0 13 3 ˛ ˛ ˛ 1 1 1 03 0 0 ˛ 1 ˛˛ 0 23 R2 +R3 1 −2 1 −2 1 ˛ 0 4R1 +R3 1 −2 1 ˛˛ 0 B B 2 −8 ˛˛˛ 8 C (c) B 2 −8 ˛˛˛ 8 C 2 −8 ˛˛˛ 8 C A −→ @ 0 A −→ @ 0 A. @ 0 ˛ ˛ −9 −9 0 −3 13 −4 5 9 0 0 1 ˛ 3 Back Substitution˛ yields ˛ 0= 16, x = 29.˛ 1 z = 3, y 0 0 1 1 4 −2 ˛˛ 1 −3R1 +R3 1 1 4 −2 ˛˛ 1 2R1 +R2 1 4 −2 ˛˛ 1 B ˛ C B ˛ B C ˛ 8 −7 ˛˛ −5 A −→ @ 0 0 −3 ˛˛ −7 A −→ @ 0 (d) @ −2 8 −7 ˛˛ −5 C A 3 −2 2 ˛ −1 3 −2 2 ˛ −1 0 −14 8 ˛ −4 ˛ 1 0 7 1 1 4 −2 ˛˛ R +R 2 3 4 −→ B −7 ˛˛˛ −5 C A. Back Substitution yields r = 3, q = 2, p = −1. @0 8 ˛ − 51 0 0 − 17 4 4 3 2

6

˛

0

˛

0

1

1

1 0 −2 0 ˛˛ −1 1 0 −2 0 ˛˛ −1 ˛ B C 0 1 0 −1 ˛ 2 C 0 −1 ˛˛ 2 C C B0 1 ˛ C. C reduces to B ˛ (e) @0 0 0 −3 2 0 ˛˛ 0 A 2 −3 ˛˛ 6 A −4 0 0 7 ˛ −5 0 0 0 −5 ˛ 15 3 Solution: x4 = −3, x3 = − 2 , x2 = −1, x1 = −4. ˛ ˛ 0 1 0 1 −1 3 −1 1 ˛˛ −2 −1 3 −1 1 ˛˛ −2 B B 2 0 ˛˛ −2 C 1 −1 3 −1 ˛˛ 0 C B 0 2 C C C reduces to B B ˛ ˛ C. (f ) B @ 0 0 −2 @ 0 4 ˛˛ 1 −1 4 ˛˛ 7 A 8A ˛ ˛ 0 0 0 −24 5 4 −1 1 0 −48 Solution: w = 2, z = 0, y = −1, x = 1. B B B @

1.3.2. (a) 3 x + 2 y = 2, − 4 x − 3 y = −1; solution: x = 4, y = −5, (b) x + 2 y = −3, − x + 2 y + z = −6, − 2 x − 3 z = 1; solution: x = 1, y = −2, z = −1, (c) 3 x − y + 2 z = −3, −2 y − 5 z = −1, 6 x − 2 y + z = −3; solution: x = 23 , y = 3, z = −1, (d) 2 x − y = 0, − x + 2 y − z = 1, − y + 2 z − w = 1, − z + 2 w = 0; solution: x = 1, y = 2, z = 2, w = 1. 1.3.3. (a) x = 17 , y = − 34 ; (b) u = 1, v = −1; (c) u = 23 , v = − 13 , w = 16 ; (d) x1 = 3 10 2 2 19 5 1 4 2 11 3 , x2 = − 3 , x3 = − 3 ; (e) p = − 3 , q = 6 , r = 2 ; (f ) a = 3 , b = 0, c = 3 , d = − 3 ; (g) x = 31 , y = 67 , z = − 83 , w = 29 . 1.3.4. Solving 6 = a + b + c, 4 = 4 a + 2 b + c, 0 = 9 a + 3 b + c, yields a = −1, b = 1, c = 6, so y = −x2 + x + 6. 1.3.5.

2 (a) Regular: 1 (b) Not regular. 0

1 4

!

−→

2 0

1

7 2

1

3

!

. 1

0

−2 1 3 −2 1 B 10 − 38 C (c) Regular: 4 −3 C A. A −→ @ 0 3 0 0 4 3 −2 5 0 1 0 1 1 −2 3 1 −2 3 B C B (d) Not regular: @ −2 4 −1 A −→ @ 0 0 5C A. 3 −1 2 0 5 −7 (e) Regular: 0 1 0 1 0 1 3 −3 0 1 3 −3 0 1 B C B B 3 −4 2 C B −1 0 −1 2 C B0 C B0 B C −→ B C −→ B @ 3 3 −6 1 A @ 0 −6 @0 3 1A 2 3 −3 5 0 −3 3 5 0 B @ −1

1.3.6.

−i 1− i

(a) 0

i (b) B @ 0 −1 (c)

1− i −i

1+ i 1 0 2i 2i

˛ ˛ ˛ ˛ ˛

−1 −3 i

!

−→

−i 0

1+ i 1 − 2i

˛ ˛ ˛ ˛ ˛

3 3 0 0

−1 1 − 2i

−3 −4 −5 −1

1

0

1 0 B 2C B0 C C −→ B @0 5A 0 7

!

;

use Back Substitution to obtain the solution y = 1, x = 1 − 2 i . ˛

1

0

2i i 0 1− i 1 − i ˛˛ B 2 C 1+ i 1 + i ˛˛˛ A −→ @ 0 2 i ˛ 1 − 2i 0 0 −2 − i i 1 solution: z= i , y = − 2 − 23 i , x˛ = 1 + i . ˛ ! ˛ i 1− i 2 ˛˛ 2 ˛ i ˛ ˛ −→ 0 2 i ˛ − 23 − 12 i 1 + i ˛ −1 solution: y = − 41 + 34 i , x = 21 . 7

˛ ˛ ˛ ˛ ˛ ˛ ˛

!

2i 2 1 − 2i

;

1

C A.

3 3 0 0

−3 −4 −5 0

1

0 2C C C. 5A 6

˛

0

˛

0

1

1

1+ i i 2 + 2 i ˛˛ 0 i 2 + 2 i ˛˛ 0 B C ˛ ˛ 0 A −→ @ 0 1 −2 + 3 i ˛˛˛ 0 C 2 i A; ˛ 0 0 −6 + 6 i ˛ 6 i 3 − 11 i ˛ 6 solution: z = − 12 − 21 i , y = − 25 + 21 i , x = 25 + 2 i .

1+ i (d) B @ 1− i 3 − 3i

1.3.7. (a) 2 x = 3, − y = 4, 3 z = 1, u = 6, 8 v = − 24. (b) x = 23 , y = − 4, z = 13 , u = 6, v = − 3. (c) You only have to divide by each coefficient to find the solution. ♦ 1.3.8. 0 is the (unique) solution since A 0 = 0. ♠ 1.3.9. Back Substitution start set xn = cn /unn for i = n − 1 to 1 with increment −1 1 set xi = uii

next j

0

@c i

−

i+1 X

j =1

1

uij xj A

end

1.3.10. Since a11 0

a12 a22

!

b11 0

b12 b22

!

!

=

a11 b11 0

!

a11 b12 + a12 b22 , a22 b22 !

!

a11 b11 a22 b12 + a12 b11 a11 a12 b11 b12 , = 0 a22 b22 0 a22 0 b22 the matrices commute if and only if a11 b12 + a12 b22 = a22 b12 + a12 b11 , or (a11 − a22 )b12 = a12 (b11 − b22 ). 1.3.11. Clearly, any diagonal matrix is both lower and upper triangular. Conversely, A being lower triangular requires that aij = 0 for i < j; A upper triangular requires that aij = 0 for i > j. If A is both lower and upper triangular, aij = 0 for all i 6= j, which implies A is a diagonal matrix. ♦ 1.3.12. (a) Set lij = 0

(

0 (b) L = B @ 1 −2

♦ 1.3.13.

aij , 0, 0 0 0

i > j, , i ≤ j,

1

0 0C A, 0

uij = 0

3 D=B @0 0 0

(

aij , 0,

0 −4 0

1

i < j, i ≥ j,

0 0C A, 5

dij = 0

0 U =B @0 0

(

1 0 0

aij , 0,

i = j, i 6= j.

1

−1 2C A. 0

1

0 0 1 C 3 (a) By direct computation, A2 = B @ 0 0 0 A, and so A = O. 0 0 0 (b) Let A have size n × n. By assumption, aij = 0 whenever i > j − 1. By induction, one

proves that the (i, j) entries of Ak are all zero whenever i > j − k. Indeed, to compute the (i, j) entry of Ak+1 = A Ak you multiply the ith row of A, whose first i entries are 0, 8

by the j th column of Ak , whose first j − k − 1 entries are non-zero, and all the rest are zero, according to the induction hypothesis; therefore, if i > j − k − 1, every term in the sum producing this entry is 0, and the induction is complete. In particular, for k = n, n every entry of Ak is zero, and ! so A = O. 1 1 (c) The matrix A = has A2 = O. −1 −1

1.3.14. (a) Add (b) Add (c) Add (d) Add (e) Add

−2 times the second row to the first row of a 2 × n matrix. 7 times the first row to the second row of a 2 × n matrix. −5 times the third row to the second row of a 3 × n matrix. 1 2 times the first row to the third row of a 3 × n matrix. −3 times the fourth row to the second row of a 4 × n matrix.

0

1 B B0 1.3.15. (a) B @0 0

0 1 0 0

1.3.16. L3 L2 L1 =

0 0 1 1 0

1

B @2

0

0

1

0 0C C C, (b) 0A 1 0 1

− 21

1

0

1 B B0 B @0 0

0 1 0 0

0 0 1 0

0 0C A 6= L1 L2 L3 . 1 1

1

0 0C C C, (c) −1 A 1

0

0

1 B B0 B @0 0

0 1 0 0

0 0 1 0

1

3 0C C C, (d) 0A 1

0

1 B B0 B @0 0

0 1 0 −2

0 0 1 0

1

0 0C C C. 0A 1

1

1 0 0 1 0 0 B 1.3.17. E3 E2 E1 = B 1 0C 1 0C @ −2 A, E1 E2 E3 = @ −2 A. The second is easier to predict 1 1 −2 2 1 −1 2 1 since its entries are the same as the corresponding entries of the Ei . 1.3.18. e adds d 6= 0 times row k to (a) Suppose that E adds c 6= 0 times row i to row j 6= i, while E e row l 6= k. If r1 , . . . , rn are the rows, then the effect of E E is to replace (i) rj by rl + c ri + d rk for j = l; (ii) rj by rj + c ri and rl by rl + (c d) ri + d rj for j = k; (iii) rj by rj + c ri and rl by rl + d rk otherwise. e is to replace On the other hand, the effect of E E (i) rj by rl + c ri + d rk for j = l; (ii) rj by rj + c ri + (c d) rk and rl by rl + d rk for i = l; (iii) rj by rj + c ri and rl by rl + d rk otherwise. e =E e E whenever i 6= l and j 6= k. Comparing results, we see that E E (b) E1 E2 = E2 E1 , E1 E3 6= E3 E1 , and E3 E2 = E2 E3 . (c) See the answer to part (a). 1.3.19. (a) Upper triangular; (b) both special upper and special lower triangular; (c) lower triangular; (d) special lower triangular; (e) none of the above. 1.3.20. (a) aij = 0 for all i 6= j; (b) aij = 0 for all i > j; (c) aij = 0 for all i > j and aii = 1 for all i; (d) aij = 0 for all i < j; (e) aij = 0 for all i < j and aii = 1 for all i. ♦ 1.3.21. (a) Consider the product L M of two lower triangular n × n matrices. The last n − i entries in the ith row of L are zero, while the first j − 1 entries in the j th column of M are zero. So if i < j each summand in the product of the ith row times the j th column is zero, 9

and so all entries above the diagonal in L M are zero. (b) The ith diagonal entry of L M is the product of the ith diagonal entry of L times the ith diagonal entry of M . (c) Special matrices have all 1’s on the diagonal, and so, by part (b), does their product. 1.3.22. (a) L = 0

1 (c) L = B @ −1 1 0 1 (e) L = B @ −2 −1 0 1 B B 0 (g) L = B B −1 @ 0

0

1 B 0 B U =B @0 0

!

!

!

!

1 0 1 3 1 0 1 3 , U= , (b) L = , U= , −1 1 0 3 3 1 0 −8 1 0 1 0 1 0 2 0 1 0 0 0 0 −1 1 −1 1 B B 1 0C 1 0C 0C (d) L = B A, U = @ 0 3 A, U = @ 0 2 A, @2 0 1 0 0 3 0 0 0 13 1 1 0 1 0 1 0 1 0 0 0 −1 0 0 1 0 0 B B C B 1 0C 1 0C A, U = @ 0 3 A, U = @ 0 −3 0 A, (f ) L = @ 2 0 0 −1 1 0 0 2 −3 13 1 0 1 1 0 1 0 −1 0 0 0 0 1 0 0 B 0 2 −1 B −1 C 1 0 0C B C C −1 1 0 B C, U = B C 3 (h) L = B B C 1 7 C, @ −2 1 1 2 1 0A @0 0 2 2A 1 3 −1 −2 3 1 − 0 0 0 −10 2

1 3 0 0

1

−2 1 −4 0

3 3C C C, 1A 1

(i) L =

0 B B B B B @

1

0 1

1 2 3 2 1 2

− 37 1 7

0 0 1

5 − 22

0

1

0 2 B C B C 0C B0 C, U = B C B0 0A @ 0 1

1

3

1

7 2

− 23 − 22 7 0

1 2 5 7 35 22

0 0

3

− 21

1

C A, 7 6 1

−1 4C A, − 13 3 1

0 0C C C, 0A 1 1

C C C C. C A

1.3.23. (a) Add 3 times first row to second row. (b) Add −2 times first row to third row. (c) Add 4 times second row to third row. 1.3.24. 0

1

1 0 0 0 B C B2 1 0 0C C (a) B @3 4 1 0A 5 6 7 1 (b) (1) Add −2 times first row to second row. (2) Add −3 times first row to third row. (3) Add −5 times first row to fourth row. (4) Add −4 times second row to third row. (5) Add −6 times second row to fourth row. (6) Add −7 times third row to fourth row. (c) Use the order given in part (b).

♦ 1.3.25. See equation (4.51) for the general case. 0 0

1 B Bt B 1 B 2 Bt @ 1 t31

1 Bt @ 1 t21 1 t2 t22 t32

1 t3 t23 t33

!

1 t1

1 t2

1 t2 t22

1 1 B t3 C A = @ t1 t21 t23

1

=

1

0

1 t1

0

1 1 C B C B t4 C B t1 C=B 2 Bt t24 C A @ 1 3 t4 t31 0

0 1

!

1 0

0 1 t1 + t 2

1 t2 − t1 10

0 1 B 0C A@0 0 1

0 1 t1 + t 2 t21 + t1 t2 + t22 1

B B0 B B B0 @

0

1 t2 − t1 0 0

!

1 t2 − t1 0

0 0 1 t1 + t 2 + t 3

1 t3 − t 1 (t3 − t1 ) (t3 − t2 ) 0 10

1

1 C t3 − t 1 A, (t3 − t1 ) (t3 − t2 ) 1

0 C 0C C C 0C A 1

1

1 C C t4 − t 1 C C. C (t4 − t1 ) (t4 − t2 ) A (t4 − t1 ) (t4 − t2 ) (t4 − t3 )

!

1 1 is regular. Only if the zero appear in the (1, 1) position 1 0 does it automatically preclude regularity of the matrix.

1.3.26. False. For instance

1.3.27. (n − 1) + (n − 2) + · · · + 1 = 1.3.28. We solve the equation

0 1

!

u1 0

u2 u3

!

a c

=

b d

!

for u1 , u2 , u3 , l, where a 6= 0 since

b is regular. This matrix equation has a unique solution: u1 = a, u2 = b, d c bc , l= . u3 = d − a a A =

a c

!

1 l

n(n − 1) . 2

!

0 1 ♦ 1.3.29. The matrix factorization A = L U is = 1 0 This implies x = 0 and a x = 1, which is impossible.

1 a

0 1

!

x 0

y z

!

=

x ax

!

y . ay + z

♦ 1.3.30. (a) Let u11 , . . . , unn be the pivots of A, i.e., the diagonal entries of U . Let D be the diagonal matrix whose diagonal entries are dii = sign uii . Then B = A D is the matrix obtained by multiplying each column of A by the sign of its pivot. Moreover, B = L U D = e , where U e = U D, is the L U factorization of B. Each column of U e is obtained by LU e , which are multiplying it by the sign of its pivot. In particular, the diagonal entries of U the pivots of B, are uii sign uii = | uii | > 0. (b) Using the same notation as in part (a), we note that C = D A is the matrix obtained by multiplying each row of A by the sign of its pivot. Moreover, C = D L U . However, D L is not special lower triangular, since its diagonal entries are the pivot signs. b = D L D is special lower triangular, and so C = D L D D U = L bU b , where But L b b U = D U , is the L U factorization of B. Each row of U is obtained by multiplying it b , which are the pivots of by the sign of its pivot. In particular, the diagonal entries of U C, are uii sign uii = | uii | > 0. (c) 1 10 0 0 1 −2 2 1 1 0 0 −2 2 1 CB B 1 B C 3C @ 1 0 1 A = @ − 2 1 0 A@ 0 1 2 A, 0 0 −4 −2 6 1 4 2 3 0 0 1 1 10 2 2 −1 1 0 0 2 2 −1 B B 1 C CB 3C @ −1 0 −1 A = @ − 2 1 0 A@ 0 1 − 2 A, −4 2 −3 0 0 4 −2 6 1 10 1 0 1 0 2 −2 −1 1 0 0 2 −2 −1 CB B 1 C B 3C 1 0 A@ 0 1 0 1A = @ 2 @ 1 2 A. −2 −6 1 0 0 4 −4 −2 −3

1.3.31. (a) x =

−1 2 3

!

, (b) x =

0

1

1 @ 4 A, 1 4

(c) x = 0

0

0

B C @ 1 A,

0

1 − 37 12 2 B B 0 B C B − 17 B C B1C 12 C, (h) x = B (f ) x = @ 1 A, (g) x = B B @1A B 1 −1 @ 4 0 2 0

1

0

11

1

1

C C C C, C C A

(d) x =

(i) x =

0 B B B @

0 B B B B B B @

4 7 2 7 5 17

−

3 35 6 35 1 7 8 35

C C C C. C C A

1

C C C, A

(e) x =

0

−1

1

B −1 C @ A, 5 2

1.3.32. 1 −3

0 1

1 (b) L = B @ −1 1

0 1 0

(a) L = 0

(c)

(d)

(e)

(f )

0

!

, U=

−1 0

!

3 ; 11

0

1

0 −1 B 0C A, U = @ 0 1 0 1

0

1 2 0

0

x1 = @ 1

−1 0C A; 3

1

−

1 5 11 A, 2 11 0

x2 = 1

!

0

0

−1 B − B C B − x1 = @ 0 A, x2 = B @ 0 0

1

1 9 11 A; 3 1 11

1 , x3 = @ 1

0

1 6 3 2 5 3

C C C; A

1

9 −2 −1 0 −2 1 B C B B C 1C B0 −1 C; 2 , U = x = −9 C , x = L = − 32 @ @ A A; 0C @ A 2 1 3 3A 2 1 5 3 −1 00 0 − 3 9 3 1 0 1 1 1 0 0 2.0 .3 .4 B L=B 1 0C .355 4.94 C @ .15 A, U = @ 0 A; .2 1.2394 1 0 0 −.2028 0 0 0 1 1 1 .6944 1.1111 −9.3056 B B B C C x1 = @ −1.3889 A, x2 = @ −82.2222 A, x3 = @ 68.6111 C A .0694 6.1111 −4.9306 0 0 51 0 1 1 1 0 1 0 −1 0 1 0 0 0 4 14 B C B B0 2 B− 5 B 0 B−1 C 3 −1 C B C 1 0 0C B 14 C B 4C C; B C C, U = B B B , x = x = L=B 3 7 7 B C 2 1 1 0 1 1 B C B A @ −1 0 0 − @ 2 2 2A @ 14 @ 4A 1 0 − 2 −1 1 1 1 0 0 0 4 4 2 0 1 1 0 1 0 0 0 1 −2 0 2 B B 1 0 0C 9 −1 −9 C B 4 C C B0 C, U = B C; L=B 1 @ −8 − 17 1 0 A A @0 0 0 9 9 −4 0 −1 1 0 1 0 1 0 0 0 1 1 0 10 1 1 C B B C B C 0C B 8C B1C C. B C, x2 = B C, x3 = B x1 = B @ 41 A @3A @4A 4 2 0 B B @

1

0 1

1

C C C C; C A

1.4.1. The nonsingular matrices are (a), (c), (d), (h). 1.4.2. (a) Regular and nonsingular, (b) singular, (c) nonsingular, (d) regular and nonsingular. (b) x1 = 0, x2 = −1, x3 = 2; 1.4.3. (a) x1 = − 35 , x2 = − 10 3 , x3 = 5; 9 (c) x1 = −6, x2 = 2, x3 = −2; (d) x = − 13 2 , y = − 2 , z = −1, w = −3; (e) x1 = −11, x2 = − 10 3 , x3 = −5, x4 = −7. 1.4.4. Solve the equations −1 = 2 b + c, 3 = − 2 a + 4 b + c, −3 = 2 a − b + c, for a = −4, b = −2, c = 3, giving the plane z = − 4 x − 2 y + 3. 1.4.5. (a) Suppose A is nonsingular. If a 6= 0 and c 6= 0, then we subtract c/a times the first row from the second, producing the (2, 2) pivot entry (a d − b c)/a 6= 0. If c = 0, then the pivot entry is d and so a d − b c = a d 6= 0. If a = 0, then c 6= 0 as otherwise the first column would not contain a pivot. Interchanging the two rows gives the pivots c and b, and so a d − b c = b c 6= 0. (b) Regularity requires a 6= 0. Proceeding as in part (a), we conclude that a d − b c 6= 0 also. 1.4.6. True. All regular matrices are nonsingular.

12

♦ 1.4.7. Since A is nonsingular, we can reduce it to the upper triangular form with nonzero diagonal entries (by applying the operations # 1 and # 2). The rest of argument is the same as in Exercise 1.3.8. 1.4.8. By applying the operations # 1 and # 2 to the system Ax = b we obtain an equivalent upper triangular system U x = c. Since A is nonsingular, uii 6= 0 for all i, so by Back Sub0 1 n X cn 1 @ and xi = ci − uik xk A, stitution each solution component, namely xn = unn uii k = i+1 for i = n − 1, n − 2, . . . , 1, is uniquely defined. 0

1

0

1

1

0 0 0 0 0 0 B 0 0 1C B0 1 0 C C, (b) P2 = B 1.4.9. (a) P1 = @0 0 1 0 1 0A 1 0 0 0 1 0 0 (c) No, they do not commute. (d) P1 P2 arranges P2 P1 arranges them in the order 2, 4, 3, 1. B B0 B @0

1.4.10. (a)

0

0

B @0

1

1 0 0

0

1

0 0 B B0 C 1 A, (b) B @1 0 0

0 0 0 1

0 1 0 0

1

1 0C C C, (c) 0A 0

1 0C C C, 0A 0 the rows in the order 4, 1, 3, 2, while

0

0 B B1 B @0 0

1 0 0 0

1

0 0 0 1

0 0C C C, (d) 1A 0

0

0

B1 B B B0 B @0

0

0 0 0 1 0

0 0 1 0 0

1

1 0 0 0 0

0 0C C C. 0C C 0A 1

1.4.11. The (i, j) entry of the following Multiplication Table indicates the product P i Pj , where 0

1 P1 = B @0 0 0 0 P4 = B @1 0

0

1

0 1 0 1 0 0

0 P2 = B @0 1 0 0 P5 = B @0 1

0 0C A, 1 1 0 0C A, 1

1 0 0 0 1 0

0

1

0 P3 = B @1 0 0 1 P6 = B @0 0

0 1C A, 0 1 1 0C A, 0

0 0 1 0 0 1

The commutative pairs are P1 Pi = Pi P1 , i = 1, . . . , 6, and P2 P3 = P3 P2 .

1.4.12. (a)

0

1

B B0 B @0

0

P1

P2

P3

P4

P5

P6

P1

P1

P2

P3

P4

P5

P6

P2

P2

P3

P1

P6

P4

P5

P3

P3

P1

P2

P5

P6

P4

P4

P4

P5

P6

P1

P2

P3

P5

P5

P6

P4

P3

P1

P2

P6

P6

P4

P5

P2

P3

P1

0 1 0 0

0 0 1 0

1

0 0C C C, 0A 1

0

0

B B0 B @0

1

1 0 0 0

0 0 1 0

1

0 1C C C, 0A 0

0

0

B B1 B @0

0

13

0 0 0 1

0 0 1 0

1

1 0C C C, 0A 0

0

0

B B1 B @0

0

1 0 0 0

0 0 1 0

1

1 0C A, 0 1 0 1C A. 0

1

0 0C C C, 0A 1

0

0

B B0 B @0

1

0 1 0 0

0 0 1 0

1

1 0C C C, 0A 0

0

1

B B0 B @0

0 0 B B1 B @0 0 0

0 0 0 1 0 0 1 0

0 0 1 0 0 0 0 1

1

0 1C C C; 0A 01 1 0C C C, 0A 0

(b) 0

0 B B0 B @1 0

0

0

B B1 B @0

1 0 0 0

0 0 0 0 1

1 0 0 0 0 0 0 11 0 1C C C; 0A 0

1

0 0C C C, 1A 0

0

1

0 0 1 0 0 0 1 0 B 0 0 B (c) B @0 1 0 0 B B0 B @0

0 0 0 1 0 1 0 0

1

0 1C C C, 0A 01 0 0C C C, 0A 1

0

0

0 1 0 0 0 0 0 0 B B0 0 B @0 1 1 0 B B0 B @1

0 0 0 1 0 1 0 0

1

1 0C C C, 0A 01 1 0C C C. 0A 0

0

1

B B0 B @0

0

0 1 0 0

0 0 0 1

1

0 0C C C, 1A 0

1.4.13. (a) True, since interchanging the same pair of rows twice brings you back 0 to where 1 0 0 1 C you started. (b) False; an example is the non-elementary permuation matrix B @ 1 0 0 A. ! 0 1 0 −1 0 is not a permutation matrix. For a complete list of (c) False; for example P = 0 −1 such matrices, see Exercise 1.2.36. 1.4.14. (a) Only when all the entries of v are different; (b) only when all the rows of A are different. 0

1

1 0 0 C 1.4.15. (a) B @ 0 0 1 A. (b) True. (c) False — A P permutes the columns of A according to 0 1 0 the inverse (or transpose) permutation matrix P −1 = P T . ♥ 1.4.16. (a) If P has a 1 in position (π(j), j), then it moves row j of A to row π(j) of P A, which is enough to establish the correspondence. 1 0 1 0 1 0 0 0 0 0 1 0 1 1 0 0 0 0 0 0 1 C B 0 1 0 B0 0 0 1 0C C B C B C B B0 0 1 0C B0 1 0 0C B C C, (iv ) B 0 0 1 0 0 C. C, (iii) B (b) (i) @ 1 0 0 A, (ii) B C B @0 0 0 1A @0 0 1 0A @0 1 0 0 0A 0 0 1 0 1 0 0 1 0 0 0 1 0 0 0 0 Cases (i) and ! (ii) are elementary matrices. ! ! ! 1 2 3 1 2 3 4 1 2 3 4 1 2 3 4 5 (c) (i) , (ii) , (iii) , (iv ) . 2 3 1 3 4 1 2 4 1 2 3 2 5 3 1 4 ♦ 1.4.17. The first row of an n×n permutation matrix can have the 1 in any of the n positions, so there are n possibilities for the first row. Once the first row is set, the second row can have its 1 anywhere except in the column under the 1 in the first row, and so there are n − 1 possibilities. The 1 in the third row can be in any of the n − 2 positions not under either of the previous two 1’s. And so on, leading to a total of n(n − 1)(n − 2) · · · 2 · 1 = n ! possible permutation matrices. 1.4.18. Let ri , rj denote the rows of the matrix in question. After the first elementary row operation, the rows are ri and rj + ri . After the second, they are ri − (rj + ri ) = − rj and rj + ri . After the third operation, we are left with − rj and rj + ri + (− rj ) = ri .

1.4.19. (a)

0 1

1 0

!

0 2

1 −1

!

=

1 0

0 1

!

2 0

!

−1 , 1

14

x=

1 5 @ 2 A; 0

3

0

0 (b) B @0 1 (c)

(d)

0

1

B B0 B @0

0 0 B 1 B (e) B @0 0 0 0 B0 B B0 (f ) B B @1 0 0

1.4.20.

0

0 B @1 0 0 0 1 0 0 0 1 0 0 1 0 0 0

0

1 (a) B @0 0

(b)

0

0

B B0 B @1

0

(c)

1 0 0

0

1

B B0 B @0

0

0 0 1 0 1 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 1 0 0 0 0 0 1

10

0 0 B 1C A@ 1 0 0 10

1

0

1

0

0 −4 1 B 2 3C A = @0 1 7 0

0 1 0

10

0 1 B 0C A@ 0 1 0 10

1 1 0 1 −3 1 0 0 CB B B 0C 3C A@ 0 2 A = @ 0 1 0 A@ 0 010 1 0 2 0 00 2 1 1 0 1 0 0 1 2 −1 0 B B 0C 0 6 2 −1 C CB 3 B1 1 C CB C=B 0 A@ 1 1 −7 2 A @ 3 0 1 1 10 1 −1 2 11 0 1 3 21 5 0 0 1 0 0 1 0 0 B B 0C 3 1 0C 1 0 CB 2 C B0 CB C=B 0 A@ 1 4 −1 2 A @ 2 −5 1 1 7 −1 2 3 7 −29 3 10 0 1 0 0 1 0 0 0 2 3 4 C B C 0 0 CB 0 1 −7 2 3 C B B0 1 B C B CB 1 4 C = B0 0 1 0C 1 1 1 CB B C 0 0 A@ 0 0 1 0 2 A @ 0 0 0 1 0 0 0 0 1 7 3 10

1

1

x

1 5 4C 3C C; = 4A 1 − 41 0 0 B B B @

0 2 −1 C 1 −3 C x=B @ 1 A; A, 0109 0 1 0 1 22 1 2 −1 0 0 C B B 2C 0C B −13 C CB 0 −1 −6 C C; CB C, x = B @ −5 A 0 A@ 0 0 5 −1 A 4 −22 110 0 0 0 − 51 0 1 1 4 −1 2 0 −1 B C B 0 0C 0C C B −1 C CB 0 1 C, x = B CB C; @ 1A 0 A@ 0 0 3 −4 A 0 0 0 1 1 3 10 1 1 0 1 4 1 1 1 1 0 0 0 C C B B C 0 0 0 CB 0 1 −7 2 3 C B 0C CB C B B0 0 C. 1 0 2C ,x=B 0C 1 0 0C CB B C C @ −1 A 0 3 0A 2 1 0 A@ 0 0 0 0 0 0 1 0 1 37 1 10

0

1

2 3 1 7C A, 0 −4

1

4 −4 2 1 0 0 2 CB C B 3 CB 0 −2 − 1 C; 1C A=B 1 0 − A@ @ 4 2A 5 −2 0 0 − 34 0 1 2 5 7 3 solution: x1 = 4 , x2 = 4 , x3 = 2 . 0 1 10 1 10 1 0 0 0 1 −1 1 −3 0 1 −1 1 1 0 B B B 0 0C 1 1 0C 1 1 0C 0 0C B0 1 C CB 0 C CB 0 C=B CB C; CB @ A @ @ A A 0 1 1 0 0 0 −2 1A 1 −1 1 −3 0 0 3 1 3 25 1 0 0 0 1 2 −1 1 0 1 2 solution: x = 4, y = 0, z = 1, w = 1. 10 10 1 1 0 0 0 1 −1 2 1 1 −1 2 1 1 0 0 0 CB CB C B 0 1 CB −1 3 −3 0C 1 −3 0C B 1 1 0 0 CB 0 C CB CB C; C=B 1 0 A@ 1 −1 0 2 −4 A 1 −3 A @ 1 0 1 0 A@ 0 1 0 0 0 0 0 1 1 2 −1 1 −1 0 − 2 1 5 19 solution: x = 3 , y = − 3 , z = −3, w = −2. 0 4 B 1C A@ −3 0 −3

−4 3 1

♦ 1.4.21. (a) They are all of the form P A = L U , where P is a permutation matrix. In the first case, we interchange rows 1 and 2, in the second case, we interchange rows 1 and 3, in the third case, we interchange rows 1 and 3 first and then interchange rows 2 and 3. (b) Same solution x = 1, y = 1, z = −2 in all cases. Each is done by a sequence of elementary row operations, which do not change the solution. 1.4.22. There are four 0 in 0 B @1 0 0 0 B @0 1 0 0 B @0 1

all: 1 0 1 0 0 B 0 0C A@1 0 1 1 10 0 1 0 B 0 1C A@1 1 0 0 10 0 1 0 B 1 0C A@1 0 0 1

1 0 1 1 0 1 1 0 1

1

0

2 1 B −1 C A = @0 3 1 0 1 1 2 B −1 C A = @1 0 3 0 1 2 1 B −1 C A = @1 3 0 15

0 1 1 0 1 1

10

1

0 1 0 −1 B 0C 2C A@0 1 A, 1 0 0 2 1 10 1 0 −1 0 B 4C 0C A, A@0 1 0 0 −2 1 10 1 0 0 1 1 3 B C 1 0C A @ 0 −1 −4 A , −1 1 0 0 −2

0

10

0

1

10

0

1 0 B 0C A@0 0 1

0 1 0 1 2 1 0 B B C 0 0C 1 A @ 1 0 −1 A = @ 0 0 1 0 1 1 3 1 −1 The other two permutation matrices are not regular. B @1

1.4.23. The maximum is 6 since there 0 1 B @ 1 −1 0 10 1 0 0 1 B CB @0 0 1A@ 1 0 1 0 −1 0 10 0 1 0 1 B CB @1 0 0A@ 1 0 0 1 −1 0

0

B @0

1 0 B @0 1 0 0 B @1 0 0

1 0 0 0 1 0 0 0 1

10

0 1 B 1C A@ 1 0 −1 10 1 1 B 0C A@ 1 0 −1 10 1 1 B 0C A@ 1 0 −1

are 0 1 1 0 1 1 0 1 1 0 1 1 0 1 1 0 1 1

1 1 0

1

3 2C A. −2

6 different 3 × 3 permutation matrices. For example, 1 0 10 1 0 1 0 0 1 0 0 B CB C 0C A = @ 1 1 0A@0 1 0A, 1 −1 1 1 0 0 1 1 0 10 1 0 1 0 0 1 0 0 B CB 0C 1C A = @ −1 1 0 A @ 0 1 A, 1 1 1 1 0 0 −1 10 1 0 1 1 1 0 0 1 0 0 B C B 0C 1 0C A @ 0 −1 0 A , A=@ 1 1 −1 −2 1 0 0 1 0

1

0 1 B 0C A = @ −1 1 1 0 1 0 1 B 0C A = @ −1 1 −1 1 0 0 1 B 0C A = @ −1 1 −1

10

1

1 1 0 0 B 1C 0C A, A@0 2 1 1 0 0 2 −2 1 1 10 0 0 −1 1 1 CB 1 0A@ 0 2 1C A, 1 0 0 12 2 1 10 1 0 0 −1 1 1 CB 1 0A@ 0 1 1C A. 2 1 0 0 −1 0 1

1.4.24. False. Changing the permuation matrix typically changes the pivots. ♠ 1.4.25. Permuted L U factorization

start set P = I , L = I , U = A for j = 1 to n if ukj = 0 for all k ≥ j, stop; print “A is singular” if ujj = 0 but ukj 6= 0 for some k > j then

interchange rows j and k of U interchange rows j and k of P for m = 1 to j − 1 interchange ljm and lkm next m for i = j + 1 to n set lij = uij /ujj

add − uij times row j to row i of A next i next j end

16

1.5.1.

!

!

!

!

!

2 3 −1 −3 1 0 −1 −3 2 3 , = = (a) −1 −1 1 2 0 1 1 2 −1 −1 0 10 1 0 1 0 10 1 2 1 1 3 −1 −1 1 0 0 3 −1 −1 2 1 1 CB B C B B C (b) B 2 1C 2 1C @ 3 2 1 A @ −4 A = @ 0 1 0 A = @ −4 A @ 3 2 1 A, 2 1 2 −1 0 1 0 0 1 −1 0 1 2 1 2 0 1 10 0 1 0 10 −1 1 1 1 1 −1 3 2 B −1 1 0 0 −1 3 C CB B CB B C 1 3 1 3 4 4 C C B 2 2 −1 (c) B 0 1 0 2 2 = = @ A@ @ A @ 7 −7 −7 A 7 −7 −7 A@ 5 8 5 8 6 6 −2 1 3 0 0 1 −2 1 −17 − 7 17 0 7 10 1 07 0 7 1 0 0 1 2 0 −5 16 6 −5 16 6 B CB C B C 1 3C 1.5.2. X = B A = @ 0 1 0 A. @ 3 −8 −3 A; X A = @ 3 −8 −3 A @ 0 1 −1 −8 0 0 1 −1 3 1 −1 3 1 1.5.3. (a) 0

1 (d) B @0 0 1.5.4.

0

1

B @a

b

!

0 1

!

1 , 0

0 1 0 0 1 0

1 0 (b) , −5 1 0 1 1 0 0 B 1 B0 C 3 A, (e) B @ 0 −6 1 0 0 10

0 1 B 0C A @ −a 1 −b

0 1 0

1

0

(c) 0 0 1 0

0 0C C C, 0A 1

2 , 1

0

1

0 B B0 (f ) B @0 1

0 1 0 0

0 0 1 0

1

1 0C C C. 0A 0

10

0

1 1 0 0 0 CB B 0C A = @ −a 1 0 A @ a b − b 10 1 1 1 0 0 1 0C =B @ −a A. ac − b −c 1

0 1 B 0C A = @0 1 0 M −1

1

!

1 0

0 1 0 0

0 1 0

1

0 0C A; 1

1.5.5. The ith row of the matrix multiplied by the ith column of the inverse should be equal 1. This is not possible if all the entries of the ith row are zero; see Exercise 1.2.24. 1.5.6. (a) A−1 = (b) C =

2 3

0

!

−1 2 !

1 , B −1 = @ −1 −

2 3 1 30

1 1 3 A. 1 3

0 1 , C −1 = B −1 A−1 = @ 0 1 !

!

1

0

!

!

sin θ a x x cos θ + y sin θ 1.5.7. (a) = . (b) . = Rθ−1 = cos θ b y − x sin θ + y cos θ ! cos θ − a − sin θ (c) det(Rθ − a I ) = det = (cos θ − a)2 + (sin θ)2 > 0 sin θ cos θ − a provided sin θ 6= 0, which is valid when 0 < θ < π. Rθ−1

1.5.8.

cos θ − sin θ

1 1 3 A. − 23

0

1 (a) Setting P1 = B @0 0 0 0 P4 = B @1 0 −1 we find P1 = P1 ,

0 0 0 1 0C P2 = B A, @0 0 1 1 0 1 1 0 0 0 0C P5 = B @0 A, 0 1 1 −1 −1 P2 = P3 , P3 = P2 ,

(b) P1 , P4 , P5 , P6 are their own inverses. 17

1

0

1 0 0 0 1C P3 = B A, @1 0 0 0 0 1 0 1 1 1 0C P6 = B @0 A, 0 0 0 −1 −1 P4 = P4 , P5 = P5 ,

1

0 1 0 0C A, 1 0 1 0 0 0 1C A, 1 0 P6−1 = P6 .

1

2 −1 C A. 3

0

0

B B1 B @0

(c) Yes: P =

0

0

0 B B0 1.5.9. (a) B @0 1

0 0 1 0

0 1 0 0

1 0 0 0

0 0 0 1 1

1

0 0C C C interchanges two pairs of rows. 1A 0

1 0C C C, (b) 0A 0

0

0 B B1 B @0 0

0 0 1 0

0 0 0 1

1

1 0C C C, (c) 0A 0

0

1 B B0 B @0 0

0 0 0 1

0 1 0 0

1

0 0C C C, (d) 1A 0

0

1

B0 B B B0 B @0

0

0 0 1 0 0

0 0 0 0 1

0 1 0 0 0

1

0 0C C C. 0C C 1A 0

1.5.10. (a) If i and j = π(i) are the entries in the ith column of the 2 × n matrix corresponding to the permutation, then the entries in the j th column of the 2 × n matrix corresponding to the permutation are j and i = π −1 (j). Equivalently, permute the columns so that the second row is in order 1, 2, . . . , n and then switch the two rows. (b) The permutations correspond to ! ! ! ! 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 5 (i) , (ii) , (iii) , (iv ) . 4 3 2 1 4 1 2 3 1 3 4 2 1 4 2 5 3 The inverse permutations correspond ! ! to ! ! 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 5 (i) , (ii) , (iii) , (iv ) . 4 3 2 1 2 3 4 1 1 4 2 3 1 3 5 2 4 1.5.11. If a = 0 the first row is all zeros, and so A is singular. Otherwise, we make d → 0 by an elementary row operation. If e = 0 then the resulting matrix has a row of all zeros. Otherwise, we make h → 0 by another elementary row operation, and the result is a matrix with a row of all zeros. 2 1.5.12. This is true if and ! only if A =! I , and so, according to Exercise 1.2.36, A is either of a b ±1 0 or the form , where a is arbitrary and b c = 1 − a2 . c −a 0 ±1

1.5.13. (3 I − A)A = 3A − A2 = I , so 3 I − A is the inverse of A. 1.5.14.

1 −1 A c

!

(c A) =

1 c A−1 A = I . c

1.5.15. Indeed, (An )−1 = (A−1 )n . 1.5.16. If all the diagonal entries are nonzero, then D −1 D = I . On the other hand, if one of diagonal entries is zero, then all the entries in that row are zero, and so D is not invertible. 1.5.17. Since U −1 is also upper triangular, the only nonzero summand in the product of the ith row of U and the ith column of U −1 is the product of their diagonal entries, which must equal 1 since U U −1 = I . ♦ 1.5.18. (a) A = I −1 A I . (b) If B = S −1 AS, then A = S B S −1 = T −1 B T , where T = S −1 . (c) If B = S −1 A S and C = T −1 B T , then C = T −1 (S −1 AS)T = (S T )−1 A(S T ). ♥ 1.5.19. (a) Suppose D !

−1

=

X Z

Y W

!

. Then, in view of Exercise 1.2.37, the equation D D −1 =

I O requires A X = I , A Y = O, B Z = O, B W = I . Thus, X = A−1 , W = B −1 I = O I and, since they are invertible, Y = A−1 O = O, Z = B −1 O = O.

18

(b) 1.5.20.

0 B B B @

−

1 3 2 3

0

1

2 3 − 13

0

0C C , 0C A

1 3

0

−1

1 1 0 0

B B −2 B @ 0

0

!

0

1

0 0 −5 2

0 0C C C. 3A −1

1

! 1 −1 1 1 0 B 1 0 (a) B A = 1C . @0 A= −1 −1 1 0 1 1 1 (b) A X = I does not0have a solution. Indeed, the first column of this matrix equation is 0 1 1 ! 1 −1 1 x C 1C the linear system B =B @0 @ 0 A, which has no solutions since x − y = 1, y = 0, A y 1 1 0 and x + y = 0 are incompatible. ! 2 3 −1 (c) Yes: for instance, B = . More generally, B A = I if and only if B = −1 −1 1 ! 1 − z 1 − 2z z , where z, w are arbitrary. −w 1 − 2w w 0

1

−2 y 1 − 2 v 1.5.21. The general solution to A X = I is X = y v C A, where y, v are arbitrary. −1 1 Any of these matrices serves as a right inverse. On the other hand, the linear system Y A = I is incompatible and there is no solution. B @

1.5.22. „ q « (a) No. The only solutions are complex, with a = − 12 ± i 23 b, where b 6= 0 is any nonzero complex number. ! ! −1 1 1 0 (b) Yes. A simple example is A = ,B = . The general solution to the −1 0 0 1 ! x y 2 × 2 matrix equation has the form A = B M , where M = is any matrix with z w tr M = x + w = −1, and det M = x w − y z = 1. To see this, if we set A = B M , then ( I + M )−1 = I + M −1 , which is equivalent to I + M + M −1 = O. Writing this out using the formula (1.38) for the inverse, we find that if det M = x w − y z = 1 then tr M = x+w = −1, while if det M 6= 1, then y = z = 0 and x+x−1 +1 = 0 = w+w −1 +1, in which case, as in part (a), there are no real solutions.

0

1 B B0 1.5.23. E = B @0 0 1.5.24. (a) 0

3 (e) B @9 1

0 @

−1 1 −2 −7 −1

0 1 0 0

0 0 7 0 1

2 3 A, 1 3 1

−2 −6 C A, −1

1

0 0C C C, 0A 1 (b)

(f )

E −1 0 @

0 B B B @

0

1 B B0 =B @0 0

− − −

1 8 3 8

5 8 1 2 7 8

0 1 0 0

0 0

1 7

0

1

3 8 A, − 81 1 8 1 2 − 38

1

0 0C C C. 0A 1

(c) 5 8 − 12 1 8

19

1

C C C, A

0 @

−

3 5 4 5 0

1 4 5 A, 3 5

−5 B 2 (g) B @ 2 2

(d) no inverse, 3 2

−1 −1

1 1 2C −1 C A,

0

(h)

0

0

B B1 B @0

0

1.5.25.

1 3 1 3

(a) (b)

1

(c)

4 3

(d) not 0 1 (e) B @3 0

(f )

(g)

0

1

0

1

B @3

0

B @0

0

0

1

2 −6 −5 2 !

0 1 ! 0 1

!0

1 0 1 0

0 3

1

1 −3 C C C, −3 A 1 !

0 −8

!

1

0 @ 35 0 A 1 0 1 possible, 10 1 0 0 B 1 0C A@ 0 −2 0 1

0 1 0

0 0 1

0 0 (h) 1 0 0 0 1 B B0 B @0 0 0 1 0 B 0 1 B (i) B @0 0 00 0 1 B B0 B @0 0 B B0 B @0

1 −2 0 0

1 0

1 0

(i)

0

−51

8 2 −3 −1

B B −13 B @ 21

5

!

−2 = 1 ! 1 3 = 0 1 0 5 3

!0 @1

0

10

1 3 1 3

12 3 −5 −1

1

3 1C C C. −1 A 0

!

−2 , −3 ! 3 , 1

1

0

− 43 A @ = 1

10

3 5 4 5

− 54 3 5

1

A,

10

1

0 0 1 0 0 1 0 0 1 0 0 B B CB 1 0C 1 0C 0C A@0 A @ 0 −1 0 A @ 0 1 A 0 1 0 −1 1 0 0 1 0 0 −1 0 10 1 0 1 1 0 −2 1 0 0 1 0 −2 B B C B 0C 0C @0 1 A @ 0 1 −6 A = @ 3 −1 A, 0 0 1 0 0 1 −2 1 −3 10 10 10 10 1 0 1 0 0 1 0 0 1 0 0 1 0 0 CB CB CB CB 0 A @ 0 1 0 A @ 0 1 0 A @ 0 −1 0 A @ 0 1 0 C A 1 0 2 0 11 0 0 3 11 0 0 0 1 1 00 0 8 1 1 2 3 1 2 0 1 0 3 1 0 0 B C C B CB CB @ 0 1 0 A @ 0 1 4 A @ 0 1 0 A = @ 3 5 5 A, 0 0 1 1 00 0 1 1 00 0 1 1 0 2 1 2 1 10 0 1 0 0 2 0 0 1 0 0 1 0 0 B CB CB CB 1C 0C A @ 0 1 0 A @ 0 1 0 A @ 0 −1 0 A @ 0 1 A 0 0 2 0 11 0 0 0 1 1 0 0 0 1 1 00 0 −1 1 2 1 2 1 0 1 1 0 0 1 12 0 B C B CB CB 2 3C A, @ 0 1 0 A @ 0 1 −1 A @ 0 1 0A = @4 0 −1 1 0 0 1 0 0 1 0 0 1 10 1 10 10 10 0 0 1 0 0 0 1 0 0 0 2 0 0 0 1 0 0 0 1 CB C B1 B CB 1 0C 0 0C CB 2 1 0 0 CB 0 1 CB 0 1 0 0 CB 0 − 2 0 0 C CB CB C CB CB 0 1 0A 0 0 A@ 0 0 1 0 A@ 0 0 1 0 A@ 0 0 1 0 A@ 0 0 0 0 1 0 0 0 1 0 1 0 0 0 1 0 0 −2 1 1 10 10 0 1 10 2 1 0 1 1 0 0 0 1 0 0 0 1 12 0 0 0 0 12 CB CB B C CB 1 3C C B0 0 B 0 1 0 3 CB 0 1 0 0 CB 0 1 0 0C 1 0 0C C, CB CB C=B CB 0 −1 A 0 1 0 A@ 0 0 1 0 A@ 0 0 1 3 A@ 0 0 1 0 A @ 1 0 0 0 −2 −5 1 0 0 110 0 0 0 110 0 0 0 110 0 0 0 10 1 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 CB CB CB CB 0 0 CB 2 1 0 0 CB 0 1 0 0 CB 0 1 0 0 CB 0 1 0 0C C CB CB CB CB C 0 1 A@ 0 0 1 0 A@ 0 0 1 0 A@ 0 2 1 0 A@ 0 0 1 0A 1 0 0100 0 1 3 100 0 1 0 100 0 1 0 −1 0 1 1 10 1 0 0 0 1 0 0 0 1 0 0 1 1 0 0 0 0 0 0 CB CB CB CB 0C 0 CB 0 1 0 0 CB 0 1 0 −2 CB 0 1 0 1 0 0 CB 0 1 0 C C CB CB CB CB 0 A@ 0 0 1 −5 A 0 A@ 0 0 1 0 A@ 0 0 1 0 −1 0 A@ 0 0 1 0 10 0 1 0 00 0 1 0 0 0 −1 100 0 0 1 0 0 1 0 10 1 1 0 1 0 1 0 0 0 1 −2 0 0 1 −2 1 1 B CB CB B C 1 0 0C B 0 1 0 0 CB 0 1 1 0 CB 0 C B 2 −3 3 0 C B CB CB C=B C. @ 0 0 1 0 A@ 0 0 1 0 A@ 0 0 1 0 A @ 3 −7 2 4 A 0 0 0 1 0 0 0 1 0 0 0 1 0 2 1 1

20

1.5.26. Applying Gaussian Elimination: 0

E1 = @ E2 = E3 =

0 @

and hence A =

1.5.27. (a)

0 @

−

0

1 0

1

2

0

0

1

0

1

1 2 A, − 2i

i 2 1 2

i (c) B @1 − i −1

0 −i −1

A,

0 @

0

1

10

1 0 A@ 0 1

1 √1 3

1− i −1

0 √2 3

!

3 @ 2

C A, √2 3

0

0 @

1

− 12 A , 1 − √1

1

10 √

1

3 A,

0

1

1 0

0 1

10

3 A@ 2

,

− √1

0 A@ 1 1 0

0

!

1

3 A.

1

,

3+ i (d) B @ −4 + 4 i −1 + 2 i

−1 1 C A, −i

0

0√

1

E 4 E3 E2 E1 A = I =

−1 1+ i

(b)

− 12

3 B 2 @

E 3 E2 E1 A =

1

=

0√

E 2 E1 A =

1

1 √1 3 A,

E1−1 E2−1 E3−1 E4−1 1

E1 A =

0 √ A, 3

√2 @ 3

E4 = @

0 A, 1

√1 3

−

0

0

1

1

−1 − i 2− i 1− i

−i 2+ i 1

1

C A. “

”

1.5.28. No. If they have the same solution, then they both reduce to I | x under elementary row operations. Thus, by applying the appropriate “ ” elementary row operations to reduce the augmented matrix of the first system to I | x , and then applying the inverse elementary row operations we arrive at the augmented matrix for second system. Thus, the first system can be changed into the second by the combined sequence of elementary row operations, proving equivalence. (See also Exercise 2.5.44 for the general case.) ♥ 1.5.29. e =E E (a) If A N N −1 · · · E2 E1 A where E1 , . . . , EN represent the row operations applied to e =A eB =E E A, then C N N −1 · · · E2 E1 A B = EN EN −1 · · · E2 E1 C, which represents the same sequence of row operations applied to C. (b) 0 0 0 10 1 1 10 1 1 2 −1 1 −2 8 −3 1 0 0 8 −3 B B B C CB C 2C 0C (E A) B = B @ 2 −3 A@ 3 A = @ −9 −2 A = @ 0 1 0 A @ −9 −2 A = E (A B). −2 −3 −2 −1 1 −9 2 −2 0 1 7 −4

1.5.30. (a)

0

1 @2 1 4 0

2

(c)

B B @1

0

0

−4 (e) B @ 2 3

1 1 2A − 41

− 52 −1 1 2

3 −1 −1

1 −2

!

10

3 2 CB B 0C A@ 1 −2 10

0

=@

1

−

1 1 2 A; 3 4 0

(b) 1

3 C B 14 C B − 2C 5C A=@ A; 2 −2 1

0

1

1 3 −4 B B C C 0C A @ 5 A = @ 1 A; 1 −7 −3

0 @

−

5 17 1 17 0

1 2 17 A 3 17

9 (d) B @ 6 −1 (f )

0

1

B B0 B @2

2

21

0 0 −1 −1

2 12

−15 −10 2 1 −1 −1 −1

!

=

!

2 ; 2

10

1

0

1

−8 3 2 B B C C −5 C A@ −1 A = @ 3 A; 1 5 0 10

1

0

1

1 4 3 B C B C −1 C CB 11 C B 1C CB C=B C; 0 A@ −7 A @ 4 A −1 6 −2

(g)

1.5.31. (a)

0

1

B B B B B @

− 52 −4 − 21

@

− −

0

0

1 1 3A 2 3 1

1 −2 −3 −1

1 −3

(a)

2 1

!

0 1 1 0 0 2 1 (c) B 4 @2 0 −2

(b)

0

1 (d) B @0 0

0

(e)

0

(f )

0

1

B B0 B @0

(g)

0

2

B @1

1

1

B B1 B @1

0 1 0 0

3 0 0 0 1

1.5.33. (a)

0 @

−

0 0 1

−3 −1 −1

!

,

=

3 7 A, 5 7

1.6.1. (a) ( 1

3 2

10 CB

B − 32 C CB CB B −3 C A@ 1 −2

2 1 2

1 −3

0

!

0 1

!

1 0

0 7

0

1

10

0 1 B 1C A@1 0 2

0

1 7 5 A, (c) @ − 15 0 1 − 12 C (g) B @ 0 A,

1

!

1 0

1 2 B1 1C A=@2 1 2 2

(b)

5 ),

0

5 1 B −2 C A = @2 3 1

0

!

2 ; 1

0 2 0 B 1 0C A@ 0 2 0 0 1 0 1 1 0 0 2 B 1 1 0 5C B C C=B −1 A @ 1 −1 1 3 − 34 1 6 0 1 0 2 −3 1 C −2 0 1C B B2 C=B −2 −2 −1 A @ 0 1 1 1 2 !

−8 , (c) 3

(b)

1 1

0 B B B @

−

!

0 , 2

1 6 2 3 2 3

(h)

1

C C C, A

4

1

C B B −10 C C, B @ −8 A

0 B B B @

(i)

3

1 − 27 0 1 1 10 0 1 1 12 CB 0 A@ 0 1 −1 C A; −1 0 0 1 10 10 0 1 0 0 1 B B 0C 0C A@ 0 −3 A@ 0 0 1 0 0 −7 10

1 1 0

1

− 32

1

−28 −7 C C C. 12 A 3

5

1

7C 3 A;

1

0 1 1 C B 0C A@ 0 1 0 A; 1 0 0 1 10 10 1 0 1 0 0 0 1 −1 1 2 CB CB 0 CB 0 −3 0 0 CB 0 1 0 −1 C C C; CB CB 0 A@ 0 0 −2 0 A@ 0 0 1 0A 1 0 0 0 1 0 0 0 4 10 10 1 0 2 1 0 0 0 0 0 0 B B 0 0C 1 0 0C CB 0 −2 CB 0 1 2 CB CB 0 −1 0 A@ 0 0 1 − 21 1 0 A@ 0 0 0 0 −5 1 0 1 0 0 0 0

1

1 C (d) B @ −2 A, (e) 0

(c)

0

!

!

0 1 0

10

0 1 1

(d) singular matrix,

1

!

1

1 1 −1

1

3 − 2C B C 1C B C 3C B 2C C=B C. B C 3C @ − 1A A 3 2 −2

0 4 1 0 −7 0 = −7 2 0 1 0 4 0 1 10 2 1 0 0 2 0 B B −1 C 1 0C A = @1 A@ 0 3 1 0 − 23 1 0 0

−1 1 −4 1 2 −1 1 1 10 0 1 B 0C CB 2 CB 1 A@ 1 0 0

1

1

1 1 (b) @ 41 A, 0 4 1 1 8C B C B − 1 C, (f ) B 2A @ 5 8

−1 C (e) B @ −4 A, −1

1.5.32.

0

1 2

!

2 , 1

22

0

1

−12 B C @ −3 A, 7

0

1 (d) B @ 2 −1

(f )

1

2 0C A, 2

0

7 1 3 B C B 2 C B C, B C @ 5 A 5 −3

(g)

0 B B B @

1

−3 − 27 C C . 11 C −2 A 1 1

0 1C C C. 0A −2

0

1

(f )

3 −1 1.6.2. AT = B @ −1

1 2C A, 1

(e)

B @

1 2C A, −3 0

T

1 2

3 4

1

T

!

5 , 6

!

−2 2

T

(A B) = B A =

B @

(g) −1 2

BT =

0

0 , 6

2 0

1 2 −1

0 3 2

!

1

1 1C A. 5

−3 , 4 T

T

(B A) = A B

T

=

0 B @

−1 5 3

1

6 −2 −2

−5 11 C A. 7

1.6.3. If A has size m × n and B has size n × p, then (A B)T has size p × m. Further, AT has size n × m and B T has size p × n, and so unless m = p the product AT B T is not defined. If m = p, then AT B T has size n × n, and so to equal (A B)T , we must have m = n = p, so the matrices are square. Finally, taking the transpose of both sides, A B = (A T B T )T = (B T )T (AT )T = B A, and so they must commute. ♦ 1.6.4. The (i, j) entry of C = (A B)T is the (j, i) entry of A B, so cij =

n X

k=1

ajk bki =

n X

k=1

e e , bik a kj

T T e e where a ij = aji and bij = bji are the entries of A and B respectively. Thus, cij equals

the (i, j) entry of the product B T AT . 1.6.5. (A B C)T = C T B T AT 1.6.6. False. For example,

1 0

1 1

!

does not commute with its transpose.

!

a b ♦ 1.6.7. If A = , then AT A = A AT if and only if b2 = c2 and (a − d)(b − c) = 0. c d So either ! −b 6= 0 and a = d. Thus all normal 2 × 2 matrices are of the form ! b = c, or c = a b a b . or −b a b d 1.6.8. −T (a) (A B)−T = ((A B)T )−1 = (B T AT )−1 = !(AT )−1 (B T )−1 = A−T B! . ! 0 −1 1 −2 1 0 −T −T , B −T = , while A = , so (A B) = (b) A B = 1 1 0 1 2 1 ! 1 −2 −T −T so A B = . 0 1

1 −1

!

−1 , 2

1.6.9. If A is invertible, then so is AT by Lemma 1.32; then by Lemma 1.21 A AT and AT A are invertible. 1.6.10. No; for example,

!

1 (3 4) = 2

3 6

4 8

!

while

!

3 (1 2) = 4

3 4

!

6 . 8

1.6.11. No. In general, B T A is the transpose of AT B. ♦ 1.6.12. (a) The ith entry of A ej is the product of the ith row of A with ej . Since all the entries in ej are zero except the j th entry the product will be equal to aij , i.e., the (i, j) entry of A. bT and the j th column of A. Since bT A e is the product of the row matrix e (b) By part (a), e i i j

23

bT are zero except the ith entry, multiplication by the j th column of A all the entries in e i will produce aij .

♦ 1.6.13. ej = bij for all i, j. A ej = e T (a) Using Exercise 1.6.12, aij = eT i B! ! !i 0 0 1 1 1 2 , B= ; A= , B= (b) Two examples: A = 0 0 1 1 0 1

0 1

!

−1 . 0

♦ 1.6.14. (a) If pij = 1, then P A maps the j th row of A to its ith row. Then Q = P T has qji = 1, and so it does the reverse, mapping the ith row of A to its j th row. Since this holds for all such entries, the result follows. ! cos θ − sin θ (b) No. Any rotation matrix also has this property. See Section 5.3. sin θ cos θ ♦ 1.6.15. (a) Note that (A P T )T = P AT , which permutes the rows of AT , which are the columns of A, according to the permutation P . (b) The effect of multiplying P A P T is equivalent to simultaneously permuting rows and columns of A according to the permutation P . Associativity of matrix multiplication implies that it doesn’t matter whether the rows or the columns are permuted first. ♥ 1.6.16. (a) Note that w vT is a scalar, and so A A−1 = ( I − v wT )( I − c v wT ) = I − (1 + c)v wT + c v (wT v)wT = I − (1 + c − cwT v)v wT = I

provided c = 1/(vT w − 1),! which works whenever wT v 6= 1. 1 2 −2 and c = T (b) A = I − v wT = = 14 , so A−1 = I − 3 −5 v w−1

5 4 3 4

− 21 vw = − 21 T (c) If v w = 1 then A is singular, since A v = 0 and v 6= 0, and so the homogeneous system does not have a unique solution.

1.6.17. (a) a = 1; 1.6.18. 0

1 (a) B @0 0 0 1 B 0 B (b) B @0 0 0 1 B B0 B @0 0

0 1 0 0 1 0 0 0 0 0 1

(b) a = −1, b = 2, c = 3;

1 0

0 0 1 B 0C A, @ 1 0 1 0 0 1 0 0 0 0 B 0 0C C B1 C, B 1 0A @0 0 11 00 1 0 0 B 0 1C C B0 C, B 1 0A @0 0 0 0

1 0

1 4

(c) a = −2, b = −1, c = −5.

1 0

1

0 0 0 1 1 0 0 B C B C 0C A, @ 0 1 0 A, @ 0 0 1 A. 1 1 10 0 0 0 1 10 0 0 0 1 0 1 0 0 0 C B B 0 0 0C C B0 1 0 0C B0 C, B C, B 0 1 0A @1 0 0 0A @0 0 0 11 00 0 0 11 01 0 0 1 0 0 0 0 0 B C B 1 0 0C C B1 0 0 0C B0 C, B C, B 0 0 1A @0 0 0 1A @1 0 0 0 1 0 0 1 0

1.6.19. True, since (A2 )T = (A A)T = AT AT = A A = A2 .

0 1 0 0 0 0 0 1

0 0 1 0 1 0 0 0

1 0

1 0C C C, 0A 01 0 1C C C, 0A 0

1 B B0 B @0 0 0 0 B B0 B @0 1

♦ 1.6.20. True. Invert both sides of the equation AT = A, and use Lemma 1.32. 24

T

0 0 1 0 0 0 1 0

0 1 0 0 0 1 0 0

1

0 0C C C, 0A 11 1 0C C C. 0A 0

!

.

0 1

♦ 1.6.21. False. For example

!

1 0

2 1

1 3

!

!

1 2

=

3 . 1

1.6.22. (a) If D is a diagonal matrix, then for all i 6= j we have aij = aji = 0, so D is symmetric. (b) If L is lower triangular then aij = 0 for i < j, if it is symmetric then aji = 0 for i < j, so L is diagonal. If L is diagonal, then aij = 0 for i < j, so L is lower triangular and it is symmetric. 1.6.23. (a) Since A is symmetric we have (An )T = (A A . . . A)T = AT AT . . . AT = A A . . . A = An (b) (2 A2 − 3 A + I )T = 2 (A2 )T − 3 AT + I = 2 A2 − 3 A + I (c) If p(A) = cn An + · · · + c1 A + c0 I , then p(A)T = cn An + · · · c1 A + c0 I T = cn (AT )n + · · · c1 AT + c0 I = p(AT ). In particular, if A = AT , then p(A)T = p(AT ) = p(A). 1.6.24. If A has size m × n, then AT has size n × m and so both products are defined. Also, K T = (AT A)T = AT (AT )T = AT A = K and LT = (AAT )T = (AT )T AT = A AT = L.

1.6.25. (a)

1 1

(b)

−2 3

(c)

0

1 4

1

B @ −1 0

−1

1 B B −1 (d) B @ 0 3 1.6.26.

!

=

3 −1

!

−1 3 2

0 1

!

1 0

1

=

− 23

1

0

0 3

0 1

!

−1 1 B 2C A = @ −1 −1 0

−1 2 2 0

M2 =

1 1

0 2 −1 0

1 1 2

0 1

1

0

!

1 0

M4 =

0 3 2 0

0 1

1 2

!0

1

B1 B B2 B B 0 @

0

♦ 1.6.27. The matrix is not regular, More0explicitly,1if 0 1 0 0 p B C L = @a 1 0A, D = B @0 b c 1 0

1 1

−2 0

3 1 B 0C C B −1 C=B 0A @ 0 1 3

!

1 0

@1

0

0 1 2 3

0

0

!

,

!

7 2 10

0 1 B 0C A@ 0 1 0

0 1 2 3

0 0 1

6 5 1

1 2 A,

1

!

− 23 , 1 10 1 0 0 B 2 0C A@ 0 0 0 − 32

1 0

10

1 0 B 0C CB 0 CB 0 A@ 0 1 0

3 4

0

0 0 −5 0

1

0 1

B1 2

M3 = B @

10

2 0 CB B C 0 CB 0 CB B 0C A@ 0 0 1

0 0 1

0 1 0 0

0 3 2

0 0

−1 1 0

2 3

0

−1

1

10

0 1 B 0C CB 0 CB 0 A@ 0 0 − 49 5 10

0 2 CB B0 0C A@ 0 1

10

1 0 CB B C 0 CB 0 CB B 0C A@ 0

0 0 4 3

5 4

0

1

1C 2 A,

0

1 2

1 0 0

0

3 2

0 0 2 3

1 0

−1 1 0 0

0 2 1 0

10 0 B1 CB 0C AB @0 4 0 3 1

0 C 0C C

1

3 3C C . 6C A 5

1

1 2

1 0

1

0C 2C C, 3A 1

C. 3C 4A

1

since after the first set of row operations the (2, 2) entry is 0. 1

0

0 0C A, r

T

p

1

ap bp a p+q abp + cq C then L D L = A. 2 2 bp abp + cq b p + c q + r Equating this to A, the (1, 1) entry requires p = 1, and so the (1, 2) entry requires a = 2, but the (2, 2) entry then implies q = 0, which is not an allowed diagonal entry for D. Even if we ignore this, the (1, 3) entry would set b = 1, but then the (2, 3) entry says a b p + c q = 2 6= −1, which is a contradiction. 0 q 0

B @ ap

2

eU e , where L e = V T and U e = D L. e Thus, AT ♦ 1.6.28. Write A = L D V , then AT = V T D U T = L T e is regular since the diagonal entries of U , which are the pivots of A , are the same as those

25

of D and U , which are the pivots of A.

!

0 1 . (c) No, −1 0 because the (1, 1) entry is always 0. (d) Invert both sides of the equation J T = − J and use Lemma 1.32. (e) (J T )T = J = − J T , (J ± K)T = J T ±!K T = − J!∓ K = − (J ± ! K). 0 1 0 1 −1 0 J K is not, in general, skew-symmetric; for instance . = −1 0 −1 0 0 −1 (f ) Since it is a scalar, vT J v = (vT J v)T = vT J T (vT )T = − vT J v equals its own negative, and so is zero.

♥ 1.6.29. (a) The diagonal entries satisfy jii = − jii and so must be 0. (b)

1.6.30. S T = S, J T0= − J, (a) Let S = 21 (A + AT ), J = 21 (A − AT ). Then 0 1 ! ! ! 1 2 3 1 3 1 25 0 − 12 1 2 B B C = 5 (b) + 1 ; @4 5 6A = @3 5 3 4 4 0 2 2 7 8 9 5 7

and A0 = S + J. 1 1 5 0 −1 −2 C B 0 −1 C 7A + @1 A. 2 1 0 9

1.7.1. 19 (a) The solution is x = − 10 7 , y = − 7 . Gaussian Elimination and Back Substitution requires 2 multiplications and Gauss–Jordan also uses 2 multiplications and 3 0 3 additions; 1 additions; finding A−1 =

@

1 7 3 7

2 7A 1 7

by the Gauss–Jordan method1requires 2 0 additions 1 0 ! − 1 2 10 − 4 7 7 7 A = @ 19 A and 4 multiplications, while computing the solution x = @ −7 − 37 17 −7 takes another 4 multiplications and 2 additions. (b) The solution is x = −4, y = −5, z = −1. Gaussian Elimination and Back Substitution requires 17 multiplications and 110additions; Gauss–Jordan uses 20 multiplications 1 0 −1 −1 C and 11 additions; computing A−1 = B @ 2 −8 −5 A takes 27 multiplications and 12 3 2 −5 −3 additions, while multiplying A−1 b = x takes another 9 multiplications and 6 additions. (c) The solution is x = 2, y = 1, z = 25 . Gaussian Elimination and Back Substitution requires 6 multiplications and 5 additions; Gauss–Jordan is the same: 6 multiplications 0 1 3 3 1 2 2C B − 2 1 1C B − 1 C takes 11 multiplications and 3 and 5 additions; computing A−1 = B 2 2 2A @ 2 1 − 5 0 −5 −1 additions, while multiplying A b = x takes another 8 multiplications and 5 additions.

1.7.2. (a) For a general matrix A, each entry of A2 requires n multiplications and n − 1 additions, for a total of n3 multiplications and n3 − n2 additions, and so, when compared with the efficient version of the Gauss–Jordan algorithm, takes exactly the same amount of computation. (b) A3 = A2 A requires a total of 2 n3 multiplications and 2 n3 − 2 n2 additions, and so is about twice as slow. (c) You can compute A4 as A2 A2 , and so only 2 matrix multiplications are required. In general, if 2r ≤ k < 2r+1 has j ones in its binary representation, then you need r multir plications to compute A2 , A4 , A8 , . . . A2 followed by j − 1 multiplications to form Ak as a product of these particular powers, for a total of r + j − 1 matrix multiplications, and hence a total of (r + j − 1)n3 multiplications and (r + j − 1)n2 (n − 1) additions. See 26

Exercise 1.7.8 and [ 11 ] for more sophisticated ways to speed up the computation. 1.7.3. Back Substitution requires about one half the number of arithmetic operations as multiplying a matrix times a vector, and so is twice as fast. ♦ 1.7.4. We begin by proving (1.61). We must show that 1 + 2 + 3 + . . . + (n − 1) = n(n − 1)/2 for n = 2, 3, . . .. For n = 2 both sides equal 1. Assume that (1.61) is true for n = k. Then 1 + 2 + 3 + . . . + (k − 1) + k = k(k − 1)/2 + k = k(k + 1)/2, so (1.61) is true for n = k + 1. Now the first equation in (1.62) follows if we note that 1 + 2 + 3 + . . . + (n − 1) + n = n(n + 1)/2. Next we prove the first equation in (1.60), namely 2 + 6 + 12 + . . . + (n − 1)n = 31 n3 − 13 n for n = 2, 3, . . .. For n = 2 both sides equal 2. Assume that the formula is true for n = k. Then 2 + 6 + 12 + . . . + (k − 1)k + k(k + 1) = 31 k3 − 13 k + k 2 + k = 13 (k + 1)3 − 31 (k + 1), so the formula is true for n = k + 1, which completes the induction step. The proof of the second equation is similar, or, alternatively, one can use the first equation and (1.61) to show that n X

j =1

(n − j)2 =

n X

j =1

(n − j)(n − j + 1) −

n X

j =1

(n − j) =

n2 − n 2 n3 − 3 n 2 + n n3 − n − = . 3 2 6

♥ 1.7.5. We may assume that the matrix is regular, so P = I , since row interchanges have no effect on the number of arithmetic operations. (a) First, according to (1.60), it takes 13 n3 − 13 n multiplications and 31 n3 − 12 n2 + 61 n additions to factor A = L U . To solve L cj = ej by Forward Substitution, the first j − 1 entries of c are automatically 0, the j th entry is 1, and then, for k = j + 1, . . . n, we need k − j − 1 multiplications and the same number of additions to compute the k th entry, for a total of 21 (n − j)(n − j − 1) multiplications and additions to find cj . Similarly, to solve U xj = cj for the j th column of A−1 requires 12 n2 + 21 n multiplications and, since the first j − 1 entries of cj are 0, also 21 n2 − 12 n − j + 1 additions. The grand total is n3 multiplications and n (n − 1)2 additions. “ ” (b) Starting with the large augmented matrix M = A | I , it takes 12 n2 (n − 1) multipli“

”

cations and 21 n (n − 1)2 additions to reduce it to triangular form U | C with U upper triangular and C“ lower triangular, then n2 multiplications to obtain the special upper ” triangular form V | B , and then 12 n2 (n − 1) multiplications and, since B is upper “

”

triangular, 12 n (n − 1)2 additions to produce the final matrix I | A−1 . The grand total is n3 multiplications and n (n − 1)2 additions. Thus, both methods take the same amount of work.

and 31 n3 − 13 n 1.7.6. Combining (1.60–61), we see that it takes 31 n3 + 12 n2 − 56 n multiplications “ ” additions to reduce the augmented matrix to upper triangular form U | c . Dividing the j th row by its pivot requires n−j +1 multiplications, for a total of 21 n2 + 21 n multiplications “ ” “ ” to produce the special upper triangular form V | e . To produce the solved form I | d requires an additional 12 n2 − 21 n multiplications and the same number of additions for a grand total of 31 n3 + 23 n2 − 56 n multiplications and 31 n3 + 21 n2 − 65 n additions needed to solve the system. 1.7.7. Less efficient, by, roughly, a factor of 1 1 3 2 n − 2 n additions.

3 2

. It takes

27

1 2

n3 + n2 −

1 2

n multiplications and

♥ 1.7.8. (a) D1 + D3 − D4 − D6 = (A1 + A4 ) (B1 + B4 ) + (A2 − A4 ) (B3 + B4 ) − − (A1 + A2 ) B4 − A4 (B1 − B3 ) = A1 B1 + A2 B3 = C1 , D4 + D7 = (A1 + A2 ) B4 + A1 (B2 − B4 ) = A1 B2 + A2 B4 = C2 , D5 − D6 = (A3 + A4 ) B1 − A4 (B1 − B3 ) = A3 B1 + A4 B3 = C3 , D1 − D2 − D5 + D7 = (A1 + A4 ) (B1 + B4 ) − (A1 − A3 ) (B1 + B2 ) − − (A3 + A4 ) B1 + A1 (B2 − B4 ) = A3 B2 + A4 B4 = C4 . (b) To compute D1 , . . . , D7 requires 7 multiplications and 10 additions; then to compute C1 , C2 , C3 , C4 requires an additional 8 additions for a total of 7 multiplications and 18 additions. The traditional method for computing the product of two 2 × 2 matrices requires 8 multiplications and 4 additions. (c) The method requires 7 multiplications and 18 additions of n × n matrices, for a total of 7 n3 and 7 n2 (n−1)+18 n2 ≈ 7 n3 additions, versus 8 n3 multiplications and 8 n2 (n−1) ≈ 8 n3 additions for the direct method, so there is a savings by a factor of 87 . (d) Let µr denote the number of multiplications and αr the number of additions to compute the product of 2r × 2r matrices using Strassen’s Algorithm. Then, µr = 7 µr−1 , while αr = 7 αr−1 + 18 · 22 r−2 , where the first factor comes from multiplying the blocks, and the second from adding them. Since µ1 = 1, α1 = 0. Clearly, µr = 7r , while an induction proves the formula for αr = 6(7r−1 − 4r−1 ), namely αr+1 = 7 αr−1 + 18 · 4r−1 = 6(7r − 7 · 4r−1 ) + 18 · 4r−1 = 6(7r − 4r ). Combining the operations, Strassen’s Algorithm is faster by a factor of 23 r+1 2 n3 = , µr + αr 13 · 7r−1 − 6 · 4r−1 which, for r = 10, equals 4.1059, for r = 25, equals 30.3378, and, for r = 100, equals 678, 234, which is a remarkable savings — but bear in mind that the matrices have size around 1030 , which is astronomical! ! ! A O B O (e) One way is to use block matrix multiplication, in the trivial form = O I O I ! C O where C = A B. Thus, choosing I to be an identity matrix of the appropriate O I size, the overall size of the block matrices can be arranged to be a power of 2, and then the reduction algorithm can proceed on the larger matrices. Another approach, trickier to program, is to break the matrix up into blocks of nearly equal size since the Strassen formulas do not, in fact, require the blocks to have the same size and even apply to rectangular matrices whose rectangular blocks are of compatible sizes.

1.7.9.

0

1 (a) B @ −1 0 0 1 B −1 B (b) B @ 0 0

2 −1 −2 −1 2 −1 0

1

0

0 1 0 B 1C 1 A = @ −1 3 0 −2 0 1 1 0 0 C 1 0C B B −1 C=B 4 1A @ 0 0 −1 6

10

1

0 1 2 0 B C 0C A @ 0 1 1 A, 1 0 0 5 10 1 0 0 0 B 1 0 0C CB 0 CB −1 1 0 A@ 0 0 0 −1 1 28

0

1

−2 C x=B @ 3 A; 0 1 −1 0 0 1 1 0C C C, 0 5 1A 0 0 7

x=

0

1

1

B C B0C B C; @1A

2

0

1 B −1 B (c) B @ 0 0 1.7.10. 0 (a)

2

B @ −1

0 2 B −1 B B @ 0 0 0

0

2 B −1 B B B 0 B @ 0 0

(b)

“

0

0 1 1 0

10

0 0 1

0 1 B 0C CB 0 CB 0 A@ 0 1 0

2 −1 0 0

−4 B 2C C B C x=B B − 2 C. @ 5A

0 0C C C, −1 A 5 −4

−1 2 −1 −1 2 −1 0

2 −1 0 1 0 0 0 3 C CB 0 B−1 1 0 −1 −1 C = A, A@ @ 2 A 2 2 4 2 00 − 3 1 0 0 103 1 1 1 0 0 0 2 −1 0 0 0 0 3 B−1 B 1 0 0C 0C B 2 C CB 0 −1 0C C 2 −1 C , CB C=B 2 4 C CB 0 2 −1 A B 1 0 −1 0 − 0 @ A A@ 3 3 3 5 −1 2 00 0 − 4 1 0 0 100 4 1 2 −1 0 1 0 0 0 0 0 0 0 3 CB B 1 1 0 0 0 −1 0 − C C B B 2 −1 0 0C B 2 CB 2 4 B B 0 1 0 0C C = B 0 −3 CB 0 2 −1 0C 3 C CB B 3 B −1 2 −1 A B 1 0C 0 −4 0 0 A@ 0 @ 0 4 0 −1 2 0 0 0 −5 1 0 0 0

” 3 T 2

− 41

, ( 2, 3, 3, 2 )T ,

“

5 9 2 , 4, 2 , 4,

− 35

1

10

0

1

1

0

1

0 0 4 0

0 0 4 −1

−1 2 −1 0 0

3 2 , 2,

1

1 0 B 0C −1 C B C=B −1 A @ 0 0 −1

2 −3 −1 0

” 5 T 2

0 0 −1 5 4

0

.

1

0 0C C C 0C C; C −1 C A 6 5

(c) The subdiagonal entries in L are li+1,i = − i/(i + 1) → −1, while the diagonal entries in U uii = (i + 1)/i → 1. ♠ 1.7.11. 0

2 (a) B @ −1 0 0 2 B B −1 B @ 0 0 0

2 B1 B B B0 B @0 0 “

1 2 −1 1 2 −1 0 1 2 1 0 0

0 1 2 1 0

” 1 1 2 T , , 3 3 3

1

10

0

1

2 1 0 1 0 0 0 5 B B 1 1 0C 1C 1C A@ 0 2 A , A = @−2 2 12 2 00 − 5 1 0 0 1 50 1 1 0 0 0 2 1 0 0 B−1 CB 0 5 C 1 0 0 CB 1 0C B 2 2 CB C=B 2 B 2 1A B 0 1 0C @ 0 −5 A@ 0 5 −1 2 0 0 − 12 1 0100 0 1 2 1 0 0 0 0 0 0 B 1 CB 1 0 0 0 0 − B B C C 0 0C B 2 CB CB 0 C B 0 −2 1 0 0 CB =B 1 0C 5 CB C B 5 B 2 1A B 1 0C 0 − 12 A@ 0 @ 0 12 1 2 0 0 0 − 29 1 0 “

” 8 13 11 20 T , , , 29 29 29 29

“

” 3 2 1 2 7 T , , , , 10 5 2√5 10

(b) , , (c) The subdiagonal√entries in L approach 1 − U approach 1 + 2 = 2.414214. 1.7.12. Both false. 10 1 1 1 0 0 CB B 1 1 1 0 CB 1 B CB B @ 0 1 1 1 A@ 0 0 0 0 1 1 0

♠ 1.7.13.

0

4

B B1 @

1

1 4 1

For 1 1 1 0 1

example, 0 1 2 0 0 C 1 0C B B2 C=B 1 1A @1 0 1 1 0

1C B 1 B1 1C A=@4 1 4 4

0 1 1 5

2 3 2 1 10

4 0 CB B0 0C A@ 1 0

1 2 3 2

0

1

15 4

0

29

0 0C C C , 1C A

12 5

0 1

29 12

0 0 0

12 5

0 1

5 2

0 0

0 0 1

29 12

0

1

0 0C C C 0C C; C 1C A

70 29

. 2 = −.414214, and the diagonal entries in

0 1C C C, 2A 2 1

1

0 1

1

B B1 B @0

0

1

1

3C C 4A 18 5

1 1 1 0

0 1 1 1

1

0

0 −1 1 B 0C 0 C B C =B @ −1 1A 1 1

0 0 1 −1

−1 1 0 0

1

1 −1 C C C. 0A 1

0

0

4 B B B1 B B0 @ 1

4 B B1 B B B0 B B @0 1

1 4 1 0 0

For 4 B B1 B B B0 B B B0 B B @0 1

the 1 4 1 0 0 0

0

1 4 1 0 0 1 4 1 0

1

0

0 1C B 1 C B1 1 0C B 4 C=B 4 C B 1A @ 0 15 1 1 4 − 15 4 0 1 1 0 0 1 B1 C B 1 0 0C B4 C B C 4 B =B0 1 0C 15 C B C 4 1A B 0 @ 0 1 1 1 4 4 − 15

0 1 4 1

10

4 1 0 CB 15 C B 0 CB 0 4 CB B 0 0C A@ 0 2 0 0 7 1 10 4 0 0 0 CB B C 0 0 0 CB 0 CB B 1 0 0C CB 0 CB 15 B 1 0C A@ 0 56 1 5 0 56 19 1 0 0 1

6 × 6 version we have 0 1 1 0 0 0 0 1 B1 C B 1 B4 1 0 0 0C C B C 4 B 0 4 1 0 0C 15 C=B B C B C 0 1 4 1 0C B 0 B C 0 1 4 1A B 0 @ 0 1 1 0 0 1 4 4 − 15

0 0 1 15 56

0 1 56

0 0 0 1 56 209 1 − 209

0 1

1

0 1 15 4

0 0 0 0 0 0 0 1 7 26

C C C C 16 C 15 A 24 7

− 41

56 15

1

0 1

56 15

0 0 10

0 4 CB C B 0 CB 0 CB B 0C CB 0 CB B 0C CB 0 CB B 0C A@ 0 0 1

0 0 1 209 56

0 1 15 4

0 0 0 0

1 − 41 1 15 55 56 66 19

0 1 56 15

0 0 0

1 C C C C C C C C A

0 0 1 209 56

0 0

0 0 0 1 780 209

0

1 − 14

1 15 1 − 56 210 209 45 13

1 C C C C C C C C C C C A

The pattern is that the only the entries lying on the diagonal, the subdiagonal or the last row of L are nonzero, while the only nonzero entries of U are on its diagonal, superdiagonal or last column.

♥ 1.7.14. (a) Assuming regularity, the only row operations required to reduce A to upper triangular form U are, for each j = 1, . . . , n − 1, to add multiples of the j th row to the (j + 1)st and the nth rows. Thus, the only nonzero entries below the diagonal in L are at positions (j, j + 1) and (j, n). Moreover, these row operations only affect zero entries in the last column, leading 1to the of0U . 0 final form1 0 1 1 −1 −1 1 0 0 1 −1 −1 B B CB C (b) @ −1 1 −2 C 2 −1 A = @ −1 1 0 A@ 0 A, 0 0 2 −1 2 0 1 −1 −1 3 10 1 1 0 1 0 0 0 0 1 −1 0 0 −1 1 −1 0 0 −1 B B B −1 1 0 0 0C 1 −1 0 −1 C CB 0 C B −1 2 −1 0 0C C B CB C B C B C B B 1 0 0 CB 0 0 2 −1 −1 C C = B 0 −1 B 0 −1 3 −1 0 , C B 3C 7 C B @ 0 1 0C − 0 − 12 0 0 0 −1 4 −1 A B A@ 0 A @ 0 2 2 13 −1 0 0 −1 5 0 0 0 0 −1 −1 − 12 − 37 1 7 0 1 −1 0 0 0 −1 1 B −1 2 −1 0 0 0C C B B 3 −1 0 0C C B 0 −1 C= B C B 0 0 −1 4 −1 0 B C @ 0 0 0 −1 5 −1 A −1 0 0 0 −1 6 0 0 0 0 −1 1 1 0 0 0 0 0 10 1 −1 B B −1 1 −1 0 0 −1 C 1 0 0 0 0C CB 0 C B B B 0 −1 0 2 −1 0 −1 C 1 0 0 0C CB 0 C B CB B 7 1C 1 CB 0 C. B 0 0 0 1 0 0 −1 − 0 − 2 2 2C CB B B C C B @ 0 0 0 0 33 0 0 − 72 1 0 A@ 0 − 78 A 7 8 0 0 0 0 0 104 −1 −1 − 21 − 71 − 33 1 33 The 4 × 4 case is a singular matrix.

30

♥ 1.7.15. (a) If matrix A is tridiagonal, then the only nonzero elements in ith row are ai,i−1 , aii , ai,i+1 . So aij = 0 whenever | i − j | > 1. 0 0 2 1 1 0 0 01 2 1 1 1 0 01 B1 2 1 1 0 0C B1 2 1 1 1 0C B C B C B C B C B1 1 2 1 1 0C B1 1 2 1 1 1C C has band width 2; B C has band (b) For example, B B0 1 1 2 1 1C B1 1 1 2 1 1C B C B C @0 0 1 1 2 1A @0 1 1 1 2 1A 0 0 0 1 1 2 0 0 1 1 1 2 width 3. (c) U is a matrix that result from applying the row operation # 1 to A, so all zero entries in A will produce corresponding zero entries in U . On the other hand, if A is of band width k, then for each column of A we need to perform no more than k row replacements to obtain zero’s below the diagonal. Thus L which reflects these row replacements will have at most k nonzero entries below the diagonal. 10 1 0 2 1 1 0 0 0 1 0 0 0 0 0 0 1 B C C B 2 1 1 0 0 0 3 1 B B1 1 0 0 0 0C 1 0 0C CB 0 2 C B1 2 1 1 0 0C B2 2 CB C B B1 C 4 1 2 CB 0 C B B C 0 1 0 0 0 1 0 CB C B2 B1 1 2 1 1 0C 3 3 3 CB C, C=B (d) B 1 1 CB 0 C B 0 2 B0 1 1 2 1 1C 1 0 0 1 0 0 1 CB C B C B 3 2 2 CB C B @0 0 1 1 2 1A B B 0 34 21 1 0 C 0 0 0 1 12 C A@ 0 A @ 0 0 0 0 1 1 2 1 3 0 0 0 1 2 1 0 0 0 0 0 4 0 10 1 1 0 0 0 0 0 2 1 1 1 0 0 0 C C B 2 1 1 1 0 01 B 1 1 3 B1 B 1 0 0 0 0C 1 0C B2 B1 2 1 1 1 0C CB 0 2 C 2 2 B1 C B CB C 1 1 2 4 B C B CB 0 C 1 0 0 0 1 0 1 1 2 1 1 1 B2 C B CB C 3 3 3 3 C=B B CB C 5 1 1 1 3 C. B1 B1 1 1 2 1 1C CB 0 0 0 1 0 0 B2 C B B C C 3 4 4 2 4 B CB C @0 1 1 1 2 1A 2 4 1 2 1C B CB 1 0 A@ 0 0 0 0 5 5 A @ 0 3 2 5 0 0 1 1 1 2 0 0 34 53 14 1 0 0 0 0 0 34 “

”T

“

”T

1 1 1 2 2 1 , . (e) 31 , 31 , 0, 0, 13 , 31 3, 3,−3,−3, 3, 3 (f ) For A we still need to compute k multipliers at each stage and update at most 2 k 2 entries, so we have less than (n − 1)(k + 2 k 2 ) multiplications and (n − 1) 2 k 2 additions. For the right-hand side we have to update at most k entries at each stage, so we have less than (n − 1)k multiplications and (n − 1)k additions. So we can get by with less than total (n − 1)(2 k + 2 k 2 ) multiplications and (n − 1)(k + 2 k 2 ) additions. (g) The inverse of a0banded matrix is1not necessarily banded. For example, the inverse of 0 1 1 1 3 2 1 0 4 −2 4C B B B C 1 1C C @ 1 2 1 A is B − 1 − 2 2A @ 0 1 2 1 3 1 4 −2 4

1.7.16. (a) ( −8, 4 )T , (b) ( −10, −4.1 )T , (c) ( −8.1, −4.1 )T . (d) Partial pivoting reduces the effect of round off errors and results in a significantly more accurate answer. 1 1 1.7.17. (a) x = 11 7 ≈ 1.57143, y = 7 ≈ .142857, z = − 7 ≈ .142857, (b) x = 3.357, y = .5, z = −.1429, (c) x = 1.572, y = .1429, z = −.1429.

1.7.18. (a) x = −2, y = 2, z = 3, (b) x = −7.3, y = 3.3, z = 2.9, (c) x = −1.9, y = 2., z = 2.9, (d) partial pivoting works markedly better, especially for the value of x. 31

1.7.19. (a) x = −220., y = 26, z = .91; (b) x = −190., y = 24, z = .84; (c) x = −210, y = 26, z = 1. (d) The exact solution is x = −213.658, y = 25.6537, z = .858586. Full pivoting is the most accurate. Interestingly, partial pivoting fares a little worse than regular elimination.

1.7.20. (a)

1.7.21. (a)

(c)

0

1

0

1

6 B 5 B 13 B− 5 @ − 95

C C C A

0

1

1.2 C =B @ −2.6 A, −1.8

1 @ − 13 A = 8 13 0 1 2 121 B C B 38 C C B 121 C= B B 59 C C B @ 242 A 56 − 121

(b)

!

−.0769 , .6154 1

0

.0165 C B B .3141 C C, B @ .2438 A −.4628

0

−1 B 4 B 5 B− B 4 B 1 B @ 8 1 4 0

1

C C C C, C C A

4 B −5 B 8 (b) B @ − 15 − 19 15 0

0

1

0 B C B1C C, (c) B @1A 0 1 C C C A

0

(d)

0

− 32 B 35 B 19 B− B 35 B 12 B− @ 35 − 76 35

1

1 C C C C C C A

0

−.8000 C =B @ −.5333 A, −1.2667 1

−.732 C (d) B @ −.002 A. .508

1.7.22. The results are the same. ♠ 1.7.23. Gaussian Elimination With Full Pivoting start for i = 1 to n set σ(i) = τ (i) = i next i for j = 1 to n if mσ(i),j = 0 for all i ≥ j, stop; print “A is singular” choose i ≥ j and k ≥ j such that mσ(i),τ (k) is maximal

interchange σ(i) ←→ σ(j) interchange τ (k) ←→ τ (j) for i = j + 1 to n set z = mσ(i),τ (j) /mσ(k),τ (j) set mσ(i),τ (j) = 0

for k = j + 1 to n + 1 set mσ(i),τ (k) = mσ(i),τ (k) − z mσ(i),τ (k)

next k next i next j end

32

1

−.9143 B C B −.5429 C C. =B @ −.3429 A −2.1714

♠ 1.7.24. We let x ∈ R n be generated using a random number generator, compute b = Hn x and then solve Hn y = b for y. The error is e = x − y and we use e⋆ = max | ei | as a measure of the overall error. Using Matlab, running Gaussian Elimination with pivoting: n e⋆

10 .00097711

20 35.5111

50 318.3845

100 1771.1

Using Mathematica, running regular Gaussian Elimination: n e⋆

10 .000309257

20 19.8964

50 160.325

100 404.625

In Mathematica, using the built-in LinearSolve function, which is more accurate since it uses a more sophisticated solution method when confronted with an ill-posed linear system: n e⋆

10 .00035996

20 .620536

50 .65328

100 .516865

(Of course, the errors vary a bit each time the program is run due to the randomness of the choice of x.) ♠ 1.7.25.

0

1

9 −360 30 C (a) H3−1 = B −36 192 −180 @ A, 30 −180 180 1 0 16 −120 240 −140 B −120 1200 −2700 1680 C C C, B H4−1 = B @ 240 −2700 6480 −4200 A −140 1680 −4200 2800 1 0 25 −300 1050 −1400 630 B −300 4080 −18900 26880 −12600 C C B B −1 C. 1050 −18900 79380 −117600 56700 C H5 = B C B @ −1400 26880 −117600 179200 −88200 A 630 −12600 56700 −88200 44100 (b) The same results are obtained when using floating point arithmetic in either Mathematica or Matlab. f H , where K f (c) The product K 10 10 10 is the computed inverse, is fairly close to the 10 × 10 identity matrix; the largest error is .0000801892 in Mathematica or .000036472 in f H , it is nowhere close to the identity matrix: in MathematMatlab. As for K 20 20 ica the diagonal entries range from −1.34937 to 3.03755, while the largest (in absolute value) off-diagonal entry is 4.3505; in Matlab the diagonal entries range from −.4918 to 3.9942, while the largest (in absolute value) off-diagonal entry is −5.1994.

1.8.1. (a) (b) (c) (d) (e) (f ) (g)

Unique solution: (− 21 , − 34 )T ; infinitely many solutions: (1 − 2 z, −1 + z, z)T , where z is arbitrary; no solutions; unique solution: (1, −2, 1)T ; infinitely many solutions: (5 − 2 z, 1, z, 0)T , where z is arbitrary; infinitely many solutions: (1, 0, 1, w)T , where w is arbitrary; unique solution: (2, 1, 3, 1)T . 33

1.8.2. (a) Incompatible; (b) incompatible; (c) (1, 0)T ; (d) (1 + 3 x2 − 2 x3 , x2 , x3 )T , where x2 T T and x3 are arbitrary; (e) (− 15 2 , 23, −10) ; (f ) (−5 − 3 x4 , 19 − 4 x4 , −6 − 2 x4 , x4 ) , where x4 is arbitrary; (g) incompatible. 1.8.3. The planes intersect at (1, 0, 0). 1.8.4. (i) a 6= b and b 6= 0;

(ii) b = 0, a 6= −2;

1.8.5. (a) b = 2, c 6= −1 or b = 1.8.6. (a)

“

1 2,

1 + i − 12 (1 + i )y, y, − i

(iii) a = b 6= 0, or a = −2 and b = 0.

c 6= 2; (b) b 6= 2, 21 ;

”T

(c) b = 2, c = −1, or b =

, where y is arbitrary;;

T

(b) ( 4 i z + 3 + i , i z + 2 − i , z ) , where z is arbitrary; (c) ( 3 + 2 i , −1 + 2 i , 3 i )T ; (d) ( − z − (3 + 4 i )w, −z − (1 + i )w, z, w )T , where z and w are arbitrary. 1.8.7. (a) 2, (b) 1, (c) 2, (d) 3, (e) 1, (f ) 1, (g) 2, (h) 2, (i) 3. 1.8.8. (a) (b) (c) (d) (e) (f )

1 1

1 −2

!

!

!

1 0 1 1 = , 1 1 0 −3 ! ! ! 2 1 3 1 0 2 1 3 = , −2 −1 −3 −1 1 0 0 0 0 0 1 10 1 1 −1 1 1 0 0 1 −1 1 B B C CB 0 1C @ 1 −1 2 A = @ 1 1 0 A @ 0 A, −1 1 0 −1 1 1 0 0 0 10 0 10 1 0 1 0 0 2 −1 2 −1 0 1 0 0 CB B CB B1 3 1 −1 C @0 0 1A@1 A = @ 2 1 0 A@ 0 2 1 0 1 0 0 0 1 1 0 0 2 −1 10 1 1 0 3 1 0 0 3 CB C C B B @ 0 A = @ 0 1 0 A@ 0 A, 2 0 −3 0 1 −2 ( 0 −1 2 5 ) = ( 1 )( 0 −1 2 5 ), 0

0 B B1 (g) B @0 0 0 1 B2 B B1 (h) B B @4 0 0 0 (i) B @0 1

10

x + y = 1, (b) y + z = 0, x − z = 1.

x = 1, 1.8.9. (a) y = 0, z = 0. 1.8.10. (a)

0

1

1 0 0 1 0 0 0 −3 B B 0 C 1 0 0 0 0C B CB 4 −1 C CB C=B 1 3 − 0 1 0 A@ 1 2A @ 4 4 1 1 0 0 1 −1 −5 − 4 − 47 0 1 0 10 −1 2 1 1 0 0 0 0 1 C B B0 1 −1 0C B2 1 0 0 0C CB B CB C = B 1 1 1 0 0 CB 0 2 −3 −1 C C B CB −1 3 2 A @ 4 1 0 1 0 A@ 0 3 1 −50 −2 0 1 01 0 01 0 1 0 0 0 0 3 1 1 0 B C B 0 1C A @ 1 2 −3 1 −2 A = @ 2 1 0 0 2 4 −2 1 −2 0 0

1 0

0 1

!

0 , (b) 0

0

1 B @0 0

0 1 0

x + y = 1, (c) y + z = 0, x − z = 0. 1

0 0C A, (c) 0

0

1 B @0 0

34

1

0 −1 C A, 1

10

1

0 1 2 B C 0C CB 0 7 C B C, 0C A@ 0 0 A 0 0 1 1 −1 2 1 3 −5 −2 C C C, 0 0 0C C 0 0 0A 0 0 0 10 0 1 2 −3 1 B 0C 4 −1 A@0 0 1 0 0 0 3

1

0 1C A, (d) 0

0

1 B @0 0

0 1 0

1

0 0C A. 1

1

−2 2C A. 1

1 2,

c = 2.

1.8.11. (a) x2 + y 2 = 1, x2 − y 2 = 2; (b) y = x2 , x − y + 2 = 0; solutions: x = 2, y = 4 and x = −1, y = 1; (c) y = x3 , x − y = 0; solutions: x = y = 0, x = y = −1, x = y = 1; (d) y = sin x, y = 0; solutions: x = k π, y = 0, for k any integer. 1.8.12. That variable does not appear anywhere in the system, and is automatically free (although it doesn’t enter into any of the formulas, and so is, in a sense, irrelevant). 1.8.13. True. For example, take a matrix in row echelon form with r pivots, e.g., the matrix A with aii = 1 for i = 1, . . . , r, and all other entries equal to 0. 1.8.14. Both false. The zero matrix has no pivots, and hence has rank 0. ♥ 1.8.15. (a) Each row of A = v wT is a scalar multiple, namely vi w, of the vector w. If necessary, we use a row interchange to ensure that the first row is non-zero. We then subtract the appropriate scalar multiple of the first row from all the others. This makes all rows below the first zero, and so the resulting matrix is in row echelon form has a single nonzero row, and hence a 0 single pivot1— proving that A has rank 1. ! ! −8 4 −1 2 2 6 −2 B C (b) (i) , (ii) @ 0 0 A, (iii) . −3 6 −3 −9 3 4 −2 (c) The row echelon form of A must have a single nonzero row, say w T . Reversing the elementary row operations that led to the row echelon form, at each step we either interchange rows or add multiples of one row to another. Every row of every matrix obtained in such a fashion must be some scalar multiple of w T , and hence the original matrix A = v wT , where the entries vi of the vector v are the indicated scalar multiples. 1.8.16. 1. 1.8.17. 2. 1.8.18. Example: A = has rank 0.

1 0

0 0

!

, B=

0 0

1 0

!

so A B =

0 0

1 0

!

has rank 1, but B A =

0 0

0 0

!

♦ 1.8.19. “ ” (a) Under elementary row operations, the reduced form of C will be U Z where U is the row echelon form ! ! the pivots in A. Examples: ! of A. Thus, C has!at least r pivots, namely 1 2 1 2 1 1 2 1 2 1 . = 2 > 1 = rank , while rank = 1 = rank rank 2 4 2 4 3 2 4 2 4 2 ! U where U is the row ech(b) Applying elementary row operations, we can reduce E to W elon form of A. If we can then use elementary row operations of type #1 to eliminate all entries of W , then the row echelon form of E has the same number of pivots as A and so rank E = rank A. Otherwise, at least one new pivot appears in the rows below U , and rank E0> rank1A. Examples: 0 1 ! ! 1 2 1 2 1 2 1 2 C C . , while rank B rank B @ 2 4 A = 2 > 1 = rank @ 2 4 A = 1 = rank 2 4 2 4 3 6 3 5

♦ 1.8.20. By Proposition 1.39, A can be reduced to row echelon form U by a sequence of elementary row operations. Therefore, as in the proof of the L U decomposition, A = E 1 E2 · · · EN U −1 are the elementary matrices representing the row operations. If A is where E1−1 , . . . , EN singular, then U = Z must have at least one all zero row. 35

“

”

♦ 1.8.21. After row operations, the augmented matrix becomes N = U | c where the r = rank A nonzero rows of U contain the pivots of A. If the system is compatible, then the last m − r entries of c are all zero, and hence N is itself a row echelon matrix with r nonzero rows and hence rank M = rank N = r. If the system is incompatible, then one or more of the last m − r entries of c are nonzero, and hence, by one more set of row operations, N is placed in row echelon form with a final pivot in row r + 1 of the last column. In this case, then, rank M = rank N = r + 1.

1.8.22. (a) x = z, y = z, where z is arbitrary; (b) x = − 23 z, y = 97 z, where z is arbitrary; (c) x = y = z = 0; (d) x = 13 z − 23 w, y = 65 z − 16 w, where z and w are arbitrary; (e) x = 13 z, y = 5 z, w = 0, where z is arbitrary; (f ) x = 32 w, y = 12 w, z = 21 w, where w is arbitrary. ”T “ ”T 1 , where y is arbitrary; (b) − 65 z, 58 z, z , where z is arbitrary; 3 y, y ”T “ 3 2 6 − 11 , where z and w are arbitrary; (d) ( z, − 2 z, z )T , 5 z + 5 w, 5 z − 5 w, z, w T T

1.8.23. (a) (c)

“

where

z is arbitrary; (e) ( −4 z, 2 z, z ) , where z is arbitrary; (f ) ( 0, 0, 0 ) ; (g) ( 3 z, 3 z, z, 0 )T , where z is arbitrary; (h) ( y − 3 w, y, w, w )T , where y and w are arbitrary. 1.8.24. If U has only nonzero entries on the diagonal, it must be nonsingular, and so the only solution is x = 0. On the other hand, if there is a diagonal zero entry, then U cannot have n pivots, and so must be singular, and the system will admit nontrivial solutions. 1.8.25. For the homogeneous case x1 = x3 , x2 = 0, where x3 is arbitrary. For the inhomogeneous case x1 = x3 + 14 (a + b), x2 = 21 (a − b), where x3 is arbitrary. The solution to the homogeneous version is a line going through the origin, while the inhomogeneous solution “

is a parallel line going through the point 41 (a + b), 0, 12 (a − b) free variable x3 is the same as in the homogeneous case.

”T

. The dependence on the

1.8.26. For the homogeneous case x1 = − 61 x3 − 16 x4 , x2 = − 23 x3 + 43 x4 , where x3 and x4 are arbitrary. For the inhomogeneous case x1 = − 16 x3 − 61 x4 + 13 a + 61 b, x2 = − 32 x3 + 4 1 1 3 x4 + 3 a + 6 b, where x3 and x4 are arbitrary. The dependence on the free variable x3 is the same as in the homogeneous case. 1.8.27. (a) k = 2 or k = −2;

(b) k = 0 or k =

1 2;

(c) k = 1.

1.9.1. (a) Regular matrix, reduces to upper triangular form U = 0

36

0 1 0

1

!

−1 , so determinant is 2; 1

3 −2 C (b) Singular matrix, row echelon form U = A, so determinant is 0; 0 0 1 1 2 3 B 2C (c) Regular matrix, reduces to upper triangular form U = @ 0 1 A, so determinant is −3; 0 00 −3 1 −2 1 3 B (d) Nonsingular matrix, reduces to upper triangular form U = @ 0 1 −1 C A after one row 0 0 3 interchange, so determinant is 6; B @

−1 0 0

2 0

(e) Upper triangular matrix, so the determinant is a product of0diagonal 1 −2 B 0 2 B (f ) Nonsingular matrix, reduces to upper triangular form U = B @0 0 0 0 one row interchange, so determinant is 40; 0 1 −2 B0 3 B B0 (g) Nonsingular matrix, reduces to upper triangular form U = B 0 B @0 0 0 0 after one row interchange, so determinant is 60. 1.9.2. det A = −2, det B = −11 and det A B = !

0

det B @

5 1 −2

4 5 10

entries: −180; 1 1 4 −1 −7 C C C after −2 −8 A 0 10 1 −3 4 0 0

4 −1 −12 −5 0

1

−5 2C C C 24 C C 10 A 1

1

4 1C A = 22. 0

2 3 ; (b) By formula (1.82), 1.9.3. (a) A = −1 −2 1 = det I = det(A2 ) = det(A A) = det A det A = (det A)2 , so det A = ±1.

1.9.4. det A2 = (det A)2 = det A, and hence det A = 0 or 1

1.9.5. (a) True. By Theorem 1.52, A ! is nonsingular, so, by Theorem 1.18, A−1 exists 2 3 , we have 2 det A = −2 and det 2 A = −4. In general, (b) False. For A = −1 −2 n det(2 A) = 2 det A. ! ! ! 2 4 0 1 2 3 (c) False. For A = = , we have det(A + B) = det and B = −1 −2 0 0 −1 −2 0 6= −1 = det A + det B. (d) True. det A−T = det(A−1 )T = det A−1 = 1/ det A, where the second equality follows from Proposition 1.56, and the third equality follows from Proposition 1.55. (e) True. det(A B −1 ) = det A det B −1 = det A/ det B, where the first equality follows from formula (1.82) and the second equality follows from Proposition 1.55. ! ! ! 0 −4 0 1 2 3 = , then det(A + B)(A − B) = det and B = (f ) False. If A = 0 2 0 0 −1 −2 ! 1 0 0 6= det(A2 − B 2 ) = det = 1. However, if A B = B A, then det(A + B)(A − B) = 0 1 det(A2 − A B + B A − B 2 ) = det(A2 − B 2 ). (g) True. Proposition 1.42 says rank A = n if and only if A is nonsingular, while Theorem 1.52 implies that det A 6= 0. (h) True. Since det A = 1 6= 0, Theorem 1.52 implies that A is nonsingular, and so B = A−1 O = O. 1.9.6. Never — its determinant is always zero. 1.9.7. By (1.82, 83) and commutativity of numeric multiplication, 1 det A det S = det A. det B = det(S −1 A S) = det S −1 det A det S = det S 1.9.8. Multiplying one row of A by c multiplies its determinant by c. To obtain c A, we must multiply all n rows by c, and hence the determinant is multiplied by c a total of n times. 1.9.9. By Proposition 1.56, det LT = det L. If L is a lower triangular matrix, then LT is an 37

upper triangular matrix. By Theorem 1.50, det LT is the product of its diagonal entries which are the same as the diagonal entries of L. 1.9.10. (a) See Exercise 1.9.8. (b) If n is odd, det(− A) = − det A. On the other hand, if ! 0 1 . AT = − A, then det A = det AT = − det A, and hence det A = 0. (c) A = −1 0 ♦ 1.9.11. We have det

a c + ka

b d + kb c a

det det

ka c

kb d a 0

det

d b

b d

!

!

!

!

a c

= a d + a k b − b c − b k a = a d − b c = det = c b − a d = −(a d − b c) = − det

a c

= k a d − k b c = k (a d − b c) = k det

b d a c

!

b d

!

,

,

b d

!

,

= a d − b 0 = ad.

♦ 1.9.12. (a) The product formula holds if A is an elementary matrix; this is a consequence of the determinant axioms coupled with the fact that elementary matrices are obtained by applying the corresponding row operation to the identity matrix, with det I = 1. (b) By induction, if A = E1 E2 · · · EN is a product of elementary matrices, then (1.82) also holds. Proposition 1.25 then implies that the product formula is valid whenever A is nonsingular. (c) The first result is in Exercise 1.2.24(a), and so the formula follows by applying Lemma 1.51 to Z and Z B. (d) According to Exercise 1.8.20, every singular matrix can be written as A = E1 E2 · · · EN Z, where the Ei are elementary matrices, while Z, its row echelon form, is a matrix with a row of zeros. But then Z B = W also has a row of zeros, and so A B = E1 E2 · · · EN W is also singular. Thus, both sides of (1.82) are zero in this case. 1.9.13. Indeed, by (1.82), det A det A−1 = det(A A−1 ) = det I = 1. ♦ 1.9.14. Exercise 1.6.28 implies that, if A is regular, so is AT , and they both have the same pivots. Since the determinant of a regular matrix is the product of the pivots, this implies det A = det AT . If A is nonsingular, then we use the permuted L U decomposition to write A = P T L U where P T = P −1 by Exercise 1.6.14. Thus, det A = det P T det U = ± det U , while det AT = det(U T LT P ) = det U det P = ± det U where det P −1 = det P = ±1. Finally, if A is singular, then the same computation holds, with U denoting the row echelon form of A, and so det A = det U = 0 = ± det AT . 1.9.15. 0

a11 B B a21 det B @ a31 a41

1

a12 a13 a14 a22 a23 a24 C C C= a32 a33 a34 A a42 a43 a44 a11 a22 a33 a44 − a11 a22 a34 a43 − a11 a23 a32 a44 + a11 a23 a34 a42 − a11 a24 a33 a42 + a11 a24 a32 a43 − a12 a21 a33 a44 + a12 a21 a34 a43 + a12 a23 a31 a44 − a12 a23 a34 a41 + a12 a24 a33 a41 − a12 a24 a31 a43 + a13 a21 a32 a44 − a13 a21 a34 a42 − a13 a22 a31 a44 + a13 a22 a34 a41 − a13 a24 a32 a41 + a13 a24 a31 a42 − a14 a21 a32 a43 + a14 a21 a33 a42 + a14 a22 a31 a43 − a14 a22 a33 a41 + a14 a23 a32 a41 − a14 a23 a31 a42 .

38

♦ 1.9.16. (i) Suppose 8 B is obtained from A by adding c times row k to row l, so < alj + c aij , i = l, bij = : Thus, each summand in the determinantal formula for aij , i 6= l. det B splits into two terms, and we find that det B = det A + c det C, where C is the matrix obtained from A by replacing row l by row k. But rows k and l of C are identical, and so, by axiom (ii), if we interchange the two rows det C = − det C = 0. Thus, det B = det A. (ii) Let B be obtained from A by interchanging rows k and l. Then each summand in the formula for det B equals minus the corresponding summand in the formula for det A, since the permutation has changed sign, and so det B = − det A. (iii) Let B be obtained from A by multiplying rows k by c. Then each summand in the formula for det B contains one entry from row k, and so equals c times the corresponding term in det A, hence det B = c det A. (iv ) The only term in det U that does not contain at least one zero entry lying below the diagonal is for the identity permutation π(i) = i, and so det U is the product of its diagonal entries. ♦ 1.9.17. If U is nonsingular, then, by Gauss–Jordan elimination, it can be reduced to the identity matrix by elementary row operations of types #1 and #3. Each operation of type #1 doesn’t change the determinant, while operations of type #3 multiply the determinant by the diagonal entry. Thus, det U = u11 u22 · · · unn det I . On the other hand, U is singular if and only if one or more of its diagonal entries are zero, and so det U = 0 = u11 u22 · · · unn . ♦ 1.9.18. The determinant of an elementary matrix of type #2 is −1, whereas all elementary matrices of type #1 have determinant +1, and hence so does any product thereof. ♥ 1.9.19. (a) Since A is regular, a 6= 0 and a d − b c 6= 0. Subtracting c/a times the first from from the ! a b second row reduces A to the upper triangular matrix , and its pivots 0 d + b (−c/a) ad − bc det A c = a . are a and d − b a = a (b) As in part (a) we reduce A to an upper triangular form. First, we subtract c/a times the first row from the second row, and g/a times the first row from third row, resulting 1 0 a b e B ad − bc af − ce C C C. Performing the final row operation reduces B0 in the matrix B a a @ ah − bg aj − cg A 0 1 0 a b e a a B C ad − bc af − ce C B0 C, whose pivots the matrix to the upper triangular form U = B − @ A a a ad − bc are a, , and 0 0 P a aj − eg (a f − c e)(a h − b g) adj + bf g + ech − af h − bcj − edg det A − = = . a a (a d − b c) ad − bc ad − bc (c) If A is a regular n × n matrix, then its first pivot is a11 , and its k th pivot, for k = 2, . . . , n, is det Ak /det Ak−1 , where Ak is the k × k upper left submatrix of A with entries aij for i, j = 1, . . . , k. A formal proof is done by induction. ♥ 1.9.20. (a–c) Applying an elementary column operation to a matrix A is the same as applying the elementary row operation to its transpose AT and then taking the transpose of the result. Moreover, Proposition 1.56 implies that taking the transpose does not affect the de39

terminant, and so any elementary column operation has exactly the same effect as the corresponding elementary row operation. (d) Apply the transposed version of the elementary row operations required to reduce A T to upper triangular form. Thus, if the (1, 1) entry is zero, use a column interchange to place a nonzero pivot in the upper left position. Then apply elementary column operations of type #1 to make all entries to the right of the pivot zero. Next, make sure a nonzero pivot is in the (2, 2) position by a column interchange if necessary, and then apply elementary column operations of type #1 to make all entries to the right of the pivot zero. Continuing in this fashion, if the matrix is nonsingular, the result is an lower triangular matrix. (e) We first interchange the first and second columns, and then use elementary column operations of type #1 to reduce the matrix to lower triangular form: 0 0 1 1 0 1 2 1 0 2 B C 3 5C det B @ −1 A = − det @ 3 −1 5 A 2 −3 1 −3 2 1 0 0 1 1 1 0 0 1 0 0 B C C = − det B @ 3 −1 −1 A = − det @ 3 −1 0 A = 5. −3 2 7 −3 2 5 ♦ 1.9.21. Using the L U factorizations established in Exercise 1.3.25: 0 1 ! 1 1 1 1 1 C = t2 − t1 , (b) det B (a) det @ t1 t2 t3 A = (t2 − t1 )(t3 − t1 )(t3 − t2 ), t1 t2 2 2 2 t1 t2 t3 1 0 1 1 1 1 C Bt B 1 t2 t3 t4 C C (c) det B B t2 t2 t2 t2 C = (t2 − t1 )(t3 − t1 )(t3 − t2 )(t4 − t1 )(t4 − t2 )(t4 − t3 ). @ 1 4A 3 2 t31 t32 t33 t34 The general formula is found in Exercise 4.4.29. ♥ 1.9.22. (a) By direct substitution: pd − bq aq − pc pd − bq aq − pc ax + by = a +b = p, cx + dy = c +d = q. ad − bc ad − bc ad − bc ad − bc ! ! 1 1 13 3 1 13 det det = −2.6, y = − = 5.2; (b) (i) x = − 0 2 4 0 10 10 ! ! 1 5 7 1 4 −2 1 4 (ii) x = = , y= =− . det det −2 6 3 −2 12 3 12 6 (c) Proof by direct0substitution, expanding all the determinants. 0 1 1 3 4 0 1 3 0 1 1 7 1 y = det B , 1C 1C (d) (i) x = det B @2 2 @ 4 2 A=− , A= 9 9 9 9 0 1 −1 −1 0 −1 0 0 1 1 1 4 3 1 2 −1 1 1 8 B B C z = det @ 4 2 2 A = ; (ii) x = − det @ 2 −3 2C A = 0, 9 9 2 −1 1 0 3 −1 1 1 1 0 0 3 1 −1 3 2 1 1 1 C B B 2 A = 4, z = − det @ 1 −3 2 C y = − det @ 1 2 A = 7. 2 2 2 3 1 2 −1 3 (e) Assuming A is nonsingular, the solution to A x = b is xi = det Ai / det A, where Ai is obtained by replacing the ith column of A by the right hand side b. See [ 60 ] for a complete justification.

40

♦ 1.9.23. (a) We can individually reduce A and B to upper triangular forms U1 and U2 with the determinants equal to the products of their respective diagonal entries. Applying the analogous !elementary row operations to D will reduce it to the upper triangular form U1 O , and its determinant is equal to the product of its diagonal entries, which O U2 are the diagonal entries of both U1 and U2 , so det D = det U1 det U2 = det A det B. (b) The same argument as in part (a) proves the result. The row operations applied to A are also applied to C, but this doesn’t affect the final upper triangular form. 0

3 (c) (i) det B @0 0 (ii)

0

1

B −3 B det B @ 0

0

(iii)

0

1

B −3 B det B @ 0

0

(iv )

0

5

B 2 B det B @2

3

1

2 4 3 2 1 0 0

−2 −5 C A = det(3) det 7 −2 0 1 2

2 1 3 0

0 4 1 0

−1 5 4 −2

0 0 4 9

4 3

1

5 −5 C C C = det 3A −2

1 −3

1

0 4 1 −1 C C B C = det @ −3 8A 0 −3 1

0 0C C C = det −2 A −5

−5 7

5 2

!

2 1

41

!

1 2

det

3 −2

!

= 7 · (−8) = −56,

1

2 1 3

−1 5

= 3 · 43 = 129,

0 4C A det(−3) = (−5) · (−3) = 15, 1 !

det

4 9

−2 −5

!

= 27 · (−2) = −54.


2.1.1. Commutativity of Addition: (x + i y) + (u + i v) = (x + u) + i (y + v) = (u + i v) + (x + i y). Associativity of Addition: h i h i (x + i y) + (u + i v) + (p + i q) = (x + i y) + (u + p) + i (v + q) = (x + u + p) + i (y + v + q) h

i

h

i

= (x + u) + i (y + v) + (p + i q) = (x + i y) + (u + i v) + (p + i q). Additive Identity: 0 = 0 = 0 + i 0 and (x + i y) + 0 = x + i y = 0 + (x + i y). Additive Inverse: − (x + i y) = (− x) + i (− y) and h

i

h

i

(x + i y) + (− x) + i (− y) = 0 = (− x) + i (− y) + (x + i y).

Distributivity: (c + d) (x + i y) = (c + d) x + i (c + d) y = (c x + d x) + i (c y + d y) = c (x + i y) + d (x + i y), c[ (x + i y) + (u + i v) ] = c (x + u) + (y + v) = (c x + c u) + i (c y + c v) = c (x + i y) + c (u + i v). Associativity of Scalar Multiplication: c [ d (x + i y) ] = c [ (d x) + i (d y) ] = (c d x) + i (c d y) = (c d) (x + i y). Unit for Scalar Multiplication: 1 (x + i y) = (1 x) + i (1 y) = x + i y. Note: Identifying the complex number x + i y with the vector ( x, y )T ∈ R 2 respects the operations of vector addition and scalar multiplication, and so we are in effect reproving that R 2 is a vector space. 2.1.2. Commutativity of Addition: (x1 , y1 ) + (x2 , y2 ) = (x1 x2 , y1 y2 ) = (x2 , y2 ) + (x1 , y1 ). Associativity of Addition: h i h i (x1 , y1 ) + (x2 , y2 ) + (x3 , y3 ) = (x1 x2 x3 , y1 y2 y3 ) = (x1 , y1 ) + (x2 , y2 ) + (x3 , y3 ). Additive Identity: 0 = (1, 1), and (x, y) + (1, 1) = (x, y) = (1, 1) + (x, y). Additive Inverse: ! h i h i 1 1 − (x, y) = , and (x, y) + − (x, y) = (1, 1) = − (x, y) + (x, y). x y Distributivity:

(c + d) (x, y) = (xc+d , y c+d ) = (xc xd , y c y d ) = (xc , y c ) + (xd , y d ) = c (x, y) + d (x, y) h

i

c (x1 , y1 ) + (x2 , y2 ) = ((x1 x2 )c , (y1 y2 )c ) = (xc1 xc2 , y1c y2c ) = (xc1 , y1c ) + (xc2 , y2c ) = c (x1 , y1 ) + c (x2 , y2 ). Associativity of Scalar Multiplication: c (d (x, y)) = c (xd , y d ) = (xc d , y c d ) = (c d) (x, y). Unit for Scalar Multiplication: 1 (x, y) = (x, y). 42

Note: We can uniquely identify a point (x, y) ∈ Q with the vector ( log x, log y )T ∈ R 2 . Then the indicated operations agree with standard vector addition and scalar multiplication in R 2 , and so Q is just a disguised version of R 2 . ♦ 2.1.3. We denote a typical function in F(S) by f (x) for x ∈ S. Commutativity of Addition: (f + g)(x) = f (x) + g(x) = (f + g)(x). Associativity of Addition: [f + (g + h)](x) = f (x) + (g + h)(x) = f (x) + g(x) + h(x) = (f + g)(x) + h(x) = [(f + g) + h](x). Additive Identity: 0(x) = 0 for all x, and (f + 0)(x) = f (x) = (0 + f )(x). Additive Inverse: (− f )(x) = − f (x) and [f + (− f )](x) = f (x) + (− f )(x) = 0 = (− f )(x) + f (x) = [(− f ) + f ](x). Distributivity: [(c + d) f ](x) = (c + d) f (x) = c f (x) + d f (x) = (c f )(x) + (d f )(x), [c (f + g)](x) = c f (x) + c g(x) = (c f )(x) + (c g)(x). Associativity of Scalar Multiplication: [c (d f )](x) = c d f (x) = [(c d) f ](x). Unit for Scalar Multiplication: (1 f )(x) = f (x). 2.1.4. (a) ( 1, 1, 1, 1 )T , ( 1, −1, 1, −1 )T , ( 1, 1, 1, 1 )T , ( 1, −1, 1, −1 )T . (b) Obviously not. 2.1.5. One example is f (x) ≡ 0 and g(x) = x3 − x.

2.1.6. (a) f (x) = − 4 x + 3; (b) f (x) = − 2 x2 − x + 1. 2.1.7.

!

!

!

1 ex , which is a constant function. , and (a) 3 cos y ! ! − 5 x + 5 y − 5 ex − 5 x − y + ex + 1 . . Multiplied by − 5 is (b) Their sum is − 5 x y − 5 cos y − 15 x y + cos y + 3 ! 0 . (c) The zero element is the constant function 0 = 0 x−y , xy

♦ 2.1.8. This is the same as the space of functions F(R 2 , R 2 ). Explicitly: Commutativity of Addition: ! ! ! ! w1 (x, y) v1 (x, y) + w1 (x, y) w1 (x, y) v1 (x, y) + = = + w2 (x, y) v2 (x, y) + w2 (x, y) w2 (x, y) v2 (x, y) Associativity of Addition: ! " ! u1 (x, y) v1 (x, y) + + u2 (x, y) v2 (x, y)

w1 (x, y) w2 (x, y)

!#

= =

"

u1 (x, y) + v1 (x, y) + w1 (x, y) u2 (x, y) + v2 (x, y) + w2 (x, y) u1 (x, y) u2 (x, y)

!

Additive Identity: 0 = (0, 0) for all x, y, and ! ! v1 (x, y) v1 (x, y) +0= =0+ v2 (x, y) v2 (x, y) Additive Inverse: −

v1 (x, y) v2 (x, y)

v1 (x, y) v2 (x, y)

!

+

!

=

!

− v1 (x, y) , and − v2 (x, y)

− v1 (x, y) − v2 (x, y)

!

=0= 43

v1 (x, y) v2 (x, y)

+

v1 (x, y) v2 (x, y)

v1 (x, y) v2 (x, y)

− v1 (x, y) − v2 (x, y)

!

+

!

!#

+

!

.

!

.

v1 (x, y) v2 (x, y)

!

.

w1 (x, y) w2 (x, y)

!

.

Distributivity: (c + d) "

!

v1 (x, y) v2 (x, y)

!

=

!#

(c + d) v1 (x, y) (c + d) v2 (x, y)

!

=c !

v1 (x, y) v2 (x, y)

!

+d !

v1 (x, y) v2 (x, y)

v1 (x, y) w1 (x, y) c v1 (x, y) + c w1 (x, y) v1 (x, y) c + = =c +c v2 (x, y) w2 (x, y) c v2 (x, y) + c w2 (x, y) v2 (x, y) Associativity of Scalar Multiplication: ! ! !# " v1 (x, y) c d v1 (x, y) v1 (x, y) . = (c d) = c d v2 (x, y) c d v2 (x, y) v2 (x, y) Unit for Scalar Multiplication: ! ! v1 (x, y) v1 (x, y) . = 1 v2 (x, y) v2 (x, y)

!

,

w1 (x, y) w2 (x, y)

!

.

♥ 2.1.9. We identify each sample value with the matrix entry mij = f (i h, j k). In this way, every sampled function corresponds to a uniquely determined m × n matrix and conversely. Addition of sample functions, (f + g)(i h, j k) = f (i h, j k) + g(i h, j k) corresponds to matrix addition, mij + nij , while scalar multiplication of sample functions, c f (i h, j k), corresponds to scalar multiplication of matrices, c mij . 2.1.10. a + b = (a1 + b1 , a2 + b2 , a3 + b3 , . . . ), c a = (c a1 , c a2 , c a3 , . . . ). Explicity verification of the vector space properties is straightforward. An alternative, smarter strategy is to identify R∞ as the space of functions f : N → R where N = { 1, 2, 3, . . . } is the set of natural numbers and we identify the function f with its sample vector f = (f (1), f (2), . . . ). “

”

2.1.11. (i) v + (−1)v = 1 v + (−1)v = 1 + (−1) v = 0 v = 0. (j) Let z = c 0. Then z + z = c (0 + 0) = c! 0 = z, and so, as in the proof of (h), z = 0. 1 1 1 (k) Suppose c 6= 0. Then v = 1 v = · c v = (c v) = 0 = 0. c c c e both satisfy axiom (c), then 0 = 0 e+0=0+0 e = 0. e ♦ 2.1.12. If 0 and 0

♦ 2.1.13. Commutativity of Addition: b , w) b = (v + v b , w + w) b = (v b , w) b + (v, w). (v, w) + (v Associativity hof Addition: i h i b , w) b + (v e , w) e b+v e, w + w b + w) e = (v, w) + (v b , w) b e , w). e (v, w) + (v = (v + v + (v Additive Identity: the zero element is (0, 0), and (v, w) + (0, 0) = (v, w) = (0, 0) + (v, w). Additive Inverse: − (v, w) = (− v, − w) and (v, w) + (− v, − w) = (0, 0) = (− v, − w) + (v, w). Distributivity: (c + d) (v, w) = ((c + d) v, (c + d) w) = c (v, w) + d (v, w), h

i

b , w) b b , c v + c w) b = c (v, w) + c (v b , w). b c (v, w) + (v = (c v + c v Associativity of Scalar Multiplication: c (d (v, w)) = (c d v, c d w) = (c d) (v, w). Unit for Scalar Multiplication: 1 (v, w) = (1 v, 1 w) = (v, w).

2.1.14. Here V = C0 while W = R, and so the indicated pairs belong to the Cartesian product vector space C0 × R. The zero element is the pair 0 = (0, 0) where the first 0 denotes the identically zero function, while the second 0 denotes the real number zero. The laws of vector addition and scalar multiplication are (f (x), a) + (g(x), b) = (f (x) + g(x), a + b), c (f (x), a) = (c f (x), c a). 44

2.2.1. e = (x e, y e, z e )T also satisfies x e−y e + 4z e = 0, (a) If v = ( x, y, z )T satisfies x − y + 4 z = 0 and v T e = (x + x e, y + y e, z + z e ) since (x + x e ) − (y + y e) + 4 (z + z e) = (x − y + 4 z) + so does v + v T e −y e +4 z e) = 0, as does c v = ( c x, c y, c z ) since (c x)−(c y)+4 (c z) = c (x−y +4 z) = 0. (x (b) For instance, the zero vector 0 = ( 0, 0, 0 )T does not satisfy the equation. 2.2.2. (b,c,d,g,i) are subspaces; the rest are not. Case (j) consists of the 3 coordinate axes and the line x = y = z. -1 -0.5

-1 -0.5

0 0.5

0

1

0.5 0

0.5 1

0.5

1

-0.5 -1

2

2.2.3. (a) Subspace:

-1 1 -0.5 0

10

(b) Not a subspace:

0

0

-2

-10

1 0.5

1

0

0.5

-0.5

0 -0.5

-1 1

-1 1

0.5 0.5

0

(c) Subspace:

(d) Not a subspace:

-0.5 -1 -1

0 -0.5 -1 -1 -0.5

-0.5 0

0

0.5

0.5

1

1

-1.5 -1.75 -2 -2.25 -2.5 1.5

1.25 1

(e) Not a subspace:

(f ) Even though the cylinders are not

0.75 0.5 -1

2 1

-0.5 0

0 -1

0.5 -2 2

1

1

subspaces, their intersection is the z axis, which is a subspace:

0 -1 -2 -2 -1 0 1 2

0

1

0

1

0

1

0

1

0

1

1 2 0 a + 2b x B C B C B B C 2.2.4. Any vector of the form 2C 2a − c C A + b @ 0 A + c @ −1 A = @ A = @ y A will −1 1 3 −a + b + 3c z 1 0 1 2 0 C belong to W . The coefficient matrix B @ 2 0 −1 A is nonsingular, and so for any −1 1 3 aB @

45

x = ( x, y, z )T ∈ R 3 we can arrange suitable values of a, b, c by solving the linear system. Thus, every vector in R 3 belongs to W and so W = R 3 . 2.2.5. False, with two exceptions: [ 0, 0 ] = {0} and ( −∞, ∞ ) = R. 2.2.6. (a) Yes. For instance, the set S = { (x, 0 } ∪ { (0, y) } consisting of the coordinate axes has the required property, but is not a subspace. More generally, any (finite) collection of 2 or more lines going through the origin satisfies the property, but is not a subspace. (b) For example, S = { (x, y) | x, y ≥ 0 } — the positive quadrant. 2.2.7. (a,c,d) are subspaces; (b,e) are not. 2.2.8. Since x = 0 must belong to the subspace, this implies b = A 0 = 0. For a homogeneous system, if x, y are solutions, so A x = 0 = A y, so are x + y since A(x + y) = A x + A y = 0, as is c x since A(c x) = c A x = 0. 2.2.9. L and M are strictly lower triangular if lij = 0 = mij whenever i ≤ j. Then N = L + M is strictly lower triangular since nij = lij + mij = 0 whenever i ≤ j, as is K = c L since kij = c lij = 0 whenever i ≤ j. ♦ 2.2.10. Note tr(A + B) =

n X

i=1

(aii + bii ) = tr A + tr B and tr(c A) =

n X

i=1

c aii = c

Thus, if tr A = tr B = 0, then tr(A + B) = 0 = tr(c A), proving closure.

n X

i=1

aii = c tr A.

2.2.11. (a) No. The zero matrix is not an element.! ! 0 0 1 0 satisfy det A = 0 = det B, but ,B = (b) No if n ≥ 2. For example, A = 0 1 0 0 ! 1 0 det(A + B) = det = 1, so A + B does not belong to the set. 0 1 2.2.12. (d,f,g,h) are subspaces; the rest are not. 2.2.13. (a) Vector space; (b) not a vector space: (0, 0) does not belong; (c) vector space; (d) vector space; (e) not a vector space: If f is non-negative, then −1 f = − f is not (unless f ≡ 0); (f ) vector space; (g) vector space; (h) vector space. 2.2.14. If f (1) = 0 = g(1), then (f + g)(1) = 0 and (c f )(1) = 0, so both f + g and c f belong to the subspace. The zero function does not satisfy f 0) = 1. For a subspace, a can be anything, while b = 0. 2.2.15. All cases except (e,g) are subspaces. In (g), | x | is not in C1 . 2.2.16. (a) Subspace; (b) subspace; (c) Not a subspace: the zero function does not satisfy the condition; (d) Not a subspace: if f (0) = 0, f (1) = 1, and g(0) = 1, g(1) = 0, then f and g are in the set, but f + g is not; (e) subspace; (f ) Not a subspace: the zero function does not satisfy the condition; (g) subspace; (h) subspace; (i) Not a subspace: the zero function does not satisfy the condition. 2.2.17. If u′′ = x u, v ′′ = x v, are solutions, and c, d constants, then (c u + d v)′′ = c u′′ + d v ′′ = c x u + d x v = x(c u + d v), and hence c u + d v is also a solution. 2.2.18. For instance, the zero function u(x) ≡ 0 is not a solution. 2.2.19. (a) It is a subspace of the space of all functions f : [ a, b ] → R 2 , which is a particular instance of Example 2.7. Note that f (t) = ( f1 (t), f2 (t) )T is continuously differentiable if and 46

only if its component functions f1 (t) and f2 (t) are. Thus, if f (t) = ( f1 (t), f2 (t) )T and g(t) = ( g1 (t), g2 (t) )T are continuously differentiable, so are (f + g)(t) = ( f1 (t) + g1 (t), f2 (t) + g2 (t) )T and (c f )(t) = ( c f1 (t), c f2 (t) )T . (b) Yes: if f (0) = 0 = g(0), then (c f + d g)(0) = 0 for any c, d ∈ R. 2.2.20. ∇ · (c v + d w) = c ∇ · v + d ∇ · w = 0 whenever ∇ · v = ∇ · w = 0 and c, d, ∈ R. 2.2.21. Yes. The sum of two convergent sequences is convergent, as is any constant multiple of a convergent sequence. 2.2.22. (a) If v, w ∈ W ∩ Z, then v, w ∈ W , so c v + d w ∈ W because W is a subspace, and v, w ∈ Z, so c v + d w ∈ Z because Z is a subspace, hence c v + d w ∈ W ∩ Z. e +z e ∈ W + Z then c (w + z) + d (w e +z e) = (c w + d w) e + (c z + d z e) ∈ W + Z, (b) If w + z, w since it is the sum of an element of W and an element of Z. (c) Given any w ∈ W and z ∈ Z, then w, z ∈ W ∪ Z. Thus, if W ∪ Z is a subspace, the e ∈ W or w + z = z e ∈ Z. In the first case sum w + z ∈ W ∪ Z. Thus, either w + z = w e − w ∈ W , while in the second w = z e − z ∈ Z. We conclude that for any w ∈ W z=w and z ∈ Z, either w ∈ Z or z ∈ W . Suppose W 6⊂ Z. Then we can find w ∈ W \ Z, and so for any z ∈ Z, we must have z ∈ W , which proves Z ⊂ W . ♦ 2.2.23. If v, w ∈ Wi , then v, w ∈ Wi for each i and so c v + d w ∈ Wi for any T c, d ∈ R because Wi is a subspace. Since this holds for all i, we conclude that c v + d w ∈ Wi . T

! ! ! ♥ 2.2.24. 0 x x (a) They clearly only intersect at the origin. Moreover, every v = + = can y 0 y be written as a sum of vectors on the two axes. (b) Since the only common solution to x = y and x!= 3 y is x = y! = 0, the lines only ! 3b a x , where a = − 21 x+ 32 y, + = intersect at the origin. Moreover, every v = b a y b = 21 x − 12 y, can be written as a sum of vectors on each line. (c) A vector v = ( a, 2 a, 3 a )T in the line belongs to the plane if and only if a + 2 (2 a) + 3 (3 a)0 =114 a = 0 0, so a = 0 and the only 0 common element is1v = 0. Moreover, every 1 x + 2 y + 3 z x 1 B 13 x − 2 y − 3 z C 1 B C C v=B @ 2 (x + 2 y + 3 z) A + @ − 2 x + 10 y − 6 z A can be written as a sum @yA = 14 3 (x + 2 y + 3 z) 14 −3x − 6y + 5z z of a vector in the line and a vector in the plane. e +z e, then w − w e =z e − z. The left hand side belongs to W , while the right (d) If w + z = w hand side belongs to Z, and so, by the first assumption, they must both be equal to 0. e z=z e. Therefore, w = w,

2.2.25. (a) (v, w) ∈ V0 ∩ W0 if and only if (v, w) = (v, 0) and (v, w) = (0, w), which means v = 0, w = 0, and hence (v, w) = (0, 0) is the only element of the intersection. Moreover, we can write any element (v, w) = (v, 0) + (0, w). (b) (v, w) ∈ D ∩ A if and only if v = w and v = − w, hence (v, w) = (0, 0). Moreover, we can write (v, w) = ( 12 v + 21 w, 21 v + 21 w) + ( 21 v − 21 w, − 21 v + 12 w) as the sum of an element of D and an element of A. 2.2.26. (a) If f (− x) = f (x), fe (− x) = fe (x), then (c f + d fe )(− x) = c f (− x) + d fe (− x) = c f (x) + d fe (x) = (c f + d fe )(x) for any c, d, ∈ R, and hence it is a subspace. (b) If g(− x) = − g(x), ge(− x) = − ge(x), then (c g + d ge)(− x) = c g(− x) + d ge(− x) = − c g(x) − d ge(x) = − (c g + d ge)(x), proving it is a subspace. If f (x) is both even and 47

odd, then f (x) = f (− x) = − f (x) and so f (x) ≡ 0 for all x. Moreover, we can write any h i 1 function h(x) = f (x) + g(x) as a sum of an even function f (x) = 2 h(x) + h(− x) and h

i

an odd function g(x) = 21 h(x) − h(− x) . (c) This follows from part (b), and the uniqueness follows from Exercise 2.2.24(d).

2.2.27. If A = AT and A = − AT is both symmetric and skew-symmetric, then A = O. “ ” Given any square matrix, write A = S + J where S = 12 A + AT is symmetric and “

”

J = 12 A − AT is skew-symmetric. This verifies the two conditions for complementary subspaces. Uniqueness of the decomposition A = S + J follows from Exercise 2.2.24(d).

♦ 2.2.28. (a) By induction, we can show that f

(n)

(x) = Pn

1 x

!

e− 1/x = Qn (x)

e− 1/x , xn

where Pn (y) and Qn (x) = xn Pn (1/x) are certain polynomials of degree n. Thus, lim f (n) (x) = lim Qn (x)

x→0

x→0

e− 1/x = Qn (0) y lim y n e− y = 0, →∞ xn

because the exponential e− y goes to zero faster than any power of y goes to ∞. (b) The Taylor series at a = 0 is 0 + 0 x + 0 x2 + · · · ≡ 0, which converges to the zero function, not to e− 1/x . 2.2.29. (a) The Taylor series is the geometric series

1 = 1 − x2 + x4 − x6 + · · · . 1 + x2 (b) The ratio test can be used to prove that the series converges precisely when | x | < 1. (c) Convergence of the Taylor series to f (x) for x near 0 suffices to prove analyticity of the function at x = 0.

♥ 2.2.30. (a) If v+a, w+a ∈ A, then (v+a)+(w+a) = (v+w+a)+a ∈ A requires v+w+a = u ∈ V , and hence a = u − v − w ∈ A.

(b) (i)

-3

-2

3

3

3

2

2

2

1

1

1

-1

1

2

3

(ii)

-3

-2

-1

1

2

3

(iii)

-3

-2

-1

1

-1

-1

-1

-2

-2

-2

-3

-3

-3

2

2

3

(c) Every subspace V ⊂ R is either a point (the origin), or a line through the origin, or all of R 2 . Thus, the corresponding affine subspaces are the point { a }; a line through a, or all of R 2 since in this case a ∈ V = R 2 . e, y e, z e )T + ( 1, 0, 0 )T where (d) Every vector in the plane can be written as ( x, y, z )T = ( x T e, y e, z e ) is an arbitrary vector in the subspace defined by x e − 2y e + 3x e = 0. (x (e) Every such polynomial can be written as p(x) = q(x) + 1 where q(x) is any element of the subspace of polynomials that satisfy q(1) = 0.

48

0

2.3.1. B @ 0

B B 2.3.2. B @

2.3.3.

1

0

1

1

0

1

0

1

1

0

5 −1 2 C B C B 2C A = 2@ −1 A − @ −4 A. 3 1 2 0

1

1

0

−2 1 −3 −2 C B C B C B −3 7C 6 C B 4C C B C B C+B C + 2B C = 3B C. @ 3A @ −2 A @ 6A 6A 1 −7 4 0 0

1

0

1

1 1 0 B C B C C (a) Yes, since B @ −2 A = @ 1 A − 3@ 1 A; −3 1 00 1 10 0 1 1 0 1 1 0 1 C C 7 B 4 B C 3 B C (b) Yes, since B @ −2 A = 10 @ 2 A + 10 @ −2 A − 10 @ 3 A; 2 0 0 14 0 −1 0 1 1 1 0 2 0 1 3 B B C B C B C C 0C B −1 C B 0C B2C C does not have a C + c3 B C = c1 B C + c2 B B (c) No, since the vector equation B @ 1A @ 3A @0A @ −1 A −1 1 0 −2 solution.

2.3.4. Cases (b), (c), (e) span R 2 . 2.3.5.

1 0.5 0 -0.5 1-1 0.5 0 -0.5 -1

(a) The line ( 3 t, 0, t )T : -2 0 2 -1 1 0.5

-0.5

0

0

-0.5

0.5

-1

1

1 0

(b) The plane z = − 53 x −

6 5

y:

-1

0.5 0 -0.5

1-1 -0.5 0

-1 2

0.5 1

1 0

(c) The plane z = − x − y:

-1 -2

2.3.6. They are the same. Indeed, since v1 = u1 + 2 u2 , v2 = u1 + u2 , every vector v ∈ V can be written as a linear combination v = c1 v1 + c2 v2 = (c1 + c2 ) u1 + (2 c1 + c2 ) u2 and hence belongs to U . Conversely, since u1 = − v1 + 2 v2 , u2 = v1 − v2 , every vector u ∈ U can be written as a linear combination u = c1 u1 + c2 u2 = (− c1 + c2 ) v1 + (2 c1 − c2 ) v2 , and hence belongs to U . 2.3.7. (a) Every symmetric matrix has the form

49

a b

b c

!

=a

1 0

0 0

!

+c

0 0

0 1

!

+b

0 1

!

1 . 0

(b)

0

1

B @0

0

1

0 0 0

0 0C A, 0

0

0

B @0

0

0 1 0

0

1

0 0C A, 0

0

B @0

0

0 0 0

1

0 0C A, 1

0

0

B @1

0

1 0 0

1

0 0C A, 0

0

0

0 0 0

B @0

1

0

1

1 0C A, 0

0

B @0

0

0 0 1

1

0 1C A. 0

2.3.8. (a) They span P (2) since ax2 + bx + c = 12 (a − 2b + c)(x2 + 1) + 12 (a − c)(x2 − 1) + b(x2 + x + 1). (b) They span P (3) since ax3 + bx2 + cx + d = a(x3 − 1) + b(x2 + 1) + c(x − 1) + (a − b + c + d)1. (c) They do not span P (3) since ax3 + bx2 + cx + d = c1 x3 + c2 (x2 + 1) + c3 (x2 − x) + c4 (x + 1) cannot be solved when b + c − d 6= 0. 2.3.9. (a) Yes. (b) No. (c) No. (d) Yes: cos2 x = 1 − sin2 x. (e) No. (f ) No. “ ” ” “ √ 2.3.10. (a) sin 3 x = cos 3 x − 12 π ; (b) cos x − sin x = 2 cos x + 14 π , “

(c) 3 cos 2 x+4 sin 2 x = 5 cos 2 x − tan−1

4 3

”

, (d) cos x sin x =

1 2

sin 2 x =

1 2

“

cos 2 x −

1 2

”

π .

2.3.11. (a) If u1 and u2 are solutions, so is u = c1 u1 + c2 u2 since u′′ − 4 u′ + 3 u = c1 (u′′ 1 − ′ x 3x 4 u′1 + 3 u1 ) + c2 (u′′ − 4 u + 3 u ) = 0. (b) span { e , e }; (c) 2. 2 2 2 2.3.12. Each is a solution, and the general solution u(x) = c1 + c2 cos x + c3 sin x is a linear combination of the three independent solutions. √

2.3.13. (a) e2 x ; (b) cos 2 x, sin 2 x; (c) e3 x , 1; (d) e− x , e− 3 x ; (e) e− x/2 cos 23 x, √ √ √ √ √ x x x x e− x/2 sin 23 x; (f ) e5 x , 1, x; (g) ex/ 2 cos √ , ex/ 2 sin √ , e− x/ 2 cos √ , e− x/ 2 sin √ . 2 2 2 2 2.3.14. (a) If u1 and u2 are solutions, so is u = c1 u1 + c2 u2 since u′′ + 4 u = c1 (u′′ 1 + 4 u1 ) + + 4 u ) = 0, u(0) = c u (0) + c u (0) = 0, u(π) = c u (π) + c u c2 (u′′ 2 2 1 1 2 2 1 1 2 2 (π) = 0. (b) span { sin 2 x } !

2 1 − 2x 2.3.15. (a) = 2 f1 (x) + f2 (x) − f3 (x); (b) not in the span; (c) 1 −1 −x ! 2−x f2 (x) − f3 (x); (d) not in the span; (e) = 2 f1 (x) − f3 (x). 0 2.3.16. True, since 0 = 0 v1 + · · · + 0 vn . 2.3.17. False. For example, if z =

0

1

1

B C @ 1 A,

1

1

B C @ 0 A,

0

0

1

B C @ 1 A,

0

0

1

B C @ 0 A,

v= w= 0 0 1 1 0 c1 + c 3 C the equation w = c1 u + c2 v + c3 z = B @ c2 + c3 A has no solution. 0 0

u=

0

!

= f1 (x) −

then z = u + v, but

♦ 2.3.18. By the assumption, any v ∈ V can be written as a linear combination v = c1 v1 + · · · + cm vm = c1 v1 + · · · + cn vm + 0 vm+1 + · · · + 0 vn of the combined collection. ♦ 2.3.19. (a) If v =

m X

j =1

cj vj and vj =

n X

i=1

aij wi , then v =

n X

i=1

bi vi where bi =

m X

j =1

aij cj , or, in

vector language, b = A c. (b) Every v ∈ V can be written as a linear combination of v1 , . . . , vn , and hence, by part (a), a linear combination of w1 , . . . , wm , which shows that w1 , . . . , wm also span V .

50

♦ 2.3.20. (a) If v =

m X

i=1

cv + dw =

ai vi , w = max{ m,n } X i=1 (∞)

n X

i=1

bi vi , are two finite linear combinations, so is

(c ai + d bi ) vi where we set ai = 0 if i > m and bi = 0 if i > n.

(b) The space P of all polynomials, since every polynomial is a finite linear combination of monomials and vice versa.

2.3.21. (a) Linearly independent; (b) linearly dependent; (c) linearly dependent; (d) linearly independent; (e) linearly dependent; (f ) linearly dependent; (g) linearly dependent; (h) linearly independent; (i) linearly independent. 2.3.22. (a) The only solution to the homogeneous linear system 1 0 1 0 0 1 2 −2 1 C B C B B C 0C B −2 C B 3C C=0 C + c3 B B C + c2 B is c1 = c2 = c3 = 0. c1 B @ 1A @ −1 A @2A −1 1 1 (b) All but the second lie in the span. (c) a − c + d = 0. 2.3.23. (a) The only solution to the homogeneous linear system 1 0 1 0 1 0 1 0 1 1 1 1 B B C B C C B C 1C B −1 C B 1C B −1 C B C + c2 B C + c3 B C=0 C + c4 B A c = c1 B @ −1 A @1A @ 0A @ 0A 0 −1 1 0 0 1 1 1 1 1 B 1 1 −1 −1 C C B C is c = 0. with nonsingular coefficient matrix A = B @ 1 −1 0 1A 0 0 1 −1 (b) Since A is nonsingular, the inhomogeneous linear system 1 0 1 0 1 0 0 1 1 1 1 1 C B C B C B B C 1C B −1 C B −1 C B 1C C C + c4 B C + c3 B B C + c2 B v = A c = c1 B @ 0A @ 0A @ −1 A @1A −1 1 0 0 has a solution c = 0 A−11v for 0 any1v ∈ R04 . 1 1 1 1 B C B C B C B0C 3B1C 1B 1C B C = 8B C + 8B C+ (c) @0A @1A @ −1 A 1 0 0

0

1

1 B C 3 B −1 C C− 4B @ 0A 1

0

1

1 B C 1 B −1 C C 4B @ 0A −1

2.3.24. (a) Linearly dependent; (b) linearly dependent; (c) linearly independent; (d) linearly dependent; (e) linearly dependent; (f ) linearly independent. 2.3.25. 0 False: 1 0 1 0 0 0 B C B @0 1 0A−@1 0 0 1 0

1 0 0

1

0

0 0 B 0C A−@0 1 1

0 1 0

1

0

1 1 B 0C A−@0 0 0

0 0 1

2.3.26. False — the zero vector always belongs to the span. 2.3.27. Yes, when it is the zero vector. 51

1

0

0 0 B 1C A+@0 1 0

1 0 0

1

0

0 0 B 1C A+@1 0 0

0 0 1

1

1 0C A = O. 0

2.3.28. Because x, y are linearly independent, 0 = c1 u + c2 v = (a c1 + c c2 )x + (b c1 + d c2 )y if and only if a c1 + c c2 = 0, b c1 + d c2 = 0. The latter linear system has a nonzero solution (c1 , c2 ) 6= 0, and so u, v are linearly dependent, if and only if the determinant of the coef! a c = a d − b c = 0, proving the result. The full collection ficient matrix is zero: det b d x, y, u, v is linearly dependent since, for example, a x + b y − u + 0 v = 0 is a nontrivial linear combination. 2.3.29. The statement is false. For example, any set containing the zero element that does not span V is linearly dependent. ♦ 2.3.30. (b) If the only solution to A c = 0 is the trivial one c = 0, then the only linear combination which adds up to zero is the trivial one with c1 = · · · = ck = 0, proving linear independence. (c) The vector b lies in the span if and only if b = c1 v1 + · · · + ck vk = A c for some c, which implies that the linear system A c = b has a solution. ♦ 2.3.31. (a) Since v1 , . . . , vn are linearly independent, 0 = c1 v1 + · · · + ck vk = c1 v1 + · · · + ck vk + 0 vk+1 + · · · + 0 vn if and only if c1 = · · · = ck = 0. ! ! 2 1 , are linearly dependent, but the , v2 = (b) This is false. For example, v1 = 2 1 subset consisting of just v1 is linearly independent. 2.3.32. (a) They are linearly dependent since (x2 − 3) + 2(2 − x) − (x − 1)2 ≡ 0. (b) They do not span P (2) . 2.3.33. (a) Linearly dependent; (b) linearly independent; (c) linearly dependent; (d) linearly independent; (e) linearly dependent; (f ) linearly dependent; (g) linearly independent; (h) linearly independent; (i) linearly independent. 2.3.34. When x > 0, we have f (x) − g(x) ≡ 0, proving linear dependence. On the other hand, if c1 f (x) + c2 g(x) ≡ 0 for all x, then at, say x = 1, we have c1 + c2 = 0 while at x = −1, we must have − c1 + c2 = 0, and so c1 = c2 = 0, proving linear independence. ♥ 2.3.35. (a) 0 =

k X

i=1

ci pi (x) =

n X

k X

j =0 i=1 T

ci aij xj if and only if

n X

k X

j =0 i=1

ci aij = 0, j = 0, . . . , n, or, in

matrix notation, A c = 0. Thus, the polynomials are linearly independent if and only if the linear system AT c = 0 has only the trivial solution c = 0 if and only if its (n + 1) × k coefficient matrix has rank AT = rank A = k. (b) q(x) = 0

n X

j =0

bj xj =

k X

i=1

ci pi (x) if and only if AT c = b. 1

−1 0 0 1 0 B 4 −2 0 1 0C B C C B 0 −4 0 0 C has rank 4 and so they are linearly dependent. (c) A = B 1 B C @ 1 0 1 0 0A 1 2 0 4 −1 (d) q(x) is not in the span. ♦ 2.3.36. Suppose the linear combination p(x) = c0 + c1 x + c2 x2 + · · · + cn xn ≡ 0 for all x. Thus, every real x is a root of p(x), but the Fundamental Theorem of Algebra says this is only possible if p(x) is the zero polynomial with coefficients c0 = c1 = · · · = cn = 0. 52

♥ 2.3.37. (a) If c1 f1 (x) + · · · + cn fn (x) ≡ 0, then c1 f1 (xi ) + · · · + cn fn (xi ) = 0 at all sample points, and so c1 f1 + · · · + cn fn = 0. Thus, linear dependence of the functions implies linear dependence of their sample vectors. (b) Sampling f1 (x) = 1 and f2 (x) = x2 at −1, 1 produces the linearly dependent sample ! 1 vectors f1 = f2 = . 1 (c) Sampling at 0, 41 π, 0 1 1 B C B1C B C B C B 1 C, B C B C B C @1A

1

1 3 20π, 4 π,1π, 1 B √ C 2 C B B 2 C B C B 0 C, B √ C B C 2C B @− 2 A

−1

leads to the linearly independent sample vectors 0 0 1 1 0 1 1 0 0 B B B B B B B B @

√ 2 2

C C C C , 1 C √ C C 2C 2 A

0

B C B 0C C B C B B −1 C, C B C B B C @ 0A

1

B C B 1C C B C B B 0 C. C B C B C B @ −1 A

0

2.3.38. (a) Suppose c1 f1 (t) + · · · + cn fn (t) ≡ 0 for all t. Then c1 f1 (t0 ) + · · · + cn fn (t0 ) = 0, and hence, by linear independence of the sample vectors, c1 = · · · = cn = 0, which proves linear independence of the functions. ! 2 c2 t + (c1 − c2 ) (b) c1 f1 (t) + c2 f1 (t) = ≡ 0 if and only if c2 = 0, c1 − c2 = 0, and 2 c2 t2 + (c1 − c2 )t so c1 = c2 = 0, proving linear independence. However, at any t0 , the vectors f2 (t0 ) = (2 t0 − 1)f1 (t0 ) are scalar multiples of each other, and hence linearly dependent. ♥ 2.3.39. (a) Suppose c1 f (x) + c2 g(x) ≡ 0 for all x for some0c = ( c1 , c2 )T1 0 6= 0.1Differentiating, c f (x) g(x) A @ 1 A = 0 for all x. we find c1 f ′ (x) + c2 g ′ (x) ≡ 0 also, and hence @ ′ ′ f (x) g (x) c2 The homogeneous system has a nonzero solution if and only if the coefficient matrix is singular, which requires its determinant W [ f (x), g(x) ] = 0. (b) This is the contrapositive of part (a), since if f, g were not linearly independent, then their Wronskian would vanish everywhere. (c) Suppose c1 f (x) + c2 g(x) = c1 x3 + c2 | x |3 ≡ 0. then, at x = 1, c1 + c2 = 0, whereas at x = −1, − c1 + c2 = 0. Therefore, c1 = c2 = 0, proving linear independence. On the other hand, W [ x3 , | x |3 ] = x3 (3 x2 sign x) − (3 x2 ) | x |3 ≡ 0.

2.4.1. Only (a) and (c) are bases. 2.4.2. Only (b) is a basis. 0

1 0

1

1 0 C B C 2.4.3. (a) B @ 0 A , @ 1 A; 0 2

(b)

1 0 1 1 3 B4C B4C B C , B C; @ 1A @ 0A 0

0

(c)

1

0 B B B @

1 0

1 0

1

1 −1 −2 C B C B 1C C B 0C B0C C, B C. C, B 0A @ 1A @0A 1 0 0

2.4.4. (a) They0do not span R 3 because the linear system A c = b with coefficient matrix 1 1 3 2 4 C A=B @ 0 −1 −1 −1 A does not have a solution for all b since rank A = 2. 2 1 −1 3 (b) 4 vectors in R 3 are automatically linearly dependent. 53

(c) No, because if v1 , v2 , v3 , v4 don’t span R 3 , no subset of them will span it either. (d) 2, because v1 and v2 are linearly independent and span the subspace, and hence form a basis. 2.4.5. (a) They0span R 3 because the1linear system A c = b with coefficient matrix 1 2 0 1 A=B 3C @ −1 −2 −2 A has a solution for all b since rank A = 3. 2 5 1 −1 (b) 4 vectors in R 3 are automatically linearly dependent. (c) Yes, because v1 , v2 , v3 also span R 3 and so form a basis. (d) 3 because they span all of R 3 . 2.4.6.

0

1

2y + 4z C (a) Solving the defining equation, the general vector in the plane is x = B y @ A where z 0 1 0 1 0 1 0 1 2 4 2 0 C B C B C B C y, z are arbitrary. We can write x = y B @ 1 A + z @ 0 A = (y + 2 z) @ −1 A + (y + z) @ 2 A 0 1 1 −1 and hence both pairs of vectors span the plane. Both pairs are linearly independent since 1 they are not and hence both form 0 0 parallel, 1 0 1 0 1 0 1a basis. 0 1 2 2 4 0 2 4 C B C B C B C B C B C (b) B @ −1 A = (− 1) @ 1 A + @ 0 A , @ 2 A = 2 @ 1 A − @ 0 A; 1 0 1 −1 0 1 1 0 1 0 10 6 C C B (c) Any two linearly independent solutions, e.g., B @ 1 A , @ 1 A, will form a basis. 2 1

♥ 2.4.7. (a) (i) Left handed basis; (ii) right handed basis; (iii) not a basis; (iv ) right handed basis. (b) Switching two columns or multiplying a column by −1 changes the sign of the determinant. (c) If det A = 0, its columns are linearly dependent and hence can’t form a basis.

2.4.8. “ ”T “ ”T , 13 , − 32 , 0, 1 ; dim = 2. (a) − 23 , 65 , 1, 0 (b) The condition p(1) = 0 says a + b + c = 0, so p(x) = (− b − c) x2 + b x + c = b (− x2 + x) + c(− x2 + 1). Therefore − x2 + x, − x2 + 1 is a basis, and so dim = 2. (c) ex , cos 2 x, sin 2 x, is a basis, so dim = 3. 0

1

3 C 2.4.9. (a) B @ 1 A, dim = 1; (b) −1

0

1 0

1

2 0 B C B C @ 0 A, @ −1 A, dim = 2; (c) 1 3

0

1

1 0

0

1 0

1

1

B C B C B C B 0 C B 1 C B −2 C B C, B C, B C, @ −1 A @ 1 A @ 1 A

2

3

dim = 3.

1

2.4.10. (a) We have a + b t + c t2 = c1 (1 + t2 ) + c2 (t + t2 ) + c3 (1 + 2 t + t2 ) provided a = c1 +1c3 , 0 1 0 1 C b = c2 + 2 c3 , c = c1 + c2 + c3 . The coefficient matrix of this linear system, B @ 0 1 2 A, 1 1 1 is nonsingular, and hence there is a solution for any a, b, c, proving that they span the space of quadratic polynomials. Also, they are linearly independent since the linear combination is zero if and only if c1 , c2 , c3 satisfy the corresponding homogeneous linear system c1 +c3 = 0, c2 + 2 c3 = 0, c1 + c2 + c3 = 0, and hence c1 = c2 = c3 = 0. (Or, you can use the fact that dim P (2) = 3 and the spanning property to conclude that they form a basis.) 54

(b) 1 + 4 t + 7 t2 = 2(1 + t2 ) + 6(t + t2 ) − (1 + 2 t + t2 )

2.4.11. (a) a+b t+c t2 +d t3 = c1 +c2 (1−t)+c3 (1−t)2 +c4 (1−t)3 provided a = c1 +c2 +c3 +c4 , 0 1 1 1 1 1 B 0 −1 −2 −3 C C C B b = − c2 − 2 c3 − 3 c4 , c = c3 + 3 c4 , d = − c4 . The coefficient matrix B @0 0 1 3A 0 0 0 −1 (3) is nonsingular, and hence they span P . Also, they are linearly independent since the linear combination is zero if and only if c1 = c2 = c3 = c4 = 0 satisfy the corresponding homogeneous linear system. (Or, you can use the fact that dim P (3) = 4 and the spanning property to conclude that they form a basis.) (b) 1 + t3 = 2 − 3(1 − t) + 3(1 − t)2 − (1 − t)3 . 2.4.12. (a) They are linearly dependent because 2 p1 − p2 + p3 ≡ 0. (b) The dimension is 2, since p1 , p2 are linearly independent and span the subspace, and hence form a basis. 0

1 0

1 0

1 0

1

1 1 C B 1√ C 1 B C B √ C B 2 C B 2C C B B C B 1C B C B 0C B− 2 C C, B C are linearly independent and B C, B 2 C, B (a) The sample vectors B C B −1 C B B1C B 0 C A @ √ A @ A @ 0 √ A @ 2 2 0 1 − 2 2 hence form a basis for R 4 — the space of sample functions. 0 1 0 0 0 1 1 1 1 1√ 0 1 √ √ B √ B 2 C B C B1C C 2C C B C C 1B 2 − 2B 2 + 2B C B1C B 2 C B− 2 C B B B 4 C= B C− C− C. (b) Sampling x produces B B1C C C 2B 8 B 8 B 0 C @2A @1A @ 0√ A @ √ A 3 2 2 1 − 2 4 2 2.4.14. ! ! ! ! 0 0 0 0 0 1 1 0 is a basis since we , E22 = , E21 = , E12 = (a) E11 = 0 1 1 0 0 0 0 0 ! a b can uniquely write any = a E11 + b E12 + c E21 + d E22 . c d (b) Similarly, the matrices Eij with a 1 in position (i, j) and all other entries 0, for i = 1, . . . , m, j = 1, . . . , n, form a basis for Mm×n , which therefore has dimension m n.

2.4.13.

2.4.15. k 6= −1, 2.

2.4.16. A basis is given by the matrices Eii , i = 1, . . . , n which have a 1 in the ith diagonal position and all other entries 0. 2.4.17.

!

!

!

1 0 0 1 0 0 (a) E11 = , E12 = , E22 = ; dimension = 3. 0 0 0 0 0 1 (b) A basis is given by the matrices Eij with a 1 in position (i, j) and all other entries 0 for 1 ≤ i ≤ j ≤ n, so the dimension is 21 n(n + 1).

2.4.18. (a) Symmetric: dim = 3; skew-symmetric: dim = 1; (b) symmetric: dim = 6; skewsymmetric: dim = 3; (c) symmetric: dim = 12 n(n+1); skew-symmetric: dim = 21 n(n−1). ♥ 2.4.19. (a) If a row (column) of A adds up to a and the corresponding row (column) of B adds up to b, then the corresponding row (column) of C = A + B adds up to c = a + b. Thus, if all row and column sums of A and B are the same, the same is true for C. Similarly, the row (column) sums of c A are c times the row (column) sums of A, and hence all the same if A is a semi-magic square.

55

0

a

1

b c e fC (b) A matrix A = A is a semi-magic square if and only if g h j a + b + c = d + e + f = g + h + j = a + d + e = b + e + h = c + f + j. The general solution to this system is 0 1 0 0 1 0 1 0 1 1 −1 0 0 0 1 1 0 −1 0 1 −1 1 1 B B C B B C A = eB 1 0C 1C @ −1 A + f @ −1 0 A + g@ 1 0 0A + h@1 0 0A + j @1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 1 1 1 1 0 0 0 1 0 0 0 1 B B C C C = (e − g) B @ 0 1 0 A + (g + j − e) @ 1 0 0 A + g @ 0 1 0 A + 0 0 1 0 0 1 1 0 0 0 0 1 1 1 0 0 0 0 1 C B C +fB @ 0 0 1 A + (h − f ) @ 1 0 0 A , 0 1 0 0 1 0 which is a linear combination of permutation matrices. (c) The dimension is 5, with any 5 of the 6 permutation matrices forming a basis. (d) Yes, by the same reasoning as 1 in part (a). Its dimension is 3, with basis 0 1 0 0 1 2 2 −1 2 −1 2 −1 2 2 B B B C 4C 1 1C @ −2 1 A, @1 A , @ 4 1 −2 A. 3 0 0 0 01 3 0 0 0 10 30 1 2 2 −1 2 −1 2 −1 2 2 B C B C B 1 1 A + c3 @ 4 1 −2 C (e) A = c1 @ −2 1 4 A + c2 @ 1 A for any c1 , c2 , c3 . 0 3 0 3 0 0 0 0 3 B @d

!

!

!

1

0 0C A 1

!

2 1 0 1 = 2 v 1 + v2 = . Then , v3 = , v2 = 1 1 1 0 v1 + v3 . In fact, there are infinitely many different ways of writing this vector as a linear combination of v1 , v2 , v3 .

♦ 2.4.20. For instance, take v1 =

♦ 2.4.21. (a) By Theorem 2.31, we only need prove linear independence. If 0 = c1 A v1 + · · · + cn A vn = A(c1 v1 + · · · + cn vn ), then, since A is nonsingular, c1 v1 + · · · + cn vn = 0, and hence c1 = · · · = cn = 0. (b) A ei is the ith column of A, and so a basis consists of the column vectors of the matrix. ♦ 2.4.22. Since V 6= {0}, at least one vi 6= 0. Let vi1 6= 0 be the first nonzero vector in the list v1 , . . . , vn . Then, for each k = i1 + 1, . . . , n − 1, suppose we have selected linearly independent vectors vi1 , . . . , vij from among v1 , . . . , vk . If vi1 , . . . , vij , vk+1 form a linearly independent set, we set vij+1 = vk+1 ; otherwise, vk+1 is a linear combination of vi1 , . . . , vij , and is not needed in the basis. The resulting collection vi1 , . . . , vim forms a basis for V since they are linearly independent by design, and span V since each vi either appears in the basis, or is a linear combination of the basis elements that were selected before it. We have dim V = n if and only if v1 , . . . , vn are linearly independent and so form a basis for V. ♦ 2.4.23. This is a special case of Exercise 2.3.31(a). ♦ 2.4.24. (a) m ≤ n as otherwise v1 , . . . , vm would be linearly dependent. If m = n then v1 , . . . , vn are linearly independent and hence, by Theorem 2.31 span all of R n . Since every vector in their span also belongs to V , we must have V = R n . (b) Starting with the basis v1 , . . . , vm of V with m < n, we choose any vm+1 ∈ R n \ V . Since vm+1 does not lie in the span of v1 , . . . , vm , the vectors v1 , . . . , vm+1 are linearly independent and span an m + 1 dimensional subspace of R n . Unless m + 1 = n we can 56

then choose another vector vm+2 not in the span of v1 , . . . , vm+1 , and so v1 , . . . , vm+2 are also linearly independent. We continue on in this fashion until we arrive at n linearly independent vectors v1 , . . . , vn which necessarily form a basis of R n . (c) (i)

“

1, 1,

1 2

”T

, ( 1, 0, 0 )T , ( 0, 1, 0 )T ;

(ii) ( 1, 0, −1 )T , ( 0, 1, −2 )T , ( 1, 0, 0 )T .

♦ 2.4.25. (a) If dim V = ∞, then the inequality is trivial. Also, if dim W = ∞, then one can find infinitely many linearly independent elements in W , but these are also linearly independent as elements of V and so dim V = ∞ also. Otherwise, let w1 , . . . , wn form a basis for W . Since they are linearly independent, Theorem 2.31 implies n ≤ dim V . (b) Since w1 , . . . , wn are linearly independent, if n = dim V , then by Theorem 2.31, they form a basis for V . Thus every v ∈ V can be written as a linear combination of w1 , . . . , wn , and hence, since W is a subspace, v ∈ W too. Therefore, W = V . (c) Example: V = C0 [ a, b ] and W = P (∞) . ♦ 2.4.26. (a) Every v ∈ V can be uniquely decomposed as v = w + z where w ∈ W, z ∈ Z. Write w = c1 w1 + . . . + cj wj and z = d1 z1 + · · · + dk zk . Then v = c1 w1 + . . . + cj wj + d1 z1 + · · · + dk zk , proving that w1 , . . . , wj , z1 , . . . , zk span V . Moreover, by uniqueness, v = 0 if and only if w = 0 and z = 0, and so the only linear combination that sums up to 0 ∈ V is the trivial one c1 = · · · = cj = d1 = · · · = dk = 0, which proves linear independence of the full collection. (b) This follows immediately from part (a): dim V = j +k = dim W +dim Z. ♦ 2.4.27. Suppose the functions are linearly independent. This means that for every 0 6= c = ( c1 , c2 , . . . , cn )T ∈ R n , there is a point xc ∈ R such that

n X

i=1

ci fi (xc ) 6= 0. The as-

sumption says that {0} = 6 Vx1 ,...,xm for all choices of sample points. Recursively define the following sample points. Choose x1 so that f1 (x1 ) 6= 0. (This is possible since if f1 (x) ≡ 0, then the functions are linearly dependent.) Thus Vx1 ( R m since e1 6∈ Vx1 . Then, for each m = 1, 2, . . . , given x1 , . . . , xm , choose 0 6= c0 ∈ Vx1 ,...,xm , and set xm+1 = xc0 . Then c0 6∈ Vx1 ,...,xm+1 ( Vx1 ,...,xm and hence, by induction, dim Vm ≤ n − m. In particular, dim Vx1 ,...,xn = 0, so Vx1 ,...,xn = {0}, which contradicts our assumption and proves the result. Note that the proof implies we only need check linear dependence at all possible collections of n sample points to conclude that the functions are linearly dependent.

2.5.1. (a) Range: all b =

b1 b2

!

such that

3 4 b1

+ b2 = 0; kernel spanned by

1! 2 .

11 0 1 1 −2 b1 C B C (b) Range: all b = such that 2 b1 + b2 = 0; kernel spanned by B @ 1 A, @ 0 A. b2 0 1 1 0 1 0 5 b B−4 C B 7C B 1C C. (c) Range: all b = @ b2 A such that − 2 b1 + b2 + b3 = 0; kernel spanned by B @−8 A b3 1 0

!

(d) Range: all b = ( b1 , b2 , b3 , b4 )T such that − 2 b1 − b2 + b3 = 2 b1 + 3 b2 + b4 = 0; 1 0 1 0 −1 1 B C B 1C B 0C C C. B C, B kernel spanned by B @1A @ 0A 1 0 57

1 0

0

1

1 −5 B 2C B2C C B B 2.5.2. (a) @ 0 A, @ 1 C A: plane; 1 0 0

0

1

−1 C (d) B @ −2 A: line; 1

(b)

0

1 1 B4C B3C B C: @8A

0 C (e) B @ 0 A: point; 0

2.5.3.

(a) Kernel spanned by

0

3

1

B C B1C B C; @0A

1 1 B3C B 5 C: @3A 0

(f )

0

1

−1 C 2.5.4. (a) b = B @ 2 A; −1

range spanned by

0

1

line.

1

0 (b) compatibility: − a + 41 b + c = 0. 1 2

1 0

2 −3 C B C (c) B @ 0 A, @ 1 A: plane; 1 0

line;

1

1

0

0

1

1 0

2

1 0

B C B C B @ 2 A, @ 0 A, @

0

1

1

0 2C A; −3

1

1+t C (b) x = B @ 2 + t A where t is arbitrary. 3+t

2.5.5. In each case, the solution is x = x⋆ + z, where x⋆ is the particular solution and z belongs to the kernel: 0 1 1 0 1 0 0 1 0 1 2 − 1 −3 1 1 B 7C C C B B C C 1C (b) x⋆ = B (a) x⋆ = B @ −1 A, z = z B @ 0 A, z = y @ 1 A + z @ 0 A; @ 7 A; 0 1 0 0 1 (c) x⋆ =

(f ) x⋆ =

0

7 B−9

1

C B 2 C C, B @ 9 A 10 9 0 11 1 2 B 1 C C B B 2 C, C B @ 0 A

0

0

1

2 C z = zB @ 2 A; 1 0

− 13 2 B B −3 2 z = rB B @ 1 0

(d) x⋆ =

1

0

− 32 C B 1 C B C + sB − 2 B C @ 0 A 1

0 B B @

1

51 6C 1C A, 2 −3

C C C; C A

z = 0; (e) x⋆ = 0

1

!

!

2 −1 ; , z=v 1 0 0

1

0

1

3 6 −4 B C B C B C B2C B2C B −1 C ⋆ C, z = z B C + w B C. (g) x = B @0A @1A @ 0A 0 0 1

2.5.6. The ith entry of A ( 1, 1, . . . , 1 )T is ai1 + . . . + ain which is n times the average of the entries in the ith row. Thus, A ( 1, 1, . . . , 1 )T = 0 if and only if each row of A has average 0. “

2.5.7. The kernel has dimension n−1, with basis − r k−1 e1 +ek = − rk−1 , 0, . . . , 0, 1, 0, . . . , 0 for k = 2, . . . n. The range has dimension 1, with basis (1, r n , r2 n . . . , r (n−1)n )T .

”T

♦ 2.5.8. (a) If w = P w, then w ∈ rng P . On the other hand, if w ∈ rng P , then w = P v for some v. But then P w = P 2 v = P v = w. (b) Given v, set w = P v. Then v = w + z where z = v − w ∈ ker P since P z = P v − P w = P v − P 2 v = P v − P v = 0. Moreover, if w ∈ ker P ∩ rng P , then 0 = P w = w, and so ker P ∩ rng P = {0}, proving complementarity. 2.5.9. False. For example, if A =

1 −1

1 −1

!

then

1 1

!

is in both ker A and rng A.

♦ 2.5.10. Let r1 , . . . , rm+k be the rows of C, so r1 , . . . , rm are the rows of A. For v ∈ ker C, the ith entry of C v = 0 is ri v = 0, but then this implies A v = 0 and so v ∈ ker A. As an ! ! 1 1 0 example, A = ( 1 0 ) has kernel spanned by , while C = has ker C = {0}. 0 0 1

58

A=

0 0

!

2.5.12. x⋆1 =

!

x , and so b ∈ rng C. As an example, 0 ! 0 1 has rng A = {0}, while the range of C = is the x axis. 0 0

♦ 2.5.11. If b = A x ∈ rng A, then b = C z where z =

−2 3 2

!

, x⋆2 = 0

−1

1 2 1

!

; x = x⋆1 + 4 x⋆2 =

−6 7 2

!

.

−1 C 2.5.13. x⋆ = 2 x⋆1 + x⋆2 = B @ 3 A. 3 0 1 1 2.5.14. B ⋆ ⋆ (a) By direct matrix multiplication: A x1 = A x2 = @ −3 C A. 5

0

1

0

1

1 −4 B C C (b) The general solution is x = x⋆1 + t (x⋆2 − x⋆1 ) = (1 − t) x⋆1 + t x⋆2 = B @ 1 A + t @ 2 A. 0 −2 2.5.15. 5 meters. 2.5.16. The mass will move 6 units in the horizontal direction and −6 units in the vertical direction. 2.5.17. x = c1 x⋆1 + c2 x⋆2 where c1 = 1 − c2 .

2.5.18. False: in general, (A + B) x⋆ = (A + B) x⋆1 + (A + B) x⋆2 = c + d + B x⋆1 + A x⋆2 , and the third and fourth terms don’t necessarily add up to 0. ♦ 2.5.19. rng A = R n , and so A must be a nonsingular matrix. ♦ 2.5.20. (a) If A xi = ei , then xi = A−1 ei which, by (2.13), is the ith column of the matrix A−1 . 0 0 1 1 1 0 1 1 − 12 2 2 B B C C C B B B C C (b) The solutions to A xi = ei in this case are x1 = B 2 C A, x2 = @ − 1 A, x3 = @ − 1 A, @ which are the columns of A

2.5.21.

!

−1

=

0 B B @

1 2

− 21 −1

2 −

1 2

1 2

!

!

1 12

−

−1

1 2C −1 C A. 1 2

!

1 2

−2 3 1 1 ; cokernel: . ; corange: ; kernel: 1 1 2 −3 0 1 0 1 0 0 0 1 1 0 1 1 0 −8 1 0 −2 1 B B B C B C C B C C C (b) range: B @ 1 A, @ −1 A; corange: @ 2 A, @ 0 A; kernel: @ 1 A; cokernel: @ −2 A. 2 6 −1 −8 0 1 1 1 0 0 1 0 1 0 0 1 0 1 0 1 −3 1 0 1 −3 1 1 C C B B C B C B B C B −3 C B 2 C B 1 C B −1 C B C B C C; cokernel: @ 1 A. C, B C; kernel: B C, B (c) range: @ 1 A, @ 0 A; corange: B @ 1A @ 0A @ 2 A @ −3 A 1 3 2 1 0 2 1 (a) range:

59

0

1 0

1 0

0

1

1 0

1 0

1

1 −3 1 1 0 0 B 0 C B 3 C B −2 C B −3 C B 3 C B 0 C B C B C B B C C B C B C C B C B C B C B C B C B 2 C, B −3 C, B 0 C; corange: B 2 C, B −6 C, B 0 C; (d) range: B B C B C B C C B B C B C @ 3 A @ −3 A @ 3 A @ 2A @ 0A @0A 1 0 3 1 −2 4 0 0 1 0 1 1 0 1 −2 −2 4 2 B −1 C B 1 C B2C B 0C B C B C B C B C C B C B C B C B 1 C, B 0 C; cokernel: B 1 C, B 0 C. kernel: B C B B C B C C B @ 0 A @ −1 A @0A @ 1A 0 0 0 1 0 1 0 1 0 1 −1 0 −3 C B C B C 2.5.22. B @ 2 A, @ 1 A, @ 1 A, which are its first, third and fourth columns; −3 2 0 01 0 1 0 1 0 1 0 1 0 1 2 −1 5 −1 0 −3 C B C B C B C B C B C Second column: B @ −4 A = 2@ 2 A; fifth column: @ −4 A = −2@ 2 A + @ 1 A − @ 1 A. 6 −3 8 −3 2 0 0 1 0 1 0 1 0 1 0 1 0 1 1 0 1 0 −3 1 C B C B C B C B C B C 2.5.23. range: B @ 2 A, @ 4 A; corange: @ −3 A, @ 0 A; second column: @ −6 A = −3@ 2 A; 1 0 −3 01 04 1 0 9 −3 0 1 0 1 0 1 1 0 2 1 0 −3 1 B B C B C B C C C 1B C second and third rows: B @ −6 A = 2@ −3 A + @ 0 A, @ 9 A = −3@ −3 A + 4 @ 0 A. 4 4 0 4 1 0

2.5.24. (i) rank = 1; dim rng! A = dim corng A = 1, !dim ker A = dim coker A = 1; −2 2 kernel basis: ; cokernel basis: ; compatibility conditions: 2 b1 + b2 = 0; 1 1 ! ! ! −2 1 1 . +z , with solution x = example: b = 1 0 −2 (ii) rank = 1; dim rng A = dim corng A = 1, dim ker A = 2, dim coker A = 1; kernel basis: 011 021 B3C B3C @ 1 A, @ 0 A;

!

2 ; compatibility conditions: 2b1 + b2 = 0; 1 0 1 021 011 0 1 ! 1 3C 3 B C B3C example: b = , with solution x = @ 0 A + y @ 1 A + z B @ 0 A. −6 0 0 1 (iii) rank = 2; dim rng A = dim corng A = 2, dim ker A = 0, dim coker A = 1; 0 1 20 B − 13 C 3 20 3 C B C; compatibility conditions: − 13 b1 + 13 b2 + b3 = 0; kernel: {0}; cokernel basis: B @ 13 A cokernel basis:

0 1 1 1 1 1 B C C example: b = B @ −2 A, with solution x = @ 0 A. 0 2 (iv ) rank = 2; dim rng A = dim corng A = 2, dim ker A = dim coker A = 1; 0 1 0 1 −2 −2 C B C kernel basis: B @ −1 A; cokernel basis: @ 1 A; compatibility conditions: 1 1 1 0 0 1 0 1 −2 1 2 C B B C C − 2 b1 + b2 + b3 = 0; example: b = B @ 1 A, with solution x = @ 0 A + z @ −1 A. 1 0 3 (v ) rank = 2; dim rng A = dim corng A = 2, dim ker A = 1, dim coker A = 2; kernel 0

60

0

−1

1

B C @ −1 A;

basis:

1

1 0 1 C B 4 B B 1 C B 1 B 4 C B−4 B C, B B C B @ 1 A @ 0 0

cokernel basis:

− 94

C C C C; C A

compatibility: − 49 b1 +

1 4 b2

+ b3 = 0,

1 1 1 0 1 0 2 1 −1 B C B C B B6C C 1 1 C, with solution x = @ 0 A + z @ −1 A. 4 b1 − 4 b2 + b4 = 0; example: b = B @3A 1 0 1 (vi) rank = 3; dim rng A = dim corng A = 3, dim ker A = dim coker A = 1; kernel basis: 0 13 B 4 B 13 B 8 B B 7 @−2

0

1

1

0

0

1

−1 C B B −1 C C; compatibility conditions: − b1 − b2 + b3 + b4 = 0; cokernel basis: B @ 1A 0 13 1 0 1 1 0 1 1 1 1 B 4 C B C B 13 C B C B C B B0C B3C C C, with solution x = B C + w B 8 C. example: b = B B 7C B0C @1A − @ 2A @ A 3 0 1 (vii) rank = 4; dim rng A = dim corng A = 4, dim ker A = 1, dim coker A = 0; kernel basis: 1 0 −2 B 1C B C B C B 0 C; cokernel is {0}; no conditions; B C @ 0A 0 1 1 0 0 1 1 −2 0 2 B0C B 1C B C B C C B B 1C B C C B C, with x = B 0 C + y B 0 C. example: b = B B C C B @ 3A @0A @ 0A −3 0 0 C C C C; C A

0

1 0

0

1

1

1 2 1 C B C C 2.5.25. (a) dim = 2; basis: B (b) dim = 1; basis: B @ 2 A, @ 2 A; @ 1 A; −1 0 −1 1 0 0 1 0 1 0 1 1 0 1 0 1 1 1 2 0 1 B C B C B C B C C B C B 0C B0C B2C B 0 C B 1 C B −3 C B C, B C, B C; C; C, B C, B B (c) dim = 3; basis: B (d) dim = 3; basis: @1A @0A @1A @ −3 A @ 2 A @ −8 A 0 1 0 7 −3 2 0 1 0 1 0 1 1 2 1 B 1 C B −1 C B 3 C B C B C B C C B C B C B −1 C, B 2 C, B −1 C. (e) dim = 3; basis: B B C B C B C @ 1A @ 2A @ 2A 1 1 1 0

1 0

1 0

1

1 0

0 0 −3 1 C C B C B B C B B1C B 0C B2C B 4C C; the dimension is 3. C, B C, B C, B 2.5.26. It’s the span of B @ 0 A @ 1 A @ 3 A @ −1 A −1 1 0 0 0

1 0

1

0 2 C B C B B 0 C B −1 C C; C, B 2.5.27. (a) B @1A @ 0A 1 0

0

1 0

1

0 1 C B C B B 1 C B −1 C C; C, B (b) B @1A @ 0A 1 0

61

0

1

−1 C B B 3C C. (c) B @ 0A 1

2.5.28. First method:

0

1

1 0

2

1

B C B C B0C B 3C B C, B C; @ 2 A @ −4 A

second method:

1 0 1 5 0 1 1 2 0 1 C B C B C B 3C B0C B 3C B C = 2B C + B C; same, while B @ −4 A @ −8 A @2A 5 3 1 0

0

1

1 0

0

1

B C B C B0C B 3C B C, B C. @ 2 A @ −8 A

The first vectors are the

1 0 1 3 0 1 1 2 1 0 B B C B C C B 3C B 3C B0C C. C = − 2B C + B B @ −4 A @2A @ −8 A 1 5 3 0

2.5.29. Both sets are linearly independent and hence span a three-dimensional subspace of R 4 . Moreover, w1 = v1 + v3 , w2 = v1 + v2 + 2 v3 , w3 = v1 + v2 + v3 all lie in the span of v1 , v2 , v3 and hence, by Theorem 2.31(d) also form a basis for the subspace. 2.5.30. (a) If A = AT , then ker A = { A x = 0 } = { AT x = 0 } = coker A, and rng A = { A x } = { AT x } = corng A. (b) ker A = coker A has basis ( 2, −1, 1 )T ; rng A = corng A has basis ( 1, 2, 0 )T , ( 2, 6, 2 )T . (c) No. For instance, if A is any nonsingular matrix, then ker A = coker A = {0} and rng A = corng A = R 3 . 2.5.31. (a) Yes. This is our method of constructing the basis for the range, and the proof is outlined in the text. 0 1 1 0 1 0 0 0 1 0 0 0 B C B 1 0 0 0C B0 1 0 0C C C, then U = B B C and the first three (b) No. For example, if A = B @0 1 0 0A @0 0 1 0A 0 0 1 0 0 0 0 0 rows of U form a basis for the three-dimensional corng U = corng A. but the first three rows of A only span a two-dimensional subspace. (c) Yes, since ker U = ker A. (d) No, since coker U 6= coker A in general. For the example in part (b), coker A has basis ( −1, 1, 0, 0 )T while coker A has basis ( 0, 0, 0, 1 )T . !

0 0 . (b) No, since then the first r rows of U are linear combina1 0 tions of the first r rows of A. Hence these rows span corng A, which, by Theorem 2.31c, implies that they form a basis for the corange.

2.5.32. (a) Example:

2.5.33. Examples: any symmetric matrix; any permutation matrix 0 0 the identity. Yet another example is the complex matrix B @1 0

since 1 the row echelon form is 0 1 i iC A. i i

♦ 2.5.34. The rows r1 , . . . , rm of A span the corange. Reordering the rows — in particular interchanging two — will not change the span. Also, multiplying any of the rows by nonzero scalars, eri = ai ri , for ai 6= 0, will also span the same space, since v=

n X

i=1

ci ri =

n X

i=1

ci e r. ai i

2.5.35. We know rng A ⊂ R m is a subspace of dimension r = rank A. In particular, rng A = R m if and only if it has dimension m = rank A. 2.5.36. This is false. If A =

1 1

1 1

!

then rng A is spanned by

62

1 1

!

whereas the range of its

row echelon form U =

1 0

1 0

!

is spanned by

!

1 . 0

♦ 2.5.37. (a) Method 1: choose the nonzero rows in the row echelon form of A. Method 2: choose the columns of 0 AT 1 that to pivot columns 0 of 1 its 1 0 correspond 1 0 0 row 0 1 form. 1echelon 1 3 2 1 0 0 B C B C C B C B C B C (b) Method 1: B @ 2 A, @ −1 A, @ −4 A. Method 2: @ 2 A, @ −7 A, @ 0 A. Not the same. 4 5 4 −7 2 2 ♦ 2.5.38. If v ∈ ker A then A v = 0 and so B A v = B 0 = 0, so v ∈ ker(B A). The first statement follows from setting B = A.

♦ 2.5.39. If v ∈ rng A B then v = A B x for some vector x. But then v = A y where y = B x, and so v ∈ rng A. The first statement follows from setting B = A. 2.5.40. First note that B A and A C also have size m × n. To show rank A = rank B A, we prove that ker A = ker B A, and so rank A = n − dim ker A = n − dim ker B A = rank B A. Indeed, if v ∈ ker A, then A v = 0 and hence B A v = 0 so v ∈ ker B A. Conversely, if v ∈ ker B A then B A v = 0. Since B is nonsingular, this implies A v = 0 and hence v ∈ ker A, proving the first result. To show rank A = rank A C, we prove that rng A = rng A C, and so rank A = dim rng A = dim rng A C = rank A C. Indeed, if b ∈ rng A C, then b = A C x for some x and so b = A y where y = C x, and so b ∈ rng A. Conversely, if b ∈ rng A then b = A y for some y and so b = A C x where x = C −1 y, so b ∈ rng A C, proving the second result. The final equality is a consequence of the first two: rank A = rank B A = rank(B A) C. ♦ 2.5.41. (a) Since they are spanned by the columns, the range of ( A B ) contains the range of A. But since A is nonsingular, rng A = R n , and so rng ( A B ) = R n also, which proves rank ( A B ) = n. (b) Same argument, using the fact that the corange is spanned by the rows. 2.5.42. True if the matrices have the same size, but false in general. ♦ 2.5.43. Since we know dim rng A = r, it suffices to prove that w1 , . . . , wr are linearly independent. Given 0 = c1 w1 + · · · + cr wr = c1 A v1 + · · · + cr A vr = A(c1 v1 + · · · + cr vr ),

we deduce that c1 v1 + · · · + cr vr ∈ ker A, and hence can be written as a linear combination of the kernel basis vectors: c1 v1 + · · · + cr vr = cr+1 vr+1 + · · · + cn vn . But v1 , . . . , vn are linearly independent, and so c1 = · · · = cr = cr+1 = · · · = cn = 0, which proves linear independence of w1 , . . . , wr .

♦ 2.5.44. (a) Since they have the same kernel, their ranks are the same. Choose a basis v 1 , . . . , vn of R n such that vr+1 , . . . , vn form a basis for ker A = ker B. Then w1 = A v1 , . . . , wr = A vr form a basis for rng A, while y1 = B v1 , . . . , yr = B vr form a basis for rng B. Let M be any nonsingular m × m matrix such that M wj = yj , j = 1, . . . , r, which exists since both sets of vectors are linearly independent. We claim M A = B. Indeed, M A vj = B vj , j = 1, . . . , r, by design, while M A vj = 0 = B vj , j = r + 1, . . . , n, since these vectors lie in the kernel. Thus, the matrices agree on a basis of R n which is enough to conclude that M A = B. (b) If the systems have the same solutions x⋆ + z where z ∈ ker A = ker B, then B x = M A x = M b = c. Since M can be written as a product of elementary matrices, we “ ” conclude that one can get from the augmented matrix A | b to the augmented matrix 63

“

B|c

”

by applying the elementary row operations that make up M .

♦ 2.5.45. (a) First, W ⊂ rng A since every w ∈ W can be written as w = A v for some v ∈ V ⊂ R n , and so w ∈ rng A. Second, if w1 = A v1 and w2 = A v2 are elements of W , then so is c w1 + d w2 = A (c v1 + d v2 ) for any scalars c, d because c v1 + d v2 ∈ V , proving that W is a subspace. (b) First, using Exercise 2.4.25, dim W ≤ r = dim rng A since it is a subspace of the range. Suppose v1 , . . . , vk form a basis for V , so dim V = k. Let w = A v ∈ W . We can write v = c1 v1 + · · · + ck vk , and so, by linearity, w = c1 A v1 + · · · + ck A vk . Therefore, the k vectors w1 = A v1 , . . . , wk = A vk span W , and therefore, by Proposition 2.33, dim W ≤ k. ♦ 2.5.46. (a) To have a left inverse requires an n×m matrix B such that B A = I . Suppose dim rng A = rank A < n. Then, according to Exercise 2.5.45, the subspace W = { B v | v ∈ rng A } has dim W ≤ dim rng A < n. On the other hand, w ∈ W if and only if w = B v where v ∈ rng A, and so v = A x for some x ∈ R n . But then w = B v = B A x = x, and therefore W = R n since every vector x ∈ R n lies in it; thus, dim W = n, contradicting the preceding result. We conclude that having a left inverse implies rank A = n. (The rank can’t be larger than n.) (b) To have a right inverse requires an m×n matrix C such that A C = I . Suppose dim rng A = rank A < m and hence rng A ( R m . Choose y ∈ R m \ rng A. Then y = A C y = A x, where x = C y. Therefore, y ∈ rng A, which is a contradiction. We conclude that having a right inverse implies rank A = m. (c) By parts (a–b), having both inverses requires m = rank A = n and A must be square and nonsingular.

2.6.1. (a)

(d)

(b)

(c)

(e)

or, equivalently,

2.6.2. (a) (b) ( 1, 1, 1, 1, 1, 1, 1 )T is a basis for the kernel. The cokernel is trivial, containing only the zero vector, and so has no basis. (c) Zero.

64

2.6.3. (a)

0 B B B @

−1 0 0 0

0 −1 1 0 0

(d)

1 1 0 1

1

B1 B B B0 B B0 B B B0 B @0

0

1

0 0C C C; −1 A −1

−1 0 −1 −1 0 0 0

0 −1 0 0 1 −1 0

0

−1 B −1 B B 1 (b) B B @ 0 0

1 0 0 1 0

1

0 0 −1 0 −1

1

0 1C C C; 0C C −1 A 1

−1 B −1 B B 0 B (c) B B 0 B @ 0 0

0 0 0 −1 0 0 0 0C C B 1 0 0 C B 1 0C B C 0 1 −1 B (e) B 0 1C C; B 0 −1 0 B −1 0C C @ 0 C 0 1 0 1A 0 0 0 1 −1 1 0 1 −1 0 0 0 0 C B 1 0 −1 0 0 0 C B B C 1 0 −1 0 0 B 0 C B C B −1 C 0 0 1 0 0 C. (f ) B B 0 C 0 1 0 0 −1 B C B C B 0 0 −1 0 1 0C B C @ 0 0 0 −1 0 1A 0 0 0 0 −1 1 0

1 0

0

1 0 0 0 0 −1

1

0 1 −1 −1 0 0

1 0 0 0 1 0

0 −1 0 0 0 1

01 0C C 0C C C; 1C C −1 A 0

−1 0 0 B 1 C B −1 C C B B C B C −1 C B C B C C; (b) 2 circuits: B 0 C, B −1 C; (c) 2 circuits: B 2.6.4. (a) 1 circuit: B C B B C @ −1 A @ 1A @ 0A 1 0 1 0

1

1 0

0

1 0

1

1 −1 0 B 1 C B −1 C B 0 C C B C B C B C B C B C B B 1 C B 0 C B −1 C C B C B C B B C C B B (d) 3 circuits: B 0 C, B −1 C, B 1 C C; (e) 2 circuits: C B C B B C B 1C B 0C B 0C C B B C B C @ 0A @ 1A @ 0A 0 0 1 1 0 1 0 1 0 −1 1 0 B0C B 1C B0C C B C B C B C B C B C B B 1 C B −1 C B 0 C C B C B C B B1C B 0C B0C C B C C, B (f ) 3 circuits: B B 0 C B 1 C, B 1 C. C B C B C B C B C B C B B0C B 0C B1C B C B C B C @0A @ 1A @0A 0 0 1 0

1 B1 B B1 ♥ 2.6.5. (a) B B @0 0

−1 0 0 1 1

0 −1 0 −1 0

1

01 011 B0C B1C B C B C B C B C B1C B0C B C, B C; B1C B0C B C B C @1A @0A 1 0 0

0 B B B B B B B @

0 0 1 0 −1 1

−1 1 0 0 1 C B 1C C B 0C C B 1 C B −1 C C C, B C; B 1C 0C C B C @ A 1 0A 0 1

0 0C C C; (b) rank = 3; (c) dim rng A = dim corng A = 3, −1 C C 0A −1 1 1 0 0 0 1 1 1 1 B −1 C B 0 C C C B B B C C C B B B1C B 0 C, B −1 C; C; cokernel: dim ker A = 1, dim coker A = 2; (d) kernel: B C C B B @1A @ 1A @ 0A 1 1 0 65

01 0C C 0C C C; 1C C 0A −1

0

1

1 B1C B C C B 1 C; (e) b1 − b2 + b4 = 0, b1 − b3 + b5 = 0; (f ) example: b = B B C @0A 0

♦ 2.6.6. (a)

0

1

B1 B B B1 B B0 B B B0 B B0 B B0 B B B0 B B0 B B B0 B @0

0

−1 0 0 1 1 0 0 0 0 0 0 0 1

0

0 −1 0 0 0 1 1 0 0 0 0 0

−1 B 1C C B C B B 0C C B B −1 C C B C B B 0C C B B 1C C Cokernel basis: v1 = B B 0 C , v2 = C B C B B 0C C B B 0C C B C B B 0C C B @ 0A 0

0 0 −1 0 0 0 0 1 1 0 0 0

0

0 0 0 −1 0 −1 0 0 0 1 0 0

1

−1 B 0C B C B C B 1C B C B 0C B C B C B −1 C B C B 0C B C B 0 C, v3 = B C B C B 1C B C B 0C B C B C B 0C B C @ 0A 0

0 0 0 0 −1 0 0 −1 0 0 1 0

0 0 0 0 0 0 −1 0 −1 0 0 1

0

1

0

1

1+t B C B t C C. x=B @ t A t 1

0 0C C C 0C C 0C C 0C C C 0C C 0C C C 0C C 0C C −1 C C C −1 A −1

0 B −1 C B C B C B 1C B C B 0C B C B C B 0C B C B 0C B C B −1 C, v4 = B C C B B 0C C B B 1C C B C B B 0C C B @ 0A 0

1

0

0 B 0C C B C B B 0C C B B −1 C C B C B B 1C C B B 0C C B B 0 C , v5 = B C B C B 0C B C B 0C B C B C B −1 C B C @ 1A 0

0

1

0 B 0C C B C B B 0C C B B 0C C B C B B 0C B C B −1 C B C B 1 C. B C B C B 0C B C B 0C B C B C B −1 C B C @ 0A 1

These vectors represent the circuits around 5 of the cube’s faces. 0

1

0 B 0C B C B C B 0C B C B 0C B C B C B 0C B C B 0C B C=v −v +v −v +v , (b) Examples: B C 1 2 3 4 5 B 0C C B B −1 C C B B 1C C B C B B 0C C B @ −1 A 1

♥ 2.6.7.

(a) Tetrahedron:

0

1 B1 B B B1 B B0 B @0 0

−1 0 0 1 1 0

0 −1 0 −1 0 1

01 0C C −1 C C C 0C C −1 A −1

66

0

1

0 B 1C C B C B B −1 C C B B −1 C C B C B B 1C C B B 1C C B B 0 C = v1 − v2 , C B C B B −1 C C B B 0C C B C B B 0C C B @ 0A 0

0

1

0 B −1 C C B C B B 1C C B B 1C C B C B B −1 C B C B 0C B C B −1 C = v3 − v4 . B C B C B 0C B C B 1C B C B C B 1C B C @ −1 A 0

number of circuits = dim coker A = 3, number of faces = 4; (b) Octahedron:

0

1 B1 B B B1 B B1 B B B0 B B0 B B0 B B B0 B B0 B B B0 B @0 0

−1 0 0 0 1 1 1 0 0 0 0 0

0 −1 0 0 −1 0 0 1 1 0 0 0

0 0 −1 0 0 0 0 −1 0 1 1 0

0 0 0 −1 0 −1 0 0 0 −1 0 1

1

0 0C C C 0C C 0C C 0C C C 0C C −1 C C C 0C C −1 C C 0C C C −1 A −1

number of circuits = dim coker A = 7, number of faces = 8. (c) Dodecahedron: 1 −1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 B 1 0 0 0 0 −1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 C B C B 0 1 −1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 C B C B C B 0 C 1 0 0 0 0 −1 0 0 0 0 0 0 0 0 0 0 0 0 0 B C B 0 C 0 1 −1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 B C B 0 1 0 0 0 0 −1 0 0 0 0 0 0 0 0 0 0 0 0 C B 0 C B C B 0 C 0 0 1 −1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 B C B 0 C 0 0 1 0 0 0 0 −1 0 0 0 0 0 0 0 0 0 0 0 B C B B −1 C 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0C B C B 0 C 0 0 0 1 0 0 0 0 −1 0 0 0 0 0 0 0 0 0 0 B C B 0 0 0 0 0 1 0 0 0 0 −1 0 0 0 0 0 0 0 0 0 C B C B C B 0 C 0 0 0 0 1 0 0 0 0 0 −1 0 0 0 0 0 0 0 0 B C B 0 C 0 0 0 0 0 1 0 0 0 0 −1 0 0 0 0 0 0 0 0 B C B 0 0 0 0 0 1 0 0 0 0 0 −1 0 0 0 0 0 0 0 C B 0 C B C B 0 C 0 0 0 0 0 0 1 0 0 0 0 −1 0 0 0 0 0 0 0 B C B 0 0 0 0 0 0 0 1 0 0 0 0 0 −1 0 0 0 0 0 0 C B C B C B 0 C 0 0 0 0 0 0 0 1 0 0 0 0 −1 0 0 0 0 0 0 B C B 0 C 0 0 0 0 0 0 0 1 0 0 0 0 0 −1 0 0 0 0 0 B C B 0 0 0 0 0 0 0 0 1 0 0 0 0 −1 0 0 0 0 0 C B 0 C B C B 0 C 0 0 0 0 0 0 0 0 1 −1 0 0 0 0 0 0 0 0 0 B C B 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 −1 0 0 0 0 C B C B C B 0 C 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 −1 0 0 0 B C B 0 C 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 −1 0 0 B C B 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 −1 0 C B C B C B 0 C 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 −1 B C B 0 C 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 −1 0 0 0 B C B 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 −1 0 0 C B 0 C B C B 0 C 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 −1 0 B C @ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 −1 A 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 −1 0 0 0 1 0

number of circuits = dim coker A = 11, number of faces = 12.

67

(d) Icosahedron: 0

1 B1 B B1 B B B1 B B1 B B B0 B B0 B B0 B B B0 B B0 B B0 B B B0 B B0 B B B0 B B0 B B0 B B B0 B B0 B B B0 B B0 B B0 B B B0 B B0 B B0 B B B0 B B0 B B B0 B B0 B @0 0

−1 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 −1 0 0 0 0 0 0 0 0 0 0 0 0

0 −1 0 0 0 −1 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 −1 0 0 0 0 0 −1 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 −1 0 0 0 0 0 0 0 −1 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 −1 0 0 0 0 0 0 0 0 0 −1 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 −1 0 0 −1 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 −1 0

0 0 0 0 0 0 0 0 0 0 −1 0 −1 0 0 0 0 0 0 0 −1 0 1 1 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 −1 0 −1 0 0 0 0 0 0 −1 0 1 1 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 −1 0 −1 0 0 0 0 0 −1 0 1 1 0 0

0 0 0 0 0 0 0 −1 0 0 0 0 0 0 0 0 0 0 0 −1 0 0 0 0 0 0 −1 0 1 1

01 0C C 0C C C 0C C 0C C 0C C C 0C C 0C C C 0C C 0C C 0C C C 0C C 0C C 0C C C 0C C 0C C C 0C C 0C C 0C C C 0C C 0C C C −1 C C 0C C −1 C C C 0C C −1 C C 0C C C −1 C C 0A −1

number of circuits = dim coker A = 19, number of faces = 20. ♥ 2.6.8.

0

−1 (a) (i) B @ 0 0 0

−1 B 0 B B 0 (iii) B B @ 0 0

1 1 1

1

0 −1 0 1 1 0 1 1

0 −1 1 0 0

0 0C A, −1 0 0 −1 0 0

0

−1 B B 0 (ii) B @ 0 0 1 0 0 0 0C C C, 0 0C C −1 0A 0 −1

1

1 −1 0 1

0 0 0 1 0 0C C C, −1 1 0A 0 0 −1 0 −1 1 0 B 0 −1 1 B B 0 (iv ) B 0 −1 B @ 0 1 0 0 0 1

0 0 1 0 0

1

0 0 0 −1 0

0 0C C C. 0C C 0A −1

−1 0 0 0

1 1 1 1

(b)

0 B B B @

−1 0 0 0

1 −1 0 0

0 1 −1 0

0 0 1 −1

1

0 0C C C, 0A 1

0 B B B @

−1 0 0 0

1 −1 0 1 68

0 1 −1 0

0 0 1 0

1

0 0C C C, 0A −1

0 B B B @

0 −1 0 0

0 0 −1 0

1

0 0C C C. 0A −1

(c) Let m denote the number of edges. Since the graph is connected, its incidence matrix A has rank n − 1. There are no circuits if and only if coker A = {0}, which implies 0 = dim coker A = m − (n − 1), and so m = n − 1. ♥ 2.6.9.

(a)

0

1 B @1 0

(b)

(c)

1 2

−1 0 1

n(n − 1);

0

1 B1 B B B1 B B0 B @0 0

1

0 −1 C A, −1

(d)

1 2

−1 0 0 1 1 0

0 −1 0 −1 0 1

01 0C C −1 C C C, 0C C −1 A −1

−1 0 0 0 −1 0 0 0

0 −1 0 0 0 −1 0 0

0 0 −1 0 0 0 −1 0

0

1 B1 B B B1 B B1 B B B0 B B0 B B0 B B B0 B @0 0

(n − 1)(n − 2).

1

−1 0 0 0 1 1 1 0 0 0

0 −1 0 0 −1 0 0 1 1 0

0 0 −1 0 0 −1 0 −1 0 1

0 0C C C 0C C −1 C C 0C C C. 0C C −1 C C 0C C C −1 A −1

0 0 0 0 0 0 1 1 1

−1 0 0 −1 0 0 −1 0 0

0 −1 0 0 −1 0 0 −1 0

01 0C C −1 C C C 0C C 0C C. −1 C C C 0C C 0A −1

♥ 2.6.10.

(a)

0

1 B1 B B 1 B (b) B B0 B @0 0 (c) m n;

0 0 0 1 1 1

−1 0 0 −1 0 0

0 −1 0 0 −1 0

01 0C C −1 C C C, 0C C 0A −1

(d) (m − 1)(n − 1).

0

1 B1 B B B1 B B1 B B0 B B B0 B @0 0

0 0 0 0 1 1 1 1

69

1

0 0C C 0C C C −1 C C, 0C C C 0C C 0A −1

0

1 B1 B B1 B B B0 B B0 B B B0 B B0 B @0 0

0 0 0 1 1 1 0 0 0

♥ 2.6.11.

0

1 B −1 B B B 0 B (a) A = B B 0 B B 0 B @ 0 0

−1 0 −1 0 0 0 0

1

0 0 0 0 0 0 0 1 0 0 0 0C C C 0 1 0 0 0 0C C −1 0 0 1 0 0C C. 0 0 −1 0 1 0C C C 0 0 −1 0 0 1A 0 0 0 0 1 −1 0 1 0 1 0 1 1 0 0 B1C B0C B0C B C B C B C B C B C B C B0C B1C B0C B C B C B C B1C B0C B C C, v = B C, v = B 0 C form a basis for ker A. (b) The vectors v1 = B B0C B1C B0C 2 3 B C B C B C B C B C B C B0C B0C B1C B C B C B C @0A @1A @0A 0 1 0 (c) The entries of each vi are indexed by the vertices. Thus the nonzero entries in v1 correspond to the vertices 1,2,4 in the first connected component, v2 to the vertices 3,6 in the second connected component, and v3 to the vertices 5,7,8 in the third connected component. (d) Let A have k connected components. A basis for ker A consists of the vectors v 1 , . . . , vk where vi has entries equal to 1 if the vertex lies in the ith connected component of the graph and 0 if it doesn’t. To prove this, suppose A v = 0. If edge #ℓ connects vertex a to vertex b, then the ℓth component of the linear system is va − vb = 0. Thus, va = vb whenever the vertices are connected by an edge. If two vertices are in the same connected component, then they can be connected by a path, and the values va = vb = · · · at each vertex on the path must be equal. Thus, the values of va on all vertices in the connected component are equal, and hence v = c1 v1 + · · · + ck vk can be written as a linear combination of the basis vectors, with ci being the common value of the entries va corresponding to vertices in the ith connected component. Thus, v1 , . . . , vk span the kernel. Moreover, since the coefficients ci coincide with certain entries va of v, the only linear combination giving the zero vector is when all ci are zero, proving their linear independence.

♦ 2.6.12. If the incidence matrix has rank r, then # circuits = dim coker A = n − r = dim ker A ≥ 1, since ker A always contains the vector ( 1, 1, . . . , 1 )T . 2.6.13. Changing the direction of an edge is the same as multiplying the corresponding row of the incidence matrix by −1. The dimension of the cokernel, being the number of independent circuits, does not change. Each entry of a cokernel vector that corresponds to an edge that has been reversed is multiplied by −1. This can be realized by left multiplying the incidence matrix by a diagonal matrix whose diagonal entries are −1 is the corresponding edge has been reversed, and +1 if it is unchanged. ♥ 2.6.14. (a) Note that P permutes the rows of A, and corresponds to a relabeling of the vertices of the digraph, while Q permutes its columns, and so corresponds to a relabeling of the edges. (b) (i),(ii),(v ) represent equivalent digraphs; none of the others are equivalent. b = P v = (v (c) v = (v1 , . . . , vm ) ∈ coker A if and only if v π(1) . . . vπ(m) ) ∈ coker B. Indeed, b T B = (P v)T P A Q = vT A Q = 0 since, according to Exercise 1.6.14, P T = P −1 is the v inverse of the permutation matrix P .

70

2.6.15. False. For example, any two inequivalent trees, cf. Exercise 2.6.8, with the same number of nodes have incidence matrices of the same size, with trivial cokernels: coker A = coker B = {0}. As another example, the incidence matrices 1 0 1 0 1 −1 0 0 0 1 −1 0 0 0 B C B 0 1 −1 0 0C 1 −1 0 0C C B 0 B B C C B B −1 C C 0 1 0 0 and B = A=B −1 0 1 0 0 B C C B @ 1 @ 1 0 0 −1 0A 0 0 −1 0A 0 1 0 0 −1 1 0 0 0 −1 both have cokernel basis ( 1, 1, 1, 0, 0 )T , but do not represent equivalent digraphs.

2.6.16. (a) If the first k vertices belong to one component and the last n−k to the other, then there is no edge between the two sets of vertices and so the entries aij = 0 whenever i = 1, . . . , k, j = k + 1, . . . , n, or when i = k + 1, . . . , n, j = 1, . . . , k, which proves that A has the indicated block form. (b) The graph consists of two disconnected triangles. If we use 1, 2, 3 to label the vertices in one triangle and 4, 5, 6 for those in the second, the resulting incidence matrix has the in0 1 −1 0 0 0 01 B 0 1 −1 0 0 0C C B B −1 0 1 0 0 0C C C, with each block a 3 × 3 submatrix. B dicated block form B C B 0 0 0 1 −1 0 C B @ 0 0 0 0 1 −1 A 0 0 0 −1 0 1

71


3.1.1. Bilinearity: h c u + d v , w i = (c u1 + d v1 ) w1 − (c u1 + d v1 ) w2 − (c u2 + d v2 ) w1 + b (c u2 + d v2 ) w2 = c (u1 w1 − u1 w2 − u2 w1 + b u2 w2 ) + d (v1 w1 − v1 w2 − v2 w1 + b v2 w2 ) = c h u , w i + d h v , w i, h u , c v + d w i = u1 (c v1 + d w1 ) − u1 (c v2 + d w2 ) − u2 (c v1 + d w1 ) + b u2 (c v2 + d w2 ) = c (u1 v1 − u1 v2 − u2 v1 + b u2 v2 ) + d (u1 w1 − u1 w2 − u2 w1 + b u2 w2 ) = c h u , v i + d h u , w i. Symmetry: h v , w i = v1 w1 − v1 w2 − v2 w1 + b v2 w2 = w1 v1 − w1 v2 − w2 v1 + b w2 v2 = h w , v i. To prove positive definiteness, note h v , v i = v12 − 2 v1 v2 + b v22 = (v1 − v2 )2 + (b − 1) v22 > 0

for all

v = ( v1 , v2 )T 6= 0

if and only if b > 1. (If b = 1, the formula is only positive semi-definite, since v = ( 1, 1 ) T gives h v , v i = 0, for instance.) 3.1.2. (a), (f ) and (g) define inner products; the others don’t. 3.1.3. It is not positive definite, since if v = ( 1, −1 )T , say, h v , v i = 0. 3.1.4. (a) Bilinearity: h c u + d v , w i = (c u1 + d v1 ) w1 + 2 (c u2 + d v2 ) w2 + 3 (c u3 + d v3 ) w3 = c (u1 w1 + 2 u2 w2 + 3 u3 w3 ) + d (v1 w1 + 2 v2 w2 + 3 v3 w3 ) = c h u , w i + d h v , w i, h u , c v + d w i = u1 (c v1 + d w1 ) + 2 u2 (c v2 + d w2 ) + 3 u3 (c v3 + d w3 ) = c (u1 v1 + 2 u2 v2 + 3 u3 v3 ) + d (u1 w1 + 2 u2 w2 + 3 u3 w3 ) = c h u , v i + d h u , w i. Symmetry: h v , w i = v1 w1 + 2 v2 w2 + 3 v3 w3 = w1 v1 + 2 w2 v2 + 3 w3 v3 = h w , v i. Positivity: h v , v i = v12 + 2 v22 + 3 v32 > 0

for all

v = ( v1 , v2 , v3 )T 6= 0

because it is a sum of non-negative terms, at least one of which is strictly positive. 72

(b) Bilinearity: h c u + d v , w i = 4 (c u1 + d v1 ) w1 + 2 (c u1 + d v1 ) w2 + 2 (c u2 + d v2 ) w1 + + 4 (c u2 + d v2 ) w2 + (c u3 + d v3 ) w3 = c (4 u1 w1 + 2 u1 w2 + 2 u2 w1 + 4 u2 w2 + u3 w3 ) + + d (4 v1 w1 + 2 v1 w2 + 2 v2 w1 + 4 v2 w2 + v3 w3 ) = c h u , w i + d h v , w i, h u , c v + d w i = 4 u1 (c v1 + d w1 ) + 2 u1 (c v2 + d w2 ) + 2 u2 (c v1 + d w1 ) + + 4 u2 (c v2 + d w2 ) + u3 (c v3 + d w3 ) = c (4 u1 v1 + 2 u1 v2 + 2 u2 v1 + 4 u2 v2 + u3 v3 ) + + d (4 u1 w1 + 2 u1 w2 + 2 u2 w1 + 4 u2 w2 + u3 w3 ) = c h u , v i + d h u , w i. Symmetry: h v , w i = 4 v1 w1 + 2 v1 w2 + 2 v2 w1 + 4 v2 w2 + v3 w3 = 4 w1 v1 + 2 w1 v2 + 2 w2 v1 + 4 w2 v2 + w3 v3 = h w , v i. Positivity: h v , v i = 4 v12 + 4 v1 v2 + 4 v22 + v32 = (2 v1 + v2 )2 + 3 v22 + v32 > 0 for all v = ( v1 , v2 , v3 )T 6= 0, because it is a sum of non-negative terms, at least one of which is strictly positive. (c) Bilinearity: h c u + d v , w i = 2 (c u1 + d v1 ) w1 − 2 (c u1 + d v1 ) w2 − 2 (c u2 + d v2 ) w1 + + 3 (c u2 + d v2 ) w2 − (c u2 + d v2 ) w3 − (c u3 + d v3 ) w2 + 2 (c u3 + d v3 ) w3 = c (2 u1 w1 − 2 u1 w2 − 2 u2 w1 + 3 u2 w2 − u2 w3 − u3 w2 + 2 u3 w3 ) + + d (2 v1 w1 − 2 v1 w2 − 2 v2 w1 + 3 v2 w2 − v2 w3 − v3 w2 + 2 v3 w3 ) = c h u , w i + d h v , w i, h u , c v + d w i = 2 u1 (c v1 + d w1 ) − 2 u1 (c v2 + d w2 ) − 2 u2 (c v1 + d w1 ) + + 3 u2 (c v2 + d w2 ) − u2 (c v3 + d w3 ) − u3 (c v2 + d w2 ) + 2 u3 (c v3 + d w3 ) = c (2 u1 v1 − 2 u1 v2 − 2 u2 v1 + 3 u2 v2 − u2 v3 − u3 v2 + 2 u3 v3 ) + + d (2 u1 w1 − 2 u1 w2 − 2 u2 w1 + 3 u2 w2 − u2 w3 − u3 w2 + 2 u3 w3 ) = c h u , v i + d h u , w i. Symmetry: h v , w i = 2 v1 w1 − 2 v1 w2 − 2 v2 w1 + 3 v2 w2 − v2 w3 − v3 w2 + 2 v3 w3 = 2 w1 v1 − 2 w1 v2 − 2 w2 v1 + 3 w2 v2 − w2 v3 − w3 v2 + 2 w3 v3 = h w , v i. Positivity: h v , v i = 2 v12 − 4 v1 v2 + 3 v22 − 2 v2 v3 + 2 v32 = 2 (v1 − v2 )2 + (v2 − v3 )2 + v32 > 0 for all v = ( v1 , v2 , v3 )T 6= 0, because it is a sum of non-negative terms, at least one of which is strictly positive. cos t sin t √ , √ 2 5

T

3.1.5. (a)

( cos t, sin t ) , 1

-0.5

-1

1

0.5

0.5

-0.5

,

sin t sin t cos t + √ , √ 3 5

1

0.5

-1

!T

1

-1

-0.5

0.5

0.5

-0.5

-1

73

1

-1

-0.5

0.5

-0.5

-1

1

!T

.

(b) Note: By elementary analytical geometry, any quadratic equation of the form a x2 + b x y + c y 2 = 1 defines an ellipse provided a > 0 and b2 − 4 a c < 0. Case (b): The equation 2 v12 + 5 v22 = 1 defines an ellipse with semi-axes √1 , √1 . 2

5

Case (c): The equation v12 −2 v1 v2 +4 v22 = 1 also defines an ellipse by the preceding remark. ♦ 3.1.6. (a) The vector v = ( x, y )T can be viewed as the hypotenuse of a right triangle with side lengths x, y, and so by Pythagoras, k v k2 = x2 + y 2 . (b) First, theqprojection p = ( x, y, 0 )T of v = ( x, y, z )T onto the xy plane ha length k p k = x2 + y 2 by Pythagoras, as in part (a). Second, the right triangle formed by 0, p and v has side lengths k p k and z, and, again by Pythagoras, k v k2 = k p k2 + z 2 = x2 + y 2 + z 2 . ♦ 3.1.7. k c v k =

q

h cv , cv i =

q

c2 h v , v i = | c | k v k.

3.1.8. By bilinearity and symmetry, h av + bw , cv + dw i = ah v , cv + dw i + bh w , cv + dw i = ach v , v i + adh v , w i + bch w , v i + bdh w , w i = a c k v k2 + (a d + b c)h v , w i + b d k w k2 .

♦ 3.1.9. If we know the first bilinearity property and symmetry, then the second follows: h u , c v + d w i = h c v + d w , u i = c h v , u i + d h w , u i = c h u , v i + d h u , w i. ♦ 3.1.10. (a) Choosing v = x, we have 0 = h x , x i = k x k2 , and hence x = 0. (b) Rewrite the condition as 0 = h x , v i − h y , v i = h x − y , v i for all v ∈ V . Now use part (a) to conclude that x − y = 0 and so x = y. (c) If v is any element of V , then we can write v = c1 v1 +· · ·+cn vn as a linear combination of the basis elements, and so, by bilinearity, h x , v i = c1 h x , v1 i + · · · + cn h x , vn i = 0. Since this holds for all v ∈ V , the result in part (a) implies x = 0. ♦ 3.1.11. 2 2 (a) k u + v k − k u − v k = h u + v , u + v i − h u − v , u − v i “

”

“

= hu,ui + 2hu,vi + hv,vi − hu,ui − 2hu,vi + hv,vi

(b) h v , w i =

1 4

h

2

2

(v1 + w1 ) − 3(v1 + w1 )(v2 + w2 ) + 5(v2 + w2 )

−

1 4

h

= v1 w1 −

i

−

(v1 − w1 )2 − 3(v1 − w1 )(v2 − w2 ) + 5(v2 − w2 )2

3 2 v1 w2

−

3 2 v2 w1

+ 5 v2 w2 .

”

= 4 h u , v i.

i

3.1.12. “ ” “ ” 2 2 2 2 2 2 (a) k x + y k + k x − y k = k x k + 2 h x , y i + k y k + k x k − 2 h x , y i + k y k = 2 k x k2 + 2 k y k2 .

(b) The sum of the squared lengths of the diagonals in a parallelogram equals the sum of the squared lengths of all four sides:

74

kxk kx + yk kyk kyk kx − yk kxk ”

“

3.1.13. By Exercise 3.1.12, k v k2 = 12 k u + v k2 + k u − v k2 − k u k2 = 17, so k v k = The answer is the same in all norms coming from inner products.

√ 17 .

3.1.14. Using (3.2), v · (A w) = vT A w = (AT v)T w = (AT v) · w. ♦ 3.1.15. First, if A is symmetric, then

(A v) · w = (A v)T w = vT AT w = vT A w = v · (A w). To prove the converse, note that A ej gives the j th column of A, and so aij = ei · (A ej ) = (A ei ) · ej = aji for all i, j. Hence A = AT .

3.1.16. The inner product axioms continue to hold when restricted to vectors in W since they hold for all vectors in V , including those in W . 3.1.17. Bilinearity: hhh c u + d v , w iii = h c u + d v , w i + hh c u + d v , w ii = c h u , w i + d h v , w i + c hh u , w ii + d hh v , w ii = c hhh u , w iii + d hhh v , w iii, hhh u , c v + d w iii = h u , c v + d w i + hh u , c v + d w ii = c h u , v i + d h u , w i + c hh u , v ii + d hh u , w ii = c hhh u , v iii + d hhh u , w iii. Symmetry: hhh v , w iii = h v , w i + hh v , w ii = h w , v i + hh w , v ii = hhh w , v iii. Positivity: hhh v , v iii = h v , v i + hh v , v ii > 0 for all v 6= 0 since both terms are positive. ♦ 3.1.18. Bilinearity: e , w) e , (v b , w) b iii = hhh (c v + d v e , c w + d w) e , (v b , w) b iii hhh c (v, w) + d (v e ,v b i + hh c w + d w e ,w b ii = hcv + dv b i + dhv e ,v b i + chw,w b i + dhw e ,w b i = chv,v b , w) b iii + d hhh (v e , w) e , (v b , w) b iii, = c hhh (v, w) , (v e , w) e + d (v b , w) b iii = hhh (v, w) , (c v e + dv b, c w e + d w) b iii hhh (v, w) , c (v e + dv b i + hh w , c w e + dw b ii = hv,cv e i + dhv,v b i + chw,w e i + dhw,w b i = chv,v e , w) e iii + d hhh (v, w) , (v b , w) b iii. = c hhh (v, w) , (v Symmetry: e , w) e iii = h v , v e i + hh w , w e ii = h v e , v i + hh w e , w ii = hhh (v e , w) e , (v, w) iii. hhh (v, w) , (v Positivity: hhh (v, w) , (v, w) iii = h v , v i + hh w , w ii > 0

75

for all (v, w) 6= (0, 0), since both terms are non-negative and at least one is positive because either v 6= 0 or w 6= 0.

3.1.19. (a) h 1 , x i =

1 2,

k 1 k = 1, k x k =

√1 ; 3

√1 , 2

(b) h cos 2 πx , sin 2 πx i = 0, k cos 2 πx k = (c) h x , ex i = 1, k x k =

(d)

*

1 (x + 1) , x+1 2

3.1.20. (a) h f , g i = (c) h f , g i =

8 15

3 4

+

k ex k =

√1 , 3

, kf k =

1 2

k sin 2 πx k =

1 2 2 (e − 1) ; s ‚ ‚ ‚ 31 ‚ ‚ 2‚ ‚ ‚ (x + 1) ‚ = , ‚ 5 ‚

3 = , 2

, kf k =

r

√1 3

, kgk =

, kgk =

r

7 6

r

28 15

1 x+1

‚ ‚ ‚ ‚ ‚

√1 ; 2

1 = √ . 2

; (b) h f , g i = 0, k f k =

r

2 3

, kgk =

r

56 15

;

.

3.1.21. All but (b) are inner products. (b) is not because it fails positivity: for instance, Z 1

−1

(1 − x)2 x dx = − 43 .

3.1.22. If f (x) is any nonzero function that satisfies f (x) = 0 for all 0 ≤ x ≤ 1 then h f , f i = 0. ( x, −1 ≤ x ≤ 0, An example is the function f (x) = However, if the function f ∈ C0 [ 0, 1 ] 0, 0 ≤ x ≤ 1. is only considered on [ 0, 1 ], its values outside the interval are irrelevant, and so the positivity is unaffected. The formula does define an inner product on the subspace of polynomial functions because h p , p i = 0 if and only if p(x) = 0 for all 0 ≤ x ≤ 1, which implies p(x) ≡ 0 for all x since only the zero polynomial can vanish on an interval. 3.1.23. (a) No — positivity doesn’t hold since if f (0) = f (1) = 0 then h f , f i = 0 even if f (x) 6= 0 for any 0 < x < 1; (b) Yes. Bilinearity and symmetry are readily established. As for positivity, h f , f i = f (0)2 +f (1)2 +

Z 1 0

f (x)2 dx ≥ 0 is a sum of three non-negative quantities, and is

equal to 0 if and only if all three terms vanish, so f (0) = f (1) = 0 and which, by continuity, implies f (x) ≡ 0 for all 0 ≤ x ≤ 1. √ √ 3.1.24. No. For example, on [ − 1, 1 ], k 1 k = 2 , but k 1 k2 = 2 6= k 12 k = 2. ♦ 3.1.25. Bilinearity: h cf + dg , h i = =

Z bh

a Z bh

=c

i

Z bh a

f (x)2 dx = 0

i

c f (x) h(x) + d g(x) h(x) + c f ′ (x) h′ (x) + d g ′ (x) h′ (x) dx i

f (x) h(x) + f ′ (x) h′ (x) dx + d

= c h f , h i + d h g , h i. h f , cg + dh i =

0

{ c f (x) + d g(x) } h(x) + { c f (x) + d g(x) }′ h′ (x) dx

a Z bh a

Z 1

Z bh a

i

f (x) { c g(x) + d h(x) } + f ′ (x) { c g(x) + d h(x) }′ dx

76

i

g(x) h(x) + g ′ (x) h′ (x) dx

=

Z bh

=c Symmetry:

i

f (x) g(x) + f ′ (x) g ′ (x) dx + d

= c h f , g i + d h f , h i.

hf ,gi = Positivity

a Z bh a

Z bh a

i

c f (x) g(x) + d f (x) h(x) + c f ′ (x) g ′ (x) + d f ′ (x) h′ (x) dx

i

f (x) g(x) + f ′ (x) g ′ (x) dx =

hf ,f i =

Z bh a

Z bh a

Z bh a

i

f (x) h(x) + f ′ (x) h′ (x) dx

i

g(x) f (x) + g ′ (x) f ′ (x) dx = h g , f i.

i

f (x)2 + f ′ (x)2 dx > 0

for all

f 6≡ 0,

since the integrand is non-negative, and, by continuity, the integral is zero if and only if both f (x) ≡ 0 and f ′ (x) ≡ 0 for all a ≤ x ≤ b. 3.1.26. (a) No, because if f (x) is any constant function, then h f , f i = 0, and so positive definiteness does not hold. (b) Yes. To prove the first bilinearity condition: h cf + dg , h i =

Z 1 h

=c

−1 Z 1

i

c f ′ (x) + d g ′ (x) h′ (x) dx f ′ (x) h′ (x) dx + d

−1

Z 1

−1

g ′ (x) h′ (x) dx = c h f , h i + d h g , h i.

The second has a similar proof, or follows from symmetry, cf. Exercise 3.1.9. To prove symmetry: Z 1

hf ,gi =

f ′ (x) g ′ (x) dx =

−1 Z 1

As for positivity, h f , f i =

−1

Z 1

−1

g ′ (x) f ′ (x) dx = h g , f i.

f ′ (x)2 dx ≥ 0. Moreover, since f ′ is continuous,

h f , f i = 0 if and only if f ′ (x) ≡ 0 for all x, and so f (x) ≡ c is constant. But the only constant function in W is the zero function, and so h f , f i > 0 for all 0 6= f ∈ W .

♦ 3.1.27. Suppose h(x0 ) = k > 0 for some a < x0 < b. Then, by continuity, h(x) ≥ 21 k for a < x0 − δ < x < x0 + δ < b for some δ > 0. But then, since h(x) ≥ 0 everywhere, Z b a

h(x) dx ≥

Z x0 +δ x0 −δ

h(x) dx ≥ k δ > 0,

which is a contradiction. A similar contradiction can be shown when h(x0 ) = k < 0 for some a < x0 < b. Thus h(x) = 0 for all a < x < b, which, by continuity, also implies h(a) = h(b) = 0. The function in (3.14) gives a discontinuous counterexample. ♦ 3.1.28. (a) To prove the first bilinearity condition: h cf + dg , h i =

Z bh

=c

a Z b a

i

c f (x) + d g(x) h(x) w(x) dx f (x) h(x) w(x) dx + d

Z b a

g(x) h(x) w(x) dx = c h f , h i + d h g , h i.

The second has a similar proof, or follows from symmetry, cf. Exercise 3.1.9. To prove symmetry: hf ,gi =

Z b a

f (x) g(x) w(x) dx =

As for positivity, h f , f i =

Z b a

Z b a

g(x) f (x) w(x) dx = h g , f i.

f (x)2 w(x) dx ≥ 0. Moreover, since w(x) > 0 and the inte77

grand is continuous, Exercise 3.1.27 implies that h f , f i = 0 if and only if f (x)2 w(x) ≡ 0 for all x, and so f (x) ≡ 0. (b) If w(x0 ) < 0, then, by continuity, w(x) < 0 for x0 − δ ≤ x ≤ x0 + δ for some δ > 0. Now choose f (x) 6≡ 0 so that f (x) = 0 whenever | x − x0 | > δ. Then Z b

hf ,f i =

a

f (x)2 w(x) dx =

Z x0 +δ

f (x)2 w(x) dx < 0,

x0 −δ

violating positivity.

(c) Bilinearity and symmetry continue to hold. The positivity argument says that h f , f i = 0 implies that f (x) = 0 whenever w(x) > 0. By continuity, f (x) ≡ 0, provided w(x) 6≡ 0 on any open subinterval a ≤ c < x < d ≤ b, and so under this assumption it remains an inner product. However, if w(x) ≡ 0 on a subinterval, then positivity is violated. ♥ 3.1.29. (a) If f (x0 , y0 ) = k > 0 then, Zby continuity, f (x, y) ≥ Z ZZ for some δ > 0. But then

(b) Bilinearity:

h cf + dg , h i =

ZZ

=c

Ω ZZ

h

Ω

f (x, y) dx dy ≥

k for (x, y) ∈ D = { k x − x0 k ≤ δ }

f (x, y) dx dy ≥

D

i

{ c f (x, y) + d g(x, y) } h(x, y) dx dy f (x, y) h(x, y) dx dy + d

Ω

1 2

ZZ

h

Ω

1 2

π k δ 2 > 0.

i

g(x, y) h(x, y) dx dy = c h f , h i + d h g , h i.

The second bilinearity conditions follows fromZthe first and symmetry: Z ZZ hf ,gi =

Ω

f (x, y) g(x, y) dx dy =

Positivity; using part (a),

hf ,f i =

ZZ

Ω

2 3,

g(x, y) f (x, y) dx dy = h g , f i.

[ f (x, y) ]2 dx dy > 0

The formula for the norm is k f k = 3.1.30. (a) h f , g i =

Ω

k f k = 1, k g k =

sZ Z

q

28 45

Ω

;

for all

f 6≡ 0.

[ f (x, y) ]2 dx dy . (b) h f , g i =

1 2

π, k f k =

♥ 3.1.31. (a) To prove the first bilinearity condition: hh c f + d g , h ii =

Z 1h

=c

q √ π , k g k = π3 .

i

(c f1 (x) + d g1 (x)) h1 (x) + (c f2 (x) + d g2 (x)) h2 (x) dx

0 Z 1h 0

i

f1 (x) h1 (x) + f2 (x) h2 (x) dx + d

Z 1h 0

i

g1 (x) h1 (x) + g2 (x) h2 (x) dx

= c hh f , h ii + d hh g , h ii. The second has a similar proof, or follows from symmetry, cf. Exercise 3.1.9. To prove symmetry: hh f , g ii =

Z 1h 0

i

f1 (x) g1 (x) + f2 (x) g2 (x) dx =

As for positivity, hh f , f ii =

Z 1h 0

Z 1h 0

i

2

i

g1 (x) f1 (x) + g2 (x) f2 (x) dx = hh g , f ii.

f1 (x) + f2 (x)2 dx ≥ 0, since the integrand is a non-

negative function. Moreover, since f1 (x) and f2 (x) are continuous, so is f1 (x)2 + f2 (x)2 , and hence hh f , f ii = 0 if and only if f1 (x)2 + f2 (x)2 = 0 for all x, and so f (x) = ( f1 (x), f2 (x) )T ≡ 0. (b) First bilinearity: hh c f + d g , h ii =

Z 1

=c

h c f (x) + d g(x) , h(x) i dx

0 Z 1 0

h f (x) , h(x) i dx + d 78

Z 1 0

h g(x) , h(x) i dx = c hh f , h ii + d hh g , h ii.

Symmetry: hh f , g ii = Positivity: hh f , f ii =

Z 1

Z 1 0

0

h f (x) , g(x) i dx =

Z 1 0

h g(x) , f (x) i dx = hh g , f ii.

k f (x) k2 dx ≥ 0 since the integrand is non-negative. Moreover,

h f , f i = 0 if and only if k f (x) k2 = 0 for all x, and so, in view of the continuity of k f (x) k2 we conclude that f (x) ≡ 0. ! 1 −1 T >0 (c) This follows because h v , w i = v1 w1 −v1 w2 −v2 w1 +3 v2 w2 v K w for K = −1 3 2 defines an inner product on R .

3.2.1. (a) (b) (c) (d)

√ √ | v1 · v2 | = 3 ≤ 5 = 5 5 = k v1 k k v2 k; angle: cos−1 53 ≈ .9273; √ √ | v1 · v2 | = 1 ≤ 2 = 2 2 = k v1 k k v2 k; angle: 32 π ≈ 2.0944; √ √ √ | v1 · v2 | = 0 ≤ 2 6 = 2 12 = k v1 k k v2 k; angle: 12 π ≈ 1.5708; √ √ √ | v1 · v2 | = 3 ≤ 3 2 = 3 6 = k v1 k k v2 k; angle: 34 π ≈ 2.3562; „ √ √ √ (e) | v1 · v2 | = 4 ≤ 2 15 = 10 6 = k v1 k k v2 k; angle: cos−1 − √2

15

«

≈ 2.1134.

3.2.2. (a) 31 π; (b) 0, 31 π, 12 π, 23 π, or π, depending upon whether −1 appears 0, 1, 2, 3 or 4 times in the second vector. 3.2.3. The side lengths are all equal to k (1, 1, 0) − (0, 0, 0) k = k (1, 1, 0) − (1, 0, 1) k = k (1, 1, 0) − (0, 1, 1) k = · · · =

The edge angle is

1 3

√ 2.

π = 60◦ . The center angle is cos θ = − 31 , so θ = 1.9106 = 109.4712◦ .

3.2.4. √ √ (a) | v · w | = 5 ≤ 7.0711 = 5 √10 = k v k k w k. (b) | v · w | = 11 ≤ 13.0767 = 3 √= k v k k w k. √ 19 (c) | v · w | = 22 ≤ 23.6432 = 13 43 = k v k k w k. 3.2.5. √ √ (a) | v · w | = 6 ≤ 6.4807 = 14 √ 3 =√k v k k w k. (b) | h v , w i | = 11 ≤ 11.7473 = √23 √6 = k v k k w k. (c) | h v , w i | = 19 ≤ 19.4936 = 38 10 = k v k k w k. 3.2.6. Set v = ( a, b )T , w = ( cos θ, sin θ )T , so that Cauchy–Schwarz gives | v · w | = | a cos θ + b sin θ | ≤

q

a2 + b2 = k v k k w k.

3.2.7. Set v = ( a1 , . . . , an )T , w = ( 1, 1, . . . , 1 )T , so that Cauchy–Schwarz gives √ q | v · w | = | a1 + a2 + · · · + an | ≤ n a21 + a22 + · · · + a2n = k v k k w k. Equality holds if and only if v = a w, i.e., a1 = a2 = · · · = an .

♦ 3.2.8. Using (3.20), k v − w k2 = h v − w , v − w i = k v k2 − 2 h v , w i + k w k2 = k v k2 + k w k2 − 2 k v k k w k cos θ. ♦ 3.2.9. Since a ≤ | a | for any real number a, so h v , w i ≤ | h v , w i | ≤ k v k k w k.

♦ 3.2.10. Expanding k v + w k2 − k v − w k2 = 4 h v , w i = 4 k v k k w k cos θ. 79

kv + wk

kwk

θ

kv − wk kvk

♥ 3.2.11. (a) It is not an inner product. Bilinearity holds, but symmetry and positivity do not. (b) Assuming v, w 6= 0, we compute sin2 θ = 1 − cos2 θ =

(v × w)2 k v k2 k w k2 − h v , w i = . k v k2 k w k2 k v k2 k w k2

The result follows by taking square roots of both sides, where the sign is fixed by the orientation of the angle. (c) By (b), v × w = 0 if and only if sin θ = 0 or v = 0 or w = 0,!which implies that they v1 w1 = 0 if and only if the are parallel vectors. Alternative proof: v × w = det v2 w2 columns v, w of the matrix are linearly dependent, and hence parallel vectors. (d) The parallelogram has side length k v k and height k w k | sin θ |, so its area is k v k k w k | sin θ | = | v × w |. 3.2.12. q (a) | h f , g i | = 1 ≤ 1.03191 = 13 (b) | h f , g i | = (c) | h f , g i | =

3.2.13. (a)

1 2

π,

q

1 1 2 2 eq −q 2 = k f k k g k; 2/e = .7358 ≤ 1.555 = 23 12 (e2 − e−2 ) = k f k k g k; √ √ 1 −1 e − 1 = k f k k g k. 2 = .5 ≤ .5253 = 2 − 5 e

(b) cos

3.2.14. (a) | h f , g i | =

3.2.15. (a) a = − 43 ;

2 3

−1

≤

√ 2 2 √ = .450301, π

q

28 45

= k f k k g k;

(c)

1 2

π.

(b) | h f , g i | =

π 2

≤

π √ 3

= k f k k g k.

(b) no.

3.2.16. All scalar multiples of 3.2.17. 3.2.15: a = 0.

“

”T 1 7 , − , 1 . 2 4 0

“

”T 1 21 . 6 , − 24 , 1 1 1 0

3.2.16: all scalar multiples of

0

1

0

1

2 1 2 1 C B C B C C B B B −3 C B −2 C B −2 C B −3 C C. C + bB C, so v = aB C, B 3.2.18. All vectors in the subspace spanned by B @ 0A @ 1A @ 1A @ 0A 1 0 1 0 3.2.19. ( −3, 0, 0, 1 )T ,

“

0, − 32 , 0, 1

”T

, ( 0, 0, 3, 1 )T . 80

3.2.20. For example, u = ( 1, 0, 0 )T , v = ( 0, 1, 0 )T , w = ( 0, 1, 1 )T are linearly independent, whereas u = ( 1, 0, 0 )T , v = w = ( 0, 1, 0 )T are linearly dependent. 3.2.21. (a) All solutions to a + b = 1; (b) all solutions to a + 3 b = 2. 3.2.22. Only the zero vector satisfies h 0 , 0 i = k 0 k2 = 0.

♦ 3.2.23. Choose v = w; then 0 = h w , w i = k w k2 , and hence w = 0.

3.2.24. h v + w , v − w i = k v k2 − k w k2 = 0 provided k v k = k w k. They can’t both be unit vectors: k v + w k2 = k v k2 + 2 h v , w i + k w k2 = 2 + 2 h v , w i = 1 if and only if h v , w i = − 12 , while k v − w k2 = 2 − 2 h v , w i = 1 if and only if h v , w i = 12 , and so θ = 60◦ .

♦ 3.2.25. If h v , x i = 0 = h v , y i, then h v , c x + d y i = c h v , x i + d h v , y i = 0 for c, d, ∈ R, proving closure. 3.2.26. (a) h p1 , p2 i =

Z 1“

h p2 , p3 i =

(b) For n 6= m,

h sin n π x , sin m π x i =

x−

0Z 1“ 0

Z 1 0

1 2

x−

”

1 2

dx = 0, ”“

h p 1 , p3 i =

x2 − x +

1 6

”

sin n π x sin m π x dx =

“

”

“

3.2.28. p(x) = a (e − 1) x − 1 + b x2 − (e − 2) x

−x+

dx = 0.

Z 1

3.2.27. Any nonzero constant multiple of x2 − 31 .

Z 1“ x2 0

1 0 2

”

h

1 6

”

dx = 0,

i

cos(n + m) π x − cos(n − m) π x dx = 0.

for any a, b ∈ R.

3.2.29. 1 is orthogonal to x, cos πx, sin πx; x is orthogonal to 1, cos πx; cos πx is orthogonal to 1, x, sin πx; sin πx is orthogonal to 1, cos πx; ex is not orthogonal to any of the others. 3.2.30. Example: 1 and x − 23 . √ 5 3.2.31. (a) θ = cos−1 √ ≈ 0.99376 radians; (b) v · w = 5 < 9.165 ≈ 84 = k v k k w k, 84 “ ”T √ √ √ k v + w k = 30 ≈ 5.477 < 6.191 ≈ 14 + 6 = k v k + k w k; (c) − 73 t, − 31 t, t . 3.2.32. √ (a) k v1 + v2 k = 4 √≤ 2 5 √= k v1 k + k v2 k; (b) k v1 + v2 k = √2 ≤ 2√2 = √ k v1 k + k v2 k; (c) k v1 + v2 k = √14 ≤√ 2 +√ 12 = k v1 k + k v2 k; (d) k v1 + v2 k = √3 ≤ √3 + √ 6 = k v1 k + k v2 k; (e) k v1 + v2 k = 8 ≤ 10 + 6 = k v1 k + k v2 k.

3.2.33. √ √ √ (a) k v1 + v2 k = √5 ≤ 5 + √ 10 = k v1 k + k v2 k; (b) k v1 + v2 k = √6 ≤ 3√ + 19√ = k v1 k + k v2 k; (c) k v1 + v2 k = 12 ≤ 13 + 43 = k v1 k + k v2 k. 3.2.34. q (a) k f + g k = 11 6 + (b) k f + g k =

q

2 3

+

q q 1 1 2 1 2 e ≈ 2.35114 ≤ 2.36467 ≈ + 2 3 2e 1 2 1 −2 −1 − 2 e ≈ 2.40105 ≤ 2 e + 4e

81

− 1 = k f k + k g k;

q

q

≤ 2.72093 ≈ 23 + 12 (e2 − e−2 ) = k f k + k g k; √ √ √ (c) k f + g k = 2 + e − 5 e−1 ≈ 1.69673 ≤ 1.71159 ≈ 2 − 5 e−1 + e − 1 = k f k + k g k. 3.2.35. q q ≈ 1.71917 ≤ 2.71917 ≈ 1 + 28 (a) k f + g k = 133 45 45 = k f k + k g k; q q √ 7 (b) k f + g k = 3 π ≈ 2.70747 ≤ 2.79578 ≈ π + 13 π = k f k + k g k.

3.2.36. (a) h 1 , x i = 0, so θ = 12 π. Yes, they are orthogonal. q q √ (b) Note k 1 k = 2, k x k = 23 , so h 1 , x i = 0 < 43 = k 1 k k x k, and q q √ k 1 + x k = 83 ≈ 1.63299 < 2.23071 ≈ 2 + 23 = k 1 k + k x k. “ (c) h 1 , p i = 2 a + 32 c = 0 and h x , p i = 23 b = 0 if and only if p(x) = c x2 − 3.2.37. ˛ (a)

s s ˛ Z 1 Z 1 ˛ ˛Z 1 ˛ ˛ x 2 x ˛≤ ˛ f (x) g(x) e dx f (x) e dx g(x)2 ex dx ˛ ˛ 0 0 0 s s s Z 1h Z 1 Z 1 i2 2 x x

1 3

”

.

,

f (x) e dx + g(x)2 ex dx ; 0 q √ (b) h f , g i = 21 (e2 − 1) = 3.1945 ≤ 3.3063 = e − 1 13 (e3 − 1) = k f k k g k, q q √ k f + g k = 13 e3 + e2 + e − 73 = 3.8038 ≤ 3.8331 = e − 1 + 13 (e3 − 1) = k f k + k g k; √ 3 e2 − 1 q (c) cos θ = = .9662, so θ = .2607. 2 (e − 1)(e3 − 1) 0

f (x) + g(x)

3.2.38. ˛ (a) s

e dx ≤

0

˛ ˛Z 1 h ˛ i ˛ ˛ ′ ′ ˛ ˛ f (x) g(x) + f (x) g (x) dx ˛ 0 ˛

Z 1h

≤

s Z 1h 0

i

2

′

f (x) + f (x)

s Z 1h

2

i

dx

s

Z 1h

i

0

i

g(x)2 + g ′ (x)2 dx; rh

i

[f (x) + g(x)]2 + [f ′ (x) + g ′ (x)]2 dx ≤ f (x)2 + f ′ (x)2 dx + g(x)2 + g ′ (x)2 . 0 0 √ (b) h f , g i = e − 1 ≈ 1.7183 ≤ 2.5277 ≈ 1 · e2 − 1 = k f k k g k; √ √ k f + g k = e2 + 2 e − 2 ≈ 3.2903 ≤ 3.5277 ≈ 1 + e2 − 1 = k f k + k g k. (c) cos θ =

s

e−1 ≈ .6798, so θ ≈ .8233. e+1

3.2.39. Using the triangle inequality, k v k = k (v − w) + w k ≤ k v − w k + k w k. Therefore, k v − w k ≥ k v k − k w k. Switching v and w proves v k v − w k ≥ k w k − k v k, and the result is the combination of both inequalities. In the figure, the inequality states that the length of the side v−w of the triangle opposite the origin is at least as large as the difference between the other two lengths.

v−w w

3.2.40. True. By the triangle inequality, k w k = k (− v) + (w + v) k ≤ k − v k + k v + w k = k v k + k v + w k. ♥ 3.2.41. (a) This follows immediately by identifying R ∞ with the space of all functions f : N → R where N = { 1, 2, 3, . . . } are the natural numbers. Or, one can tediously verify all the vector space axioms.

82

(b) If x, y ∈ ℓ2 , then, by the triangle inequality on R n , n X

k=1

(xk + yk )2 ≤

0v u u n Bu X @t k=1

x2k +

v u n uX u t k=1

1

yk2 C A

2

≤

0v u u ∞ Bu X @t k=1

x2k +

v u ∞ uX u t k=1

1

2

yk2 C A < ∞,

and hence, in the limit as n → ∞, the series of non-negative terms is also bounded: ∞ X

k=1

(xk + yk )2 < ∞, proving that x + y ∈ ℓ2 .

(c) (1, 0, 0, 0, . . . ), (1, 12 , 41 , 81 , . . . ) and (1, 12 , 31 , 14 , . . . ) are in ℓ2 , while (1, 1, 1, 1, . . . ) and (1, √1 , √1 , √1 , . . . ) are not. (d) True. convergence of

∞ X

k=1

2

3

4

x2k requires x2k → 0 as k → ∞ and hence xk → 0.

(e) False – see last example in part (b). (f )

∞ X

k=1

x2k =

∞ X

α2 k is a geometric series which converges if and only if | α | < 1.

k=1

(g) Using the integral test,

∞ X

k=1

x2k =

∞ X

k=1

k2 α converges if and only if 2 α < −1, so α < − 21 .

(h) First, we need to prove that it is well-defined, which we do by proving that the series is absolutely convergent. If x, y ∈ ℓ2 , then, by the Cauchy–Schwarz inequality on R n for the vectors ( | x1 |, . . . , | xn | )T , ( | y1 |, . . . , | yn | )T , n X

k=1

| xk yk | ≤

v u n uX u t k=1

v u u n 2 uX xk t k=1

and hence, letting n → ∞, we conclude that

v u ∞ uX u t k=1

yk2 ≤ n X

k=1

v u u ∞ 2 uX xk t k=1

yk2 < ∞,

| xk yk | < ∞. Bilinearity, symmetry

and positivity are now straightforward to verify. (i)

∞ X

k=1

| xk yk | ≤

v u ∞ uX u t k=1

v u u ∞ 2 uX xk t k=1

yk2

,

v u ∞ uX u t (xk k=1

3.3.1. k v + w k1 = √ 2 ≤ 2 = 1 + 1 = k v k 1 + k w k1 ; k v + w k2 = √2 ≤ 2 = 1 + 1 = k v k2 + k w k2 ; k v + w k 3 = 3 2 ≤ 2 = 1 + 1 = k v k 3 + k w k3 ; k v + w k ∞ = 1 ≤ 2 = 1 + 1 = k v k ∞ + k w k∞ .

+ yk )

3.3.2. (a) k v + w k1 = 6√ ≤6=3 1 + k w k1 ; √+ 3 = √ k v k√ k v + w k2 = 3√ 2 ≤ 2 √5 = √5 + √5 = k v k2 + k w k2 ; k v + w k3 = 3 54 ≤ 2 3 9 = 3 9 + 3 9 = k v k3 + k w k3 ; k v + w k ∞ = 3 ≤ 4 = 2 + 2 = k v k ∞ + k w k∞ . (b) k v + w k1 = 2 = k v√ k 1 + k w k1 ; √≤ 4 =√2 + 2√ k v + w k2 = √2 ≤ 2 √2 = √2 + √2 = k v k2 + k w k2 ; k v + w k 3 = 3 2 ≤ 2 3 2 = 3 2 + 3 2 = k v k 3 + k w k3 ; k v + w k ∞ = 1 ≤ 2 = 1 + 1 = k v k ∞ + k w k∞ . (c) k v + w k1 = 10 √ ≤ 10√= 4 +√6 = k v k1 + k w k1 ; k v + w k2 = 34 ≤ 6 + 14 = k v k2 + k w k2 ; 83

2

≤

v u ∞ uX u t k=1

x2k

+

v u ∞ uX u t k=1

yk2 .

√ √ √ k v + w k3 = 3 118 ≤ 3 10 + 3 36 = k v k3 + k w k3 ; k v + w k ∞ = 4 ≤ 5 = 2 + 3 = k v k ∞ + k w k∞ .

3.3.3. (a) k u − v k1 = 5, k v − w k1 = 7, so u, √ k u − w k1 = 6, √ √v are closest. (b) k u − v k2 = 13, k u − w k2 = 12, k v − w k2 = 21, so u, w are closest. (c) k u − v k∞ = 3, k u − w k∞ = 2, k v − w k∞ = 4, so u, w are closest.

3.3.4. (a) k f k∞ = 23 , k g k∞ = 41 ; (b) k f + g k∞ = 3.3.5. (a) k f k1 =

5 18 ,

3.3.6. (a) k f − g k1 =

k g k1 = 61 ; (b) k f + g k1 =

1 2

5 6

4 √ 9 3

≤ ≤

2 3

+

5 18

1 4

+

= k f k ∞ + k g k∞ .

1 6

= k f k 1 + k g k1 .

= .5, k f − h k1 = 1 − π2 = .36338, k g − h k1 =

are closest. (b) k f − g k2 = q

2 3

2 π

q

1 3

= .57735, k f − h k2 =

q

3 2

−

4 π

1 2

− π1 = .18169, so g, h

= .47619, k g − h k2 =

− = .44352, so g, h are closest. (c) k f − g k∞ = 1, k f − h k∞ = 1, k g − h k∞ = 1, so they are equidistant.

3.3.7. (a) k f + g k1 = (b) k f

(c) k f (d) k f

3 5 k f k 1 + k g k1 ; 4 = .75 ≤ 1.3125 ≈ 1 + 16 = q q 31 7 = k f k 2 + k g k2 ; + g k2 = 48 ≈ .8036 ≤ 1.3819 ≈ 1 + 48 √ √ 3 3 39 41 + g k3 = 4 ≈ .8478 ≤ 1.4310 ≈ 1 + 8 = k f k3 + k g k3 ; + g k∞ = 54 = 1.25 ≤ 1.75 = 1 + 43 = k f k∞ + k g k∞ .

3.3.8. (a) k f + g k1 = eq− e−1 ≈ 2.3504 ≤ 2.3504 ≈ (e − 1) + (1 − e−1 ) = k f k1 + k g k1 ; (b) k f + g k2 = 12 e2 + 2 − 21 e−2 ≈ 2.3721 (c) k f (d) k f

q 1 1 −2 1 + = k f k 2 + k g k2 ; 2 2 − 2e q 3 2 3 2 −3 −1 + g k3 = 3 e + 3 e − 3 e − 3 e ≈ 2.3945 q q ≤ 2.5346 ≈ 3 13 e3 − 13 + 3 13 − 31 e−3 = k f k3 + k g k3 ; + g k∞ = e + e−1 ≈ 3.08616 ≤ 3.71828 ≈ e + 1 = k f k∞ + k g k∞ .

≤ 2.4448 ≈

q

1 2 2e

−

3.3.9. Positivity: since both summands are non-negative, k x k ≥ 0. Moreover, k x k = 0 if and only if x = 0 = x − y, and so x = ( x, y )T = 0. “ ” Homogeneity: k c x k = | c x | + 2 | c x − c y | = | c | | x | + 2 | x − y | = | c | k v k. Triangle inequality: k x + v k = |“x + v | + 2 ” | x + v“ − y − w | ” ≤ | x | + | v | + 2 | x − y | + | v − w | = k x k + k v k.

3.3.10. (a) Comes from weighted inner product h v , w i = 2 v1 w1 + 3 v2 w2 . (b) Comes from inner product h v , w i = 2 v1 w1 − 21 v1 w2 − 12 v2 w1 + 2 v2 w2 ; positivity 2 follows because h v , v i = 2 (v1 − 41 v2 )2 + 15 8 v2 . “ ” (c) Clearly positive; k c v k = 2 | c v1 | + | c v2 | = | c | 2 | v1 | + | v2 | = | c | k v k; k v + w k = 2 | v1 + w1 | + | v2 +nw2 | ≤ 2 | v1 | +o| v2 | + 2 | w1 n | + | w2 | = k v k + k w k. o (d) Clearly positive; k c v k = max 2 | c v1 |, | c v2 | = | c | max 2 | v1 |, | v2 | = | c | k v k; n

k v + w k = max 2 | v1 + w1 |, | v2 + w2 | n

o

n

o

n

≤ max 2 | v1 | + 2 | w1 |, | v2 | + | w2 | o

o

≤ max 2 | v1 |, | v2 | + max 2 | w1 |, | w2 | = k v k + k w k. (e) Clearly non-negative and equals zero if and only if v1n− v2 = 0 = v1 + v2o, so v = 0; n o k c v k = max | c v1 − c v2 |, | c v1 + c v2 | = | c | max | v1 − v2 |, | v1 + v2 | = | c | k v k; 84

n

k v + w k = max | v1 + w1 − v2 − w2 |, | v1 + w1 + v2 + w2 | n

o

≤ max | v1 − v2 | + | w1 − w2 |, | v1 + v2 | + | w1 + w2 | n

o

n

o

o

≤ max | v1 − v2 |, | v1 + v2 | + max | w1 − w2 |, | w1 + w2 | = k v k + k w k. (f ) Clearly non-negative and equals zero if and only if v1 − v2 = 0 = v1 + v2 , so v = 0; “ ” k c v k = | c v1 − c v2 | + | c v1 + c v2 | = | c | | v1 − v2 | + | v1 + v2 | = | c | k v k; k v + w k = | v 1 + w1 − v 2 − w2 | + | v 1 + w1 + v 2 + w2 | ≤ | v1 − v2 | + | v1 + v2 | + | w1 − w2 | + | w1 + w2 | = k v k + k w k. 3.3.11. Parts (a), (c) and (e) define norms. (b) doesn’t since, for instance, k ( 1, −1, 0 ) T k = 0. (d) doesn’t since, for instance, k ( 1, −1, 1 )T k = 0. 3.3.12. Clearly if v = 0, then w = 0 since only the zero vector has norm 0. If v 6= 0, then w = c v, and k w k = | c | k v k = k v k if and only if | c | = 1. 3.3.13. True for an inner product norm, but false in general. For example, k e1 + e2 k1 = 2 = k e1 k1 + k e2 k1 . 3.3.14. If x = ( 1, 0 )T , ”y = ( 0, 1 )T , say, then k x + y k2∞ + k x − y k2∞ = 1 + 1 = 2 6= 4 = “ 2 k x k2∞ + k y k2∞ , which contradicts the identity in Exercise 3.1.12.

3.3.15. No — neither result satisfies the bilinearity property. For example, if v = ( 1, 0 )T , w = ( 1, 1 )T , then h 2v , w i = h 2v , w i =

“ 1 4 k 2v “ 1 4 k 2v

+ w k21 − k 2 v − w k21

”

+ w k2∞ − k 2 v − w k2∞

1 2

= 3 6= 2 h v , w i = ”

= 2 6= 2 h v , w i =

“

k v + w k21 − k v − w k21

1 2

“

”

= 4,

k v + w k2∞ − k v − w k2∞ 0

♦ 3.3.16. Let m = k v k∞ = max{| v1 |, . . . , | v1 |}. Then k v kp = m @ !p

n X

i=1

| vi | m

!p 11/p A .

”

= 23 .

Now if

| vi | → 0 as p → ∞. Therefore, k v kp ∼ m k 1/p → m as p → ∞, where m 1 ≤ k is the number of entries in v with | vi | = m.

| vi | < m then

♦ 3.3.17.

(a) k f + g k1 =

Z b a

| f (x) + g(x) | dx ≤ =

Z bh

a Z b a

i

| f (x) | + | g(x) | dx

| f (x) | dx +

Z b a

| g(x) | dx = k f k1 + k g k1 .

(b) k f + g k∞ = max | f (x) + g(x) | ≤ max | f (x) | + | g(x) |

≤ max | f (x) | + max | g(x) | = k f k∞ + k g k∞ .

♦ 3.3.18. (a) Positivity follows since the integrand is non-negative; further, k c f k1,w = k f + g k1,w = ≤

Z b

a Z b

a Z b a

| c f (x) | w(x) dx = | c |

a≤x≤b

a

| f (x) | w(x) dx = | c | k f k1,w ;

| f (x) + g(x) | w(x) dx | f (x) | w(x) dx +

(b) Positivity is immediate; further, k c f k∞,w = max

Z b

n

| c f (x) | w(x)

o

Z b a

| f (x) | w(x) dx = k f k1,w + k g k1,w .

= | c | max

85

a≤x≤b

n

| f (x) | w(x)

o

= | c | k f k∞,w ;

k f + g k∞,w = max

a≤x≤b

≤ max

a≤x≤b

n

n

| f (x) + g(x) | w(x) | f (x) | w(x)

o

o

+ max

a≤x≤b

n

| g(x) | w(x)

o

= k f k∞,w + k g k1,w .

3.3.19. n o n o (a) Clearly positive; k c v k = max k c v k1 , k c v k2 = | c | max k v k1 , k v k2 = | c | k v k; n

k v + w k = max k v + w k1 , k v + w k2 n

o

o

n

n

≤ max k v k1 + k w k1 , k v k2 + k w k2 o

o

≤ max k v k1 , k v k2 + max k w k1 , k w k2 = k v k + k w k. (b) No. The triangle inequality is not necessarily valid. For example, in R 2 set k v k1 = T if v = ( 1, .4 )o , w = ( 1, .6 )T , then k v k = |x|n + | y |, k v k2 =o 23 max{| x |, | y |}. Then n min k v k1 , k v k2 = 1.4, k w k = min k w k1 , k w k2 = 1.5, but k v + w k = n

o

min k v + w k1 , k v + w k2 = 3 > 2.9 = k v k + k w k. (c) Yes. (d) No. the triangle inequality is not necessarily valid. For example, if v = ( 1, 1 )T , w = q √ √ T T ( 1, 0 ) , so v + w = ( 2, 1 ) , then k v + w k1 k v + w k∞ = 6 > 2 + 1 = q

k v k 1 k v k∞ +

3.3.20. (a)

0

√1 14 B B B √2 B 14 @ 3 √ − 14

1

C C C; C A

q

k w k 1 k w k∞ .

(b)

0 B B B @

1 3 2 3

−1

1

C C C; A

(c)

0

1 B 6 B 1 B @ 3 − 12

1

C C C; A

(d)

0 B B B @

1 3 2 3

−1

1

C C C; A

(e)

0

1 B 6 B 1 B @ 3 − 12

1

C C C. A

3.3.21. (a) k v k2 = cos2 θ cos2 φ + cos2 θ sin2 φ + sin2 θ = cos2 θ + sin2 θ = 1; (b) k v k2 = 12 (cos2 θ + sin2 θ + cos2 φ + sin2 φ) = 1; (c) k v k2 = cos2 θ cos2 φ cos2 ψ + cos2 θ cos2 φ sin2 ψ + cos2 θ sin2 φ + sin2 θ = cos2 θ cos2 φ + cos2 θ sin2 φ + sin2 θ = cos2 θ + sin2 θ = 1. 3.3.22. 2 vectors, namely u = v/k v k and − u = − v/k v k. 1 1

1

0.5 0.5

3.3.23. (a)

-1

-0.5

0.5

1

(b)

-1

-0.5

0.5

0.5

(c)

1

-1

-0.5

1

0.5

-0.5

-0.5

-0.5

-1

-1

-1

1

1

0.75

3.3.24. (a)

0.5

0.5

0.25

0.25

-1 -0.75 -0.5-0.25

1

0.75

0.25 0.5 0.75

1

(b)

-1 -0.75 -0.5-0.25

-0.25

0.5

0.25 0.5 0.75

1

(c)

-1

-0.5

0.5

-0.25

-0.5

-0.5

-0.75

-0.75

-1

-1

86

-0.5

-1

1

1

0.5

(d)

-1

-0.5

2

1

1

0.5

0.5

1

(e)

-1

-0.5

-0.5

0.5

(f )

1

-2

-0.5

-1

-1

1

2

-1

-1

-2

3.3.25.

(a) Unit octahedron:

(b) Unit cube:

(c) Ellipsoid with semi-axes

√1 , 1, √1 : 2 3

(d)

In the last case, the corners of the “top” face of the parallelopiped are at v 1 = “

− 12 , − 21 , 23

”T

“

“

”T

− 23 , 12 , 21

, v3 = , v4 = v2 = “bottom” (hidden) face are − v1 , − v2 , − v3 , − v4 .

− 21 , 23 , − 12

”T

“

1 1 1 2, 2, 2

”T

,

, while the corners of the

3.3.26. Define |k x k| = k x k/k v k for any x ∈ V . 3.3.27. True. Having the same unit sphere means that k u k1 = 1 whenever k u k2 = 1. If v 6= 0 is any other nonzero vector space element, then u = v/k v k1 satisfies 1 = k u k1 = k u k2 , and so k v k2 = k k v k1 u k2 = k v k1 k u k2 = k v k1 . Finally k 0 k1 = 0 = k 0 k2 , and so the norms agree on all vectors in V . 3.3.28. (a)

18 5

x − 56 ; (b) 3 x − 1; (c)

3 2

x − 21 ; (d)

9 10

x−

3 10 ;

(e)

3 √ 2 2

x−

1 √ ; 2 2

(f )

3 4

x − 14 .

3.3.29. (a), (b), (c), (f ), (i). In cases (g), (h), the norm of f is not finite. ♦ 3.3.30. If k x k, k y k ≤ 1 and 0 ≤ t ≤ 1, then, by the triangle inequality, k t x + (1 − t) y k ≤ t k x k + (1 − t) k y k ≤ 1. The unit sphere is not convex since, for instance, 21 x + 12 (− x) = 0 6∈ S1 when x, − x ∈ S1 .

3.3.31. √ (a) k v k2 = 2, k v k∞ = 1, and √ (b) k v k2 = 14, k v k∞ = 3, and

√ √ 2 ≤ 1 ≤ 2; √ √ 14 ≤ 3 ≤ 14 ;

√1 2 √1 3

(c) k v k2 = 2, k v k∞ = 1, and 12 2 ≤ 1 ≤ 2; √ √ √ (d) k v k2 = 2 2, k v k∞ = 2, and √1 2 2 ≤ 2 ≤ 2 2 . 5

87

3.3.32. (a) v = ( a, 0 )T or ( 0, a )T ; (b) v = ( a, 0 )T or ( 0, a )T ; (c) v = ( a, 0 )T or ( 0, a )T ; (d) v = ( a, a )T or ( a, −a )T . 3.3.33. Let 0 < ε ≪ 1 be small. First, if the entries satisfy | vj | < ε for all j, then k v k∞ = n

o

max | vj | < ε; conversely, if k v k∞ < ε, then | vj | ≤ k v k∞ < ε for any j. Thus, the entries of v are small if and only if its ∞ norm is small. Furthermore, by the equivalence of norms, any other norm satisfies ck v k ≤ k v k∞ ≤ C k v k where C, c > 0 are fixed. Thus, if k v k < ε is small, then its entries, | vj | ≤ k v k∞ ≤ C ε are also proportionately 1 1 small, while if the entries are all bounded by | vj | < ε, then k v k ≤ k v k∞ ≤ , ε is also c c proportionately small.

3.3.34. If | vi | = k v k∞ is the maximal entry, so | vj | ≤ | vi | for all j, then

2 k v k2∞ = vi2 ≤ k v k22 = v12 + · · · + vn ≤ n vi2 = n k v k2∞ .

3.3.35. k v k21

(i)

=

0 @

n X

i=1

| vi

12 |A

=

n X

i=1

| v i |2 + 2

On the other hand, since 2 x y ≤ x2 + y 2 , k v k21 =

(ii) (a) (b) (c) (d) (iii) (a)

n X

i=1

| v i |2 + 2

X

i
X

i
| vi | | v j | ≥

| vi | | v j | ≤ n

n X

i=1

n X

i=1

| vi |2 = k v k22 .

| vi |2 = n k v k22 .

√ √ √ √ k v k2 = √2, k v k1 = 2, and √ 2 ≤ 2 ≤ 2√ 2; √ k v k2 = 14, k v k1 = 6, and 14 ≤ 6 ≤ 3 14; k v k2 = 2,√ k v k1 = 4, and 2 ≤ 4√≤ 2 · 2; √ √ k v k2 = 2 2, k v k1 = 6, and 2 2 ≤ 6 ≤ 5 · 2 2. v = c ej for some j = 1, . . . , n; (b) | v1 | = | v2 | = · · · = | vn |.

3.3.36. (i) k v k∞ ≤ k v k1 ≤ n k v k∞ . (ii) (a) k v k∞ = 1, k v k1 = 2, and 1 ≤ 2 ≤ 2 · 1; (b) k v k∞ = 3, k v k1 = 6, and 3 ≤ 6 ≤ 3 · 3; (c) k v k∞ = 1, k v k1 = 4, and 1 ≤ 4 ≤ 4 · 1; (d) k v k∞ = 2, k v k1 = 6, and 2 ≤ 6 ≤ 5 · 2. (iii) k v k∞ = k v k1 if and only if v = c ej for some j = 1, . . . , n; k v k1 = n k v k∞ if and only if | v1 | = | v2 | = · · · = | vn |. 3.3.37. In each case, we √ minimize and maximize k ( cos θ, sin θ√ )T k for 0 ≤ θ ≤ 2 π: √ (a) c⋆ = 2 , C ⋆ = 3 ; (b) c⋆ = 1, C ⋆ = 2 . 3.3.38. First, | vi | ≤ k v k∞ . Furthermore, Theorem 3.17 implies k v k∞ ≤ C k v k, which proves the result. ♦ 3.3.39. Equality implies that k u k2 = c⋆ for all vectors u with k u k1 = 1. But then if v 6= 0 is any other vector, setting u = v/k v k1 , we find k v k2 = k v k1 k u k2 = c⋆ k v k1 , and hence the norms are merely constant multiples of each other. 3.3.40. If C = k f k∞ , then | f (x) | ≤ C for all a ≤ x ≤ b. Therefore, k f k22 =

Z b a

f (x)2 dx ≤

Z b a

C 2 dx = (b − a) C 2 = (b − a) k f k2∞ .

88

♥ 3.3.41. (a) The maximum (absolute) value of fn (x) is 1 = k fn k∞ . On the other hand, sZ sZ √ ∞ n 2 | fn (x) | dx = dx = 2 n k fn k2 = −→ ∞. −∞

−n

(b) Suppose there √ exists a constant C such that k f k2 ≤ C k f k∞ for all functions. Then, in C for all n, which is impossible. particular, 2 n = k fn k2 ≤ C k fn k∞ = v (c) First, k fn k2 =

sZ ∞

−∞

2

| fn (x) | dx =

uZ u 1/n t −1/n

n dx = 1. On the other hand, the 2

q

maximum (absolute) value of fn (x) is k fn k∞ = n/2 → ∞. Arguing as in part (b), we conclude that there is no constant C such that k f k∞ ≤ C k f k2 . 8 > < n, − 1 ≤ x ≤ 1 , n (d) (i) fn (x) = > 2 has k fn k1 = 1, k fn k∞ = −→ ∞; n n 2 : 0, otherwise, 8 1 > < √ √ , − n ≤ x ≤ n, has k fn k2 = 1, k fn k1 = 2 n −→ ∞; (ii) fn (x) = > 2n : otherwise, 8 0, > < n, − 1 ≤ x ≤ 1 , n −→ ∞. (iii) fn (x) = > 2 has k fn k1 = 1, k fn k2 = n n 2 : 0, otherwise,

♥ 3.3.42. (a) We can’t use the functions in Exercise 3.3.41 directly since they are not continuous. In( √ n (1 − n | x |), − n1 ≤ x ≤ n1 , stead, consider the continuous functions fn (x) = 0, otherwise, v uZ 1/n u √ 2 Then k fn k∞ = n, while k fn k2 = t n(1 − n | x |)2 dx = √ . Thus, there is no −1/n 3 √ constant C such that k f k∞ ≤ C k f k2 as otherwise n = k fn k∞ ≤ C k fn k2 = √2 C 3 for all n, which is impossible. (b) Yes: since, by the definition of the L∞ norm, | f (x) | ≤ k f k∞ for all −1 ≤ x ≤ 1, s s Z 1 Z 1 √ 2 | f (x) | dx = k f k2∞ dx = 2 k f k∞ . k f k2 = −1

−1 (

n (1 − n | x |), − n1 ≤ x ≤ n1 , are 0, otherwise, continuous and satisfy k fn k∞ = n, while k fn k1 = 1. Thus, there is no constant C such that k f k∞ ≤ C k f k1 for all f ∈ C0 [ − 1, 1 ].

(c) They are not equivalent. The functions fn (x) =

3.3.43. First, since h v , w i is easily shown to be bilinear and symmetric, the only issue is positivity: Is 0 < h v , v i = αk v k21 + βk v k22 for all 0 6= v ∈ V ? Let† µ = min k v k2 /k v k1 over all 0 6= v ∈ V . Then h v , v i = αk v k21 +βk v k22 ≥ (α+β µ2 )k v k21 > 0 provided α+β µ2 > 0. Conversely, if α + β µ2 ≤ 0 and 0 6= v achieves the minimum value, so k v k2 = µ k v k1 , then h v , v i ≤ 0. (If there is no v that actually achieves the minimum value, then one can also allow α + β µ2 = 0.)

†

In infinite-dimensional situations, one should replace the minimum by the infimum, since the minimum value may not be achieved. 89

3.4.1. (a) Positive definite: h v , w i = v1 w1 + 2 v2 w2 ; (b) not positive definite; (c) not positive definite; (d) not positive definite; (e) positive definite: h v , w i = v1 w1 − v1 w2 − v2 w1 + 3 v2 w2 ; (f ) not positive definite. 3.4.2. For instance, q(1, 0) = 1, while q(2, −1) = −1. ♦ 3.4.3. (a) The associated quadratic form q(x) = xT D x = c1 x21 + c2 x22 + · · · + cn x2n is a sum of squares. If all ci > 0, then q(x) > 0 for x 6= 0, since q(x) is a sum of non-negative terms, at least one of which is strictly positive. If all ci ≥ 0, then, by the same reasoning, D is positive semi-definite. If all the ci < 0 are negative, then D is negative definite. If D has both positive and negative diagonal entries, then it is indefinite. (b) h v , w i = vT D w = c1 v1 w1 + c2 v2 w2 + · · · + cn vn wn , which is the weighted inner product (3.10). ! 1 2 T is not positive definite or even ♦ 3.4.4. (a) kii = ei K ei > 0. (b) For example, K = 2 1 ! 1 0 semi-deifnite. (c) For example, K = . 0 0 3.4.5.

| 4 x1 y1 − 2 x1 y2 − 2 x2 y1 + 3 x2 y2 | ≤

q

q

4 x21 − 4 x1 x2 + 3 x22

q

4 y12 − 4 y1 y2 + 3 y22 ,

4 (x1 + y1 )2 − 4 (x1 + y1 ) (x2 + y2 ) + 3 (x2 + y2 )2 ≤

q

4 x21 − 4 x1 x2 + 3 x22 +

q

4 y12 − 4 y1 y2 + 3 y22 .

3.4.6. First, (c K)T = c K T = c K is symmetric. Second, xT (c K) x = c xT K x > 0 for any x 6= 0, since c > 0 and K > 0.

♦ 3.4.7. (a) xT (K + L)x = xT K x + xT L x > !0 for all x 6= 0, ! since both summands are strictly ! 2 0 −1 0 1 0 positive. (b) For example, K = ,L= , with K + L = > 0. 0 −1 0 2 0 1 ! ! ! 4 7 1 1 3 1 is not even symmetric. Even the associated quadratic form = 3.4.8. 2 5 1 4 1 1 ! ! x 4 7 = 4 x2 + 9 x y + 5 y 2 is not positive definite. (x y) y 2 5 3.4.9. Example:

0 1

!

1 . 0

♦ 3.4.10. (a) Since K −1 is also symmetric, xT K −1 x = xT K −1 K K −1 x = (K −1 x)T K(K −1 x) = yT K y. (b) If K > 0, then yT K y > 0 for all y = K −1 x 6= 0, and hence xT K −1 x > 0 for all x 6= 0. v · Kv vT K v ♦ 3.4.11. It suffices to note that K > 0 if and only if cos θ = = > 0 for k v k k Kv k k v k k Kv k all v 6= 0, which holds if and only if | θ | < 21 π. ♦ 3.4.12. If q(x) = xT K x with K T = K, then q(x + y) − q(x) − q(y) = (x + y)T K(x + y) − xT K x − yT K y = 2 xT K y = 2 h x , y i.

3.4.13. (a) No, by continuity. Or, equivalently, q(c x+ ) = c2 q(x+ ) > 0 for any c 6= 0, so q is positive at any nonzero scalar multiple of x+ . (b) In view of the preceding calculation, this holds if and only if q(x) is either positive or negative definite and x0 = 0.

90

3.4.14. (a) The quadratic form for K = − N is xT K x = − xT N x > 0 for all x 6= 0. (b) a < 0 and det N = ac − b2 > 0. ! −1 1 (c) The matrix is negative definite. The others are not. 1 −2 −1 3.4.15. x K x = ( 1 1 ) 1 T

!

= 0, but K x =

1 −2

−2 3

!

1 1

!

−1 1

=

!

6= 0.

3.4.16. If q(x) > 0 and q(y) < 0, then the scalar function f (t) = q(t x + (1 − t) y) satisfies f (0) > 0 and f (1) < 0, so, by continuity, there is a point 0 < t⋆ < 1 such that f (t⋆ ) = 0 and hence, setting z = t⋆ x+(1−t⋆ ) y gives q(z) = 0. Moreover, z 6= 0, as otherwise x = c y, c = 1 − 1/t⋆ , would be parallel vectors, but then q(x) = c2 q(y) would have the same sign. 1 0

3.4.17. ! (a) False. For example, the nonsingular matrix K = ! 1 1 . (b) True; see Exercise 3.4.16. and −1 1 ♦ 3.4.18.

0 −1

!

has null directions, e.g.,

1

0.5

(a) x2 − y 2 = (x − y)(x + y) = 0:

-1

-0.5

0.5

1

0.5

1

-0.5

-1

1

0.5

(b) x2 + 4 x y + 3 y 2 = (x + y)(x + 3 y) = 0:

-1

-0.5

-0.5

-1

(c) x2 − y 2 − z 2 = 0:

♦ 3.4.19. T (a) First, kii = eT i K ei = ei L ei = lii , so their diagonal entries are equal. Further, kii + 2 kij + kjj = (ei + ej )T K (ei + ej ) = (ei + ej )T L (ei + ej ) = lii + 2 lij + ljj , and hence kij = kji = lij = lji , and so K = L. ! ! 0 1 0 2 (b) Example: If K = and L = then xT K x = xT L x = 2 x1 x2 . 1 0 0 0 ♦ 3.4.20. Since q(x) is a scalar q(x) = xT A x = (xT A x)T = xT AT x, and hence q(x) = 21 (xT A x + xT AT x) = xT K x. ♦ 3.4.21. (a) ℓ(c x) = a · c x = c (a · x) = c ℓ(x); (b) q(c x) = (c x) T K(c x) = c2 xT K x = c2 q(x); (c) Example: q(x) = k x k2 where k x k is any norm that does not come from an inner product. 91

3.4.22. (i)

10 6

!

6 ; positive definite. (ii) 4 0

1

0 B @

5 4 −3

4 13 −1

1

−3 −1 C A; positive semi-definite; null 2

! 5 6 −8 B C ; positive definite. vectors: all scalar multiples of @ −1 A. (iii) −8 13 7 0 1 0 1 2 1 1 9 6 3 C B C (iv ) B @ 1 2 1 A; positive definite. (v ) @ 6 6 0 A; positive semi-definite; null vectors: 3 0 3 1 1 2 0 0 1 1 ! −1 30 0 −6 2 −1 C all scalar multiples of B ; positive definite. (vii) B 3C @ 1 A. (vi) @ 0 30 A; −1 3 1 −6 3 15 1 0 2 −2 −1 0 B −2 5 2 2C C C; positive definite. B positive definite. (viii) B @ −1 2 2 3A 0 2 3 13 !

0

0

1

1

3 1 2 21 12 9 B B C 9 3C 3.4.23. (iii) @ 1 4 3 A, (v ) @ 12 A. Positive definiteness doesn’t 2 3 5 9 3 6 change, since it only depends upon the linear independence of the vectors. 9 −12

3.4.24. (vi)

4 3

−1

−12 , (iv ) 21

−1 7 4

!

,

(vii)

0

0 B B @

10 −2 −1

1 e−1 1 2 3.4.25. K = e−1 2 (e − 1) 1 2 1 3 (e − 1) 2 3 (e − 1) early independent functions. B B B @

0

1 − 1/e 1 3.4.26. K = 1 e−1 1 2 e−1 2 (e − 1) (still) linearly independent. B B @

3.4.27. K =

3.4.28. K =

0

2 0

B B B B2 B @3 0 B B B B B @

0 2 3

0

0

2 5

2

2 3 2 3 2 5 2 5

2 3 2 3 2 5

2 3

0 2 5

0 2 3 2 5 2 5 2 7

0

1

C 2C 5C C 0C A 2 7 21 5C 2C 5C C 2C 7A 2 9

−2

145 12 10 3 1 2 1 3 1 4

−1

1

10 C C, 3 A 41 6 1

(viii)

0

5 4

B B −2 B B @ −1

0

−2 9 2

−1 2

2 1

4 3

1

1

0 C 1C C. C 1A 5

(e2 − 1) C C is positive definite since 1, ex , e2 x are lin(e3 − 1) C A (e4 − 1) 1

e−1 C x 2x 1 2 are (e − 1) C A is also positive definite since 1, e , e 2 1 3 3 (e − 1)

is positive definite since 1, x, x2 , x3 are linearly independent.

is positive definite since 1, x, x2 , x3 are (still) linearly independent.

♦ 3.4.29. Let h x , y i = xT K y be the corresponding inner product. Then kij = h ei , ej i, and hence K is the Gram matrix associated with the standard basis vectors e1 , . . . , en . 92

♦ 3.4.30. (a) is a special case of (b) since positive definite matrices are symmetric. (b) By Theorem 3.28 if S is any symmetric matrix, then S T S = S 2 is always positive semidefinite, and positive definite if and only if ker S = {0}, i.e., S is nonsingular. In particular, if S = K > 0, then ker K = {0} and so K 2 > 0. ♦ 3.4.31. (a) coker K = ker K since K is symmetric, and so part (a) follows from Proposition 3.36. (b) By Exercise 2.5.39, rng K ⊂ rng AT = corng A. Moreover, by part (a) and Theorem 2.49, both have the same dimension, and hence they must be equal. 3.4.32. 0 = zT K z = zT AT C A z = yT C y, where y = A z. Since C > 0, this implies y = 0, and hence z ∈ ker A = ker K. 3.4.33. (a) L = (AT )T is the Gram matrix corresponding to the columns of AT , i.e., the rows of A. (b) From Exercise 3.4.31 and Theorem 2.49, rank K = rank A = rank AT = rank L. (c) This is true if and only if both ker A and coker A are {0}, which, by Theorem 2.49, requires that A be square and nonsingular. 3.4.34. A Gram matrix is positive definite if and only if the vector space elements used to construct it are linearly independent. Linear independence doesn’t depend upon the inner product being used, and so if the Gram matrix for one inner product is positive definite, so is the Gram matrix for any other inner product on the vector space. ♦ 3.4.35. (a) As in Exercise 3.4.7, the sum of positive definite matrices is positive definite. (b) Example: A1 = ( 1 0 ), A2 = ( 0 1 ), C1 = C2 = !I , K = I . ! A1 C1 O (c) In block form, set A = and C = . Then A2 O C2 T AT C A = AT 1 C1 A1 + A2 C2 A2 = K.

3.5.1. Only (a), (e) are positive definite. 3.5.2.

1 2

(a)

5 −1 0 3 (c) B @ −1 3

(b)

(d)

0

(e)

0

B @

−2 1 −1

B @

2 1 −2

2 3

!

1 2

0 1

!

!

!

1 0 1 2 = ; not positive definite. 0 −1 0 1 ! ! ! ! 1 5 0 1 0 −1 1 − 5 ; positive definite. = 0 14 − 15 1 3 0 1 5 1 0 1 10 10 1 1 0 0 3 0 0 1 1 − −1 3 3 C B 14 B 1 B 1 0C 0C 5 1C A = @−3 AB A@ 0 1 37 C 3 A; positive definite. @0 1 5 1 73 1 0 0 78 0 00 1 1 0 1 1 10 1 1 1 0 0 −2 0 0 1 − 1 −1 2 2C B 3 B 1 B 1 0C 0C −2 1C A = @−2 AB A@ 0 − 2 1 − 13 C A; @0 1 1 4 1 −2 − 1 0 0 − 0 0 1 2 3 3 not positive definite. 1 1 0 10 10 1 −2 1 0 0 2 0 0 1 12 −1 C B 1 CB B 1 1 −3 C 1 0C A=@ 2 A@ 0 2 0 A@ 0 1 −4 A; positive definite. −3 11 −1 −4 1 0 0 1 0 0 1

93

0

0

1

1

1 1 0 1 0 0 B 2 0 1C 1 0 C B1 C=B (f ) 0 1 1 A @ 1 −1 1 0 1 1 2 0 1 −2 not positive definite. 0 0 1 1 0 0 3 2 1 0 B2 C B 1 0 2 3 0 1C B 3 B C=B (g) B 2 1 B @1 0 3 2A @ 3 −5 1 3 0 1 2 4 0 5 1 positive definite. 0 1 0 1 2 1 −2 0 B 1 C B 1 3 −3 2C B 2 C=B B (h) B @ −2 −3 4 −1 A B −1 @ 0 2 −1 7 0 positive definite. B B1 B @1

10

0 1 B 0C CB 0 CB 0 A@ 0 1 0 1

0 03 0C CB 0 CB B 0C A@ 0 0 1 0 1 4 5

0 0

10

0 1 B 0C CB 0 CB 0 A@ 0 5 0

1 1 0 0

10

2 3

0 1 B 0C CB CB 0 0 AB @0 1 0

12 5

0

10

0 2 B0 0C CB CB B 0C A@ 0 0 1

0

0 0

5 2

2 5

0 0

0

1 0 0

1

1 −1 1 0

0 1C C C; −2 A 5

1 3 2 −5

0

10

0 1 B 0C CB 0 CB B 0C A@ 0 9 0 2

C 3C 5C C; 1A

1 0

1

1 2

1 0 0

1

3.5.4. K =

0 B B B @

1 −

1 2 1 2

1 c−1 0

1

−1 − 45 1 0

0 4 5 3 2

1

1

C C C; C A

0 C; 1 C c−2 A c−1 − 2 are all positive if and only if c > 2. the pivots 1, c − 1, cc − 1 10 0 0 1 10 1 1 1 0 1 0 0 1 0 0 1 1 0 B C B B 1 0C (b) B 0C 1C @1 3 1A = @1 A@ 0 2 A@ 0 2 A; 1 1 1 0 1 1 0 2 1 0 0 2 0 0 2 (c) q(x, y, z) = (x + y)2 + 2 (y + 21 z)2 + 12 z 2 . (d) The coefficients (pivots) 1, 2, 12 are positive, so the quadratic form is positive definite.

3.5.3. (a) Gaussian Elimination leads to U =

1 B B0 @ 0

0

0 0

3 2

0

0 0 −1 0

5 3

0 0 1

− 45

0 1 0 0

1

− 12 C C ; yes, it is positive definite. 0C A 3

1 2

2 0

3.5.5. (a) (x + 4 y)2 − 15 y 2 ; not positive definite. (b) (x − 2 y)2 + 3 y 2 ; positive definite.

(c) (x − y)2 − 2 y 2 ; not positive definite.

(d) (x + 3 y)2 − 9 y 2 ; not positive definite. 3.5.6. (a) (x + 2 z)2 + 3 y 2 + z 2 , “

(c) 2 x1 + 3.5.7.

0

1 (a) ( x y z )T B @0 2 0

0 2 4

3 (b) ( x y z )T B @ −4 0

1 2

1 (c) ( x y z )T B @ 1 −2

1 4

(b)

x2 −

10

1 2

1

“

x3

x+ ”2

3 2

+

y−z

15 8

“

”2

(y + 2 z)2 + 4 z 2 ,

+

3 4

2 5

x3

x2 −

”2

2 x B C 4C A@ y A; not positive definite; 12 1z0 1 x −4 21 CB C −2 0 A@ y A; not positive definite; 0 1 10z 1 x 1 −2 B C 2 −3 C A@ y A; positive definite; z −3 6 94

+

6 5

x23 .

(d) ( x1 x2

0

B B x3 )T B @

(e) ( x1 x2 x3

3 2 − 27 0

2 −1

1 B 2 B x4 )T B @ −1 0

9 2

10 1 CB x1 C 9C C; CB 2 A@ x2 A

− 27

5

2 5 0 −1

−1 0 6 − 21

x3

not positive definite;

10

1

0 x1 C B −1 C CB x C 1 CB 2 C; positive definite. − 2 A@ x3 A x4 4

3.5.8. When a2 < 4 and a2 + b2 + c2 − a b c < 4.

!

2 b c a − b2 2 3.5.9. True. Indeed, if a 6= 0, then q(x, y) = a x + y y ; + a a !2 c a − b2 2 b x ; + if c 6= 0, then q(x, y) = c y + x c c while if a = c = 0, then q(x, y) = 2 b x y = 21 b (x + y)2 −

1 2

b (x − y)2 .

3.5.10. (a) According to Theorem 1.52, det K is equal to the product of its pivots, which are all positive by Theorem 3.37. (b) tr K =

n X

i=1

kii > 0 since, according to Exercise 3.4.4, !

a b , if tr K = a + c > 0, and a ≤ 0, every diagonal entry of K is positive. (c) For K = b c then c > 0, but then det K = a c − b2 ≤ 0, which contradicts the assumptions. 0 Thus, both 1 3 0 0 a > 0 and a c−b2 > 0, which, by (3.62), implies K > 0. (d) Example: K = B 0C @ 0 −1 A. 0 0 −1 ! x1 T ∈ R 2 n where x1 , x2 ∈ R n , we have xT K x = xT 3.5.11. Writing x = 1 K1 x1 +x2 K2 x2 > 0 x2 for all x 6= 0 by positive definiteness of!K1 , K2 . The converse is also true, because x1 T xT with x1 6= 0. 1 K1 x1 = x K x > 0 when x = 0 3.5.12. (a) If x 6= 0 then u = x/k x k is a unit vector, and so q(x) = xT K x = k x k2 uT K u > 0. (b) Using the Euclidean norm, let m = min{ uT S u | k u k = 1 } > − ∞, which is finite since q(u) is continuous and the unit sphere in R n is closed and bounded. Then uT K u = uT S u + c k u k2 ≥ m + c > 0 for k u k = 1 provided c > − m. 3.5.13. Write S = (S + c I ) + (− c I ) = K + N , where N = − c I is negative definite for any c > 0, while K = S + c I is positive definite provided c ≫ 0 is sufficiently large. ♦ 3.5.14. (a) The ith column of D LT is di li . Hence writing K = L(D LT ) and using formula (1.14) results in (3.69). ! ! ! ! ! ” 3 0 4 −1 0 0 4 −1 1 “ 1 (b) + ( 0 1 ) = + , 1 − =4 1 4 −1 0 43 −1 1 − 41 4 1 4 0 0 1 1 0 1 0 1 0 “ 1 1 2 1 0 ” 5 B B C C B C B C 1 @ 2 6 1 A = @ 2 A( 1 2 1 ) + 2@ 1 A 0 1 − 2 + @ 0 A( 0 0 1 ) 2 1 − 12 1 1 1 4 0 0 1 0 1 1 1 2 1 0 0 0 0 0 0 B C B =B 2 −1 C 0C @2 4 2A + @0 A + @0 0 A. 1 5 1 2 1 0 −1 0 0 2 2 95

♥ 3.5.15. According to Exercise 1.9.19, the pivots of a regular matrix are the ratios of these suca d − b2 det K cessive subdeterminants. For the 3×3 case, the pivots are a, , and . Therea a d − b2 fore, in general the pivots are positive if and only if the subdeterminants are. ♦ 3.5.16. If a negative diagonal entry appears, it is either a pivot, or a diagonal entry of the remaining lower right symmetric (m − i) × (m − i) submatrix, which, by Exercise 3.4.4, must all be positive in order that the matrix be positive definite. !

a b ♦ 3.5.17. Use the fact that K = −N is positive definite. A 2 × 2 symmetric matrix N = b c 2 is negative definite 0 1 if and only if a < 0 and det N = ac − b > 0. Similarly, a 3 × 3 matrix a b c C 2 N = B @ b d e A < 0 if and only if a < 0, ad − b > 0, det K < 0. In general, K is c e f negative definite if and only if its upper left entry is negative and the sequence of square upper left i × i subdeterminants, i = 1, . . . , n, have alternating signs: −, +, −, +, . . . . 3.5.18. False: if N has size n × n then tr N < 0 but det N > 0 if n is even and < 0 if n is odd.

3.5.19. 3 −2

(a)

−2 2

!

0

=@ !

1 10 √ 3 − √2 √3 C AB @ A, √2 √2 0 3 3 ! !

√ 3

√0

− √2

3

4 −12 2 0 = −12 45 −6 3 0 1 0 1 0 1 1 1 B 1 2 −2 C (c) B A = @1 @1 1 −3 1 −2 14 0√ 0 1 2 √0 2 1 1 B 1 B C B√ √3 (d) @ 1 2 1 A = B 2 @ 2 1 1 2 √1 √1 (b)

(e)

0

2

B B1 B @0

0

3.5.20. (a)

(e)

0

2

B B1 B @1

1

1 2 1 0

0 1 2 1

4 −2 1 2 1 1

1

B 0 B B 0C B C C=B 1A B B @ 2

−2 4 1 1 2 1

2 0

!

1

=

√ 2 √1 2

0 0

6

2 −6 , 0 3 10 1 1 0 B 0C A@ 0 1 0 0 2 10 √ 2 0

CB B 0C CB AB @ √2 3

√0

0 0

√3 √2 √2 3

√2 √3 3 2

0 !

2 √0 3 −1 0√ 2 √0

B 1 B C 1C B B C=B 1A B B @ 2

√1 2 √1 2 1 √ 2

√3 2 √1 6 1 √ 6

2 0

√1 √2 √3 2

√1 2 √1 0 6 0 0 √2 3 10 √ 2 0 B CB 0C CB 0 CB CB B C 0 CB B 0 √ A@ 5 0 2 !

−1 √ , 3 0 0

√2 3 1 √ 2 3

3.5.21. (a) z12 + z22 , where z1 = 4 x1 , z2 = 5 x2 ;

1

1 −3 C A, 2

10 √

0 B CB 0C CB B CB

CB 0C CB

√ AB 5 @ 2

96

2 0

1

C C C, C A

√1 √2 √3 2

0

0

C C 0C C √ C C. 3C 2 C √ A 5 2

√ √2 3 √2 3

0 0

0

√1 √2 √3 2

√1 2 √1 6 √2 3

0

0

0

0

1

0

√1 2 √1 6 1 √ 2√3 5 2

1

C C C C C. C C C A

√ (b) z12 + z22 , where z1 = x1 − x2 , z2 = 3 x2 ; r √ (c) z12 + z22 , where z1 = 5 x1 − √2 x2 , z2 = 11 5 x2 ; 5 r √ 2 2 2 1 1 (d) z1 + z2 + z3 , where z1 = 3 x1 − √ x2 − √ x3 , z2 = 53 x2 − 3

(e) z12 + z22 + z32 , where z1 = x1 +

1 2

x2 , z2 =

3 √ 3 2 x2

+

√1 3

x3 , z3 =

√1 x , 3 r 15 2 3 x3 ;

z3 =

r

28 5

x3 ;

(f ) z12 + z22 + z32 , where z1 = 2 x1 − 21 x2 − x3 , z2 = 12 x2 − 2 x3 , z3 = x3 ; (g) z12 + z22 + z32 + z42 , r r r r r √ 8 55 where z1 = 3 x1 + √1 x2 , z2 = 83 x2 + 38 x3 , z3 = 21 x + x , z = 4 8 3 21 4 21 x3 . 3

3.6.1. The equation is eπ i + 1 = 0, since eπ i = cos π + i sin π = −1. 3.6.2. e

kπ i

k

= cos k π + i sin k π = (−1) =

(

1, −1,

k even, k odd.

3.6.3. Not necessarily. Since 1 = e2 k π i for any integer k, we could equally well compute “

”

1z = e2 k π i z = e− 2 k π y+ i (2 k π x) = e− 2 k π y cos 2 k π x + i sin 2 k π x . If z = n is an integer, this always reduces to 1n = 1, no matter what k is. If z = m/n is a rational number (in lowest terms) then 1m/n has n different possible values. In all other cases, 1z has an infinite number of possible values. 3.6.4. e2 a π i = cos 2 a π + i sin 2 a π = 1 if and only if a is an integer. The problem is that, as in Exercise 3.6.3, the quantity 1a is not necessarily equal to 1 when a is not an integer. √ i = eπ i /4 = √1 + √i and e5 π i /4 = − √1 − √i . 3.6.5. (a) i = eπ i /2 ; (b) 2 √ 2 2 2 √ 4 (c) 3 i = eπ i /6 , e5 π i /6 , e3 π i /2 = − i ; i = eπ i /8 , e5 π i /8 , e9 π i /8 , e13 π i /8 . 3.6.6. Along the line through z at the reciprocal radius 1/r = 1/| z |. 3.6.7. (a) 1/z moves in a clockwise direction around a circle of radius 1/r; (b) z moves in a clockwise direction around a circle of radius r; 1 (c) Suppose the circle has radius r and is centered at a. If r < | a |, then z moves in a | a |2 a counterclockwise direction around a circle of radius centered at ; if | a |2 − r 2 | a |2 − r 2 | a |2 1 cenr > | a |, then z moves in a clockwise direction around a circle of radius 2 r − | a |2 a 1 ; if r = | a |, then z moves along a straight line. On the other hand, z tered at 2 2 |a| − r moves in a clockwise direction around a circle of radius r centered at a. ♦ 3.6.8. Set z = x + i y. We find | Re z | = | x | = inequality. Similarly, | Im z | = | y | =

q

y2 ≤

q

x2 ≤

q

q

x2 + y 2 = | z |, which proves the first

x2 + y 2 = | z |.

♦ 3.6.9. Write z = r e i θ so θ = ph z. Then Re e i ϕ z = Re (r e i (ϕ+θ) ) = r cos(ϕ + θ) ≤ r = | z |, with equality if and only if ϕ + θ is an integer multiple of 2 π. 3.6.10. Set z = r e i θ , w = s e i ϕ , then z w = r s e i (θ+ϕ) has modulus | z w | = r s = | z | | w | and 97

phase ph (z w) = θ + ϕ = ph z + ph w. Further, z = r e− i θ has modulus | z | = r = | z | and phase ph z = − θ = − ph z. 3.6.11. If z = r e i θ , w = s e i ϕ 6= 0, then z/w = (r/s) e i (θ−ϕ) has phase ph (z/w) = θ − ϕ = ph z − ph w, while z w = r s e i (θ−ϕ) also has phase ph (z w) = θ − ϕ = ph z − ph w.

3.6.12. Since tan(t + π) = tan t, the inverse tan−1 t is only defined up to multiples of π, whereas ph z is uniquely defined up to multiples of 2 π. 3.6.13. Set z = x + i y, w = u + i v, then z w = (x + i y) · (u − i v) = (x u + y v) + i (y u − x v) has real part Re (z w) = x u + y v, which is the dot product between ( x, y )T and ( u, v )T . 3.6.14. (a) By Exercise 3.6.13, for z = x + i y, w = u + i v, the quantity Re (z w) = x u + y v is equal to the dot product between the vectors ( x, y )T , ( u, v )T , and hence equals 0 if and only if they are orthogonal. (b) z i z = − i z z = − i | z |2 is purely imaginary, with zero real part, and so orthogonality follows from part (a). Alternatively, note that z = x + i y corresponds to ( x, y ) T while i z = − y + i x corresponds to the orthogonal vector ( −y, x )T . 3.6.15.

h

e(x+ i y)+(u+ i v) = e(x+u)+ i (y+v) = ex+u cos(y + v) + i sin(y + v) h

i

= ex+u (cos y cos v − sin y sin v) + i (cos y sin v + sin y cos v) h

= ex (cos y + i sin y)

ih

i

eu (cos v + i sin v) = ez ew .

i

Use induction: e(m+1) z = em z+z = em z ez = (ez )m ez = (ez )m+1 . 3.6.16. (a) e2 i θ = cos 2 θ + i sin 2 θ while (e i θ )2 = (cos θ + i sin θ)2 = (cos2 θ − sin2 θ) + 2 i cos θ sin θ, and hence cos 2 θ = cos2 θ − sin2 θ, sin 2 θ = 2 cos θ sin θ. 2 (b) cos 3 θ = cos3 θ − 3 cos θ sin0 θ, 1sin 3 θ = 3 cos θ sin2 θ − sin3 θ. X m (c) cos m θ = (−1)k @ A cosm−j θ sinj θ, j 0≤j=2 k≤m sin m θ =

X

(−1)

0
♦ 3.6.17.

cos

θ−ϕ θ+ϕ cos = 2 2 =

1 4 1 4

h

0 1 k @mA

j

cosm−j θ sinj θ.

e i (ϕ−θ)/2 + e− i (ϕ−θ)/2

eiϕ +

1 −iϕ 4e

+

1 4

eiθ +

ih

e i (ϕ+θ)/2 + e− i (ϕ+θ)/2

1 −iθ 4e

=

1 2

cos θ −

1 2

i

cos ϕ.

3.6.18. ez = ex cos y + i ex sin y = r cos θ + i r sin θ implies r = | ez | = ex and θ = ph ez = y. 3.6.19. cos(x + i y) = cos x cosh y − i sin x sinh y, sin(x + i y) = sin x cosh y + i cos x sinh y, ey + e− y ey − e− y where cosh y = , sinh y = . In particular, when y = 0, cosh y = 1 and 2 2 sinh y = 0, and so these reduce to the usual real trigonometric functions. If x = 0 we obtain cos i y = cosh y, sin i y = i sinh y. 3.6.20. (a) cosh(x + i y) = cosh x cos y − i sinh x sin y, sinh(x + i y) = sinh x cos y + i cosh x sin y; (b) Using Exercise 3.6.19, e− z − ez ez − e− z e− z + ez cos i z = = cosh z, sin i z = = i = i sinh z. 2 2i 2 98

♥ 3.6.21. (a) If j + k = n, then (cos θ)j (sin θ)k =

1 2n i k

(e i θ + e− i θ )j (e i θ − e− i θ )k . When multiplied

out, each term has 0 ≤ l ≤ n factors of e i θ and n − l factors of e− i θ , which equals e i (2 l−n) θ with − n ≤ 2 l − n ≤ n, and hence the product is a linear combination of the indicated exponentials. (b) This follows from part (a), writing each e i k θ = cos k θ + i sin k θ and e− i k θ = cos k θ − i sin k θ for k ≥ 0. (c) (i) cos2 θ = 14 e2 i θ + 12 + 41 e− 2 i θ = 12 + 12 cos 2 θ, (ii) cos θ sin θ = − 41 i e2 i θ + 14 i e− 2 i θ = 21 sin 2 θ, (iii) cos3 θ = 81 e3 i θ + 38 e i θ + 83 e− i θ + 18 e− 3 i θ = 34 cos θ + 14 cos 3 θ, 1 −4 i θ 1 4iθ (iv ) sin4 θ = 16 e − 41 e2 i θ + 38 − 41 e− 2 i θ + 16 e = 83 − 21 cos 2 θ + 18 cos 4 θ, 1 −4 i θ 1 4iθ (v ) cos2 θ sin2 θ = − 16 e + 18 − 16 e = 18 − 81 cos 4 θ. ♦ 3.6.22. xa+ i b = xa e i b log x = xa cos(b log x) + i xa sin(b log x).

♦ 3.6.23. First, using the power series for ex , we have the complex power series e i x = 8 > > > > > <

Since i n = > > > > > :

eix =

1, i, −1, −i,

∞ X

n=1

n = 4 k, n = 4 k + 1, n = 4 k + 2, n = 4 k + 3,

∞ X

n=1

( i x)n . n!

we can rewrite the preceding series as

∞ X x2 k ( i x)n = +i (−1)k n! (2 k) ! k=0

∞ X

(−1)k

k=0

x2 k+1 = cos x + i sin x. (2 k) !

♦ 3.6.24.

” d λx d “ µx e cos ν x + i eµ x sin ν x = (µ eµ x cos ν x − ν eµ x sin ν x) + e = dx dx “ ” + i (µ eµ x sin ν x + ν eµ x cos ν x) = (µ + i ν) eµ x cos ν x + i eµ x sin ν x = λ eλ x . (b) This follows from the Fundamental Theorem of Calculus. Alternatively, one can calculate the integrals of the real and imaginary parts directly:. Z b “ ” 1 eµ x cos ν x dx = 2 µ eµ b cos ν b + ν eµ b sin ν b − µ eµ a cos ν a − ν eµ a sin ν a , 2 a µ +ν Z b “ ” 1 µb µb µa µa eµ x sin ν x dx = 2 µ e sin ν b − ν e cos ν b − µ e sin ν a + ν e cos ν a . a µ + ν2

(a)

3.6.25. (a) 12 x + 14 sin 2 x, (b) 12 x − 14 sin 2 x, (c) − 41 cos 2 x, 1 1 (e) 38 x + 41 sin 2 x + 32 sin 4 x, (f ) 38 x − 14 sin 2 x + 32 sin 4 x, 1 1 1 (h) − 14 cos x + 20 cos 5 x − 36 cos 9 x − 60 cos 15 x.

♠ 3.6.26.

Im z 2 :

Re z 2 :

Both have saddle points at the origin, 99

1 (d) − 41 cos 2 x − 16 cos 8 x, 1 1 (g) 8 x − 32 sin 4 x,

Re

1 : z

Im

1 : z

Both have singularities (“poles”) at the origin.

♠ 3.6.27.

ph z:

| z |:

3.6.28. (a) Linearly independent; (b) linearly dependent; (c) linearly independent; (d) linearly dependent; (e) linearly independent; (f ) linearly independent; (g) linearly dependent. √ √ 3.6.29. (a) Linearly independent; (b) yes, they are a basis; (c) k v k = 2, k v k = 6, 1 2 √ k v3 k = 5, (d) v1 · v2 = 1 + i , v2 · v1 = 1 − i , v1 · v3 = 0, v2 · v3 = 0, so v1 , v3 and v2 , v3 are orthogonal, but not v1 , v2 . (e) No, since v1 and v2 are not orthogonal. 3.6.30. (a) Dimension = 1; basis: ( 1, i , 1 − i )T . (b) Dimension = 2; basis: ( i − 1, 0, 1 )T , ( − i , 1, 0 )T . (c) Dimension = 2; basis: ( 1, i + 2 )T , ( i , 1 + 3 i )T . (d) Dimension = 1; basis:

“

− 14 5 −

8 5

13 5 T

i,

−

4 5

i,1

”T

.

T

(e) Dimension = 2; basis: ( 1 + i , 1, 0 ) , ( i , 0, 1 ) .

3.6.31. False — it is not closed under scalar multiplication. For instance, i not in the subspace since i z = − i z. 3.6.32.

!

i ; corange: −1

(a) Range: (b) Range:

(c) Range:

0

!

!

i ; kernel: 2

2i ; cokernel: 1 0 1 0 ! ! 2 0 2 −1 + i C B , ; corange: B −1 + i , 1 + i @ A @ −4 3 1 − 2i 3 − 3i cokernel: {0}. i

1 0

B C B @ −1 + 2 i A, @

i

2− i 3 1+ i

1

C A;

corange:

100

0 B @

i −1 2− i

1 0 C B A, @

1

−i 1

!

z z

!

iz iz

=

. 0

1 + 25 i C B ; kernel: A @ 3i 1

1

!

0 0 C A; kernel: −2

0 B @

−i 1 0

1

C A;

1

C A;

is

cokernel:

0

1 − 32 B B 1 @−2 +

i i

1

1

C C. A

3.6.33. If c v + d w = 0 where c = a + i b, d = e + i f , then, taking real and imaginary parts, (a + e)x + (− b + f )y = 0 = (b + f )x + (a − e)y. If c, d are not both zero, so v, w are linearly dependent, then a ± e, b ± f cannot all be zero, and so x, y are linearly dependent. Conversely, if a x+b y = 0 with a, b not both zero, then (a− i b)v+(a+ i b)w = 2(a x+b y) = 0 and hence v, w are linearly dependent. 3.6.34. This can be proved directly, or by noting that it can be identified with the vector space C m n . The dimension is m n, with a basis provided by the m n matrices with a single entry of 1 and all other entries equal to 0. 3.6.35. Only (b) is a subspace. 3.6.36. False. 3.6.37. (a) Belongs: sin x = − 12 i e i x + 21 i e− i x ; (b) belongs: cos x − 2 i sin x = ( 12 + i )e i x + ( 12 − i )e− i x ; (c) doesn’t belong; (d) belongs: sin2 12 x = 21 − 21 e i x − 21 e− i x ; (e) doesn’t belong. 3.6.38. (a) Sesquilinearity: h c u + d v , w i = (c u1 + d v1 ) w1 + 2 (c u2 + d v2 ) w2 = c (u1 w1 + 2 u2 w2 ) + d (v1 w1 + 2 v2 w2 ) = c h u , w i + d h v , w i, h u , c v + d w i = u 1 ( c v 1 + d w1 ) + 2 u 2 ( c v 2 + d w2 ) ¯ (u w + 2 u w ) = ¯c h u , v i + d ¯ h u , w i. = ¯c (u1 v 1 + 2 u2 v 2 ) + d 1 1 2 2 Conjugate Symmetry: h v , w i = v1 w1 + 2 v2 w2 = w1 v 1 + 2 w2 v 2 = h w , v i.

Positive definite: h v , v i = | v1 |2 + 2 | v2 |2 > 0 whenever v = ( v1 , v2 )T 6= 0. (b) Sesquilinearity: h c u + d v , w i = (c u1 + d v1 ) w1 + i (c u1 + d v1 ) w2 − i (c u2 + d v2 ) w1 + 2 (c u2 + d v2 ) w2 = c (u1 w1 + i u1 w2 − i u2 w1 + 2 u2 w2 ) + d (v1 w1 + i v1 w2 − i v2 w1 + 2 v2 w2 ) = c h u , w i + d h v , w i,

h u , c v + d w i = u 1 ( c v 1 + d w1 ) + i u 1 ( c v 2 + d w2 ) − i u 2 ( c v 1 + d w1 ) + 2 u 2 ( c v 2 + d w2 ) ¯ (u w + i u w − i u w + 2 u w ) = ¯c (u v + i u v − i u v + 2 u v ) + d 1 1

1 2

2 1

2 2

1

1

1

2

2

1

2

2

¯ h u , w i. = ¯c h u , v i + d Conjugate Symmetry:

h v , w i = v1 w1 + i v1 w2 − i v2 w1 + 2 v2 w2 = w1 v 1 + i w1 v 2 − i w2 v 1 + 2 w2 v 2 = h w , v i. Positive definite: Let v = ( v1 , v2 )T = ( x1 + i y1 , x2 + i y2 )T :

h v , v i = | v1 |2 + i v1 v 2 − i v2 v 1 + 2 | v2 |2 = x21 + y12 + x1 y2 − x2 y1 + 2 x22 + 2 y22 = (x1 + y2 )2 + (y1 − x2 )2 + x22 + y22 > 0

provided

v 6= 0.

3.6.39. Only (d), (e) define Hermitian inner products. ♦ 3.6.40. (A v) · w = (A v)T w = vT AT w = vT A w = vT A w = v · (A w).

101

3.6.41. (a) k z k2 =

n X

j =1

| zj |2 =

n “ X

j =1

| xj |2 + | yj |2

”

=

n X

j =1

| xj |2 +

n X

j =1

| yj |2 = k x k 2 + k y k 2 .

(b) No; for instance, the formula is not valid for the inner product in Exercise 3.6.38(b). ♦ 3.6.42. 2 2 2 (a) k z + w k = h z + w , z + w i = k z k + h z , w i + h w , z i + k w k (b) Using (a),

= k z k2 + h z , w i + h z , w i + k w k2 = k z k2 + 2 Re h z , w i + k w k2 .

k z + w k2 − k z − w k2 + i k z + i w k2 − i k z − i w k2 = 4 Re h z , w i + 4 i Re (− i h z , w i) = 4 Re h z , w i + 4 i Im h z , w i = 4 h z , w i. ♦ 3.6.43.

|hv,wi| . Note the modulus on the kvk kwk inner product term, which √ √ is needed in order to keep the angle real. (b) k v k = 11 , k w k = 2 2, h v , w i = 2 i , so θ = cos−1 √1 = 1.3556 radians. (a) The angle θ between v, w is defined by cos θ =

22

3.6.44. k v k = k c v k if and only if | c | = 1, and so c = e i θ for some 0 ≤ θ < 2 π. ♦ 3.6.45. Assume w 6= 0. Then, by Exercise 3.6.42(a), for t ∈ C,

0 ≤ k v + t w k2 = k v k2 + 2 Re t h v , w i + | t |2 k w k2 . hv,wi , we find With t = − k w k2 0 ≤ k v k2 − 2

| h v , w i |2 | h v , w i |2 | h v , w i |2 2 + = k v k − , k w k2 k w k2 k w k2

which implies | h v , w i |2 ≤ k v k2 k w k2 , proving Cauchy–Schwarz. To establish the triangle inequality, k v + w k2 = h v + w , v + w i = k v k2 + 2 Re h v , w i + k w k2

≤ k v k2 + 2 k v k k w k + k w k 2 =

“

kvk + kwk

since, according to Exercise 3.6.8, Re h v , w i ≤ | h v , w i | ≤ k v k2 k w k2 ,.

”2

,

3.6.46. (a) A norm on the complex vector space V assigns a real number k v k to each vector v ∈ V , subject to the following axioms, for all v, w ∈ V , and c ∈ C: (i) Positivity: k v k ≥ 0, with k v k = 0 if and only if v = 0. (ii) Homogeneity: k c v k = | c | k v k. (iii) Triangle inequality: kqv + w k ≤ k v k + k w k. (b) k v k1 = | v1 |+· · ·+| vn |; k v k2 = | v1 |2 + · · · + | vn |2 ; k v k∞ = max{ | v1 |, . . . , | vn | }. 3.6.47. (e) Infinitely many, namely u = e i θ v/k v k for any 0 ≤ θ < 2 π.

♦ 3.6.48. (a) (A† )† = (AT )T = (AT )T = A, (b) (z A + w B)† = (z A + w B)T = (z AT + w B T ) = z AT + w B T = z A† + w B † , (c) (AB)† = (AB)T = B T AT = B T AT = B † A† .

102

♦ 3.6.49. (a) The entries of H satisfy hji = hij ; in particular, hii = hii , and so hii is real. (b) (H z) · w = (H z)T w = zT H T w = zT H w = z · (H w). n X

(c) Let z = n X

i=1

i,j = 1

zi ei , w = T

n X

i=1

wi ei be vectors in C n . Then, by sesquilinearity, h z , w i =

hij zi wj = z H w, where H has entries hij = h ei , ej i = h ej , ei i = hji , proving

that it is a Hermitian matrix. Positive definiteness requires k z k2 = zT H z > 0 for all z 6= 0. (d) First check that the matrix is Hermitian: hij = hji . Then apply Regular Gaussian Elimination, checking that all pivots are real and positive. ♦ 3.6.50. (a) The (i, j) entry of the Gram matrix K is kij = h vi , vj i = h vj , vi i = kji , and so K † = K is Hermitian.

(b) xT K x =

n X

i,j = 1

kij xi xj =

n X

i,j = 1

xi xj h vi , vj i = k v k2 ≥ 0 where v =

n X

i=1

xi vi .

(c) Equality holds if and only if v = 0. If v1 , . . . , vn are linearly independent, then v=

n X

i=1

xi vi = 0 requires x = 0, proving positive definiteness.

3.6.51. (a) (i) h 1 , e i πx i = − π2 i , k 1 k = 1, k e i πx k = 1; √ k 1 + e i πx k = 2 ≤ 2 = k 1 k + k e i πx k. (ii) | h 1 , e i πx i | = π2 ≤ 1 = k 1 k k e i πx k, (b) (i) h x + i , x − i i = − 23 + i , k x + i k = k x − i k = √2 ; 3

√ 13 3

4 3

(ii) | h x + i , x − i i | = ≤ = k x + i k k x − i k, k (x + i ) + (x − i ) k = k 2 x k = √2 ≤ √4 = k x + i k + k x − i k.

(c)

3 3 2 1 (i) h i x , (1 − 2 i )x + 3 i i = + 4 i , k i x k = √1 , k (1 − 2 i )x + 3 i k 5 √ q 2 2 5 14 (ii) | h i x , (1 − 2 i )x + 3 i i | = 4 ≤ 15 = k i x k k (1 − 2 i )x + 3 i k, q q 14 √1 + ≈ 2.4221 ≤ 2.6075 ≈ k i x2 + (1 − 2 i )x + 3 i k = 88 15 3 5 2 2

1 2

=

q

14 3

;

= k i x k + k (1 − 2 i )x + 3 i k.

3.6.52. w(x) > 0 must be real and positive. Less restrictively, one needs only require that w(x) ≥ 0 as long as w(x) 6≡ 0 on any open subinterval a ≤ c < x < d ≤ b; see Exercise 3.1.28 for details.

103


4.1.1. We need to minimize (3 x − 1)2 + (2 x + 1)2 = 13 x2 − 2 x + 2. The minimum value of 1 occurs when x = 13 .

25 13

4.1.2. Note that f (x, y) ≥ 0; the minimum value f (x⋆ , y⋆ ) = 0 is achieved when x⋆ = − 57 , y⋆ = − 74 . 4.1.3. (a) ( −1, 0 )T , (b) ( 0, 2 )T , (c)

“

1 1 2, 2

”T

, (d)

“

− 23 ,

3 2

”T

, (e) ( −1, 2 )T .

4.1.4. Note: To minimize the distance between the point ( a, b )T to the line y = m x + c: n o (i) in the ∞ norm we must minimize the scalar function f (x) = max | x − a |, | m x + c − b | , while (ii) in the 1 norm we must minimize the scalar function f (x) = | x − a |+| m x + c − b |. (i) (a) all points on the line segment ( x, 0 )T for −3 ≤ x ≤ 1; (b) all points on the line segment ( 0, y )T for 1 ≤ y ≤ 3; (c)

“

1 1 2, 2

”T

; (d)

“

− 23 ,

3 2

”T

; (e) ( −1, 2 )T .

(ii) (a) ( −1, 0 )T ; (b) ( 0, 2 )T ; (c) all points on the line segment ( t, t )T for −1 ≤ t ≤ 2; (d) all points on the line segment ( t, −t )T for −2 ≤ t ≤ −1; (e) ( −1, 2 )T . 4.1.5. (a) Uniqueness is assured in the Euclidean norm. (See the following exercise.) (b) Not unique. For instance, in the ∞ norm, every point on the x-axis of the form (x, 0) for −1 ≤ x ≤ 1 is at a minimum distance 1 from the point ( 0, 1 )T . (c) Not unique. For instance, in the 1 norm, every point on the line x = y of the form (x, x) for −1 ≤ x ≤ 1 is at a minimum distance 1 from the point ( −1, 1 )T . b ♥ 4.1.6. (a) The closest point v is found by dropping a perpendicular from the point to the line: θ

v

(b) Any other point w on the line lies at a larger distance since k w − b k is the hypotenuse of the right triangle with corners b, v, w and hence is longer than the side length k v − b k. (c) Using the properties of the cross product, the distance is k b k | sin θ | = | a × b |/k a k, where θ is the angle between the line through a and b. To prove the other formmula, we note that k a k2 k b k2 − (a · b)2 = (a21 + a22 )(b21 + b22 ) − (a1 b1 + a2 b2 )2 = (a1 b2 − a2 b1 )2 = (a × b)2 . 4.1.7. This holds because the two triangles in the figure are congruent. According to Exercise 4.1.6(c), when k a k = k b k = 1, the distance is | sin θ | where θ is the angle between a, b, as ilustrated: 104

b

a θ 4.1.8.

| a × x0 | = kak | a x0 + b y 0 + c | | a x0 + b y 0 | √ √ . (b) A similar geometric construction yields the distance . a2 + b 2 a2 + b 2 | a x0 + b y 0 + c z 0 + d | 1 √ ♥ 4.1.9. (a) The distance is given by . (b) √ . 2 2 2 14 a +b +c 4.1.10. (a) Let v⋆ be the minimizer. Since x2 is a montone, strictly increasing function for x ≥ 0, we have 0 ≤ x < y if and only if x2 < y 2 . Thus, for any other v, we have x = k v⋆ − b k < y = k v − b k if and only if x2 = k v⋆ − b k2 < y 2 = k v − b k2 , proving that v⋆ must minimize both quantities. (b) F (x) must be strictly increasing: F (x) < F (y) whenever x < y. (a) Note that a = ( b, −a )T lies in the line. By Exercise 4.1.6(a), the distance is

4.1.11. (a) Assume V 6= {0}, as otherwise the minimum and maximum distance is k b k. Given any 0 6= v ∈ V , by the triangle inequality, k b − t v k ≥ | t | k v k − k b k → ∞ as t → ∞, and hence there is no maximum distance. (b) Maximize distance from a point to a closed, bounded (compact) subset of R n , e.g., the unit sphere { k v k = 1 }. For example, the maximal distance between the point ( 1, 1 ) T «T „ √ being the farthest and the unit circle x2 + y 2 = 1 is 1 + 2, with x⋆ = − √1 , − √1 2 2 point on the circle.

4.2.1. x =

1 2,y

=

1 2 , z0=

−2 1 1 coefficient matrix B @1 3 0 1

with f (x, y, z) = − 23 . This is the glboal minimum because the 1 0 1C A is positive definite. 1

1 4.2.2. At the point x⋆ = − 11 , y⋆ =

5 22 .

4.2.3. (a) Minimizer: x = − 23 , y = 61 ; minimum value: − 43 . (b) Minimizer: x = 29 , y = 92 ; 1 minimum value: 32 9 . (c) No minimum. (d) Minimizer: x = − 2 , y = −1, z = 1; minimum value: − 45 . (e) No minimum. (f ) No minimum. (g) Minimizer: x = 57 , y = − 54 , z = 15 , w = 52 ; minimum value: − 85 . !

!

!

!

1 b 0 4.2.4. (a) | b | < 2, (b) A = ; (c) If | b | < 2, the = 0 1 4 − b2 1 b 1 minimum value is − , achieved at x⋆ = − , y⋆ = . When b ≥ ±2, the 4 − b2 4 − b2 4 − b2 minimum is − ∞. 1 b

b 4

1 b

0 1

105

1 0

4.2.5. ” “ 1 T 1 (a) p(x) = 4 x2 − 24 x y + 45 y 2 + x − 4 y + 3; minimizer: x⋆ = 24 , 18 ≈ ( .0417, .0556 )T ; minimum value: p(x⋆ ) = 419 144 ≈ 2.9097. 2 2 (b) p(x) = 3 x + 4 x y + y − 8 x − 2 y; no minimizer since K is not positive definite. (c) p(x) = 3 x2 − 2 x y + 2 x z + 2 y 2 − 2 y z + 3 z 2 − 2 x + 4 z − 3; minimizer: x⋆ = “

”T

65 ≈ ( .5833, −.1667, −.9167 )T ; minimum value: p(x⋆ ) = − 12 = −5.4167. 2 2 (d) p(x) = x + 2 x y + 2 x z + 2 y − 2 y z + z + 6 x + 2 y − 4 z + 1; no minimizer since K is not positive definite. (e) p(x) = x2 + 2 x y + 2 y 2 + 2 y z + 3 z 2 + 2 z w + 4 w 2 + 2 x − 4 y + 6 z − 8 w; minimizer: x⋆ = ( −8, 7, −4, 2 )T ; minimum: p(x⋆ ) = −42. 7 1 11 12 , − 6 , − 12 2

“

4.2.6. n = 2: minimizer x⋆ = − 61 , − 16 “

”T

; minimum value − 16 .

3 5 5 , − 14 , − 28 n = 3: minimizer x⋆ = − 28 “

”T

; minimum value − 72 .

5 5 2 2 n = 4: minimizer x⋆ = − 11 , − 22 , − 22 , − 11 “

”T

”T

9 ; minimum value − 22 .

3 4.2.7. (a) maximizer: x⋆ = 10 ; maximum value: p(x⋆ ) = 11 , 11 mum, since the coefficient matrix is not negative definite.

16 11 .

(b) There is no maxi-

4.2.8. False. Even in the scalar case, p1 (x) = x2 has minimum at x⋆1 = 0, while p2 (x) = x2 − 2 x has minimum at x⋆2 = 1, but the minimum of p1 (x) + p2 (x) = 2 x2 − 2 x is at x⋆ = 21 6= x⋆1 + x⋆2 . ♦ 4.2.9. Let x⋆ = K −1 f be the minimizer. When c = 0, according to the third expression in (4.12), p(x⋆ ) = − (x⋆ )T K x⋆ ≤ 0 because K is positive definite. The minimum value is 0 if and only if x⋆ = 0, which occurs if and only if f = 0. 4.2.10. First, using Exercise 3.4.20, we rewrite q(x) = xT K x where K = K T is symmetric. If K is positive definite or positive semi-definite, then the minimum value is 0, attained when x⋆ = 0 (and other points in the semi-definite cases). Otherwise, there is at least one vector v for which q(v) = vT K v = a < 0. Then q(t v) = t2 a can be made arbitrarily large negative for t ≫ 0; in this case, there is no minimum value. 4.2.11. If and only if f = 0 and the function is constant, in which case every x is a minimizer. ♦ 4.2.12. p(x) has a maximum if and only if − p(x) has a minimum. Thus, we require either K is negative definite, or negative semi-definite with f ∈ rng K. The maximizer x⋆ is obtained by solving K x⋆ = f , and the maximum value p(x⋆ ) is given as before by any of the expressions in (4.12). 4.2.13. The complex numbers do not have an ordering, i.e., an inequality z < w doesn’t make any sense. Thus, there is no “minimum” of a set of complex numbers. (One can, of course, minimize the modulus of a set of complex numbers, but this places us back in the situation of minimizing a real-valued function.)

4.3.1. Closest point:

“

6 38 36 7 , 35 , 35

”T

≈ ( .85714, 1.08571, 1.02857 )T ; distance:

4.3.2. (a) Closest point: ( .8343, 1.0497, 1.0221 )T ; distance: .2575.

106

√1 35

≈ .16903.

(b) Closest point: ( .8571, 1.0714, 1.0714 )T ; distance: 2673. 4.3.3.

“

7 8 11 9, 9, 9

”T

≈ ( .7778, .8889, 1.2222 )T .

4.3.4. (a) Closest point: (b) Closest point: (c) Closest point: (d) Closest point:

q ” 7 7 7 7 T 11 ; distance: , , , 4 4 4 4 4 . q ”T “ 5 3 3 ; distance: 2, 2, 2 , 2 2. T ( 3, 1, 2, 0 ) ; distance: 1. ” “ 3 1 3 T 5 ; distance: 27 . 4,−4, 4,−4 “

4.3.5. Since the vectors are linearly dependent, one must first reduce to a basis consisting of the first two. The closest point is 4.3.6. (i) 4.3.4: (a) Closest point:

4.3.5: (ii) 4.3.4:

distance:

4.3.8. (a)

“

q

1 1 1 1 2, 2, 2, 2

”T

and the distance is 3.

q ” 3 3 3 3 T 7 ; distance: 2, 2, 2, 2 4. q ” “ 5 5 5 4 4 T ; distance: (b) Closest point: 3 , 3 , 3 , 3 3. T (c) Closest point: ( 3, 1, 2, 0 ) ; distance: 1. “ ”T (d) Closest point: 32 , − 16 , − 13 , − 13 ; distance: √7 . 6 ”T “ 4 1 3 13 , 22 , − 11 , − 22 Closest point: 11 ≈ ( .2727, .5909, −.3636, −.0455 )T ; q 155 distance: 22 ≈ 2.6543. ” “ 25 25 25 25 T , 14 , 14 , 14 ≈ ( 1.7857, 1.7857, 1.7857, 1.7857 )T ; (a) Closest point: 14 q 215 distance: 14 ≈ 3.9188. ” “ 66 66 59 59 T ≈ ( 1.8857, 1.8857, 1.6857, 1.6857 )T ; (b) Closest point: 35 , 35 , 35 , 35 q 534 distance: 35 ≈ 3.9060. “ ”T 28 11 16 (c) Closest point: , 9 , 9 ,0 ≈ ( 3.1111, 1.2222, 1.7778, 0 )T ; 9q 32 distance: 9 ≈ 1.8856. ”T “ √ ; distance: 42. (d) Closest point: 23 , −1, 0, − 21

6 3 3 3 5, 5, 2, 2 8 3;

(b)

”T

“

“

26 292 159 , − 107 259 , − 259 , 259 q 259 8 143 259 ≈ 5.9444.

4.3.5: Closest point:

4.3.7. v =

“

”T

≈ ( .1004, −.4131, 1.1274, .6139 )T ;

= ( 1.2, .6, 1.5, 1.5 )T .

√7 . 6

♥ 4.3.9. T (a) P 2 = A(AT A)−1 AT A(AT A)−1 AT = A(A A)−1 AT = P . 0 1 1 1 0 1 ! 1 1 6 3 −6 B − 1 0 B 1 2 1 2 2 A, (ii) , (iii) B (b) (i) @ 1 3 3 −3 @ 0 1 − 12 1 2 − 61 − 13 6

1

C C C, A

(iv )

0 B B B @

− −

5 6 1 6 1 3

− 16

5 6 − 13

− 13 − 13 1 3

1

C C C. A

(c) P T = (A(AT A)−1 AT )T = A((AT A)T )−1 AT = A(AT A)−1 AT = P . (d) Given v = P b ∈ rng P , then v = A x where x = (AT A)−1 AT b and so v ∈ rng A. Conversely, if v = A x ∈ rng A, then we can write v = P b ∈ rng P where b solves the 107

linear system AT b = (AT A)x ∈ R n , which has a solution since rank AT = n and so rng AT = R n . (e) According to the formulas in the section, the closest point is w = A x where x = K −1 AT b = (AT A)−1 AT b and so w = A(AT A)−1 AT b = P b. (f ) If A is nonsingular, then (AT A)−1 = A−1 A−T , hence P = A(AT A)−1 AT = A A−1 A−T AT = I . In this case, the columns of A span R n and the closest point to any b ∈ R n is b = P b itself. ♦ 4.3.10. (a) The quadratic function to be minimized is n X

p(x) =

i=1

2

0

2

k x − ai k = n k x k − 2 x · @ n X

which has the form (4.22) with K = n I , f = 1 minimizing point is x = K −1 f = n (b) (i)

“

”

− 12 , 4 , (ii)

“

1 1 3, 3

”

, (iii)

i=1

n X

ai , which “ i=1 ” − 41 , 43 .

n X

i=1

1

ai A +

ai , c =

n X

i=1

n X

i=1

k ai k2 ,

k ai k2 . Therefore, the

is the center of mass of the points.

4.3.11. In general, for the norm based on the positive definite matrix C, the quadratic function to be minimized is p(x) =

n X

i=1 n 1 X

“

”

k x − ai k2 = n k x k2 − 2 n h x , b i + n c = n xT C x − 2 xT C b + c ,

n 1 X k ai k2 . Therefore, the minimizing n i=1 n i=1 point is still the center of mass: x = C −1 C b = b. Thus, all answers are the same as in Exercise 4.3.10.

where b =

ai is the center of mass and c =

4.3.12. The Cauchy–Schwarz inequality guarantees this. ♦ 4.3.13.

k v − b k2 = k A x − b k2 = (A x − b)T C (A x − b) = (xT AT − bT )C (A x − b)

= xT AT C A x − xT AT C b − bT C A x + bT b = xT AT C A x − 2 xT AT C bT + bT b = xT Kx − 2 xT f + c,

where the scalar quantity xT AT C b = (xT AT C b)T = bT C A x since C T = C.

4.3.14. (a) x =

1 15 , y

(d) x 4.3.15. (a)

1 2,

(b)

0

41 45 ; = 13 , y

=

1

8 @ 5 A 28 65

=

1 8 (b) x = − 25 , y = − 21 ;

= 2, z = 34 ; !

(c) u = 23 , v = 53 , w = 1;

(e) x1 = 13 , x2 = 2, x3 = − 13 , x4 = − 34 .

1.6 , (c) .4308

0 B B B @

2 5 1 5

0

1

C C C, A

(d)

1 227 @ 941 A 304 941 0

=

!

0

1

.0414 .2412 C , (e) B @ −.0680 A. .3231 −.0990

4.3.16. The solution is ( 1, 0, 3 )T . If A is nonsingular, then the least squares solution to the system A x = b is given by x⋆ = K −1 f = (AT A)−1 AT b = A−1 A−T AT b = A−1 b, which coincides with the ordinary solution. 108

4.3.17. The solution is x⋆ = ( −1, 2, 3 )T . The least squares error is 0 because b ∈ rng A and so x⋆ is an exact solution. ♦ 4.3.18. (a) This follows from Exercise 3.4.31 since f ∈ corng A = rng K. (b) This follows from Theorem 4.4. (c) Because if z ∈ ker A = ker K, then x + z is also a solution to the normal equations, and has the same minimum value for the least squares error.

4.4.1. (a) y =

12 7

+

12 7

t = 1.7143 (1 + t);

(b) y = 1.9 − 1.1 t;

(c) y = − 1.4 + 1.9 t. 140 120

4.4.2. (a) y = 30.6504 + 2.9675 t; (c) profit: $179, 024,

(b)

(d) profit: $327, 398.

100 80 60 40 20 10

4.4.3. (a) y = 3.9227 t − 7717.7;

15

20

25

30

35

(b) $147, 359 and $166, 973.

♥ 4.4.4. (a) y = 2.2774 t − 4375; (b) 179.75 and 191.14; (c) y = e.0183 t−31.3571 , with estimates 189.79 and 207.98. (d) The linear model has a smaller the least squares error between its predictions and the data, 6.4422 versus 10.4470 for the exponential model, and also a smaller maximal error, namely 4.3679 versus 6.1943. 4.4.5. Assuming a linear increase in temperature, the least squares fit is y = 71.6 + .405 t, which equals 165 at t = 230.62 minutes, so you need to wait another 170.62 minutes, just under three hours. 4.4.6. (a) The least squares exponential is y = e4.6051−.1903 t and, at t = 10, y = 14.9059. (b) Solving e4.6051−.1903 t = .01, we find t = 48.3897 ≈ 49 days.

4.4.7. (a) The least squares exponential is y = e2.2773−.0265 t . The half-life is log 2/.0265 = 26.1376 days. (b) Solving e2.2773−.0265 t = .01, we find t = 259.5292 ≈ 260 days. 4.4.8. (a) The least squares exponential is y = e.0132 t−20.6443 , giving the population values (in millions) y(2000) = 296, y(2010) = 337, y(2050) = 571. (b) The revised least squares exponential is y = e.0127 t−19.6763 , giving a smaller predicted 0 0 value of y(2050) = 536. 31 1 1 11

B 6C 1 2C C B C C B 2 1C B 11 C C C is the C, while z = B B −2 C 2 2C C B C @ A 0A 3 2 3 1 3 4 T T data vector. The least squares solution to the normal equations A A x = A z for x = ( a, b, c )T gives the plane z = 6.9667 − .8 x − .9333 y. (b) Every plane going through ( 0, 2, 2 )T has the an equation z = a(x − 2) + b(y − 2), i.e., a

4.4.9. (a) The sample matrix for the functions 1, x, y is A =

109

B1 B B B1 B B1 B @1

0

−1 B −1 B B 0 B linear combination of the functions x − 2, y − 2. The sample matrix is A = B B 0 B and the least squares solution gives the plane @ 1 (y − 2) = 3.4667 − .8 x − .9333 y. z = − 45 (x − 2) − 14 1 15 ♦ 4.4.10. For two data points t1 = a, t2 = b, we have t2 =

1 2

(a2 + b2 ),

while

( t )2 =

1 4

−1 1 0C C −1 C C C, 0C C 0A 2

(a2 + 2 a b + b2 ),

which are equal if and only if a = b. (For more data points, there is a single quadratic condition for equality.) Similarly, if y1 = p, y2 = q, then ty =

1 2

(a p + b q),

ty =

while

1 4

(a + b)(p + q).

Fixing a 6= b, these are equal if and only if p = q. ♦ 4.4.11. 1 m

m X

i=1

(ti − t )2 =

1 m

m X

i=1

t2i −

2t m

m X

i=1

ti +

( t )2 m

m X

1 = t 2 − 2 ( t ) 2 + ( t ) 2 = t2 − ( t ) 2 .

i=1

4.4.12. 4 (a) p(t) = − 15 (t − 2) + (t + 3) = 17 5 + 5 t, 1 t(t − 1) = 1 − 58 t + 18 t2 , (b) p(t) = 31 (t − 1)(t − 3) − 41 t(t − 3) + 24 (c) p(t) = 21 t (t − 1) − 2 (t − 1)(t + 1) − 21 (t + 1) t = 2 − t − 2 t2 , (d) p(t) = 21 t (t − 2)(t − 3) − 2 t (t − 1)(t − 3) + 23 t (t − 1)(t − 2) = t2 , 1 (e) p(t) = − 24 (t + 1) t (t − 1)(t − 2) + 31 (t + 2) t (t − 1)(t − 2) + 21 (t + 2) (t + 1) (t − 1)(t − 2) − 1 5 13 2 1 3 3 4 1 6 (t + 2) (t + 1) t (t − 2) + 8 (t + 2) (t + 1) t (t − 1) = 2 + 3 t − 4 t − 6 t + 4 t . 4.4.13. 10

(a) y = 2 t − 7

5

2

4

6

8

10

-5

14 12 10

(b) y = t2 + 3 t + 6

8 6 4 2 -3

-2

-1

1

2

4 2

(c) y = − 2 t − 1

-3

-2

-1

1 -2 -4

110

2

5 4 3

(d) y = − t3 + 2t2 − 1

2 1 -1.5

-1 -0.5

0.5

1

1.5

2

2.5

-1 -2 -3

25 20 15

(e) y = t4 − t3 + 2 t − 3

10 5 -1

-2

1

2

-5 -10

4.4.14. (a) y = − 43 + 4 t. (b) y = 2 + t2 . The error is zero because the parabola interpolates the points exactly. 4.4.15. 8 7

(a) y = 2.0384 + .6055 t

6 5 4 3 2

4

6

8

10

2

4

6

8

10

8 7

(b) y = 2.14127 + .547248 t + .005374 t2

6 5 4 3

10

2 3 (c) y = 2.63492 + .916799 t − .796131 t + .277116 t 4

− .034102 t + .001397 t

8

5

6

2

4

6

8

10

(d) The linear and quadratic models are practically identical, with almost the same least squares errors: .729045 and .721432, respectively. The fifth order interpolating polynomial, of course, has 0 least squares error since it goes exactly through the data points. On the other hand, it has to twist so much to do this that it is highly unlikely to be the correct theoretical model. Thus, one strongly suspects that this experimental data comes from a linear model. 4.4.16. The quadratic least squares polynomial is y = 4480.5 + 6.05 t − 1.825 t2 , and y = 1500 at 42.1038 seconds. 4.4.17. The quadratic least squares polynomial is y = 175.5357 + 56.3625 t − .7241 t 2 , and y = 0 at 80.8361 seconds. 4.4.18. (a) p2 (t) = 1 + t +

1 2 2t ,

p4 (t) = 1 + t +

1 2 2t

111

+

1 6 6t

+

1 4 24 t ;

(b) The maximal error for p2 (t) over the interval [ 0, 1 ] is .218282, while for p4 (t) it is .0099485. The Taylor polynomials do a much better job near t = 0, but become significantly worse at larger values of t; the least squares approximants are better over the entire interval. ♥ 4.4.19. Note: In this solution t is measured in degrees! (Alternatively, one can set up and solve the problem in radians.) The error is the L∞ norm of the difference sin t − p(t) on the interval 0 ≤ t ≤ 60. (a) p(t) = .0146352 t + .0243439; maximum error ≈ .0373. (b) p(t) = − .000534346 + .0191133 t − .0000773989 t2 ; maximum error ≈ .00996. π t; maximum error ≈ .181. (c) p(t) = 180 (d) p(t) = .0175934 t“− 9.1214 × 10−6 t2 − 7.25655 × 10−7 t3 ; maximum error ≈ .000649. ” π π (e) p(t) = 180 t − 61 180 t 3 ; maximum error ≈ .0102. (f ) The Taylor polynomials do a much better job at the beginning of the interval, but the least squares approximants are better over the entire range. 4.4.20. (a) For equally spaced data points, the least squares line is y = .1617 + .9263 t with a maximal error of .1617 on the interval 0 ≤ t ≤ 1. (b) The least squares quadratic polynomial is y = .0444 + 1.8057 t −√.8794 t2 with a slightly better maximal error of .1002. Interestingly, the line is closer to t over a larger fraction of the interval than the quadratic polynomial, and only does significantly worse near t = 0. 1 1 1 1 Alternative solution: (a) The data points 0, 25 , 16 , 9 , 4 , 1 have exact square roots 0, 51 , 41 , 31 , 12 , 1. For these, we obtain the least squares line y = .1685 + .8691 t, with a maximal error of .1685. (b) The least squares quadratic polynomial y = .0773 + 2.1518 t − 1.2308 t 2 with, strangely, a worse maximal error of .1509, although it does do better over a larger fraction of the interval. 4.4.21. p(t) = .9409 t + .4566 t2 − .7732 t3 + .9330 t4 . The graphs are very close over the interval 0 ≤ t ≤ 1; the maximum error is .005144 at t = .91916. The functions rapidly diverge above 1, with tan t → ∞ as t → 21 π, whereas p( 12 π) = 5.2882. The first graph is on the interval [ 0, 1 ] and the second on [ 0, 12 π ]: 14

1.5

12

1.25

10 1 8 0.75

6

0.5

4

0.25

2 0.2

0.4

0.6

0.8

1

0.2 0.4 0.6 0.8

1

1.2 1.4

4.4.22. The exact value is log 10 e ≈ .434294. (a) p2 (t) = − .4259 + .48835 t − .06245 t2 and p2 (e) = .440126; (b) p3 (t) = − .4997 + .62365 t − .13625 t2 + .0123 t3 and p2 (e) = .43585. ♦ 4.4.23. (a) q(t) = α + β t + γ t2 where y t t (t − t1 ) + y1 t2 t0 (t0 − t2 ) + y2 t0 t1 (t1 − t0 ) α=− 0 1 2 2 , (t1 − t0 )(t2 − t1 )(t0 − t2 )

y0 (t22 − t21 ) + y1 (t20 − t22 ) + y2 (t21 − t20 ) y (t − t1 ) + y1 (t0 − t2 ) + y2 (t1 − t0 ) , γ=− 0 2 . (t1 − t0 )(t2 − t1 )(t0 − t2 ) (t1 − t0 )(t2 − t1 )(t0 − t2 ) β (b) The minimum is at t⋆ = − , and m0 s1 − m1 s0 = − 12 β(t2 − t0 ), s1 − s0 = γ(t2 − t0 ). 2γ y02 (t2 − t1 )4 + y12 (t0 − t2 )4 + y22 (t1 − t0 )4 β2 h i. =− (c) q(t⋆ ) = α− 4γ 4 (t1 − t0 )(t2 − t1 )(t0 − t2 ) y0 (t2 − t1 ) + y1 (t0 − t2 ) + y2 (t1 − t0 )

β=

112

♠ 4.4.24. When a < 2, the approximations are very good. At a = 2, a small amount of oscillation is noticed at the two ends of the intervals. When a > 2, the approximations are worthless for | x | > 2. The graphs are for n + 1 = 21 iteration points, with a = 1.5, 2, 2.5, 3:

-1

1

0.8

0.8

0.8

0.6

0.6

0.6

0.6

0.4

0.4

0.4

0.4

0.2

0.2

0.2

0.2 -1.5

1

1

1 0.8

-0.5

1

0.5

1.5

-2 -1.5

-1 -0.5

-0.2 -0.4

1

0.5

1.5

2

-1

-2

1

-3

2

-1

-2

1

-0.2

-0.2

-0.2

-0.4

-0.4

-0.4

2

3

Note: Choosing a large number of sample points, say n = 50, leads to an ill-conditioned matrix, and even the small values of a exhibit poor approximation properties near the ends of the intervals due to round-off errors when solving the linear system. ♠ 4.4.25. The conclusions are similar to those in Exercise 4.4.24, but here the critical value of a is around 2.4. The graphs are for n + 1 = 21 iteration points, with a = 2, 2.5, 3, 4: 1

1

1

0.8

0.8

0.6

0.6

0.6

0.6

0.4

0.4

0.4

0.4

0.2

-2 -1.5

1

0.8

0.8

-1 -0.5

0.2

0.2

0.5

1

1.5

2

-2

-1

1

-3

2

-2

-1

0.2

1

2

3

-4

-3

-2

-1

-0.2

-0.2

-0.2

-0.2

-0.4

-0.4

-0.4

-0.4

1

2

3

4

4.4.26. x ∈ ker A if and only if p(t) vanishes at all the sample points: p(ti ) = 0, i = 1, . . . , m. 4.4.27. (a) For example, the the interpolating polynomial for the data (0, 0), (1, 1), (2, 2) is the straight line y = t. (b) The Lagrange interpolating polynomials are zero at n of the sample points. But the only polynomial of degree < n than vanishes at n points is the zero polynomial, which does not interpolate the final nonzero data value. ♦ 4.4.28. (a) If p(xk ) = a0 + a1 xk + a2 x2k + · · · + an xn k = 0 for k = 1, . . . , n + 1, then V a = 0 where V is the (n + 1) × (n + 1) Vandermonde matrix with entries vij = xji−1 for i, j = 1, . . . , n + 1. According to Lemma 4.12, if the sample points are distinct, then V is a nonsingular matrix, and hence the only solution to the homogeneous linear system is a = 0, which implies p(x) ≡ 0. (b) This is a special case of Exercise 2.3.37. (c) This follows from part (b); linear independence of 1, x, x2 , . . . , xn means that p(x) = a0 + a1 x + a2 x2 + · · · + an xn ≡ 0 if and only if a0 = · · · = an = 0. ♦ 4.4.29. This follows immediately from (4.51), since the determinant of a regular matrix is the product of the pivots, i.e., the diagonal entries of U . Every factor ti − tj appears once among the pivot entries. 4.4.30. Note that kij = 1 + xi xj + (xi xj )2 + · · · + (xi xj )n−1 is the dot product of the ith and

j th columns of the n × n Vandermonde matrix V = V (x1 , . . . , xn ), and so K = V T V is a Gram matrix. Moreover, V is nonsingular when the xi ’s are distinct, which proves positive definiteness.

113

♥ 4.4.31.

(a) f ′ (x) ≈

f (x + h) − f (x − h) ; 2h

(b) f ′′ (x) ≈

f (x + h) − 2 f (x) + f (x − h) ; h2

(c) f ′ (x) ≈

− f (x + 2 h) + 4 f (x + h) − 3 f (x) ; 2h

(d) f ′ (x) ≈

− f (x + 2 h) + 8 f (x + h) − 8 f (x − h) + f (x − 2 h) , 12 h

f ′′ (x) ≈

− f (x + 2 h) + 16 f (x + h) − 30 f (x) + 16 f (x − h) − f (x − 2 h) , 12 h2

f ′′′ (x) ≈

f (x + 2 h) − 2 f (x + h) + 2 f (x − h) − f (x − 2 h) , 2 h3

f (iv) (x) ≈

f (x + 2 h) − 4 f (x + h) + 6 f (x) − 4 f (x − h) + f (x − 2 h) . h4

(e) For f (x) = ex at x = 0, using single precision arithmetic, we obtain the approximations: For h = .1:

f ′ (x) ≈ 1.00166750019844,

f ′′ (x) ≈ 1.00083361116072, f ′ (x) ≈ .99640457071210,

f ′ (x) ≈ .99999666269610,

f ′′ (x) ≈ .99999888789639,

f ′′′ (x) ≈ 1.00250250140590, For h = .01:

f (iv) (x) ≈ 1.00166791722567. f ′ (x) ≈ 1.00001666675000,

f ′′ (x) ≈ 1.00000833336050, f ′ (x) ≈ .99996641549580,

f ′ (x) ≈ .99999999966665,

f ′′ (x) ≈ .99999999988923,

f ′′′ (x) ≈ 1.00002500040157,

For h = .001:

f (iv) (x) ≈ 1.00001665913362. f ′ (x) ≈ 1.00000016666670,

f ′′ (x) ≈ 1.00000008336730, f ′ (x) ≈ .99999966641660,

f ′ (x) ≈ .99999999999997,

f ′′ (x) ≈ 1.00000000002574,

f ′′′ (x) ≈ 1.00000021522190,

f (iv) (x) ≈ .99969229415631.

114

For h = .0001:

f ′ (x) ≈ 1.00000000166730,

f ′′ (x) ≈ 1.00000001191154, f ′ (x) ≈ 0.99999999666713,

f ′ (x) ≈ 0.99999999999977,

f ′′ (x) ≈ 0.99999999171286,

f ′′′ (x) ≈ 0.99998010138450,

f (iv) (x) ≈ −3.43719068489236.

When f (x) = tan x at x = 0, using single precision arithmetic, For h = .1:

f ′ (x) ≈ 1.00334672085451,

f ′′ (x) ≈ 3.505153914964787 × 10−16 , f ′ (x) ≈ .99314326416565,

f ′ (x) ≈ .99994556862489,

f ′′ (x) ≈ 4.199852359823657 × 10−15 ,

f ′′′ (x) ≈ 2.04069133777138,

f (iv) (x) ≈ −1.144995345400589 × 10−12 .

For h = .01:

f ′ (x) ≈ 1.00003333466672,

f ′′ (x) ≈ −2.470246229790973 × 10−15 ,

f ′ (x) ≈ .99993331466332,

f ′ (x) ≈ .99999999466559,

f ′′ (x) ≈ −2.023971870815236 × 10−14 ,

f ′′′ (x) ≈ 2.00040006801198,

f (iv) (x) ≈ 7.917000388601991 × 10−10 .

For h = .001:

f ′ (x) ≈ 1.00000033333347,

f ′′ (x) ≈ −3.497202527569243 × 10−15 ,

f ′ (x) ≈ .99999933333147,

f ′ (x) ≈ .99999999999947,

f ′′ (x) ≈ 5.042978979811304 × 10−15 , f ′′′ (x) ≈ 2.00000400014065,

f (iv) (x) ≈ 1.010435775508413 × 10−7 .

For h = .0001:

f ′ (x) ≈ 1.00000000333333,

f ′′ (x) ≈ 4.271860643001446 × 10−13 , f ′ (x) ≈ .99999999333333,

f ′ (x) ≈ 1.00000000000000,

f ′′ (x) ≈ 9.625811347808891 × 10−13 ,

f ′′′ (x) ≈ 2.00000003874818,

f (iv) (x) ≈ −8.362027156624034 × 10−4 . 115

In most cases, the accuracy improves as the step size gets smaller, but not always. In particular, at the smallest step size, the approximation to the fourth derivative gets worse, indicating the increasing role played by round-off error. (f ) No — if the step size is too small, round-off error caused by dividing two very small quantities ruins the approximation. (g) If n < k, the k th derivative of the degree n interpolating polynomial is identically 0. ♥ 4.4.32.

(a) Trapezoid Rule:

(b) Simpson’s Rule: (c) Simpson’s

3 8

Zab

Z b a

i

1 2

(b − a) f (x0 ) + f (x1 ) .

f (x) dx ≈

1 6

(b − a) f (x0 ) + 4 f (x1 ) + f (x2 ) .

b

Z b a a

h

f (x) dx ≈

aZ

Rule:

(d) Midpoint Rule: (e) Open Rule:

Z b

f (x) dx ≈

1 8

h

h

1 2

i

(b − a) f (x0 ) + 3 f (x1 ) + 3 f (x2 ) + f (x3 ) .

f (x) dx ≈ (b − a) f (x0 ).

f (x) dx ≈

i

h

i

(b − a) f (x0 ) + f (x1 ) .

(f ) (i) Exact: 1.71828; Trapezoid Rule: 1.85914; Simpson’s Rule: 1.71886; Simpson’s 83 Rule: 1.71854; Midpoint Rule: 1.64872; Open Rule: 1.67167. (ii) Exact: 2.0000; Trapezoid Rule: 0.; Simpson’s Rule: 2.0944; Simpson’s 83 Rule: 2.04052; Midpoint Rule: 3.14159; Open Rule: 2.7207. (iii) Exact: 1.0000; Trapezoid Rule: .859141; Simpson’s Rule: .996735; Simpson’s 83 Rule: .93804; Midpoint Rule: 1.06553; Open Rule: .964339. (iv ) Exact: 1.11145; Trapezoid Rule: 1.20711; Simpson’s Rule: 1.10948; Simpson’s 38 Rule: 1.11061; Midpoint Rule: 1.06066; Open Rule: 1.07845. Note: For more details on numerical differentiation and integration, you are encouraged to consult a basic numerical analysis text, e.g., [ 10 ].

0

1 4.4.33. The sample matrix is A = B @ 0 −1 gives g(t) = 38 cos π t + 12 sin π t.

1

0 1C A; the least squares solution to A x = y = 0

4.4.34. g(t) = .9827 cosh t − 1.0923 sinh t.

0

1

1 B C @ .5 A .25

4.4.35. (a) g(t) = .538642 et − .004497 e2 t , (b) .735894. (c) The maximal error is .745159 which occurs at t = 3.66351. (d) Now the least squares approximant is 0.58165 et − .0051466 e2 t − .431624; the least squares error has decreased to .486091, although the maximal error over the interval [ 0, 4 ] has increased to 1.00743, which occurs at t = 3.63383! 4.4.36. (a) 5 points: g(t) = − 4.4530 cos t + 3.4146 sin t = 5.6115 cos(t − 2.4874); 9 points: g(t) = − 4.2284 cos t + 3.6560 sin t = 5.5898 cos(t − 2.4287). (b) 5 points: g(t) = − 4.9348 cos t + 5.5780 sin t + 4.3267 cos 2 t + 1.0220 sin 2 t = 4.4458 cos(t − .2320) + 7.4475 cos(2 t − 2.2952); 9 points: g(t) = − 4.8834 cos t + 5.2873 sin t + 3.6962 cos 2 t + 1.0039 sin 2 t = 3.8301 cos(t − .2652) + 7.1974 cos(2 t − 2.3165). 116

1

♥ 4.4.37. (a) n = 1, k = 4: p(t) = .4172 + .4540 cos t; maximal error: .1722;

0.8 0.6 0.4 0.2 -3

-2

-1

1

2

1

2

3

1

2

3

1

2

3

3

1 0.8

(b) n = 2, k = 8: p(t) = .4014 + .3917 cos t + .1288 cos 2 t; maximal error: .0781;

0.6 0.4 0.2 -3

-2

-1 1 0.8

(c) n = 2, k = 16: p(t) = .4017 + .389329 cos t + .1278 cos 2 t; maximal error: .0812;

0.6 0.4 0.2 -3

-2

-1 1 0.8

(d) n = 3, k = 16: p(t) = .4017 + .3893 cos t + .1278 cos 2 t + .0537 cos 3 t; maximal error: .0275;

0.6 0.4 0.2

-3

-2

-1

(e) Because then, due to periodicity of the trigonometric functions, the columns of the sample matrix would be linearly dependent. (

1, j = k, , the same as the Lagrange polynomials, the coeffi0, otherwise. cients are cj = f (xj ). For each function and step size, we plot the sinc interpolant S(x) and a comparison with the graph of the function.

♥ 4.4.38. Since Sk (xj ) =

1

1

0.8

f (x) = x2 ,

h = .25,

max error: .19078:

0.8

0.6

0.6

0.4

0.4

0.2

0.2 0.2

0.4

0.6

0.8

1

1

f (x) = x2 ,

0.8

h = .1,

max error: .160495:

0.6

0.4

0.4

0.4

0.6

0.8

1

1

0.6

0.4

0.4

˛

− ˛˛ x −

˛ 1˛ 2 ˛,

h = .25, max error: .05066:

0.4

0.6

0.8

1

0.5

0.5

0.4

0.4

0.3

0.3

0.2

0.2

0.1

0.4

0.6

0.8

1

0.2

0.4

0.6

0.8

1

0.2

0.4

0.6

0.8

1

0.1 0.2

117

0.2

0.2 0.2

1 2

1

0.8

0.6

0.2

f (x) =

0.8

1

0.8

max error: .14591:

0.6

0.2 0.2

h = .025,

0.4

0.8

0.6

0.2

f (x) = x2 ,

0.2 1

0.4

0.6

0.8

1

0.5

f (x) =

1 2

˛ ˛ − ˛˛ x − 21 ˛˛,

h = .1, max error: .01877:

0.5

0.4

0.4

0.3

0.3

0.2

0.2

0.1

0.1 0.2

0.4

0.6

1

0.8

0.5

f (x) =

1 2

˛

− ˛˛ x −

˛ 1˛ 2 ˛,

h = .025, max error: .004754:

0.2

0.4

0.6

0.8

1

0.2

0.4

0.6

0.8

1

0.5

0.4

0.4

0.3

0.3

0.2

0.2

0.1

0.1 0.2

0.4

0.6

1

0.8

In the first case, the error is reasonably small except near the right end of the interval. In the second case, the approximation is much better; the larger errors occur near the corner of the function.

4.4.39. (a) y = − .9231 + 3.7692 t, (b) The same interpolating parabola y = 2 + t 2 . Note: When interpolating, the error is zero irrespective of the weights. 4.4.40. (a) .29 + .36 t,

(b) 1.7565 + 1.6957 t,

(c) − 1.2308 + 1.9444 t,

(d) 2.32 + .4143 t.

4.4.41. The weights are .3015, .1562, .0891, .2887, .2774, .1715. The weighted least squares plane has the equation z = 4.8680 − 1.6462 x + .2858 y.

♦ 4.4.42. When x = (AT C A)−1 AT C y is the least squares solution, Error = k y − A x k =

4.4.43. (a)

3 7

+

9 14

t;

(b)

9 28

+

9 7

t−

q

yT C y − yT C A (AT C A)−1 AT C y .

9 2 14 t ;

(c)

24 91

+

180 91

t−

216 2 91 t

+

15 3 13 t .

0.5 -1

-0.5

0.5

1

-0.5

4.4.44.

1 8

+

27 20

t−

3 2 2t ;

the maximal error if

-1 -1.5

2 5

at t = ± 1

-2 -2.5 -3

4.4.45. p1 (t) = .11477 + .66444 t,

p2 (t) = − .024325 + 1.19575 t − .33824 t2 .

4.4.46. 1.00005 + 0.99845 t + .51058 t2 + 0.13966 t3 + .069481 t4 . ♥ 4.4.47. (a) 1.875 x2 − .875 x, (b) 1.9420 x2 − 1.0474 x + .0494, (c) 1.7857 x2 − 1.0714 x + .1071. (d) The interpolating polynomial is the easiest to compute; it exactly coincides with the function at the interpolation points; the maximal error over the interval [ 0, 1 ] is .1728 at t = .8115. The least squares polynomial has a smaller maximal error of .1266 at t = .8018. The L2 approximant does a better job on average across the interval, but its maximal error of .1786 at t = 1 is comparable to the quadratic interpolant. 4.4.48. g(x) = 2 sin x. ♦ 4.4.49. Form the n × n Gram matrix with entries kij = h gi , gj i = vector f with entries fi = h f , gi i =

Z b a

Z b a

gi (x) gj (x) w(x) dx and

f (x) gi (x) w(x) dx. The solution to the linear sys118

tem K c = f then gives the required coefficients c = ( c1 , c2 , . . . , cn )T . 4.4.50. 2 3 25 2 5 (i) 28 − 15 14 t + 14 t ≈ .10714 − 1.07143 t + 1.78571 t ; maximal error: 28 = .178571 at t = 1; 2 2 25 2 t + 50 (ii) 72 − 14 21 t ≈ .28571 − 1.78571 t + 2.38095 t ; maximal error: 7 = .285714 at t = 0; (iii) .0809 − .90361 t + 1.61216 t2 ; maximal error: .210524 at t = 1. Case (i) is the best. ♥ 4.4.51.

˛

˛∞ a dx 1 π −1 2 ˛ ˛ = = tan a x (a) = . 4 2 ˛ −∞ 1 + a x a a x = −∞ √ . (b) The maximum value of fa (x) occurs at x = 0, where fa (0) = a = k fa k∞ s

k fa k22

Z ∞

π , which a is small when a is large. But fa (x) has a large maximum value and so is very far from zero near x = 0. Note that fa (x) → 0 for all x 6= 0 as a → ∞, but fa (0) → ∞.

(c) The least squares error between fa (x) and the zero function is k fa k2 =

4.4.52. (a) z = x + y −

1 3

,

(b) z =

9 10 (x − y),

4.4.53. p(x, y) ≡ 0.

119

(c) z = 4/π 2 — a constant function.


5.1.1. (a) Orthogonal basis; (b) orthonormal basis; (c) not a basis; (d) basis; (e) orthogonal basis; (f ) orthonormal basis. 5.1.2. (a) Basis; (b) orthonormal basis; (c) not a basis. 5.1.3. (a) Basis; (b) basis; (c) not a basis; (d) orthogonal basis; (e) orthonormal basis; (f ) basis. 5.1.4. h e1 , e2 i = h e1 , e3 i = h e2 , e3 i = 0. {e1 , 5.1.5. (a) a = ± 1. (b) a = ± 5.1.6. a = 2 b > 0. 5.1.7. (a) a = 5.1.8.

1 2

3 2

x1 y1 +

q

2 3

, (c) a = ±

q

√1 2

3 2

e2 ,

√1 3

e3 } is an orthonormal basis.

.

b > 0; (b) no possible values — because they cannot be negative! 1 8

x2 y2 .

0

0

1

1

0

1

0 1 0 B C B C C 5.1.9. False. Consider the basis v1 = B @ 1 A, v2 = @ 1 A, v3 = @ 0 A. Under the weighted inner 1 0 0 product, h v1 , v2 i = b > 0, since the coefficients of a, b, c appearing in the inner product must be strictly positive. ♥ 5.1.10. (a) By direct computation: u × v = u × w = 0. (b) First, if w = c v, then we compute v × w = 0. Conversely, suppose v 6= 0 — otherwise the result is trivial, and, in particular, v1 6= 0. Then v × w = 0 implies wi = c vi , i = 1, 2, 3, where c = w1 /v1 . The other cases when v2 6= 0 and v3 6= 0 are handled in a similar fashion. (c) If v, w are orthogonal and nonzero, then by Proposition 5.4, they are linearly independent, and so, by part (b), u = v × w is nonzero, and, by part (a), orthogonal to both. Thus, the three vectors are nonzero, mutually orthogonal, and so form an orthogonal basis of R 3 . (d) Yes. In general, k v × w k = k v k k w k | sin θ | where θ is the angle between them, and so when v, w are orthogonal unit vectors, k v × w k = k v k k w k = 1. This can also be shown by direct computation of k v × w k using orthogonality. ♦ 5.1.11. See Example 5.20.

♦ 5.1.12. We repeatedly use the identity sin2 α + cos2 α = 1 to simplify

h u1 , u2 i = − cos φ sin φ sin2 θ + (− cos θ cos ψ sin φ − cos φ sin ψ)(cos θ cos ψ cos φ − sin φ sin ψ) +(cos ψ sin φ + cos θ cos φ sin ψ)(cos φ cos ψ − cos θ sin φ sin ψ) = 0. By similar computations, h u1 , u3 i = h u2 , u3 i = 0, h u1 , u1 i = h u2 , u2 i = h u3 , u3 i = 1.

♥ 5.1.13. (a) The (i, j) entry of AT K A is viT K vj = h vi , vj i. Thus, AT K A = I if and only if ( 1, i = j, h v i , vj i = and so the vectors form an orthonormal basis. 0, i 6= j, 120

(b) According to part (a), orthonormality requires AT K A = I , and so K = A−T A−1 = (A AT )−1 is the Gram matrix for A−1 , and K > 0 since A−1 is nonsingular. This also proves the uniqueness of the inner product. ! ! 10 −7 1 2 , with inner product ,K= (c) A = −7 5 1 3 h v , w i = vT K w = 10 v1 w1 − 7 v1 w2 − 7 v2 w1 + 5 v2 w2 ; 0 1 0 1 1 1 1 3 −2 0 C B (d) A = B 6 −3 C @ 1 1 2 A, K = @ −2 A, with inner product 1 2 3 0 −3 2 h v , w i = vT K w = 3 v1 w1 − 2 v1 w2 − 2 v2 w1 + 6 v2 w2 − 3 v2 w3 − 3 v3 w2 + 2 v3 w3 . 5.1.14. One way to solve this is by direct computation. A more sophisticated approach is to apply the Cholesky factorization (3.70) to the inner product matrix: K = M M T . Then, bT w b where v b = M v, w b = M w. Therefore, v , v form an orthonorh v , w i = vT K w = v 1 2 T b = Mv , v b = M v , form an mal basis relative to h v , w i = v K w if and only if v 1 1 2 2 orthonormal basis for the dot product, and hence of the form determined in Exercise 5.1.11. Using this we find: ! ! ! cos θ ± sin θ 1 √0 , so v1 = √1 sin θ , v2 = √1 cos θ , for any 0 ≤ θ < 2 π. (a) M = 2 0 2

(b) M =

1 0

!

2

!

−1 , so v1 = 1

cos θ + sin θ , v2 = sin θ

!

± sin θ + cos θ , for any 0 ≤ θ < 2 π. cos θ

5.1.15. k v + w k2 = h v + w , v + w i = h v , v i+2 h v , w i+h w , w i = k v k2 +k w k2 if and only if h v , w i = 0. The vector v + w is the hypotenuse of the right triangle with sides v, w. 5.1.16. h v1 + v2 , v1 − v2 i = k v1 k2 − k v2 k2 = 0 by assumption. Moreover, since v1 , v2 are linearly independent, neither v1 − v2 nor v1 + v2 is zero, and hence Theorem 5.5 implies that they form an orthogonal basis for the two-dimensional vector space V .

5.1.17. By orthogonality, the Gram matrix is a k × k diagonal matrix whose diagonal entries are k v1 k2 , . . . , k vk k2 . Since these are all nonzero, the Gram matrix is nonsingular. An alternative proof combines Propositions 3.30 and 5.4. 5.1.18. (a) Bilinearity: for a, b constant, e,qi = hap + bp

Z 1

e(t)) q(t) dt t (a p(t) + b p

0 Z 1

=a

0

t p(t) q(t) dt + b

Z 1 0

e(t) q(t) dt = a h p , q i + b h p e , q i. tp

The second bilinearity condition h p , a q + b eq i = a h p , q i + b h p , eq i follows similarly, or is a consequence of symmetry, as in Exercise 3.1.9. Symmetry: h q , p i =

Z 1

Positivity: h p , p i =

0

Z0 1

t q(t) · p(t) dt =

Z 1 0

t p(t) · q(t) dt = h p , q i.

t p(t)2 dt ≥ 0, since t ≥ 0 and p(t)2 ≥ 0 for all 0 ≤ t ≤ 1.

Moreover, since p(t) is continuous, so is t p(t)2 . Therefore, the integral can equal 0 if and only“ if t p(t)2” ≡ 0 for all 0 ≤ t ≤ 1, and hence p(t) ≡ 0. (b) p(t) = c 1 − 23 t for any c. √ (c) p1 (t) = 2 , p2 (t) = 4 − 6 t; √ √ (d) p1 (t) = 2 , p2 (t) = 4 − 6 t, p3 (t) = 2 (3 − 12 t + 10 t2 ). 121

Z π

5.1.19. Since

−π

sin x cos x dx = 0, the functions cos x and sin x are orthogonal under the L2

inner product on [ − π, π ]. Moreover, they span the solution space of the differential equation, and hence, by Theorem 5.5, form an orthogonal basis. 5.1.20. They form a basis, but not on orthogonal basis since h ex/2 , e− x/2 i =

Z 1 ex/2 e− x/2 dx 0

= 1. An orthogonal basis is ex/2 , e− x/2 −

ex/2 . e−1

5.1.21. (a) We compute h v1 , v2 i = h v1 , v3 i = h v2 , v3 i = 0 and k v1 k = k v2 k = k v3 k = 1. T 37 7 11 37 (b) h v , v1 i = 75 , h v , v2 i = 11 13 , and h v , v3 i = − 65 , and so ( 1, 1, 1 ) = 5 v1 + 13 v2 − 65 v3 . (c)

“

7 5

”2

+

“

11 13

”2

“

+ − 37 65

”2

= 3 = k v k2 .

♥ 5.1.22. (a) By direct commputation: v1 · v2 = 0, v1 · v3 = 0, v2 · v3 = 0. v ·v v2 · v v3 · v 6 −3 1 1 (b) v = 2v1 − 12 v2 + 12 v3 , since 1 2 = = 2, = =− , = . k v1 k 3 k v2 k2 6 2 k v3 k2 2 ! ! ! ! ! ! v2 · v 2 v3 · v 2 v1 · v 2 6 2 3 2 1 2 (c) + + = √ + √ + √ = 14 = k v k2 . k v1 k k v2 k k v3 k 3 6 2 (d) The orthonormal basis is u1 =

(e) v = 2

0 1 √ B 3 B 1 B√ B 3 @ √1 3 “

√ 1 3 3 u1 − √ u2 + √ u3 and 6 2

1

C C C, C A

2

0

u2 =

√ ”2 3 +

B B B B @

5.1.23. (a) Because h v1 , v2 i = v1T K v2 = 0. (b) v = (c)

v1 · v k v1 k

(d) u1 =

„

!2

+

√1 , √1 3 3

v2 · v k v2 k «T

!2

=

, u2 =

„

„

√7 3

«2

+

„

√1 15 15

− √2 ,

(e) v = h v , u1 i u1 + h v , u2 i u2 =

√7 3

√5 15

−

«T

u1 −

5.1.24. Consider the non-orthogonal basis v1 = but k v k2 = 1 6= 12 + (−1)2 .

−1

t dt = 0,

h P 0 , P2 i =

+

1 √ 2

B B B B @

!2

1

− √1

2 √1 2

0

C C C. C A

= 14 = k v k2 .

h v , v1 i h v , v2 i v1 + v = 2 k v1 k k v2 k2 2

7 3

v1 −

1 3

v2 .

= 18 = k v k2 .

u2 ; k v k2 = 18 = !

„

√7 3

«2

0 . We have v = 1

1 , v2 = 1

Z 1 “ t2 −1

122

u3 =

0

.

√ 15 3 !

h 1 , p1 i h 1 , p2 i h 1 , p3 i = 1, = 0, = 0,, k p1 k2 k p2 k2 k p3 k2 h x , p3 i h x , p1 i 1 h x , p2 i = , = 1, = 0,, 2 2 k p1 k 2 k p2 k k p3 k2 h x2 , p 1 i h x2 , p3 i 1 h x2 , p2 i = , = 1, = 1,, k p1 k2 3 k p2 k2 k p3 k2 1

−3 √ 6

«2

5.1.25.

5.1.26. Z (a) h P0 , P1 i =

1 √1 6C C √1 C, 6C A 2 −√ 6 !2

+

„

1 0

−

!

√

15 3

«2

.

= v1 − v2 ,

so 1 = p1 (x) + 0 p2 (x) + 0 p3 (x) = p1 (x). so x =

1 2 p1 (x) + p2 (x).

so x2 =

−

1 3

”

1 3 p1 (x) + p2 (x) + p3 (x).

dt = 0,

h P0 , P3 i = h P1 , P3 i = (b) (c)

r

√1 , 2 3

3 2

t,

h t , P0 i = k P0 k2

Z 1 “ Z 1 “ ” ” t3 − 53 t dt = 0, h P1 , P2 i = t t2 − 31 dt = 0, −1 −1 Z 1 “ Z 1 “ ” ” ”“ 3 3 t t − 5 t dt = 0, h P2 , P3 i = t2 − 31 t3 − 53 t dt = 0. −1 −1 r “ ” r “ ” 5 3 2 7 5 3 1 3 2 2t − 2 , 2 2t − 2t , 3 h t , P1 i h t3 , P3 i 3 h t3 , P2 i 0, = , = 0, = 1, so t3 = 35 P1 (t) + P3 (t). k P1 k2 5 k P2 k2 k P3 k2

5.1.27. Z (a) h P0 , P1 i =

1“

Z01 “

t−

2 3

”

t dt = 0,

”“

h P 0 , P2 i = ”

Z 1“ t2 0

−

3 t − 23 h P1 , P2 i = t2 − 65 t + 10 t dt = 0. 0 ” √ √ “ (b) 2 , 6 t − 4, 6 3 − 12 t + 10 t2 . h t2 , P0 i 1 h t2 , P1 i 6 h t2 , P2 i (c) = , = , = 1, so t2 = k P0 k2 2 k P1 k2 5 k P2 k2

1 2

3 10

”

t dt = 0,

P0 (t) +

6 5

P1 (t) + P2 (t).

6 5

t+

5.1.28. (a) cos2 x = 12 + 12 cos 2 x, (b) cos x sin x = 21 sin 2 x, (c) sin3 x = 34 sin x − 41 sin 3 x, 1 1 sin 3 x − 16 sin 5 x, (e) cos4 x = 38 + 12 cos 2 x + 18 cos 4 x. (d) cos2 x sin3 x = 18 sin x + 16 cos x sin x 1 , √ , √ , 5.1.29. √ π π 2π ♦ 5.1.30. h e ♦ 5.1.31.

ikx

Z π

−π

Z π

−π Z π −π

,e

i lx

1 i= 2π

Z π

−π

e

ikx

cos k x cos l x dx =

Z π

sin k x sin l x dx =

Z π

cos k x sin l x dx =

cos n x sin n x √ , √ . π π

... ,

1 −π 2

e i k x dx

“

1 = 2π

Z π

−π

e

i (k−l) x

dx =

( 8 > > > <

”

1, 0, 0,

cos(k − l) x + cos(k + l) x dx = > 2 π, > > : 8 <

“ ” 1 cos(k − l) x − cos(k + l) x dx = : 2 −π Z π “ ” 1 sin(k − l) x + sin(k + l) x dx = 0. −π 2

k = l, k 6= l. k 6= l, k = l = 0,

π, 0,

k = l 6= 0, k 6= l,

π,

k = l 6= 0,

♦ 5.1.32. Given v = a1 v1 + · · · + an vn , we have

h v , vi i = h a1 v1 + · · · + an vn , vi i = ai k vi k2 ,

since, by orthogonality, h vj , vi i = 0 for all j 6= i. This proves (5.7). Then, to prove (5.8), k v k2 = h a1 v1 + · · · + an vn , a1 v1 + · · · + an vn i = =

n X

i=1

5.2.1. (a) (b)

√1 2 √1 2

a2i

2

k vi k =

( 1, 0, 1 )T , ( 0, 1, 0 )T , ( 1, 1, 0 )T ,

√1 6

√1 2

n X

i=1

h v , vi i k vi k2

( −1, 0, 1 )T ;

( −1, 1, −2 )T ,

√1 3

!2

k vi k =

( 1, −1, −1 )T ; 123

2

n X

i=1

n X

i,j = 1

a i a j h v i , vj i

h v , vi i k vi k

!2

.

(c) 5.2.2. (a) (b)

√1 14 „

„

( 1, 2, 3 )T ,

√1 , 0, √1 , 0 2 2 √1 , 0, 0, √1 2 2

«T

«T

√1 3

, ,

„

( 1, 1, −1 )T ,

√1 , 0, − √1 2 2 ” “ 2 1 2 T 3 , 3 , 0, − 3

0,

( −5, 4, −1 )T .

√1 42 «T

“

,

1 1 1 1 2, 2,−2, 2 „ T

, ( 0, 0, 1, 0 ) ,

”T

−

,

“

− 21 , 21 , 21 ,

1 2

1 4 1 √ , √ , 0, √ 3 2 3 2 3 2

”T

;

«T

.

5.2.3. The first two Gram–Schmidt vectors are legitimate, v1 = ( 1, −1, 0, 1 )T , v2 = ( −1, 0, 1, 1 )T , but then v3 = 0, and the algorithm breaks down. The reason is that the given vectors are linearly dependent, and do not, in fact, form a basis. 5.2.4.

!

2 1 T 0, √ , √ , ( 1, 0, 0 )T . 5 5 ”T “ , ( −1, 0, 1 )T , the Gram–Schmidt process produces (b) Starting with the basis 12 , 1, 0 (a)

„ «T «T 4 2 5 √1 , √2 , 0 − √ , , √ , √ . 5 5 3 5 3 5 3 5 T T Starting with the basis ( 1, 1, 0 ) , ( 3, 0, 1 ) , the Gram–Schmidt „ „ «T «T √3 , − √3 , √2 orthonormal basis √1 , √1 , 0 , . 2 2 22 22 22 „

the orthonormal basis

(c)

process produces the

5.2.5. ( 1, −1, −1, 1, 1 )T , ( 1, 0, 1, −1, 1 )T , ( 1, 0, 1, 1, −1 )T . 5.2.6. (a)

√1 3

( 1, 1, −1, 0 )T ,

√1 15

( −1, 2, 1, 3 )T ,

√1 15

( 3, −1, 2, 1 )T .

(b) Solving the homogeneous system we obtain the kernel basis ( −1, 2, 1, 0 )T , ( 1, −1, 0, 1 )T . The Gram-Schmidt process gives the orthonormal basis √1 ( −1, 2, 1, 0 )T , √1 ( 1, 0, 1, 2 )T . 6

“

(c) Applying Gram-Schmidt to the corange basis ( 2, 1, 0, −1 )T , 0, 21 , −1, orthonormal basis

√1 6

( 2, 1, 0, −1 )T ,

√1 6

( 0, 1, −2, 1 )T .

1 2

”T

6

, gives the

(d) Applying Gram-Schmidt to the range basis ( 1, 2, 0, −2 )T , ( 2, 1, −1, 5 )T , gives the or1 thonormal basis 31 ( 1, 2, 0, −2 )T , √ ( 8, 7, −3, 11 )T . 9 3

(e) Applying Gram-Schmidt to the cokernel basis orthonormal basis

√1 14

( 2, −1, 3, 0 )T ,

√1 9 42

“

”T 1 2 3 , − 3 , 1, 0 T

, ( −4, 3, 0, 1 )T , gives the

( −34, 31, 33, 14 ) .

(f ) Applying Gram-Schmidt to the basis ( −1, 1, 0, 0 )T , ( 1, 0, 1, 0 )T , ( 1, 0, 0, 1 )T , gives 1 ( 1, 1, −1, 3 )T . the orthonormal basis √1 ( −1, 1, 0, 0 )T , √1 ( 1, 1, 2, 0 )T , √ 2

5.2.7.

!

6

!

2 3 !

1 1 1 1 1 1 (a) Range: √ ; kernel: √ ; corange: √ ; −3 1 −1 100 0 1 2 0 1 1 2 −1 1 B1C 1 B 2C 1 C (b) Range: √ B @ 1 A, √ @ 1 A; kernel: √ @ −1 A; 2 6 0 1 6 20 1 10 0 1 −1 2 −1 1 1 B C 1 B C C corange: √ B @ 0 A, √ @ 5 A; cokernel: √ @ −1 A. 5 30 1 3 2 1

124

1 cokernel: √ 10

!

3 . 1

0

1

1 1 1 C (c) Range: √ B @ 1 A, √ 3 −1 42

0

1

1 1 B C @ 4 A, √ 14 5

0

1

3 C B @ −2 A; 1

kernel:

1 2

0

−1

1

B C B −1 C C; B @ 1A

1 0 1 0 1 1 1 0 −1 C C B 1 B 1 B0C C B1C 1 B 1C B C, √ B C, C; the cokernel is {0}, so there is no basis. B corange: √ B 2 @1A 2 @0A 2 @ 1A 01 1 1 −1 0 0 0 1 1 1 −4 B C C 1 B 1 1 0 −1 B C B C C C, √ B C; kernel: √ B (d) Range: √ B @ 1 A; 3 @ −1 A 3 @ 0A 21 2 1 −1 0 1 0 1 0 1 0 1 1 0 1 1 B C B 1 B 1 B 1 C 1 B −1 C 1 C C C C. corange: √ B @ 2 A, √ @ −2 A; cokernel: √ B C, √ B @1A @ 1A 6 1 14 3 3 3 0 1 0

5.2.8. (i) (a)

(b) (c) (ii) (a) (b) (c)

T 1 1 √1 ( 0, 1, 0 )T , √ ( −1, 0, 3 )T ; 2 ( 1, 0, 1 ) , 2 2 3 √1 ( 1, 1, 0 )T , √1 ( −2, 3, −5 )T , √1 ( 2, −3, −6 )T ; 5 55 66 1 √ ( 1, 2, 3 )T , √ 1 ( 4, 3, −8 )T , √1 ( −5, 6, −3 )T . 2 5 130 2 39 “ ” “ ” “ ” 1 1 1 1 T 1 T 1 T , , − ; , 0, , 1, , 0, 2 2 2 2 2 2 «T “ «T „ ”T „ √1 , √1 , 0 , − 21 , 0, − 21 , 0, − √1 , − √1 ; 2 2 2 2 T T T 1 √ ( 1, 2, 3 ) , √1 ( 4, 5, 0 ) , √1 ( −2, 1, 0 ) . 2 3 42 14

5.2.9. Applying the Gram–Schmidt process to the standard basis vectors e1 , e2 gives 1 1 0 1 0 0 0 1 0 1 1 0 1 √ 1 − √1 √1 √1 0 B B C 10 (a) @ 3 A, @ √1 A; (b) @ 2 A, @ 2 2 3 A; (c) @ 2 A, @ √2 C A. √ √ 0 0 0 5 3 5

5.2.10. Applying the Gram–Schmidt process to the standard basis vectors e1 , e2 , e3 gives 0 1 (a)

1 1 2 B C B C, @0A 0

0

0

1 √ B2 2 B B √1 B 2 @

0

1

C C C, C A

1 √ B2 6 B B √1 B B √6 @ √2 3

C C C C; C A

(b)

0

1

1

0

1

1

0

C 1 B C 1 B √1 B @ 0 A, √ @ 3 A, √ @ 3 33 4 22

0

0

1

−2 5C A. 11

5.2.11. (a) 2, namely ±1; (b) infinitely many; (c) no. ♦ 5.2.12. The key is to make sure the inner products are in the correct order, as otherwise complex conjugates appear on the scalars. By induction, assume that we already know k−1 X h w k , vj i vj , for any i < k, h vi , vj i = 0 for i 6= j ≤ k − 1. Then, given vk = wk − k vj k2 j =1 h v k , vi i =

*

wk −

k−1 X

h w k , vj i

j =1

= h w k , vi i −

k vj k2

k−1 X

j =1

vj , vi

h w k , vj i k vj

k2

+

h v j , vi i = h w k , v i i −

completing the induction step. 125

h w k , vi i h vi , vi i = 0, k vi k2

!T

1+ i 1− i 5.2.13. (a) , 2 2 1+ i 1− i 2− i , , (b) 3 3 3 5.2.14.

(b) (c)

1 3 “

i,

1 1 1 2,−2, 2

i

”T

,

!T

,

3 − i 1 − 2 i −1 + 3 i , , 5 5 5

!T

!

T −1 + 2 i 3 − i 3 i √ , √ , √ ; 2 6 2 6 2 6 √1 ( 6 + 2 i , 5 − 5 i , 9 )T ;

T 1− i 1 √ , √ ,0 , 3 3 ( −1 − 2 i , 2, 0 )T ,

− 21

,

!T

!

(a)

!

3 − i 1 + 3i T √ , √ ; 2 5 2 5 −2 + 9 i −9 − 7 i −1 + 3 i , , 15 15 15

3 19 “

− 12 ,

,

1 2

i , 21 ,

1 2

i

”T

,

“

− 21 i ,

1 2

i , 0,

1 2

−

1 2

i

”T

.

5.2.15. False. Any example that starts with a non-orthogonal basis will confirm this. ♦ 5.2.16. According to Exercise 2.4.24, we can find a basis of R n of the form u1 , . . . , um , vm+1 , . . . , vn . When we apply the Gram–Schmidt process to this basis in the indicated order, it will not alter the orthonormal vectors u1 , . . . , um , and so the result is the desired orthonormal basis. Note also that none of the orthonormal basis vectors u m+1 , . . . , un belongs to V as otherwise it would be in the span of u1 , . . . , um , and so the collection would not be linearly independent.

0

1

0

0. C 5.2.17. (a) B @ .7071 A, .7071 0

1

.5164 B .2582 C B C C B .7746 C, (c) B B C @ .2582 A 0.

1

.8165 B C @ −.4082 A, .4082

0

1

−.2189 B −.5200 C B C B C B .4926 C, B C @ −.5200 A .4105

5.2.18. Same solutions.

0

0

1

0

.57735

1

0

1

−.2582 .5164 C C C, (b) .2582 A 0. .7746 0 1 0 1 0 1 0. .6325 .1291 B .7071 C B −.3162 C B −.3873 C B C B C B C C B C B C B 0. C, B .6325 C, B −.5164 C, (d) B B C B C B C @ .7071 A @ .3162 A @ .3873 A 0. 0. −.6455

.57735 B C @ .57735 A; −.57735 1

.2529 B .5454 C B C B C B −.2380 C; B C @ −.3372 A .6843

C B B .57735 C B C, @ −.57735 A

B B B @

0

.7746

1

C B B −.2582 C B C; @ .5164 A

.2582 1 .57735 B 0. C B C B C B −.57735 C. B C @ 0. A .57735 0

5.2.19. See previous solutions. (j)

(j)

♦ 5.2.20. Clearly, each uj = wj /k wj k is a unit vector. We show by induction on k and then (j)

on j that, for each 2 ≤ j ≤ k, the vector wk (k) (k) wk /k wk k

imply uk = Indeed, by the formulas,

is orthogonal to u1 , . . . , uk−1 , which will

is also orthogonal to u1 , . . . , uk−1 ; this will establish the result.

(2)

h wk , u1 i = h wk , u1 i − h wk , u1 i h u1 , u1 i = 0. Further, for i < j < k (j+1)

h wk

(j)

(j)

, ui i = h wk , ui i − h wk , uj i h uj , ui i = 0,

(j)

h wj , ui i (j) = 0. since, by the induction hypothesis, both h wk , ui i = 0 and h uj , ui i = (j) k wj k Finally, (j+1) (j) (j) uj = h wk , uj i − h wk , uj i h uj , uj i = 0, wk since uj is a unit vector. This completes the induction step, and the result follows. 5.2.21. Since u1 , . . . , un form an orthonormal basis, if i < j, (j+1)

h wk

(j)

, ui i = h wk , ui i, 126

.

(i)

and hence, by induction, rik = h wk , ui i = h wk , ui i. Furthermore, (j+1) 2

k wk

(i)

(j)

(j)

(j)

2 k = k wk k2 − h wk , uj i2 = k wk k2 − rjk ,

2 2 2 . = rii − · · · − ri−1,i and so, by (5.5), k wi k2 = k wi k2 − r1i

5.3.1. (a) Neither; (b) proper orthogonal; (c) orthogonal; (d) proper orthogonal; (e) neither; (f ) proper orthogonal; (g) orthogonal. 5.3.2. (a) By direct computation RT R = 1 I , QT Q = I ; 0 0 1 cos θ sin θ 0 cos θ 0 sin θ B C 0 0 1C (b) Both R Q = B A, and Q R = @ − sin θ 0 cos θ A @ − sin θ cos θ 0 0 1 0 satisfy (R Q)T (R Q) = I = (Q R)T (Q R); (c) Q is proper orthogonal, while R, R Q and Q R are all improper. 5.3.3. (a) True: Using the formula (5.31) for an improper 2 × 2 orthogonal matrix, cos θ sin θ

!2

!

sin θ 1 0 = . − cos θ 0 1 0 1 cos θ − sin θ 0 2 (b) False: For example, B cos θ 0C @ sin θ A 6= I for θ 6= 0, π. 0 0 −1 ♥ 5.3.4. (a) By direct computation using sin2 α + cos2 α = 1, we find QT Q = I and det Q = +1. (b) 0 1 cos ϕ cos ψ − cos θ sin ϕ sin ψ − cos ϕ sin ψ − cos θ sin ϕ cos ψ sin θ sin ϕ C Q−1 = QT = B @ sin ϕ cos ψ + cos θ cos ϕ sin ψ − sin ϕ sin ψ + cos θ cos ϕ cos ψ − sin θ cos ϕ A. sin θ sin ψ sin θ cos ψ cos θ ♥ 5.3.5. (a) By a long direct computation, we find QT Q = (y12 + y22 + y32 + y42 )2 I and det Q = (y12 + y22 + y32 + y42 )3 = 1. (b) Q−1

0

y12 + y22 − y32 − y42 B = QT = B @ 2 (y2 y3 + y1 y4 ) 2 (y2 y4 − y1 y3 )

2 (y2 y3 − y1 y4 ) y12 − y22 + y32 − y42 2 (y3 y4 + y1 y2 )

1

2 (y2 y4 + y1 y3 ) C 2 (y3 y4 − y1 y2 ) C A; 2 2 2 2 y1 − y2 − y3 + y4

(c) These follow by direct computation using standard trigonometric identities, e.g., the (1, 1) entry is y12 + y22 − y32 − y42 θ ϕ−ψ θ ϕ−ψ θ ϕ+ψ θ ϕ+ψ cos2 + cos2 sin2 − sin2 sin2 − sin2 cos2 = cos2 2 2 2 2 2 2 2 2 2 θ 2 θ + + cos(ϕ − ψ) sin = cos(ϕ + ψ) cos 2 2 ! ! 2 θ 2 θ 2 θ 2 θ = cos ϕ cos ψ cos − sin ϕ sin ψ cos + sin − sin 2 2 2 2 = cos ϕ cos ψ − cos θ sin ϕ sin ψ. 5.3.6. Since the rows of Q are orthonormal (see Exercise 5.3.8), so are the rows of R and hence 127

R is also an orthogonal matrix. Moreover, interchanging two rows changes the sign of the determinant, and so if det Q = +1, then det R = −1. 5.3.7. In general, det(Q1 Q2 ) = det Q1 det Q2 . If both determinants are +1, so is their product. Improper times proper is improper, while improper times improper is proper. ♦ 5.3.8. (a) Use (5.30) to show (QT )−1 = Q = (QT )T . (b) The rows of Q are the columns of QT , and hence since QT is an orthogonal matrix, the rows of Q must form an orthonormal basis. 5.3.9. (Q−1 )T = (QT )T = Q = (Q−1 )−1 , proving orthogonality. 5.3.10. (a) False — they must be an orthonormal basis. (b) True, since then QT has orthonormal basis columns, and so is orthogonal. Exercise 5.3.8 then implies that Q = (QT ! )T is also orthogonal. 0 1 (c) False. For example is symmetric and orthogonal. 1 0 5.3.11. All diagonal matrices whose diagonal entries are ± 1.

5.3.12. Let U = ( u1 u2 . . . un ), where the last n − j entries of the j th column uj are zero.

Since k u1 k = 1, u1 = ( ± 1, 0, . . . , 0 )T . Next, 0 = u1 · uj = ± u1,j for j 6= 1, and so all non-diagonal entries in the first row of U are zero; in particular, since k u 2 k = 1, u2 = ( 0, ± 1, 0, . . . , 0 )T . Then, 0 = u2 · uj = ± u2,j , j 6= 2, and so all non-diagonal entries

in the second row of U are zero; in particular, since k u3 k = 1, u3 = ( 0, 0, ± 1, 0, . . . , 0 )T . The process continues in this manner, eventually proving that U is a diagonal matrix whose diagonal entries are ± 1.

5.3.13. (a) Note that P 2 = I and P = P T , proving orthogonality. Moreover, det P = −1 since P can be obtained from the identity matrix I by interchanging two rows. (b) Only the matrices corresponding to multiplying a row by −1. 5.3.14. False. This is true only for row interchanges or multiplication of a row by −1.

5.3.15. (a) The columns of P are the standard basis vectors e1 , . . . , en , rewritten in a different order, which doesn’t affect their orthonormality. (b) Exactly half are proper, so there are 12 n ! proper permutation matrices. ♦ 5.3.16. (a) k Q x k2 = (Q x)T Q x = xT QT Q x = xT I x = xT x = k x k2 . (b) According to Exercise 3.4.19, since both QT Q and I are symmetric matrices, the equation in part (a) holds for all x if and only if QT Q = I . ♥ 5.3.17. (a) QT Q = ( I − 2 u uT )T ( I − 2 u uT ) = I − 4 u uT + 4 u uT u uT = I , since k u k2 = uT u = 1 by assumption. 0 0 1 1 1 0 ! 1 0 0 0 0 −1 24 7 − −1 0 B B C 25 25 A, 0C (iii) @ 0 −1 0 A, (iv ) @ 0 1 (b) (i) , (ii) @ A. 24 7 0 1 − 25 − 25 0 0 1 −1 0 0 0 1 0 1 0 1 0 1 ! ! 1 0 1 0 0 −4 B C B C B C C (c) (i) v = c , (ii) v = c , (iii) v = c@ 0 A + d@ 0 A, (iv ) v = c@ 0 A + dB @ 1 A. 1 3 0 1 1 0 128

In general, Q v = v if and only if v is orthogonal to u. ♦ 5.3.18. QT = ( I + A)T ( I − A)−T = ( I + AT )( I − AT )−1 = ( I − A)( I + A)−1 = Q−1 . To prove that I − A is invertible, suppose ( I − A)v = 0, so A v = v. Multiplying by v T and using Exercise 1.6.29(f ) gives 0 = vT A v = k v k2 , proving v = 0 and hence ker( I − A) = {0}. 5.3.19. (a) If S = ( v1 v2 . . . vn ), then S −1 = S T D, where D = diag (1/k v1 k2 , . . . , 1/k vn k2 ). 01 1 0 1 1 11 1 1 0 1 0 4 4 4C 1 1 1 1 B 4 0 0 0C 1 1 1 0 −1 B 4 1 1 1C 1 B1 B B B 0 0C 1 −1 −1 C 1 1 −1 0C B B1 CB 0 4 C C 4 −4 −4 C C=B C =B4 B CB C. (b) B 1 B1 @ 1 −1 @ 1 −1 0 0 AB 0 1A 0 12 0 C 0 0C @ 2 −2 A A @ 0 0 0 1 −1 1 −1 0 −1 1 1 1 0 0 0 2 0 0 − 2

2

♦ 5.3.20. Set A = ( v1 v2 . . . vn ), B = ( w1 w2 . . . wn ). The dot products are the same if and only if the two Gram matrices are the same: AT A = B T B. Therefore, Q = B A−1 = B −T AT satisfies QT = A−T B T = Q−1 , and hence Q is an orthogonal matrix. The resulting matrix equation B = Q A is the same as the vector equations wi = Q vi for i = 1, . . . , n. 5.3.21. (a) The (i, j) entry of QT Q is the product of the ith row of QT times the j th column of ( 1, i = j, u = u · u = Q, namely uT and hence QT Q = I . (b) No. For instance, i j i j 0, i 6= j, if Q = u =

1 √1 B 2C @ 1 A, √ 2 0

then Q QT =

0

1 @2 1 2

1 1 2A 1 2

is not a 2 × 2 identity matrix.

5.3.22. (a) Assuming the columns are nonzero, Proposition 5.4 implies they are linearly independent. But there can be at most m linearly independent vectors in R m , so n ≤ m. (b) The (i, j) entry of AT A is the dot product viT vj = vi · vj of the ith and j th columns of A, and so by orthogonality this is zero if i 6= j. The ith diagonal entry is the squared th column, k v k2 . Euclidean norm of the i0 i 0 1 1 ! 2 0 2 1 1 2 0 C T 2 −2 C , but A AT = B (c) Not necessarily. If A = B @0 @ 1 −1 A, then A A = A. 0 6 2 −2 4 0 2

♦ 5.3.23. If S = ( v1 v2 . . . vn ), then the (i, j) entry of S T K S is viT K vj = h vi , vj i, so S T K S = I if and only if h vi , vj i = 0 for i 6= j, while h vi , vi i = k vi k2 = 1.

♥ 5.3.24. (a) Given any A ∈ G, we have A−1 ∈ G, and hence the product A A−1 = I ∈ G also. (b) (i) If A, B are nonsingular, so are A B and A−1 , with (A B)−1 = B −1 A−1 , (A−1 )−1 = A. (ii) The product of two upper triangular matrices is upper triangular, as is the inverse of any nonsingular upper triangular matrix. (iii) If det A = 1 = det B, then A, B are nonsingular; det(A B) = det A det B = 1 and det(A−1 ) = 1/ det A = 1. (iv ) If P, Q are orthogonal matrices, so P −1 = P T , Q−1 = QT , then (P Q)−1 = Q−1 P −1 = QT P T = (P Q)T , and (Q−1 )−1 = Q = (QT )T = (Q−1 )T , so both P Q and Q−1 are orthogonal matrices. (v ) According to part (d), the product and inverse of orthogonal matrices are also orthogonal. Moreover, by part (c), the product and inverse of matrices with determi129

nant 1 also have determinant 1. Therefore, the product and inverse of proper orthogonal matrices are proper orthogonal. (vi) The inverse of a permutation matrix is a permutation matrix, as is the product of two permutation ! ! matrices. x y a b have integer entries with det A = a d − b d = 1, ,B = (vii) If A = z w c d ! ax + bz ay + bw also has integer det B = x w−y z = 1, then the product A B = cx + dz cy + dw ! d −b −1 entries and determinant det(A B) = det A det B = 1. Moreover, A = −c a also has integer entries and determinant det(A−1 ) = 1/ det A = 1. (c) Because the inverse of a matrix with 0 integer entries does not necessarily have integer 1 !−1 1 1 −2 A 1 1 . entries. for instance, = @ 21 1 −1 1 2 2 (d) No, because the product of two positive definite matrices is not necessarily symmetric, let alone positive definite. ♥ 5.3.25. (a) The defining equation U † U = I implies U −1 = U † . (b) (i)

0

√1 B −1 † 2 U =U =@ − √i 2 01 B2 B1 −1 † B2 (iii) U =U =B B1 @2 1 2

(c) (i) No, (ii) yes, (iii) yes.

1

− √i

2C A, √1 2

1 2 − 2i − 21 i 2

(ii) U −1 = U † =

1 2 − 21 1 2 − 21

1 2 i 2 − 21 − 2i

1

C C C C. C A

0 1 √ B 3 B 1 B√ B 3 @ √1 3

√1 3 1 √ − − 2 3 1 − √ + 2 3

i 2 i 2

√1 3 1 √ + − 2 3 1 − √ − 2 3

i 2 i 2

1

C C C, C A

(d) Let u1 , . . . , un denote the columns of U . The ith row of U † is the complex conjugate of the ith column of U , and so the (i, j) entry of U † U is uT i uj = uj · ui , i.e., the Hermitian dot product of the column vectors. Thus, U † U = I if and only if u1 , . . . , un form an orthonormal basis of C n . (e) Note first that (U V )† = V † U † . If U −1 = U † , V −1 = V † , then (U V )−1 = V −1 U −1 = V † U † = (U V )† , so U V is also unitary. Also (U −1 )−1 = U = (U † )† = (U −1 )† , and so U −1 is also unitary.

5.3.26.

0 B B @

2 2 −1

5.3.27. (a)

1 2

−3 1

(b)

4 3

3 2

!

0 4 −1 !

= 0

=@

1

0

3 1C B B B0 2C A=B @ −3 0 0

√1 B 5 @ 2 √ 5 4 5 3 5

3 5 − 54

− √2

3 √

3 √1 2 √3 2

2 2 0 1 0√

5

5C B A @ √1 0 5 1 0 1 18 5 A; A @5 1 0 5

10 CB CB CB CB A@

− √1

−

1

5C A; √7 5

130

2 3 2 3 1 3

− √1

2 √1 2

0

1 √ 3 2 1 − √ 3√2 −232

−

1

C C C. C A

(c)

0 B B @

2 0 −1

1 1 −1

B B @

0 −1 −1

1 1 1

0 0 −1

0 4 0

1

1 1 2 1

(d)

0

(e)

0

(f )

B B @

0

1 2 1 0

B B B1 B B1 @

1

(ii) (a)

0

1 −1 2

2

(iii) (a)

♠ 5.3.29.

0

4 B @1 0 0 4 1 B B1 4 B @0 1 0 0 0 4 1 0 B1 4 1 B B B0 1 4 B @0 0 1 0 0 0

1

1 4 1 0 1 4 1 0 0 1 4 1

1

B B B B B @

√2 5

0

0 2C B B √1 − B C 1A = B 2 @ 3 − √1

1 0

1

0

2C B 0 B 1C 0 A=@ 1 −1 0 1

1C B B B B 0C C C=B B B 1C A B @ 1

2 3

0

2

0

1 2 1 2 1 2 1 2

1 0√ √1 5 6 C q 30 C B B C 1 5 √ C B 0 B 6 @ q6 C q 2 2A − 15 0 3 1 0√ √ 0 C 2 − 2 B − √1 C C B 0 1 C @ 2 A √1 0 0 2 1 0 1

− √1

√1 5

1 0 0 −1

B B− 1 @

0

−

1 0 −1

B B1 @

0

−1 C 3C A= 1

1

5.3.28. (i) (a)

1

0 1C B1 B 1 0C A @0 0 0 0 1 0 − √

√1 2

2 3 1 √ 2 3 √ 3 2 1 − √ 2 3

−

0 − √1

2

0 −1 C 4 1C A; 0 q 21 0 2 3 1 −√ 6

0 − √1

6

C C C C C C C C A

2 B B B0 B

B B B0 B @

0

√3 q5 6 5

− √3

1

q 5C C 2 C; 7 15 C q A 2 2 3 1

0 √ −2 2 C 2 C; √ A 2

2 √ 2 0 0

5 2

0

√

3 2

0

3 2 − √1 2 1 √ 2 3 √1 6

1

C C C C C. C C C A

1 0√ 1 0 1 ! √1 √1 √1 2 − − 75 A x B 2 2C B 2C @ (b) = ; =@ A @ A; 1 y √1 √5 − √1 0 5 2 1 2 02 1 2 0 0 1 1 1 0 1 √ √1 − 3 3 0 2 1 −1 C B 2 3 2C x √ C B B C B √ √ C C 2 2 C B C B1 ; (b) B @yA = B 2 −2 2 C 2C 0 C @0 @ −1 A; A=B3 3 √ A A @ z 1 2 2 −1 0 0 3 √ √1 3 − 2 −3 2 √ 1 1 0 0 1 0 1 √1 √1 √1 0 1 − √1 2 √1 1 − 2 6 3C B 2 2 0C B C x q B 2C C B C B C C B 3 B √1 √1 √1 C B 0 √1 C (b) B @ y A = B 1 C. 1C C B A = B− 2 2 − 6 C; 6 3 @ 2A q A @ A @ z 1 1 √2 0 0 0 − 23 √1 2 3 3

!

0

0

0 .9701 B 1C A = @ .2425 4 0 1 0 0 .9701 B 0C C B .2425 C=B 1A @ 0 4 0 0 1 0 .9701 B .2425 0C B C B C=B 0 0C C B 1A @ 0 4 0

−.2339 .9354 .2650 −.2339 .9354 .2650 0 −.2339 .9354 .2650 0 0

10

1

.0643 4.1231 1.9403 .2425 B −.2571 C 0 3.773 1.9956 C A@ A, .9642 0 0 3.5998 10 4.1231 1.9403 .2425 .0619 −.0172 B −.2477 .0688 C 0 3.773 1.9956 CB CB .9291 −.2581 A@ 0 0 3.7361 0 0 0 .2677 .9635 1 .0619 −.0166 .0046 −.2477 .0663 −.0184 C C C .9291 −.2486 .0691 C C .2677 .9283 −.2581 A 0 .2679 .9634 0 1 4.1231 1.9403 .2425 0 0 B 0 3.773 1.9956 .2650 0 C B C B C B C. 0 0 3.7361 1.9997 .2677 B C @ 0 0 0 3.7324 2.0000 A 0 0 0 0 3.5956

131

1

0 .2650 C C C, 1.9997 A 3.596

5.3.30. :

5.3.27 (a)

!

−1.2361 , 2.0000

b = v 1

.4472 .8944

Q=

(b)

(c)

(d)

(e)

(f )

b = v 1

!

−1 , 3 0

H1 =

.8 .6

1

H1 = !

.8944 , −.4472 !

.6 , −.8

Q=

0

.8944 , −.4472

!

2.2361 0

R= .8 .6

!

.4472 .8944

!

.6 , −.8

−.4472 ; −3.1305 R=

0

0

1

1

1

−1.3416 2.556 C A; 1.633

−1.4142 0 −.7071 −.7071 B C b = −1 −.7071 .5 −.5 C v , H = @ A A, 1 1 −1 −.7071 −.5 .5 0 1 0 1 0 1 0 0 B C B b = @ −1.7071 A, v H2 = @ 0 −.7071 −.7071 C A 2 −.7071 0 −.7071 .7071 1 0 0 1 0 1 0 1.4142 −1.4142 −2.8284 B B C Q = @ −.7071 0 −.7071 A, R = @ 0 1 2 C A; −.7071 0 .7071 0 0 1.4142 B @

0

1

0

1

0

1

0

1

0 0 0 −1 −1 C b =B b = 0C 0C v H1 = B v @ 0 A, @ 0 1 A, A, 2 1 0 −1 00 0 −1 0 1 1 1 0 −1 0 0 −1 1C 0C R=B Q=B @0 4 @ 0 1 A; A, 0 0 −2 −1 0 0 B @

−1 B 1C C B b =B C, v 1 @ 1A 1 1 0 0 B −.4142 C C b =B C, B v 2 @ 0 A −1 1 0 0 B 0 C C b =B B C, v 3 @ −.3660 A .7071 0 .5 0 B .7071 B .5 Q=B @ .5 0 .5 −.7071

!

3.6 ; .2

1

−.2361 .8944 0 −.4472 C B b = v 0 = , H 0 1 0 C A @ A, 1 1 −1 −.4472 0 −.8944 0 1 0 1 0 1 0 0 B C B b = @ −.0954 A, v H2 = @ 0 .9129 .4082 C A, 2 .4472 0 .4082 −.9129 0 1 0 .8944 −.1826 .4082 2.2361 1.3416 Q=B 0 .9129 .4082 C R=B 0 1.0954 @ A, @ −.4472 −.3651 .8165 0 0 B @

5 0

H2 =

0

1

B @0

0

0 1 0

1

0 0C A, 1

1

0

.5 .5 .5 .5 B .5 −.5 −.5 C C B .5 C, H1 = B @ .5 −.5 .5 −.5 A .5 −.5 −.5 .5 1 0 1 0 0 0 B 0 .7071 0 −.7071 C C C, B H2 = B @0 0 1 0 A 0 −.7071 0 −.7071 0 1 1 0 0 0 B 0 1 0 0 C C B C, H3 = B @ 0 0 .5774 .8165 A 0 0 1.8165 −.5774 0 2 2 −.2887 .8165 C B −.2887 −.4082 C B 0 1.4142 C, R=B @0 0 .866 0 A 0 0 −.2887 −.4082

132

2.5 0 .866 0

1

1.5 −.7071 C C C; .2887 A .4082

5.3.29:

3 × 3 case:

.9701 Q=B @ .2425 0

−.2339 .9354 .2650

.9701 B B .2425 Q=B @ 0 0

0

1

−.1231 1 C C b = C, v 1 0 A 0 1 0 0 B −7.411 C C b =B C, B v 2 @ 1 A 0 1 0 0 B 0 C C b =B C, B v 3 @ −.1363 A 1 −.2339 .9354 .2650 0

5 × 5 case:

b v 1

b v 2

b v 3

b v 4

0

1

.0619 −.2477 .9291 .2677

0

1

4.1231 0 R=B @ 0

H1 =

H2 =

0 0

1

0 1 B B0 H3 = B @0 0 0

1

H1

H2

H3

H4

.2425 0 0 −.9701 0 0 C C C, 0 1 0A 0 0 0 1 1 0 0 0 −.9642 .2650 0 C C C, .2650 .9642 0 A 0 0 1 1 0 0 0 1 0 0 C C C, 0 .9635 .2677 A 0 .2677 −.9635 0

4.1231 B 0 B R=B @ 0 0 0

.2425 1.9956 C A; 3.5998

1

.9701

B B0 B @0

1

1.9403 3.773 0

B B .2425 B @ 0

.0172 −.0688 C C C, .2581 A −.9635

−.1231 B 1 C B C B C C, =B 0 B C @ 0 A 0 1 0 0 B −7.411 C C B C C, B 1 =B C B @ 0 A 0 1 0 0 B 0 C C B C B −.1363 C, =B C B @ 1 A 0 1 0 0 C B 0 C B C B C, 0 =B C B @ −7.3284 A 1

1

.9701 .2425 0 C H1 = B @ .2425 −.9701 0 A, 0 0 1 1 0 1 0 0 C H2 = B @ 0 −.9642 .2650 A 0 .2650 .9642

.0643 −.2571 C A, .9642

B B B @

0

0

1

−.1231 b =B 1 C v @ A, 1 0 0 1 0 C b =B v @ −7.411 A, 2 1

0

4 × 4 case:

0

1.9403 3.773 0 0

.2425 1.9956 3.7361 0 1

.9701 .2425 0 0 0 B .2425 −.9701 0 0 0 C B C C B 0 C, =B 0 1 0 0 B C @ 0 0 0 1 0A 0 0 0 0 1 1 0 1 0 0 0 0 B 0 −.9642 .2650 0 0 C C B C C, B0 .2650 .9642 0 0 =B C B @0 0 0 1 0A 0 0 0 0 1 1 0 1 0 0 0 0 B0 1 0 0 0C C B C C, B 0 0 .9635 .2677 0 =B C B @ 0 0 .2677 −.9635 0 A 0 0 0 0 1 1 0 1 0 0 0 0 B0 1 0 0 0 C C B B C, 0 0 1 0 0 C =B C B @ 0 0 0 −.9634 .2679 A 0 0 0 .2679 .9634

133

1

0 .2650 C C C; 1.9997 A −3.596

0

1

.9701 −.2339 .0619 −.0166 .0046 B .2425 .9354 −.2477 .0663 −.0184 C B C C B 0 C, Q=B .2650 .9291 −.2486 .0691 B C @ 0 0 .2677 .9283 −.2581 A 0 0 0 .2679 .9634 0 1 4.1231 1.9403 .2425 0 0 B 0 3.773 1.9956 .2650 0 C B C C B C. R=B 0 0 3.7361 1.9997 .2677 B C @ 0 0 0 3.7324 2. A 0 0 0 0 3.5956 ♥ 5.3.31. (a) Q R factorization requires n3 + n2 multiplication/divisions, n square roots, and n3 − 12 n2 − 21 n addition/subtractions. (b) Multiplication of QT b requires an additional n2 multiplication/divisions and n2 − n addition/subtractions. Solving R x = QT b by Back Substitution requires 21 n2 + 12 n multiplication/divisions and 12 n2 − 21 n addition/subtractions. (c) The Q R method requires approximately 3 times as much computational effort as Gaussian Elimination. e = R e R−1 . The left hand side is orthogonal, while the e R, e then Q−1 Q ♦ 5.3.32. If Q R = Q right hand side is upper triangular. Thus, by Exercise 5.3.12, both sides must be diagonal e implies positivity of those of with ±1 on the diagonal. Positivity of the entries of R and R −1 −1 −1 e e and R = R. e e e R R , and hence Q Q = R R = I , which implies Q = Q

♥ 5.3.33. (a) If rank A = n, then the columns w1 , . . . , wn of A are linearly independent, and so form a basis for its range. Applying the Gram–Schmidt process converts the column basis w1 , . . . , wn to an orthonormal basis u1 , . . . , un of rng A. (b) In this case, for the same reason as in (5.23), we can write w1 = r11 u1 , w2 = r12 u1 + r22 u2 , w3 = r13 u1 + r23 u2 + r33 u3 , .. .. .. .. . . . . wn = r1n u1 + r2n u2 + · · · + rnn un .

(c)

The result is equivalent to the factorization A = Q R where Q = (u1 , . . . , un ), and R = (rij ) is nonsingular since its diagonal entries are non-zero: rii 6= 0. 0

1 B (i) B @2 0

1

0

√1

−1 C B B 5 2 C = 3A B B√ @ 5 2 0 (iii)

0 B B B B B @

− 32

−1 1 −1 1

1

C C 1C 3C A 2 3

√ 5 0 1

√

!

5 ; 3

0

1C B − B −2 C B C C=B B 2C @ − A −1

1 2 1 2 1 2 1 2

(ii) − 21 − 21

0 B B @

1

C C C C 1C 2A 1 2

134

−3 0 4 2 0

1

0

1C B B B 1C A=B @ 2 !

−3 ; 1

−

3 5

0 4 5

8 √ 5 5 √1 5 6 √ 5 5

1 C C C C A

5 0

!

1 √ ; 5

(iv )

0 B B B B B @

0 −2 −1 2

1 1 0 1

1

0

−1 C B B B − B 3C C C=B B B − −2 C A B @ −2

0 2 3 1 3 2 3

√1 3 √1 3

0 √1 3

3 √ 7 2 11 √ 21√2 − 1321 2 √ − 212

−

1 C C C C C C C C A

0

B3 B B0 @ 0

0 √ 3 0

− 38 0 √

7 2 3

1

C C C. A

(d) The columns of A are linearly dependent, and so the algorithm breaks down, as in Exercise 5.2.3. ♥ 5.3.34. (a) If A = ( w1 w2 . . . wn ), then U = ( u1 u2 . . . un ) has orthonormal columns and hence is a unitary matrix. The Gram-Schmidt process takes the same form: w1 = r11 u1 , w2 = r12 u1 + r22 u2 , w3 = r13 u1 + r23 u2 + r33 u3 , .. .. .. .. . . . . wn = r1n u1 + r2n u2 + · · · + rnn un , which is equivalent 0 to the factorization 1 0A √ = U R. ! 3i 1 i √ −√ C B 2 −√ i 1 2 2 2 =B (b) (i) @ A @ i 1 1 √ √ √ − 1 2i − 0 (ii)

(iii)

0 B B @

0

1+ i 1− i i 1 0

1 i 1

i B (iv ) B 1 − i @ −1

1

C A;

2 2 1 0 2 ! i 1 i 1 + − 2 1 − 2i 2− i 2 2 2 2 A @ ; = 1 0 1 −i − 2i 21 + 2i 2 1 0√ 0 i 1 1 √ √1 − √i 0 √1 2 3 6 0C B C B 2 2 C B 1 C √ C B√ √i √1 C B B = ; 1C B 2 C 0 3 q0C A 3 A q6 A @ @ 3 1 2 i √ 0 0 0 i 2 3 0 3 1 i 2−3 i 1 √i 2 6 1 −i C B 3√ 2 C C B i i 1 2 C B1 = 0 1+ i C − + C B A 2 2 2 6 3 A @ 1 2 i 1 1 2 + 3i 1 −2 − √ 2 + 3 3 2 !

0

B2 B B0 @

−1 − 2 i 3

−1 + i 1 − 3i √

1

C C C. A

2 2 0 0 3 (c) Each diagonal entry of R can be multiplied by any complex number of modulus 1. Thus, requiring them all to be real and positive will imply uniqueness of the U R factorization. The proof of uniqueness is modeled on the real version in Exercise 5.3.32.

5.3.35. Householder’s Method start set R = A for j = 1 to n − 1 for i = 1 to j − 1 set wi = 0 next i for i = j to n set wi = rij next i set v = w − k w k ej

if v 6= 0 set uj = v/k v k,

rjj = k w k 135

for i = j + 1 to n

set rij = 0

next i

for k = j + 1 to n for i = j to n

n X

set rik = rik − 2ui

ul rlk

l=1

next i

next k else set uj = 0 endif next j end

5.4.1. (a) t3 = q3 (t) + 1=

3 5 q1 (t),

where Z 1

h t 3 , q3 i 175 = k q3 k2 8

h t3 , q1 i 3 3 = = 5 k q1 k2 2 (b) t4 + t2 = q4 (t) +

Z 1

−1

−1

t3

“

t3 −

0=

h t 4 + t2 , q3 i 175 = 2 k q3 k 8 2

h t 4 + t2 , q1 i 3 = k q1 k2 2

h t4 + t2 , q0 i 1 8 = = 15 k q0 k2 2

Z 1

Z 1

−1

−1

Z 1

(t4 + t2 )

(t4 + t2 )

(t4 + t2 )

−1

“

“

(t4 + t2 ) t dt,

Z 1

(t4 + t2 ) dt;

−1 −1

h 7 t4 + 2 t3 − t , q 4 i 11025 = 2 k q4 k 128 h 7 t4 + 2 t3 − t , q3 i 175 = k q3 k2 8

−1

Z 1

−1

−1

−1

136

6 2 7t

t4 −

t3 −

3 5

t3

“

t2 −

1 3

t3 dt;

3 35

”

dt,

”

dt,

where

(7 t4 + 2 t3 − t)

(7 t4 + 2 t3 − t)

+

t dt,

”

1 3

1 7 5 q1 (t) + 5 q0 (t),

Z 1

Z 1

“

t2 −

Z 1

(c) 7 t4 + 2 t3 − t = 7 q4 (t) + 2 q3 (t) + 6 q2 (t) +

2=

h t3 , q0 i 1 = k q0 k2 2

Z 1

where

h t + t , q2 i 13 45 = = 7 k q2 k2 8

7=

h t3 , q2 i 45 = k q2 k2 8

0= 0=

h t 4 + t2 , q4 i 11025 = 2 k q4 k 128

4

0=

”

t dt,

t3 t dt,

8 13 7 q2 (t) + 15 q0 (t),

1=

3 5

“

“

t4 −

t3 −

3 5

6 2 7t ”

+

t dt,

3 35

”

dt,

”

dt,

6=

h 7 t4 + 2 t3 − t , q2 i 45 = 2 k q2 k 8

h 7 t4 + 2 t3 − t , q1 i 1 3 = = 2 5 k q1 k 2

h 7 t4 + 2 t3 − t , q0 i 1 7 = = 2 5 k q0 k 2 5.4.2. (a) q5 (t) = t5 −

10 3 9 t

+

5 21

t=

4 5 2 5 (c) q6 (t) = t6 − 15 11 t + 11 t − 231

Z 1

−1

(7 t4 + 2 t3 − t)

“

t2 −

Z 1

(7 t4 + 2 t3 − t) t dt,

Z 1

(7 t4 + 2 t3 − t) dt.

−1 −1

5 ! d5 (t2 − 1)5 , 10 ! dt5 6 ! d6 2 = (t −1)6 , 12 ! dt6

(b) t5 = q5 (t) +

1 3

”

dt,

3 10 9 q3 (t) + 7 q1 (t),

15 q4 (t)+ 75 q2 (t)+ 71 q0 (t). t6 = q6 (t)+ 11

♦ 5.4.3. (a) We characterized qn (t) as the unique monic polynomial of degree n that is orthogonal to q0 (t), . . . , qn−1 (t). Since these Legendre polynomials form a basis of P (n−1) , this implies that qn (t) is orthogonal to all polynomials of degree ≤ n − 1; in particular h qn , tj i = 0 for j = 0, . . . , n − 1. Conversely, if the latter condition holds, then qn (t) is orthogonal to every polynomial of degree ≤ n − 1, and, in particular, h qn , qj i = 0, j = 0, . . . , n − 1. (b) Set q5 (t) = t5 + c4 t4 + c3 t3 + c2 t2 + c1 t + c0 . Then we require 2 2 3 c2 + 5 c4 , 2 2 2 3 c1 + 5 c3 + 7 , 2 2 2 5 c1 + 7 c3 + 9 .

0 = h q5 , 1 i = 2 c0 + 0 = h q 5 , t2 i =

0 = h q 5 , t4 i =

2 2 3 c2 + 5 c4 , 2 2 2 3 c0 + 5 c2 + 7 c4 ,

0 = h q5 , t i = 2 c0 + 0 = h q 5 , t3 i =

5 The unique solution to this linear system is c0 = 0, c1 = 21 , c2 = 0, c3 = − 10 9 , c4 = 0,, and 5 5 10 3 so q5 (t) = t − 9 t + 21 t is the monic Legendre polynomial of degree 5.

5.4.4. Since even and odd powers of t are orthogonal with respect to the L2 inner product on [ − 1, 1 ], when the Gram–Schmidt process is run, only even powers of t will contribute to the even order polynomial, whereas only odd powers of t will contribute to the odd order cases. Alternatively, one can prove this directly from the Rodrigues formula, noting that (t2 − 1)k is even, and the derivative of an even (odd) function is odd (even). 5.4.5. Use Exercise 5.4.4 and the fact that h f , g i = even, since their product is odd.

k! dk 5.4.6. qk (t) = (t2 − 1)k , (2 k) ! dtk qk (t) (2 k) ! 5.4.7. Qk (t) = = k k qk (t) k 2 (k !)2

Z 1

2k (k !)2 k qk k = (2 k) ! s

−1 s

f (t) g(t) dt = 0 if f is odd and g is

2 . 2k + 1

2k + 1 1 qk (t) = k 2 2 k!

s

2 k + 1 dk (t2 − 1)k . 2 dtk

1 dk (t − 1)k (t + 1)k . Differentiating using Leibniz’ Rule, we con2k k ! dtk clude the only term that does not contain a factor of t − 1 is when all k derivatives are ap1 plied to (t − 1)k . Thus, Pk (t) = k (t + 1)k + (t − 1) Sk (t) for some polynomial Sk (t) and so 2 Pk (1) = 1.

♦ 5.4.8. Write Pk (t) =

♥ 5.4.9. (a) Integrating by parts k times, and noting that the boundary terms are zero by (5.50): 137

Z 1

k dk 2 k d (t − 1) (t2 − 1)k dt −1 dtk dtk Z 1 Z 1 d2 k 2 k k (t2 − 1)k = (−1)k (t2 − 1)k dt. (t − 1) dt = (−1) (2 k) ! −1 −1 dt2 k (b) Since t = cos θ satisfies dt = − sin θ dθ, and takes θ = 0 to t = 1 and θ = π to t = −1, we find Z π Z 1 2k Z π sin2 k+1 θ dθ = (−1)k (t2 − 1)k dt = (−1)k sin2 k−1 θ dθ 0 −1 2k + 1 0 2k 2k − 2 Z π = (−1)k sin2 k−3 θ dθ = · · · 2k + 1 2k − 1 0

k Rk,k k2 =

= (−1)k

(2 k)(2 k − 2) · · · 4 · 2 (2 k + 1)(2 k − 1) · · · 5 · 3

Z π 0

sin θ dθ = (−1)k

22 k+1 (k !)2 , (2 k + 1) !

where we integrated by parts k times after making the trigonometric change of variables. Combining parts (a,b), k Rk,k k2 =

(2 k) ! 22 k+1 (k !)2 22 k+1 (k !)2 = . (2 k + 1) ! 2k + 1

Thus, by the Rodrigues formula, 1 1 2k+1/2 k ! √ = k Pk k = k k Rk,k k = k 2 k! 2 k! 2k + 1

s

2 . 2k + 1

♥ 5.4.10. q (a) The roots of P2 (t) are ± √1 ; the roots of P3 (t) are 0, ± 35 ; 3

r

√

30 . the roots of P4 (t) are ± 15±2 35 d (b) We use induction on Rj+1,k = R . Differentiation reduces the order of a root by 1. dt j,k Moreover, Rolle’s Theorem says that at least one root of f ′ (t) lies strictly between any two roots of f (t). Thus, starting with R0,k (t) = (1 − t)k (1 + t)k , which has roots of order k at ±1, we deduce that, for each j < k, Rj,k (t) has j roots lying between −1 and 1 along with roots of order k − j at ±1. Since the degree of Rj,k (t) is 2 k − j, and the roots at ±1 have orders k − j, the j other roots between −1 and 1 must all be simple. d R (t) has k simple roots Setting k = j, we conclude that Pk (t) = Rk,k (t) = dt k−1,k strictly between −1 and 1.

5.4.11. (a) P0 (t) = 1, (b) P0 (t) = 1, (c) P0 (t) = 1, (d) P0 (t) = 1, 5.4.12. 1, t −

3 4

3 9 2 33 63 P1 (t) = t − 23 , P2 (t) = t2 − 3 t + 13 6 , P3 (t) = t − 2 t + 5 t − 20 ; 2 3 2 3 6 4 P1 (t) = t − 32 , P2 (t) = t − 56 t + 10 , P3 (t) = t − 12 7 t + 7 t − 35 ; 3 2 5 3 P1 (t) = t, P2 (t) = t − 5 , P3 (t) = t − 7 t; P1 (t) = t, P2 (t) = t2 − 2, P3 (t) = t3 − 12 t.

, t2 −

4 3

t+

2 5

, t3 −

15 2 8 t

+

15 14

t−

5 28

, t4 −

12 3 5 t

+ 2 t2 −

2 3

t+

1 14

.

15 16

t.

5.4.13. These are the rescaled Legendre polynomials: 1,

1 2

t,

3 2 8t

−

1 2

,

5 3 16 t

−

3 4

t,

35 4 128 t

138

−

15 2 16 t

+

3 8

,

63 5 256 t

−

35 3 32 t

+

5.4.14. Setting h f , g i =

Z 1 0

k f k2 =

f (t) g(t) dt,

e (t), q0 (t) = 1 = P 0 h t , q0 i q (t) = t − q1 (t) = t − k q0 k2 0

1 2

=

Z 1 0

f (t)2 dt,

1 e 2 P 1 (t),

” h t2 , q1 i h t2 , q0 i 1/12 “ 1/3 e (t), t − 21 = t2 − t + 16 = 16 P 1− q (t) − q1 (t) = t2 − 0 2 2 2 k q0 k k q1 k 1 1/12 ” ” 1/120 “ 2 3/40 “ 1/4 1 1 e = 20 P 3 (t), t − 21 − t − t + 16 = t3 − 32 t2 + 35 t − 20 q3 (t) = t3 − 1− 1 1/12 1/180 ” ” ” 1/5 1/105 “ 2 1/1400 “ 3 3 2 3 1/15 “ 1 t − 21 − t − t + 16 − t − 2 t + 5 t − 20 q4 (t) = t4 − 1− 1 1/12 1/180 1/2800

q2 (t) = t2 −

= t4 − 2 t 3 +

9 2 7t

−

2 7

t+

1 70

5.4.15. p0 (t) = 1, p1 (t) = t −

=

1 2

1 e 70 P 4 (t).

, p2 (t) = t2 − t +

1 6

, p3 (t) = t3 −

3 2 2t

+

33 65

t−

1 260 .

♦ 5.4.16. The formula for the norm follows from combining equations (5.48) and (5.59). 5.4.17. L4 (t) = t4 − 16 t3 + 72 t2 − 96 t + 24, k L4 k = 24, L5 (t) = t5 − 25 t4 + 200 t3 − 600 t2 + 600 t − 120, k L5 k = 120. Z ∞ e− t dt 0

♦ 5.4.18. This is done by induction on k. For k = 0, we have Integration by parts implies Z ∞ tk e− t dt 0

= − tk e− t

˛∞ ˛ ˛ t=0

+

Z ∞ 0

♦ 5.4.19. p0 (t) = 1, p1 (t) = t, p2 (t) = t2 −

k tk−1 e− t dt = k 1 2

, p3 (t) = t3 −

= − e− t

Z ∞ tk−1 e− t dt 0

3 2

˛∞ ˛ ˛ t=0

= 1.

= k · (k − 1) ! = k !

t, p4 (t) = t4 − 3 t2 +

3 4

.

♥ 5.4.20. (a) To prove orthogonality, use the change of variables t = cos θ in the inner product intedt gral, noting that dt = − sin θ dθ, and so dθ = √ : 8 1 − t2 π, m = n = 0, > > > Z π Z 1 < cos(m arccos t) cos(n arccos t) 1 √ cos mθ cos nθ dθ = > 2 π, m = n > 0, dt = h Tm , Tn i = 0 −1 > 1 − t2 > : q 0, m 6= n. √ π (b) k T0 k = π , k Tn k = 2 , for n > 0. (c) T0 (t) = 1, T1 (t) = t, T2 (t) = 2 t2 − 1, T3 (t) = 4 t3 − 3 t, T4 (t) = 8 t4 − 8 t2 + 1,

T5 (t) = 16 t5 − 20 t3 + 5 t,

1.5

1.5

1

-0.5

1

-1

-0.5

0.5

0.5

-0.5

-1

-1.5

1

0.5

0.5

-0.5

1.5

1

0.5

-1

T6 (t) = 32 t6 − 48 t4 + 18 t2 − 1.

-1

-1.5

139

1

-1

-0.5

0.5

-0.5

-1

-1.5

1

1.5

1.5

1

1

0.5

-1

-0.5

1.5

1

0.5

0.5

1

-1

0.5

-0.5

-0.5

0.5

1

-1

-0.5

-0.5

-1

1

-0.5

-1

-1.5

0.5

-1

-1.5

-1.5

5.4.21. The Gram–Schmidt process will lead to the monic Chebyshev polynomials q n (t), obtained by dividing each Tn (t) by its leading coefficient: q0 (t) = 1, q1 (t) = t, q2 (t) = t2 − 12 , q3 (t) = t3 − 34 t, q4 (t) = t4 − t2 + 18 , etc. This follows from the characterization of each qn (t) as the unique monic polynomial of degree n that is orthogonal to all polynomials of degree < n under the weighted inner product, cf. (5.43); any other degree n polynomial with the same property must be a scalar multiple of qn (t), or, equivalently, of Tn (t). 5.4.22. A basis for the solution set is given by ex and e2 x . The Gram-Schmidt process yields 2 (e3 − 1) x e . the orthogonal basis e2 x and 3 (e2 − 1)

5.4.23. cos x, sin x, ex form a basis for the solution space. Applying the Gram–Schmidt process, sinh π sinh π cos x − sin x. we find the orthogonal basis cos x, sin x, ex + π π (1)

5.4.24. Starting with a system of linearly independent functions fj (t) = fj (t), j = 1, . . . , n, in an inner product space, we recursively compute the orthonormal system u1 (t), . . . , un (t) by (j)

(j)

(j+1)

(j)

(j)

setting uj = fj /k fj k , fk = fk − h fk , uj i uj , for j = 1, . . . n, k = j + 1, . . . , n. The algorithm leads to the same orthogonal polynomials. ♥ 5.4.25. dt > 0 for t > 0, the function (a) First, when s = 0, t = −1, while when s = q 1, t = 1. Since ds 1 is monotone increasing, with inverse s = + 2 (t + 1) . (b) If p(t) is any polynomial, so is q(s) = p(2 s2 − 1). The formulas are q0 (s) = 1, q1 (s) = 2 2 2 s2 − 1, q2 (s) = 4 s4 − 4 s2 + 23 , q3 (s) = 8 s6 − 12 s4 + 24 5 s − 5. (c) No. For example, h q0 , q1 i =

Z 1 0

(2 s2 − 1) ds = − 31 . They are orthogonal with respect

to the weighted inner product h F , G i = F (s) = f (2 s2 − 1), G(s) = g(2 s2 − 1).

Z 1 0

F (s) G(s) s ds =

1 4

Z 1

−1

f (t) g(t) dt provided

5.4.26. (a) By the change of variables formula for integrals, since ds = − e− t dt, then hf ,gi =

Z ∞ 0

f (t) g(t) e− t dt =

Z 1 0

F (s) G(s) ds

when

f (t) = F (e− t ), g(t) = G(e− t ).

The change of variables does not map polynomials to polynomials. e (e− t ) are orthogonal with respect to the (b) The resulting exponential functions Ek (t) = P k L2 inner product on [ 0, ∞). (c) The resulting logarithmic polynomials Qk (s) = qk (− log s) are orthogonal with respect Z 1

q (− log s) qk (− log s) ds is to the L2 inner product on [ 0, 1 ]. Note that h Qj , Qk i = 0 j finite since the logarithmic singularity at s = 0 is integrable. 140

5.5.1. (a) v2 , v4 , (b) v3 , (c) v2 , (d) v2 , v3 , (e) v1 , (f ) v1 , v3 , v4 . 5.5.2. (a)

0

1 B−3 B B @

1

C 1C C, 3A 1 3 0

(b) 1

0

4 B 7 B 2 B− @ 7 6 7 0

1 C C C A

≈

1

0

B C @ −.2857 A, 0

5 3C 2B 2C B B 5.5.3. B @ 2 A − @ −2 A = B @ 7 1 3 −2 5.5.4. Orthogonal basis:

0 B B @

.5714

1

.8571

17 21 58 21 43 21 1

1 C C C A

0

(c)

0

B B B @

7 9 11 9 1 9 1

1 C C C A

≈

0

B C @ 1.2222 A,

1

1

.1111

(d)

0

1007 B − 4225

1

C B 301 C C B @ 4225 A 60 169

≈

0 B @

0

1

0

8

C C C A

≈

0 B @

1

.5333 2.9333 C A. −1.3333

021 C C C B B B3C C C B 6C B B7C C C B 5C B B3C C, C, C, B B B C. (b) (c) 5.5.5. (a) (d) B 3C B C B C C @ 5A @ −1 A A @ 0A 5 5 6 3 3 5 1 1 1 0 0 0 0 1 1 10 1 15 − .5263 .88235 5 19 17 C C C B B B C C C B B C C B − 5 C ≈ @ −.2632 A, B 19 C ≈ @ 1.11765 A, B 1 C, (b) B (c) B 5.5.6. (i) (a) B @ 19 A @ 17 A @ 5A .7895 .05882 1 15 1 19 15 1 0 017 0 0 1 1 0 1 191 5 .0788 .2632 0 B − 2425 C B 19 C C C B B B C B C B 463 C ≈ @ .1909 A. 5 C ≈ @ −.1316 C (d) B (ii) (a) @ 0 A, (b) B A, @ 2425 A @ − 38 A −.1237 .3947 0 12 15 − 97 38 0 1 1 0 0 0 1 1 553 13 .5909 .0387 B 14282 C B 22 C B B C C B 1602 C 9 C B− C ≈ @ .4091 A, C ≈ @ −.2243 A. B (d) (c) B @ 7141 A @ 22 A −.0455 .2487 1 48 − 22 193 0

11 B 21 B 10 B 21 B 2 B @ −7 10 − 21 0

0

− 35

0

2 3 7 3

1

1

5.5.7. ( 1.3, .5, .2, −.1 )T .

♥ 5.5.8. (a) The entries of c = AT v are ci = uT i v = ui · v, and hence, by (1.11), w = P v = A c = c1 u1 + · · · + ck uk , reproducing the projection formula (5.63). 1 0 0 1 4 4 2 1 1 0 1 1 1 9 −9 9C B 2 0 2C B 4 2C C, B − 4 (b) (i) @ 21 12 A, (ii) B (iii) B 0C @ 0 1 A, 9 9 −9 A @ 1 1 2 2 1 2 2 2 0 2 9 −9 9

141

1

−.23833 .07123 C A. .35503

1

3 −1 C B 15 B 2C 4 2B B 44 C B C orthogonal projection: B @ 2A − @ 2A = B @ 15 3 5 1 − 52 − 43 1

.7778

1

.8095 C ≈B @ 2.7619 A. 2.0476

−1 C B 32 C C B 2C A, @ 2 A; 1 − 25 0

0

(iv )

0 B B B B B @

−

1 9 2 9 2 9

− 29

2 9

0

8 9 − 92 T

8 9

0 − 92 − 29

0

1

C C C C, C A

(v )

03 B4 B1 B4 B B1 @4 1 4

1 4 3 4 − 14 − 14

1 4 − 14 3 4 − 14

1 4 − 14 − 14 3 4

1

C C C C. C A

1 0 − 29 9 (c) P T = (A AT )T = A A = P . (d) The entries of AT A are the inner products ui ·uj , and hence, by orthonormality, AT A =

I . Thus, P 2 = (A AT ) (A AT ) = A I AT = A AT = P . Geometrically, w = P v is the orthogonal projection of v onto the subspace W , i.e., the closest point. In particular, if w ∈ W already, then P w = w. Thus, P 2 v = P w = w = P v for all v ∈ R n , and hence P2 = P. (e) Note that P is the Gram matrix for AT , and so, by Proposition 3.36, rank P = rank AT = rank A.

5.5.9. (a) If x, y are orthogonal to V , then h x , v i = 0 = h y , v i for every v ∈ V . Thus, h c x + d y , v i = c h x , v i + d h y , v i = 0 and hence c x + d y is orthogonal to every v ∈ V . (b)

5.5.10.

“

“

− 23 , 43 , 1, 0

“

,

“

− 21 , 43 , 0, 1

”T 1 1 , − , 2 . 2 2

5.5.11. (a) 5.5.12.

”T

“

− 17 , 0

”T

4 2 25 17 7 , 7 , 14 , 14

,

”T

(b)

“

9 4 14 , 31

”T

.

”T

,

(c)

“

− 74 ,

1 2 30 , 7

”T

.

.

5.5.13. orthogonal basis: ( 1, 0, 2, 1 )T , ( 1, 1, 0, −1 )T ,

“

1 1 2 , −1, 0, − 2

closest point = orthogonal projection = 5.5.14. orthogonal basis: ( 1, 0, 2, 1 )T ,

“

5 1 3 4 , 1, 2 , − 4

”T

closest point = orthogonal

“

;

− 32 , 2, 23 ,

4 3

”T

.

” 15 21 3 9 T ; , − , , − 22 22 11 22 ” “ 8 16 T 8 . projection = − 7 , 2, 7 , 7

,

“

”T

5.5.15. 1 (a) p1 (t) = 14 + 72 t, p2 (t) = p3 (t) = 14 + 27 t + 14 (t2 − 2); (b) p1 (t) = .285714 + 1.01429 t, p2 (t) = .285714 + 1.01429 t − .0190476 (t2 − 4), p3 (t) = .285714 + 1.01429 t − .0190476 (t2 − 4) − .008333 (t3 − 7 t); 80 20 2 (c) p1 (t) = 100 + 80 7 t, p2 (t) = p3 (t) = 100 + 7 − 21 (t − 4).

♦ 5.5.16. (a) The key point is that, since the sample points are symmetric, tk = 0 whenever k is odd. Thus, n n 1 X 1 X ti = t = 0, (t2 − t2 ) = t2 − t2 = 0, h q 0 , q1 i = h q 0 , q1 i = n i=1 n i=1 i 1 h q 0 , q3 i = n

n X

i=1

0

@ t3 i

−

t4 t2

1

ti A = t 3 −

142

t4 t2

t = 0,

1 n

h q 1 , q2 i =

1 h q 1 , q3 i = n 1 h q 2 , q3 i = n “

t4 − t2

ti (t2i − t2 ) = t3 − t t2 = 0,

i=1

0

n X

ti @ t3i

i=1 n X

i=1

t6 − t 2 t4

4

(b) q4 (t) = t −

n X

”2

“

t

(t2i 2

−

t4 t2

0

1

− t2 ) @ t3i

− t2

”

− t4

t4

ti A = t 4 − −

t4 t2

t2

1

t4

ti A = t 5 − t 2 t3 − 2

,

t2 = 0,

k q4 k =

t8

−

“

”2 t4

t2 +

t3 + t4 t = 0. “

t6 − t 2 t4 “

t4 − t2 “

(c) p4 (t) = .3429 + .7357 t + .07381 (t2 − 4) − .008333 (t3 − 7 t) + .007197 t4 − 5.5.17. (a) p4 (t) = 14 + (b) p4 (t) = (c) p4 (t) =

7 2

t+

1 14

(t2 − 2) −

5 12

“

t4 −

31 7

+

72 35

”

;

.2857 + 1.0143 t − .019048 (t2 − 4) − .008333 (t3 − 7 t) + .011742 “ ” 4 67 2 20 2 5 72 100 + 80 7 − 21 (t − 4) − 66 t − 7 t + 7 .

“

”2

”2

67 2 7 t

t4 −

. +

67 2 7 t

72 7

+

”

72 7

.

”

;

5.5.18. Because, according to (5.65), the k th Gram-Schmidt vector belongs to the subspace spanned by the first k of the original basis vectors. ♥ 5.5.19. (a) Since ti = t0 + i h, we have t = t0 + sn−i =

“

1 2

1 2

”

“

n h and so si =

n − i h = − si , proving symmetry of the points.

”

1 2

i−

n h. In particular,

(b) Since p(ti ) = q(ti − t ) = q(si ), the least squares errors coincide: n h X

i=1

(c)

p(ti ) − yi

i2

92 35

“

t−

7 2 7 2

” ”

= − 56 5 + +

9 56

h“

= − 9.7 + 1.503 t + .1607 t2 .

♦ 5.5.20.

92 35

i2

=

t−

t = − 11.2 + 2.6286 t,

7 2

”2

−

35 12

i

= − 97 10 +

q0 (t) = 1,

q1 (t) = t − t,

q2 (t) = t2 −

q0 = t0 ,

q1 = t 1 − t t 0 ,

q2 = t 2 −

k q0 k2 = 1,

i=1

q(si ) − yi

, and hence q(s) minimizes the former if and only if p(t) = q(t − t )

minimizes the latter. “ p1 (t) = − 2 + 92 35 t − p2 (t) = − 2 +

n h X

k q1 k2 = t2 − t2 ,

421 280

t3 − t t 2 t2 − t 2

t3 − t t 2 t2 − t 2 “

k q2 k2 = t4 − t2

“

”2

“

t+

9 2 56 t

”

t − t − t2 , ”

t1 − t − t 2 t0 , −

“

t3 − t t 2 t 2 − t2

”2

.

5.5.21. For simplicity, we assume ker A = {0}. According to Exercise 5.3.33, orthogonalizing the basis vectors for rng A is the same as factorizing A = Q R where the columns of Q are the orthonormal basis vectors, while R is a nonsingular upper triangular matrix. The formula for the coefficients c = ( c1 , c2 , . . . , cn )T of v = b in (5.63) is equivalent to the matrix formula c = QT b. But this is not the least squares solution x = (AT A)−1 AT b = (RT QT Q R)−1 RT QT b = (RT R)−1 RT QT b = R−1 QT b = R−1 c. Thus, to obtain the least squares solution, John needs to multiply his result by R −1 . 143

♦ 5.5.22. Note that QT Q = I , while R is a nonsingular square matrix. Therefore, the least squares solution is x = (AT A)−1 AT b = (RT QT Q R)−1 RT QT b = (RT R)−1 RT QT b = R−1 QT b. 5.5.23. The solutions are, of course, 0 1 the same: ! ! .30151 .79455 3.31662 −.90453 .06667 B C (a) Q = @ .90453 −.06356 A, R = , x= ; 0 2.86039 .91111 −.30151 .60386 1 0 .8 −.43644 ! ! B 5 0 −.04000 .65465 C C B .4 C, R= , x= ; (b) Q = B @ .2 −.43644 A 0 4.58258 −.38095 .4 .43644 1 0 0 0 1 1 .53452 .61721 .57735 3.74166 .26726 −1.87083 .66667 C B B C 0 1.38873 −3.24037 C (c) Q = B @ .80178 −.15430 −.57735 A, R = @ A, x = @ 1.66667 A; .26726 −.77152 .57735 0 0 1.73205 1.00000 0 1 0 1 .18257 .36515 .12910 5.47723 −2.19089 0 B C B B .36515 −.18257 .90370 C C C, R = 0 1.09545 −3.65148 (d) Q = B @ A, @ 0 .91287 .12910 A 0 0 2.58199 −.91287 0 .38730 x = ( .33333, 2.00000, .75000 )T ; 0 1 0 1 .57735 .51640 −.15811 −.20412 1.73205 .57735 .57735 −.57735 B 0 C .77460 .15811 .20412 B B C 0 1.29099 −.25820 1.03280 C C B C, B .57735 −.25820 C, R = B (e) Q = B .47434 .61237 C C B @ A 0 0 1.26491 −.31623 @ 0 0 .79057 −.61237 A 0 0 0 1.22474 .57735 −.25820 −.31623 −.40825 x = ( .33333, 2.00000, −.33333, −1.33333 )T . ♦ 5.5.24. The second method is more efficient! Suppose the system is A x = b where A is an m × n matrix. Constructing the normal equations requires m n2 multiplications and (m − 1) n2 ≈ m n2 additions to compute AT A and an additional n m multiplications and n(m−1) additions to compute AT b. To solve the normal equations AT Ax = AT b by Gaussian Elimination requires 31 n3 + n2 − 31 n ≈ 31 n3 multiplications and 13 n3 + 21 n2 − 65 n ≈ 31 n3 additions. On the other hand, to compute the A = Q R decomposition by Gram–Schmidt requires (m + 1) n2 ≈ m n2 multiplications and 12 (2 m + 1)n (n − 1) ≈ m n2 additions. To compute c = QT b requires m n multiplications and m (n − 1) additions, while solving R x = c by Back Substitution requires 12 n2 + 12 n multiplications and 21 n2 − 21 n additions. Thus, the first step requires about the same amount of work as forming the normal equations, and the second two steps are considerably more efficient than Gaussian Elimination. ♦ 5.5.25. (a) If A = Q has orthonormal columns, then k Q x ⋆ − b k 2 = k b k 2 − k Q T b k2 =

m X

i=1 T

b2i −

n X

i=1

(ui · b)2 .

(b) If the columns v1 , . . . , vn of A are orthogonal, then A A is a diagonal matrix with the square norms k vi k2 along its diagonal, and so k A x⋆ − b k2 = k b k2 − bT A (AT A)−1 AT b =

144

m X

i=1

b2i −

n X

i=1

(ui · b)2 . k ui k2

♥ 5.5.26. (a) The orthogonal projection is w = A x where x = (AT A)−1 AT b is the least squares solution to A x = b, and so w = A(AT A)−1 AT b = P b. (b) If the columns of A are orthonormal, then AT A = I , and so P = A AT . (c) Since Q has orthonormal columns, QT Q = I while R is invertible, so P = A(AT A)−1 AT = Q R (RT QT Q R)−1 RT QT = Q R (RT R)−1 RT QT = Q QT . Note: in the rectangular case, the rows of Q are not necessarily orthonormal vectors, and so Q QT is not necessarily the identity matrix. 5.5.27.

0

.25 B B −.25 (a) P = B @ −.35 .05

(b) P =

(c) P =

0 B B B B B @

0

−

1 3 1 3

0 1 3

.28

B B −.4 B @ .2

.04

(d) P =

0 B B B B B @

7 15 − 52 4 15 2 15

−.25 .25 .35 −.05

− 31

7 9 − 92 1 9

−.4 .6 −.2 −.2

− 52 7 10 1 5 1 10

−.35 .35 .49 −.07 0 − 29 1 9 − 29

.2 −.2 .4 −.4

4 15 1 5 13 15 1 − 15

1

0

.05 −.05 C C C, −.07 A .01

1 3 1 9 − 29 7 9

1

C C C C, C A

Pv=

1

.04 −.2 C C C, −.4 A .72 2 15 1 10 1 − 15 29 30

1

.25 C B B −.25 C C; Pv=B @ −.35 A .05 0 B B B B B @

Pv=

−

1 3 1 3

1

C C C C; 0C A 1 3

0

.28

B B −.4 B @ .2

.04

1

C C C C, C A

Pv=

5.5.28. Both are the same quadratic polynomial:

1 5

0

7 15 − 52 4 15 2 15

+

4 7

B B B B B @

“

1

C C C; A

1

C C C C. C A

− 12 +

3 2 2t

”

3 = − 35 +

6 2 7t .

5.5.29. 2 2 3 32 Quadratic: 15 + 25 (2 t − 1) + 72 (6 t2 − 6 t + 1) = 35 − 35 t + 12 7 t = .08571 − .91429 t + 1.71429 t ; 1 1 (20 t3 − 30 t2 + 12 t − 1) = − 70 + 27 t − 79 t2 + 2 t3 = Cubic: 15 + 25 (2 t − 1) + 72 (6 t2 − 6 t + 1) + 10 − .01429 + .2857 t − 1.2857 t2 + 2 t3 . 5.5.30. 1.718282 + .845155 (2 t − 1) + .139864 (6 t2 − 6 t + 1) + .013931 (20 t3 − 30 t2 + 12 t − 1) = .99906 + 1.0183 t + .421246 t2 + .278625 t3 .

9 9 9 5.5.31. Linear: 41 + 20 (2 t − 1) = − 51 + 10 t; minimum value: 700 = .01286. Quadratic: 2 2 9 1 1 3 3 1 1 4 + 20 (2 t − 1) + 4 (6 t − 6 t + 1) = 20 − 5 t + 2 t ; minimum value: 2800 = .0003571. Cubic: 2 3 2 3 9 1 1 1 4 + 20 (2 t − 1) + 4 (6 t − 6 t + 1) + 20 (20 t − 30 t + 12 t − 1) = t ; minimum value: 0.

5.5.32. (a) They are both the same quadratic polynomial: 2 10 − 120 π 2 + π π3

1−

6 6 t + 2 t2 π π

!

=−

120 720 − 60π 2 12 720 − 60π 2 2 + + t − t π3 π π4 π5

= − .050465 + 1.312236 t − .417698 t2 . 145

1 0.8

(b)

The maximum error is .0504655 at the ends t = 0, π.

0.6 0.4 0.2 0.5

1

1.5

2

2.5

3

♠ 5.5.33. .459698 + .427919 (2 t − 1) − .0392436 (6 t2 − 6 t + 1) − .00721219 (20 t3 − 30 t2 + 12 t − 1) = −.000252739 + 1.00475 t − .0190961 t2 − .144244 t3 . ♠ 5.5.34.

“

” 3 2 1 t − 2 2 + ” ” “ “ 4 15 2 3 t − t + + .070456 52 t3 − 32 t + .009965 35 8 4 8 + ” “ “ 5 35 3 15 231 6 315 4 105 2 + .00109959 63 8 t − 4 t + 8 t + .00009945 16 t − 16 t + 16 t 2 3 4 5

p(t) = 1.175201 + 1.103638 t + .357814

−

5 16

”

= 1. + 1.00002 t + .500005 t + .166518 t + .0416394 t + .00865924 t + .00143587 t6 . 5.5.35. (a) 23 − (b)

” 4 2 35 2 = 15 3t+ 5 2 − 15 t + 4 t ; it gives the ”2 t2 dt among all quadratic polynomials k p(t) − 1t k2 = p(t) − 1t 0 “ ” “ ” “ ” 2 2 3 3 35 4 126 10 2 5 t3 − 158 t + 15t 2 − 3 t− 4 + 4 t − 3t+ 5 − 5 14 − 28 3 = 12 − 42 t + 56 t2 − 126 5 t . 10 3

“

t−

3 4

”

35 4 Z 1“

+

“

t2 −

smallest value to p(t).

30 25

(c)

20 15 10 5 0.2

0.4

0.6

0.8

1

(d) Both do a reasonable job approximating from t = .2 to 1, but can’t keep close near the singularity at 0, owing to the small value of the weight function w(t) = t 2 there. The cubic does a marginally better job near the singularity. ♠ 5.5.36. Quadratic: .6215 + .3434 (t − 1) − .07705 (t2 − 4 t + 2) = .1240 − .6516 t − .07705 t2 ; Cubic: .6215 + .3434 (t − 1) − .07705 (t2 − 4 t + 2) + .01238 (t3 − 9 t2 + 18 t − 6) = .0497 + .8744 t − .1884 t2 + .01238 t3 .

The accuracy is reasonable up until t = 2 for the quadratic and t = 4 for the cubic polynomial.

1.4 1.2 1 0.8 0.6 0.4 0.2 4

2

5.6.1. (a) W ⊥ has basis

011 0 11 − B3C B 3C @ 1 A, @ 0 A,

0

dim W ⊥ = 2; (b) W ⊥ has basis

1

0

1 B−2 B 5 B− @ 4

1

146

1

C C C, A

6

8

dim W

⊥

= 1; (c) W

⊥

dim W ⊥ = 1; (e) W ⊥ 0

1

3 C 5.6.2. (a) B @ 4 A; −5 0

(b)

1

0 B

B z=B @

2 3 1 3 1 3

(b)

011 031 B2C B2C @ 1 A, @ 0 A;

0

1 3 @ 10 A, 1 − 10

1

C C C; A

z=

(d) w =

5.6.5. (a) Span of 0

1

011 2 B C B1C B 4 C, B C @ 1A

0

0

(c)

0 B B B @

1

C C C, A

1

1

−1

1

2C 7C C, 1A

0

0

1

−1 C (d) B @ 1 A, 0

0

0

1

B4C B7C B C; @ 0A

0

1

0

1

1 B C @ 0 A. 1

1 B C B0C C, (d) B @1A 0

1

0 B B B @

−4 −9 C B C ⊥ (c) Span of B = 2. @ 1 A, @ 0 A; dim W 0 1 (e) W ⊥ = {0}, and dim W ⊥ = 0.

− 51

1

(d) Span of

0

0

2

0

6

1

1

1

B C @ 1 A,

1

11 2 C B B−7 C B 4 C. C B @ 0A 0

1

0 1 6 B −5 C B 3 C B C 8 C B−1 C, z = B 42 C; (c) w = B (b) w = 25 A @ 25 A @ 3 6 − 31 − − 13 25 0 25 0 1 1 4 7 0 1 5 B 11 C B 11 C 1 C B− 1 C B7C B B 11 C C C B B B 3 C; C, z = B 11 C. z=B (e) w = 1 C 1 C B B @7A @ 11 A @ − 11 A 1 13 2 7 − 11 11 0 1 12 B− 8 C C ⊥ ⊥ B − 15 C; dim W dim W = 2. (b) Span of B = 1. 8 A @

1 7 @ 10 A; 21 10

2 B 7 B 3 B− @ 7 − 17

1

−1 C (c) B @ −1 A; 1

−1 C B B −1 C B C; @ 0A 1

0

0

0

1

1

0

021 0 1 −1 B3C B C @ 1 A, @ 0 A;

0

1

1 0

−2 −3 C B ⊥ = 2; (d) W ⊥ has basis 1C has basis A, @ 0 A, dim W 0 1 = {0}, dim W ⊥ = 0. B @

0

−1 B C B 3C C; 5.6.3. (a) B @ 2A 1 5.6.4. (a) w =

0

1

C C C, A

1

B3C @ 2 A;

dim W ⊥ = 1.

1

5.6.6. For the weighted inner product, the orthogonal complement W ⊥ is the set of all vectors v = ( x, y, z, w )T that satisfy the linear system h v , w1 i = x + 2 y + 4 w = 0,

h v , w2 i = x + 2 y + 3 z − 8 w = 0.

A non-orthogonal basis for W ⊥ is z1 = ( −2, 1, 0, 0 )T , z2 = ( −4, 0, 4, 1 )T . Applying Gram–Schmidt, the corresponding orthogonal basis is y1 = ( −2, 1, 0, 0 )T , y2 =

“

“

− 34 , − 43 , 4, 1

”T

. We decompose v = w + z, where w =

13 4 1 30 43 , − 43 , − 43 , − 43

”T

⊥

“

13 13 4 1 43 , 43 , 43 , 43

”T

∈ W,

∈ W . Here, since w1 , w2 are no longer orthogonal, its easier z= to compute z first and then subtract w = v − z. 5.6.7. (a) h p , q i = Z 1

−1

p(x) dx =

Z 1

Z 1

−1

−1

p(x) q(x) dx = 0 for all q(x) = a + b x + c x2 , or, equivalently,

x p(x) dx =

Z 1

−1

x2 p(x) dx = 0. Writing p(x) = a + b x + c x2 + d x3 + e x4 ,

the orthogonality conditions require 2 a + 32 c + 25 e = 0, 32 b + 52 d = 0, 23 a + 52 c + 27 e = 0. 3 (b) Basis: t3 − 35 t, t4 − 76 t2 + 35 ; dim W ⊥ = 2; (c) the preceding basis is orthogonal. ♦ 5.6.8. If u, v ∈ W ⊥ , so h u , w i = h v , w i = 0 for all w ∈ W , then h c u + d v , w i = c h u , w i + d h v , w i = 0 also, and so c u + d v ∈ W ⊥ . 147

5.6.9. (a) If w ∈ W ∩ W ⊥ then w ∈ W ⊥ must be orthogonal to every vector in W and so w ∈ W is orthogonal to itself, which implies w = 0. (b) If w ∈ W then w is orthogonal to every z ∈ W ⊥ and so w ∈ (W ⊥ )⊥ . 5.6.10. (a) The only element orthogonal to all v ∈ V is 0, and hence V ⊥ contains only the zero vector. (b) Every v ∈ V is orthogonal to 0, and so belongs to {0}⊥ . 5.6.11. If z ∈ W2⊥ then h z , w i = 0 for every w ∈ W2 . In particular, every w ∈ W1 ⊂ W2 , and hence z is orthogonal to every vector w ∈ W1 . Thus, z ∈ W1⊥ , proving W2⊥ ⊂ W1⊥ . 5.6.12. (a) We are given that dim W + dim Z = n and W ∩ Z = {0}. Now, dim W ⊥ = n − dim W , dim Z ⊥ = n − dim Z and hence dim W ⊥ + dim Z ⊥ = n. Furthermore, if v ∈ W ⊥ ∩ Z ⊥ then v is orthogonal to all vectors in both W and Z, and hence also orthogonal to any vector of the form w + z for w ∈ W and z ∈ Z. But since W, Z are complementary, every vector in R n can be written as w + z and hence v is orthogonal to all vectors in R n which implies v = 0. Z⊥ W

Z

(b)

W⊥

♦ 5.6.13. Suppose v ∈ (W ⊥ )⊥ . Then we write v = w + z where w ∈ W, z ∈ W ⊥ . By assumption, for every y ∈ W ⊥ , we must have 0 = h v , y i = h w , y i + h z , y i = h z , y i. In particular, when y = z, this implies k z k2 = 0 and hence z = 0 which proves v = w ∈ W . 5.6.14. Every w ∈ W can be written as w = z=

l X

j =1

k X

i=1

ai wi ; every z ∈ Z can be written as

bj zj . Then, using bilinearity, h w , z i =

and Z are orthogonal subspaces.

k X

l X

i=1 j =1

ai bj h wi , zj i = 0, and hence W

♦ 5.6.15. (a) We are given that h wi , wj i = 0 for all i 6= j between 1 and m and between m+1 and n. It is also 0 if 1 ≤ i ≤ m and m+1 ≤ j ≤ n since every vector in W is orthogonal to every vector in W ⊥ . Thus, the vectors w1 , . . . , wn are non-zero and mutually orthogonal, and so form an orthogonal basis. (b) This is clear: w ∈ W since it is a linear combination of the basis vectors w 1 , . . . , wm , similarly z ∈ W ⊥ since it is a linear combination of the basis vectors wm+1 , . . . , wn . ♥ 5.6.16. (a) Let V = { α + β x } = P (1) be the two-dimensional subspace of linear polynomials. u(b) − u(a) , Every u(x) ∈ C0 ( a, b ) can be written as u(x) = α + β x + w(x) where β = b−a α = u(a) − β a, while w(a) = w(b) = 0, so w ∈ W . Moreover, a linear polynomial α + β x vanishes at a and b if and only if it is identically zero, and so V satisfies the conditions

148

for it to be a complementary subspace to W . (b) The only continuous function which is orthogonal to all functions in W is the zero function. Indeed, suppose h v , w i =

Z b a

v(x) w(x) dx = 0 for all w ∈ W , and v(c) > 0

for some a < c < b. Then, by continuity, v(x) > 0 for | x − c | < δ for some δ > 0. Choose w(x) ∈ W so that w(x) ≥ 0, w(c) > 0, but w(x) ≡ 0 for | x − c | ≥ δ. Then v(x) w(x) ≥ 0, with v(c) w(c) > 0, and so

Z b a

v(x) w(x) dx > 0, which is a contradic-

tion. The same proof works for w(c) < 0 — only the inequalities are reversed. Therefore v(x) = 0 for all a < x < b, and, by continuity, v(x) ≡ 0 for all x. Thus, W ⊥ = {0}, and there is no orthogonal complementary subspace.

5.6.17. Note: To show orthogonality of two subspaces, it suffices to check orthogonality of their respective basis vectors. ! ! ! ! 1 −2 1 2 (a) (i) Range: ; cokernel: ; corange: ; kernel: ; 2 1 −2 1 ! ! ! ! 2 1 −2 1 · = 0; (iii) · (ii) = 0. 1 −2 1 2 0 0 1 0 1 1 1 ! ! 0 5 0 5 B 5C B C B C ; kernel: {0}; , (b) (i) Range: @ 1 A, @ 2 A; cokernel: @ −1 A; corange: 2 0 2 0 1 0 1 0 1 1 0 1 0 1 1 ! ! 5 0 5 0 C B 5C B C B 5C (ii) B ·0= · 0 = 0. @ 1 A · @ −1 A = @ 2 A · @ −1 A = 0; (iii) 0 2 0 2 11 0 1 1 0 1 0 0 1 0 1 1 0 −3 0 −1 −3 1 0 B B B C B C C B C B C (c) (i) Range: @ −1 A, @ 0 A; cokernel: @ −2 A; corange: @ 0 A, @ 1 A; kernel: @ −2 C A; 1 2 −3 1 3 −2 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 −3 1 −3 −1 −3 0 −3 B C B C B C B C B C B C B C B (ii) @ −1 A · @ −2 A = @ 0 A · @ −2 A = 0; (iii) @ 0 A · @ −2 A = @ 1 A · @ −2 C A = 0. −2 1 3 1 −3 0 1 01 1 2 1 0 1 0 1 0 1 1 0 1 2 −1 B C B C B C B C B C B2C B3C C, B C; (d) (i) Range: @ −1 A, @ 1 A; cokernel: @ −1 A; corange: B @0A @3A 0 3 1 1 2 1 0 1 0 0 1 0 1 1 0 1 0 2 B 13 C 1 −1 −1 2 B −1 C B C B C C B C B C B − 23 C C; B C, B (ii) kernel: B @ −1 A · @ −1 A = @ 1 A · @ −1 A = 0; C B @ 1A @ 0A 3 0 1 1 0 1 1 0 0 1 0 0 1 0 1 1 0 1 0 11 0 1 1 0 1 2 2 0 1 3 3C C B C B B C B C C B C B B C B 2 2C C B 3 2 −1 −1 2 B C B B C B C C B3C B B C B− C − B C·B C=B C·B 3 C C=B C·B 3 C=B C·B = 0. (iii) B @3A @ 1A @0A @ 1A @3A @ 0C @0A @ 0A A 2 1 0 0 1 2 1 1 1 0 1 0 0 3 1 0 0 1 0 1 B1C B 1C 1 −3 3 C B C B C C B C B B C B C B 4 C, B 1 C; kernel: (e) (i) Range: @ 1 A, @ 1 A; cokernel: @ −1 A; corange: B C B C B @ A @ −1 A 2 2 2 5 1 7

149

0

1 0

1 0

1

0

1 0

1

−1 −1 −2 3 −1 0 1 0 0 1 0 1 1 B −1 C B 1 C B −1 C B 1 C B −1 C 3 −3 1 −3 B C B C B C B C B C B C B C B C B C B B C B C C B C B C B 1 C, B 0 C, B 0 C; (ii) @ 1 A · @ −1 A = @ 1 A · @ −1 A = 0; (iii) B 4 C · B 1 C C B B C B C B C B C @ 0A @ 1A @ 0A @2A @ 0A 5 2 2 2 0 0 1 7 0 0 0 1 0 0 1 0 1 1 1 0 0 1 1 0 1 0 0 1 1 −2 −1 0 3 3 −1 0 0 −1 −2 B 1 C B −1 C B 1 C B −1 C B1C B 1C B 1C B 1C B 1 C B −1 C B B C B C C B C B C B B C C B C B B C C C C B B C B C B C B C B B C B C B C C B 4 C · B 0 C = B 4 C · B 0 C = B 1 C · B 1 C = B 1 C · B 0 C = B 1 C · B 0 C = 0. =B C C B B C B B C B C C B B B C C C B C B @2A @ 0A @ −1 A @ 0 A @2A @ 1A @ −1 A @ 1 A @ −1 A @ 0 A 1 0 0 1 0 01 0 1 1 7 7 1 1 00 1 1 0 1 0 1 1 0 1 1 −1 3 C C B B B B C B C B C −2 C B 3C B 7C C B 1C B −2 C B 1 C C; C, B B C, B C, B C; corange: B C; cokernel: B (f ) (i) Range: B @ 0A @ 2A @ −3 A @ 5 A @ 1A @0A −1 −2 1 1 0 −4 1 0 1 0 kernel: 0

6 B 7C B B−2 C B B 7 C, B C B B @ 1A @

1

0 1 0

11 7 1 7

−1

0 1 1

C C C; C A

0

1

0

1

0

1

0

1

0

1

0

1

3 −1 3 1 B C B C C B 1C 1 −2 C B C B1C C B C · B C · B C = 0; C = B (ii) · = = · @ 5A @0A 5A @ 1A 0 −4 0 01 1 1 0 1 1 −4 0 11 0 0 0 1 1 1 0 1 0 61 11 6 11 1 0 0 1 7C 7 C 7C 7 C B B B B B B C B 1 C C B 2C C B B C B 2C B 3C B− C B 3C B C B 7C B− C B 7C B 1 C C·B 7 C=B C·B 7 C=B C·B 7 C B C·B 7 C=B = 0. (iii) B @ 0A @ 0 A @ 2A @ 1A @ 2A @ 0 C @ 0A @ 1A A −2 −1 −1 −2 0 1 0 1 1 1 0 1 0 0 0 1 1 0 −20 −1 −11 2 −1 B −4 C B −1 C B −9 C B 2 C B −5 C C C B C B B C B C B C C B C B B C C B C; corange: C, B 0 C, B B −3 C, B 2 C; cokernel: B 0 1 (g) (i) Range: B C B C B C B C C B B @ @ 1 A @ −3 A 0A @ 1A @ 0A 1 0 0 −5 −2 1 0 1 0 1 0 0 1 0 0 1 0 1 0 1 1 −1 −1 −1 −11 −1 2 0 −1 B 2 C B −4 C B 2 C B −1 C B C C B B C B C B C C B C B B C C B 2C B0C B1C B 0C B C B B C B C C; (ii) B −3 C · B C, B C; kernel: B C, B B C = B −3 C · B 0 C = 1 C C B B B C C B @0A @ 0A @ 2A @1A @ 1A @ 1A @ 1A @ 0A 1 0 0 −1 0 −2 −2 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 −20 2 −1 2 −11 2 −20 −1 B −5 C B −9 C B −5 C B −1 C B −5 C B −4 C B 2 C B −9 C C C B B C C B B C C B B C C B B C C B B C C B B C C B B C C B B C = 0; C = B 2C · B 0C = B 2C · B C = B 2C · B B −3 C · B 0 1 0 C C B B C C B B C C B B C C B B @ 1A @ 0 A @ −3 A @ 0 A @ −3 A @ 1 A @ −3 A @ 0 A 1 0 1 0−51 0 10 0 1−50 −5 1 −2 0 1 0 1 0 1 0 1 −1 2 −1 −1 0 2 0 −1 B B C B C B C B C B C B C B C 2C C B1C B 2C B 0C B0C B1C B0C B 0C B C·B C=B C·B C=B C·B C=B C·B C = 0. (iii) B @ 2A @0A @ 2A @ 0A @1A @0A @1A @ 0A −1 0 −1 1 0 0 0 1 B C B −2 C C B @ −3 A

C B B −2 C B C @ 1A

1

B C B −2 C C B @ −3 A

5.6.18. (a) The compatibility condition is

1

B C B1C B C @0A

2 3 b1

B B B @

+ b2 = 0 and so the cokernel basis is

“

”T 2 . 3,1 T

(b) The compatibility condition is −3 b1 + b2 = 0 and so the cokernel basis is ( −3, 1 ) . (c) There are no compatibility conditions, and so the cokernel is {0}. (d) The compatibility conditions are − 2 b1 − b2 + b3 = 2 b1 − 2 b2 + b4 = 0 and so the cokernel basis is ( −2, −1, 1, 0 )T , ( 2, −2, 0, 1 )T . 150

5.6.19. (a)

(c)

0

0

1

1

−1

1

B C @ −2 A 0

1 0

0

1

B C B C B −2 C B 0 C C, B C; B @ 2A @1A

−1

1

3

0

1 0

1

10 −1 C B C (b) B @ −21 A, @ 12 A; −12 21

0

1

0

1

0

0

1

0

1

0

1 B −1 C 10 B 10 C = @ −21 A + @ 12 A, 99 −12 99 21

B @

2 7 B 10 C 4 B −1 C C B @ −21 A + @ 12 A, @ −3 A = 33 −12 33 21 0

1

0

1

0

1

1

0

1

0

1

−2 20 B 10 C 2 B −1 C 4C @ −21 A − @ 12 A, A=− 99 −12 99 21 2

−1 7 B 10 C 29 B −1 C B C @ 5A = − @ −21 A + @ 12 A. 99 −12 99 21 7

5.6.20. (a) Cokernel basis: ( 1, −1, 1 )T ; compatibility condition: 2 a − b + c = 0; (b) cokernel basis: ( −1, 1, 1 )T ; compatibility condition: − a + b + c = 0; (c) cokernel basis: ( −3, 1, 1, 0 )T , ( 2, −5, 0, 1 )T ; compatibility conditions: − 3 b1 + b2 + b3 = 2 b1 − 5 b2 + b4 = 0; (d) cokernel basis: ( −1, −1, 1, 0 )T , ( 2, −1, 0, 1 )T ; compatibility conditions: − a − b + c = 2 a − b + d = 0. 5.6.21. (a) z =

(b) z =

(c) z =

(d) z =

0

1 1 2 B C B C @ 0 A, − 12 1 0 1 B 3C B 1C C, B @ 3A − 13 0 14 1 B 17 C B− 1 C B 17 C B C, 4 C B @ − 17 A 5 − 17 0 11 B 2 C B 1C B 3C C B B − 1 C, B 6C C B B 1C @−3 A

0

0 1 0 1 1 1 1 2 2 C B C B C 3B B C B C C w=B @ 0 A = − @ −2 A + @ −3 A; 2 1 1 2 2 0 1 0 1 0 1 2 1 −1 B 3C B C 1B C C C B C B−1 C = − B w=B @ 1 A − @ 0 A; @ 3A 3 1 2 −1 3 0 3 1 1 0 0 1 1C 2 B 17 C B B C C B 1 C B B 4 B1C 1 B −1 C C C B 17 C = C+ B B C; w=B C B 4 C B B3C 51 51 0 @ 17 A A @ @ A 5 3 3 17 0 11 0 1 −3 B 2 C B C B−1 C B 2C B 3C B C C B 1 B C 1C C. w=B −1 B 6C= B B C 6B C B C B 1C @ −2 A @ 3A 0

0

0

5.6.22. (a) (i) Fredholm requires that the cokernel basis T

“

”T 1 2,1

be orthogonal to the right hand

side ( −6, 3 ) ; (ii) the general solution is x = −3 + 2 y with y free; (iii) the minimum norm solution is x = − 53 , y = 65 . (b) (i) Fredholm requires that the cokernel basis ( 27, −13, 5 )T be orthogonal to the right hand side ( −1, 1, 8 )T ; (ii) there is a unique solution: x = −2, y = 1; (iii) by uniqueness, the minimum norm solution is the same: x = −2, y = 1. (c) (i) Fredholm requires that the cokernel basis ( −1, 3 )T be orthogonal to the right hand side ( 12, 4 )T (ii) the general solution is x = 2 + 21 y − 23 z with y, z free; (iii) the minimum norm solution is x = 47 , y = − 72 , z = 67 . 151

(d) (i) Fredholm requires that the cokernel basis ( −11, 3, 7 )T be orthogonal to the right hand side ( 3, 11, 0 )T (ii) the general solution is x = −3 + z, y = 2 − 2 z with z free; 1 7 (iii) the minimum norm solution is x = − 11 6 , y = −3 , z = 6 . (e) (i) Fredholm requires that the cokernel basis ( −10, −9, 7, 0 )T , ( 6, 4, 0, 7 )T be orthogonal to the right hand side ( −8, 5, −5, 4 )T ; (ii) the general solution is x1 = 1 − t, x2 = 4 5 3+2 t, x3 = t with t free; (iii) the minimum norm solution is x1 = 11 6 , x2 = 3 , x3 = − 6 . T (f ) (i) Fredholm requires that the cokernel basis ( −13, 5, 1 ) be orthogonal to the right hand side ( 5, 13, 0 )T ; (ii) the general solution is x = 1 + y + w, z = 2 − 2 w with y, w 9 8 7 9 , y = − 11 , z = 11 , w = 11 . free; (iii) the minimum norm solution is x = 11 0

1 0

1

1 B − 13 C B 1C −1 C C B B C, B 3 C 5.6.23. (a) B C; @ 0A B @ 1A 1 2

(b)

3

0

1

1 0

−1

1

B C B C B1C B 1C B C, B C; @ 0 A @ −1 A

0

1

(c) yes because of the orthogonality of the corange and kernel; see Exercise 5.6.15.

5.6.24. If A is symmetric, ker A = ker AT = coker A, and so this is an immediate consequence of Theorem 5.55. ♦ 5.6.25. Since rng A = span {v1 , . . . , vn } = V , the vector w is orthogonal to V if and only if w ∈ V ⊥ = (rng A)⊥ = coker A.

♦ 5.6.26. Since ker A and corng A are complementary subspaces of R n , we can write any x ∈ R m as a combination x = v + z with v = c1 v1 + · · · + cr vr ∈ corng A and z ∈ ker A. Then y = A x = A v = c1 A v1 + · · · + cr A vr and hence every y ∈ rng A can be written as a linear combination of A v1 , . . . , A vr , which proves that they span rng A. To prove linear independence, suppose 0 = c1 A v1 + · · · + cr A vr = A(c1 v1 + · · · + cr vr ). This implies that c1 v1 + · · · + cr vr ∈ ker A. However, since they are orthogonal complements, the only vector in both corng A and ker A is the zero vector, and so c1 v1 + · · · + cr vr = 0, which, since v1 , . . . , vr are a basis, implies c1 = · · · = cr = 0. 5.6.27. False. The resulting basis is almost never orthogonal. 5.6.28. False. See Example 5.60 for a counterexample. ♦ 5.6.29. If f 6∈ rng K, then there exists x ∈ ker K = coker K such that xT f = x · f = b 6= 0. But then p(s x) = − 2 s xT f + c = − 2 b s + c can be made arbitrarily large negative by choosing s = t b with t ≫ 0. Thus, p(x) has no minimum value. ♦ 5.6.30. The result is not true if one defines the cokernel and corange in terms of the transposed matrix AT . Rather, one needs to replace AT by its Hermitian transpose A† = AT , cf. Exercise 5.3.25, and define corng A = rng A† , coker A = ker A† . (These are the complex conjugates of the spaces defined using the transpose.) With these modifications, both Theorem 5.54 and the Fredholm Alternative Theorem 5.55 are true for complex matrices.

5.7.1. (a) (i) c0 = 0, c1 = − 12 i , c2 = c−2 = 0, c3 = c−1 = 12 i , (ii) 21 i e− i x − 21 i e i x = sin x; 1 π, c4 = c−2 = 0, c5 = c−1 = 29 π, (b) (i) c0 = 21 π, c1 = 92 π, c2 = 0, c3 = c−3 = 18 1 1 1 (ii) 18 π e− 3 i x + 29 π e− i x + 12 π + 29 π e i x = 21 π + 49 π cos x + 18 π cos 3 x − 18 π i sin 3 x; (c) (i) c0 =

1 3,

c1 =

√ 3− 3 i 12

, c2 =

√ 1− 3 i 12

152

, c3 = c−3 = 0, c4 = c−2 =

√ 1+ 3 i 12

,

√

√ √ √ √ 1+ 3 i − 2 i x e + 3+123 i e− i x + 13 + 3−123 i e i x + 1−123 i e2 i x = 12 1 1 √ 6 cos 2 x + 2 3 sin 2 x; 2 3 √ √ (i) c0 = − 18 , c1 = − 81 + 1+4 2 i , c2 = − 81 , c3 = − 81 − 1−4 2 i , c4 = c−4 = √ √ − 18 , c5 = c−3 = − 81 + 1−4 2 i , c6 = c−2 = − 18 , c7 = c−1 = − 18 − 1+4 2 i , „ « „ « √ √ (ii) − 81 e− 4 i x + − 81 + 1−4 2 i e− 3 i x − 18 e− 2 i x + − 81 − 1+4 2 i e− i x − 18 + „ « „ « √ √ √ − 81 + 1+4 2 i e i x − 18 e2 i x + − 81 − 1−4 2 i e3 i x = − 18 − 41 cos x − 2+1 sin x − 2 √ 1 i 2−1 1 sin 3 x − 18 cos 4 x − 1− 4 cos 2 x − 4 cos 3 x − 2 8 sin 4 x.

c5 = c−1 = 3+123 i , (ii) 1 1 1 √ sin x + 3 + 2 cos x +

(d)

5.7.2. (a) (i) f0 = f3 = 2, f1 = −1, √f2 = −1. (ii)√e− i x + e i x =√2 cos x; √ (b) (i) f0 = f5 = 1, f1 = 1 − 5 , f2 = 1 + 5 , f3 = 1 + 5 , f4 = 1 − 5 ; (ii) e− 2 i x − e− i x + 1 − e i x + e2 i x = 1 − 2 cos x + 2 cos 2 x; (c) (i) f0 = f5 = 6, f1 = 2 + 2 e2 π i /5 + 2 e− 4 π i /5 = 1 + .7265 i , f2 = 2 + 2 e2 π i /5 + 2 e4 π i /5 = 1 + 3.0777 i , f3 = 2 + 2 e− 2 π i /5 + 2 e− 4 π i /5 = 1 − 3.0777 i , f4 = 2 + 2 e− 2 π i /5 + 2 e4 π i /5 = 1 − .7265 i ; (ii) 2 e− 2 i x + 2 + 2 e i x = 2 + 2 cos x + 2 i sin x + 2 cos 2 x − 2 i sin 2 x; (d) (i) f0 = f1 = f2 = f4 = f5 = 0, f3 = 6; (ii) 1 − e i x + e2 i x − e3 i x + e4 i x − e5 i x = 1 − cos x + cos 2 x − cos 3 x + cos 4 x − cos 5 x + i (− sin x + sin 2 x − sin 3 x + sin 4 x − sin 5 x). ♠ 5.7.3.

6

6

5

5

5

4

4

4

3

3

3

2

2

2

6

1

1

1 1

2

4

3

5

1

6

2

4

3

5

1

6

2

4

3

5

6

The interpolants are accurate along most of the interval, but there is a noticeable problem near the endpoints x = 0, 2 π. (In Fourier theory, [ 16, 47 ], this is known as the Gibbs phenomenon.) ♠ 5.7.4. (a)

40

40

40

30

30

30

20

20

20

10

10

10

1

(b)

2

4

3

5

1

6

2

4

3

5

1

6

10

10

10

8

8

8

6

6

6

4

4

4

2

2 1

2

4

3

5

-1

4

3

5

2

3

4

5

4

5

2

3

4

5

6

6

1 0.5

1

6

1

6

0.5

1 -0.5

2

1

0.5

(c)

3

2 1

6

1

2

-0.5

2

3

4

5

1

6 -0.5

-1

-1

153

2

3

4

5

6

1

(d)

1

0.5

1

2

3

4

5

1

6

4

3

5

1

6

1

0.6

0.6

0.6

0.4

0.4

0.2

0.2 2

3

4

5

2

3

5

1

3

4

5

5.7.5.

1

2

ζ62

3

1 2

√

+ 23 i ; √ − 12 + 23 i ,

4

5

1

6

2

3

4

5

6

ζ6

ζ63 = −1

(a)

6

0.5 1

6

5

2

1

2

4

1.5

0.5 1

3

2.5

1.5

0.5

2

3

2

1.5

1

6

2.5

2

(b) ζ6 =

4

3

2.5

6

0.2 1

6

5

1

0.4

1

4

0.8

0.8

3

3

-1

1

0.8

2

-0.5

-1

-1

(f )

2

-0.5

-0.5

(e)

1 0.5

0.5

ζ60 = 1

ζ65

ζ64 √

√

ζ63 = −1, ζ64 = − 12 − 23 i , ζ65 = 12 − 23 i , so 1 + ζ6 + ζ62 + ζ63 + ζ64 + ζ65 = 0. (c) ζ62 = (d) We are adding three pairs of unit vectors pointing in opposite directions, so the sum cancels out. Equivalently, 16 times the sum is the center (of mass) of the regular hexagon, which is at the origin. ♦ 5.7.6. (a) The roots all have modulus | ζ k | = 1 and phase ph ζ k = 2 πk/n. The angle between successive roots is 2 π/n. The sides meet at an angle of π − 2 π/n. q ” 1“ ph z + 2 πk , so the angle between (b) Every root has modulus n | z | and the phases are n successive roots is 2 π/n, and the sides continue to meet at an angle of π − 2 π/n. q 1 The n-gon has radius ρ = n | z | . Its first vertex makes an angle of ϕ = ph z with the n z horizontal. √ In the figure, the roots ρ = 6 z are at the vertices of a regular hexagon, whose first vertex makes an angle with the horizontal that is 61 the angle made by the point z.

♦ 5.7.7. (a) (i) i , − i ; (ii) e2 πk i /5 for k = 1, 2, 3 or 4; (iii) e2 πk i /9 for k = 1, 2, 4, 5, 7 or 8; (b) e2 πk i /n whenever k and n have no common factors, i.e., k is relatively prime to n. 154

5.7.8. (a) Yes, the discrete Fourier coefficients are real for all n. (b) A function f (x) has real discrete Fourier coefficients if and only if f (xk ) = f (2 π−xk ) on the sample points x0 , . . . , xn−1 . In particular, this holds when f (x) = f (2 π − x). ♥ 5.7.9. (a) In view of (2.13), formula (5.91) is equivalent to matrix multiplication f = F n c, where 1 0 1 1 1 ... 1 B ζ ζ2 ... ζ n−1 C C B1 B “ ” 2 4 2(n−1) C C B1 ζ ζ . . . ζ C Fn = ω0 ω1 . . . ωn−1 = B C B C B. .. .. .. .. B. . . 2C . . A @. 1 ζ n−1 ζ 2(n−1) . . . ζ (n−1)

is the n×n matrix whose columns are the sampled exponential vectors (5.90). In particular, 1 1

F2 =

1 −1

!

0

1 B B F3 = @ 1 1

,

0

F8 =

1

B1 B B B1 B B B1 B B B1 B B1 B B B1 @

1

1√ − 12 + 23 i − 12 −

1

1 −ij n ζn

=

i

−1+ √ i 2

−1

−1− √ i 2

−i

−1+ √ i 2

1 ij n ζn ,

− 21 +

i 1

1 −1 1 −1 1 −1 1 −1

−1+ √ i 2

−i

1+ √i 2

−1

1− √i 2

i

−1− √ i 2

Fn−1 f .

which is

√ 3 2

i

0

1

1 1 B i B1 F4 = B @ 1 −1 1 −i 1 1 1 1− √i C −i C 2 C −1 −i C C C −1+ √ i C i 2 C C. 1 −1 C C −1+ i C √ −i 2 C C −1 i C A 1+ √i −i

C C, A

1

−1− √ i 2

i

1− √i 2

−1

1+ √i 2

−i

1− √i 2

1 n

times the complex conjugate of the (j, i) entry √ (c) By part (b), Un−1 = n Fn−1 = √1n Fn† = Un† .

2 1.5

1

1

1

0.5 1

2

4

3

5

0.5

6

-0.5

1

2

4

3

5

1

6

2

4

3

5

6

-0.5

-0.5

-1

1

1 −i C C C, −1 A i

Moreover, formula (5.91) implies that the (i, j)

1.5

0.5

1 −1 1 −1

2

2

2 1.5

♠ 5.7.10.

3 2

1 i −1 −i 1 i −1 i

1+ √i 2

(b) Clearly, if f = Fn c, then c = entry of Fn−1 is of Fn .

√

1√ − 21 − 23 i

-1

-1

Original function, 11 mode compression, 21 mode compression. The average absolute errors are .018565 and .007981; the maximal errors are .08956 and .04836, so the 21 mode compression is about twice as accurate. ♠ 5.7.11. (a)

6

6

6

5

5

5

4

4

4

3

3

3

2

2

2

1

1 1

2

3

4

5

6

1 1

2

3

4

5

6

1

2

3

4

5

6

Original function, 11 mode compression, 21 mode compression. The largest errors are at the endpoints, which represent discontinuities. The average absolute errors are .32562 and .19475; the maximal errors are 2.8716 and 2.6262, so, except near the ends, the 21 mode compression is slightly less than twice as accurate. 155

(b)

80

80

80

60

60

60

40

40

40

20

20

1

2

3

4

5

20

1

6

2

4

3

5

1

6

2

3

4

5

6

Original function, 11 mode compression, 21 mode compression. The error is much less and more uniform than in cases with discontinuities. The average absolute errors are .02575 and .002475; the maximal errors are .09462 and .013755, so the 21 mode compression is roughly 10 times as accurate. 1

1

0.8

(c)

1

0.8

0.8

0.6

0.6

0.6

0.4

0.4

0.4 0.2

0.2

0.2 1

2

4

3

5

1

6

2

4

3

5

1

6

2

3

4

5

6

Original function, 11 mode compression, 21 mode compression. The only noticeable error is at the endpoints and the corner, x = π. The average absolute errors are .012814 and .003612; the maximal errors are .06334 and .02823, so the 21 mode compression is 3 times as accurate. ♣ 5.7.12. l = 4, 27, 57. ♣ 5.7.13. Very few are needed. In fact, if you take too many modes, you do worse! For example, if ε = .1, 2

2

1.5

2

1.5

1

1

0.5

2

3

4

5

1

2

3

4

5

1.5

1

0.5

6

2

1.5

1

0.5

1

2

1.5

1

0.5

6

1

2

3

4

5

0.5

6

1

2

3

4

5

6

1

2

3

4

5

6

plots the noisy signal and the effect of retaining 2 l + 1 = 3, 5, 11, 21 modes. Only the first three give reasonable results. When ε = .5 the effect is even more pronounced: 2.5

2.5

2.5

2.5 2

2

2

1.5

2

1.5

1

1

0.5 1

2

3

4

5

6

1

2

3

4

5

1

1

0.5

-0.5

1.5

1.5

1

0.5

-0.5

2

1.5

0.5

6

1

2

3

4

5

6

-0.5

0.5 1

2

3

4

5

6

1

2

3

4

5

6

-0.5

♠ 5.7.14. For noise varying between ±1, and 256 = 28 sample points, the errors are # nodes

3

5

7

9

11

13

average error

.8838

.1491

.0414

.0492

.0595

.0625

maximal error

1.5994

.2687

.1575

.1357

.1771

.1752

Thus, the optimal denoising is at 2 l + 1 = 7 or 9 modes, after which the errors start to get worse. Sampling on fewer points, say 64 = 26 , leads to similar results with slightly worse performance: # nodes

3

5

7

9

11

13

average error

.8855

.1561

.0792

.0994

.1082

.1088

maximal error

1.5899

.3348

.1755

.3145

.3833

.4014

On the other hand, tripling the size of the error, to vary between ± 3, leads to similar, and marginally worse performance:

156

# nodes

3

5

7

9

11

13

average error

.8830

.1636

.1144

.3148

.1627

.1708

maximal error

1.6622

.4306

.1755

.3143

.3398

.4280

Note: the numbers will slightly vary each time the random number generator is run. ♣ 5.7.15. The “compressed” function differs significantly from original signal. The following plots are the function, that obtained by retaining the first l = 11 modes, and then the first l = 21 2

2

1.5 1

modes:

2

1.5

1.5

1

0.5

1

0.5 1

2

3

4

5

6

-0.5

0.5 1

2

3

4

5

1

6

-0.5

-1

2

3

4

5

6

-0.5

-1

-1

5.7.16. True for the odd case (5.103), but false for the even case (5.104). Since ω − k = ωk , when f is real, c− k = ck , and so the terms c− k e− i k x + ck e i k x = 2 Re ck e i k x combine to form a real function. Moreover, the constant term c0 , which equals the average sample value of f , is also real. Thus, all terms in the odd sum pair up into real functions. However, in the even version, the initial term c− m e− i m x has no match, and remains, in general, complex. See Exercise 5.7.1 for examples.

0

0

1

1

0

0

1

3

1

1 0 4 B B C B 2 C B C C B1C B−1 + 1 i C B1C B 1C − B2C C B 4 B C B 2C (2) (0) (1) 4 C; C, c = c C, c C, c =B =B =B ♠ 5.7.17. (a) f = B B C C C B B1C B − 14 @1A A @ @2A @ 1 A 3 3 1 1 1 −2 −4 + 4 i 2 2 0 1 1 0 0 1 1 0 0 0 0 0 0 B 1 C C B 0 B 1 C C B B √ C B 0 C B B 1 C C B B C B B−2 i C B− i B 0 C 2 C B C C B B B 2 C C B B C B 1 C B 0 C B B 0 C B 1 C C B B B 0 C C B B C C B B 1 B C C B B √1 C B −1 C B B C B 1 C B C C B B 2i C B 0 C B B C B √1 C (2) (3) (0) (1) 2 B B C C B C, c C, c B , c = , c = c = (b) f = B = = B 0 C B B 0 C B 0 C C B 2 B B 0 C C B B C C B B 1− i C B B 1 C 1 C B B C B B C C B √ √ √ B − √1 C B− C B 2 2 C B 0 B 2C 2C B B C B B C C B 2 B B 1 C C B B C C B B B √ C C 0 B C C B @ 0 0 B −1 C B C @ A A @ 2 A @ @ A 1 1 1+ i √ √ 2i − √1 − √1 2

0

2

0

(c) f =

B B B B B B B B B B B B B B B B B @

1

0

π B C B πC B C B C B πC B C B C C π C (0) B B C, c =B B 0 C B C B C 1 C B B 4πC B C 1 C B π @ A 2 3 4π

3 4 1 2 1 4

2 2

2

1

π C 0 C C 1 C C π 2 C 1 C C 2πC , c(1) = 3 C C π C 4 C 1 C 4πC C 1 C π 4 A 3 4π

0 1 1 π 2 B 1 C B C B 2π C B C 1 B C B 2π C B C B C B 0 C B C, B 1π C B 2 C B C B 1 C B 4π C B C B 1 C @ 2π A − 41 π

157

0

c(2) =

0

1

1 π B 2 C B 1π C B 4 C B C B 0 C B C B C B 1π C B 4 C B C, B 1 C B 2π C B C B 1+ i C B 8 πC B C B C 0 A @ 1− i 8 π

c = c(3) =

B B B B B B B B B B B B B B B B B B B B B @

1 π √2 2+1 √ 8 2

0

√ 2−1 √ 8 2

0

√ 2−1 √ 8 2

0

√ 2+1 √ 8 2

1

C C C C C C C C C. C C C C C C C A

1 C C

πC C

C C C C C πC C C; C C C C πC C C C C A

π

(d)

f = ( 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 )T , c(0) = ( 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0 )T , c(1) = ( .5, .5, .5, .5, .5, .5, 0, 0, .5, .5, .5, .5, .5, .5, 0, 0 )T , c(2) = ( .5, .25 − .25 i , 0, .25 + .25 i , .25, .25, .25, .25, .5, .25 − .25 i , 0, .25 + .25 i , .25, .25, .25, .25 ) T , c(3) = (.375, .2134 − .2134 i , . − .125 i , .0366 + .0366 i , .125, .0366 − .0366 i , .125 i , .2134 + .2134 i ,

.375, .2134 − .2134 i , −.125 i , .0366 + .0366 i , .125, .0366 − .0366 i , .125 i , .2134 + .2134 i ) T ,

c = c(4) = (.375, .1644 − .2461 i , −.0442 − .1067 i , .0422 + .0084 i , .0625 − .0625 i , −.0056 − .0282 i , .0442 + .0183 i , .0490 − .0327 i , 0, .0490 + .0327 i , .0442 − .01831 i , −.0056 + .0282 i , .0625 + .0625 i , .0422 − .0084 i , −.0442 + .1067 i , .1644 + .2461 i )T .

♠ 5.7.18. (a) c =

0

1

1

C B B −1 C C, B @ 1A

0

−1

1 B0 B B B1 B B0 M2 = B B0 B B B0 B @0 0

0 1 0 1 0 0 0 0

f (0)

1 0 −1 0 0 0 0 0

1

1

C B B 1C C, B @ −1 A

0

2

1

C B B 0C B C, @ −2 A

f (1) =

−1

f = f (2) =

0

0

0

1

B C B0C B C; @4A

0

0

1

3 1 0 1 0 1 2 4 4 B √3 (1 + i ) C B C B 0C B 2C B 0C 2 B C C B B C B C B C C B B C B C 4 + 3 i B C B 0C B 4C B 0C B 3 C B C C B C B B √ (−1 + i ) C B 0C C B C B C 2 C, f (1) = B 0 C, f (2) = B 0 C, f = f (3) = B B C. =B B 2C B −1 C B 1C B C 5 B C B C C B B C B C C C B B B − √3 (1 + i ) C B −1 C B 3C B 3C B C B C C B C B 2 B C @ −1 A @ −2 A @ 3A B C 4 − 3i @ A 3 −1 0 3 √ (1 − i ) 0

1

2 B 2C B C B C B 0C B C B −1 C C (b) c = B B 2 C, B C C B B −1 C C B @ 0A −1 ♥ 5.7.19. (a) 0 1 0 0 B0 0 0 B B B0 0 1 B B0 0 0 M0 = B B0 1 0 B B B0 0 0 B @0 0 0 0 0 0 0

f (0) =

0

2

0 0 0 0 0 0 1 0

0

1

0 1 0 0 0 0 0 0

0 0 0 0 0 1 0 0

0 0 0 1 0 0 0 0

0 0C C 0C C C 0C C, 0C C C 0C C 0A 1

0 i 0 −i 0 0 0 0

0 0 0 0 1 0 1 0

0 0 0 0 0 1 0 1

0 0 0 0 1 0 −1 0

1

0 0C C 0C C C 0C C, 0C C C iC C 0A −i

1 B1 B B B0 B B0 M1 = B B0 B B B0 B @0 0 0 1 B0 B B B0 B B B0 B M3 = B B1 B B0 B B B0 @ 0

1 0 0 0 0 −1 0 0 0 0 0 1 1 0 0 0 1 −1 0 0 0 0 0 1 1 0 0 0 1 −1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1+ √i 1 0 0 0 2 0 1 0 0 0 0 0 1 0 0 0 0 0 −1 0 √i 1 0 0 0 − 1+ 2 0 1 0 0 0 0 0 1 0 0

0 0 0 0 0 0 1 1

1

0 0C C 0C C C 0C C, 0C C C 0C C 1A −1 0 0 0 0 i 0 i√ −1 0 2 0 0 0 0 −i 0 1− √i 0 2

1

C C C C C C C C C. C C C C C C A

(b) Because, by composition, f = M3 M2 M1 M0 c. On the other hand, according to Exercise 5.7.9, f = F8 c, and so M3 M2 M1 M0 c = F8 c. Since this holds for all c, the coefficient matrices must be equal: F8 = M3 M2 M1 M0 . (c) c(0) = N0 f , c(1) = N1 c(0) , c(2) = N2 c(0) , c = c(3) = N3 c(0) , where N0 = M0 , and Nj = 21 Mj , j = 1, 2, 3, and F8−1 = 81 F8† = N3 N2 N1 N0 ..

158


6.1.1. (a) K =

3 −2

!

“

0

1 ! 18 3.6 @ 5 A = ; (c) the first mass has moved the 17 3.4 5 ” 17 T = ( 3.6, − .2, − 3.4 )T , so the first spring has stretched 5

−2 ; (b) u = 3

1 farthest; (d) e = 18 5 ,− 5,− the most, while the third spring experiences the most compression.

6.1.2. (a) K =

3 −1

!

−1 ; (b) u = 2 “

”T

1 11 @ 5 A 13 5 0

=

!

2.2 ; (c) the second mass has moved the 2.6

2 13 farthest; (d) e = 11 = ( 2.2, .4, −2.6 )T , so the first spring has stretched the 5 , 5,− 5 most, while the third spring experiences even more compression.

6.1.3.

3 −2

6.1.1: (a) K =

!

−2 ; (b) u = 2 “

”T

7 17 2

!

=

!

7.0 ; (c) the second mass has 8.5

moved the farthest; (d) e = 7, 23 = ( 7.0, 1.5 )T , so the first spring has stretched the most. 0 1 ! ! 7 3 −1 3.5 2 A = ; (b) u = @ 13 6.1.2: (a) K = ; (c) the second mass has moved −1 1 6.5 the farthest; (d) e =

“

”T 7 = 2,3

2

( 3.5, 3. )T , so the first spring has stretched slightly farther.

6.1.4. (a) u = ( 1, 3, 3, 1 )T , e = ( 1, 2, 0, −2, −1 )T . The solution is unique since K is invertible. (b) Now u = ( 2, 6, 7.5, 7.5 )T , e = ( 2, 4, 1.5, 0 )T . The masses have all moved farther, and the springs are elongated more; in this case, no springs are compressed. 6.1.5. (a) Since e1 = u1 , ej = uj − uj+1 , for 2 ≤ j ≤ n, while en+1 = − un , so e1 + · · · + en+1 = u1 + (u2 − u1 ) + (u2 − u1 ) + · · · + (un − un−1 ) − un = 0. Alternatively, note that z = ( 1, 1, . . . , 1 )T ∈ coker A and hence z · e = e1 + · · · + en+1 = 0 since e = A u ∈ rng A. (b) Now there are only n springs, and so e1 + · · · + en = u1 + (u2 − u1 ) + (u2 − u1 ) + · · · + (un − un−1 ) = un . 1 1 Thus, the average elongation n (e1 + · · · + en ) = n un equals the displacement of the last mass divided by the number of springs. ♦ 6.1.6. Since the stiffness matrix K is symmetric, so is its inverse K −1 . The basis vector ei represents a unit force on the ith mass only; the resulting displacement is uj = K −1 ei , which is the ith column of K −1 . Thus, (j, i) entry of K −1 is the displacement of the j th mass when subject to a unit force on the ith mass. Since K −1 is a symmetric matrix, this is equal to its (i, j) entry, which, for the same reason, is the displacement of the i th mass when subject to a unit force on the j th mass. 159

♣ 6.1.7. Top and bottom support; constant force: 12 0.4 10 0.2

8 6 4

20

40

60

80

100

20

40

60

80

100

20

40

60

80

100

-0.2

2 -0.4 20

40

60

80

100

Top and bottom support; linear force: 12 0.2

10 8 6

-0.2

4

-0.4

2 -0.6 20

40

60

80

100

Top and bottom support; quadratic force: 15 0.4 12.5 0.2

10 7.5 5

-0.2

2.5 -0.4 20

40

60

80

100

Top support only; constant force: 50

1

40

0.8

30

0.6

20

0.4

10

0.2 20

40

60

80

100

20

40

60

80

100

20

40

60

80

100

20

40

60

80

100

Top support only; linear force: 1 60 0.8

50 40

0.6

30

0.4

20 0.2

10 20

40

60

80

100

Top support only; quadratic force: 50

1

40

0.8

30

0.6

20

0.4

10

0.2 20

40

60

80

100

160

6.1.8. (a) For maximum displacement of the bottom mass, the springs should be arranged from weakest at the top to strongest at the bottom, so c1 = c = 1, c2 = c′ = 2, c3 = c′′ = 3. (b) In this case, the order giving maximum displacement of the bottom mass is c 1 = c = 2, c2 = c′ = 3, c3 = c′′ = 1. ♣ 6.1.9. (a) When the bottom end is free, for maximum displacement of the bottom mass, the springs should be arranged from weakest at the top to strongest at the bottom. In fact, the ith elongation is ei = (n − i + 1)/ci . The displacement of the bottom mass is the sum n n n−i+1 X X of the elongations of all the springs above it, and achieves un = ei = ci i=1 i=1 its maximum value if and only if c1 ≤ c2 ≤ · · · ≤ cn . (b) In this case, the weakest spring should be at the bottom, while the remaining springs are arranged in order from second weakest at the top to strongest just above the last mass. A proof that this is best would be interesting... 6.1.10. 1 i+1 . The sub-diagonal entries of L are li,i−1 = − , while the diagonal entries of D are dii = i i ♥ 6.1.11. (a) Since y = A u, we have y ∈ rng A = corng AT . Thus, according to Theorem 5.59, y has minimal Euclidean norm among all solutions to AT y = f . (b) To find the minimal norm solution to AT y = f , we proceed as in Chapter 5, and append T the conditions 0 that y is orthogonal 1 to ker A = coker A. In the particular case of Exam1 −1 0 0 T T 1 −1 0C ple 6.1, AT = B @0 A, and ker A is spanned by z = ( 1, 1, 1, 1 ) . To find 0 1 1 0 1 −1 0 0 0 0 1 −1 0 B C B 1 −1 0C B1C C B0 Cy = B C the minimal norm solution, we solve the enlarged system B @0 @0A 0 1 −1 A 0 1 1 1 1 obtained by appending the compatibility condition z · y = 0, whose solution y = “

1 1 1 1 2, 2,−2,−2

”T

reproduces the stress in the example.

0

1 0

2 −1 0 2 −1 3 3 B0 C B0 1 6.1.12. Regular Gaussian Elimination reduces them to @ A, @ 2 2 0 0 34 0 0 tively. Since all three pivots are positive, the matrices are positive definite.

1 3

6.1.13. Denoting the gravitation 0 force by g: 10 1 0 1 2 −1 0 u g 1 B CB 1 C C (a) 2 −1 A @ u2 A − ( u1 u2 u3 )B p(u) = ( u1 u2 u3 ) @ −1 @gA 2 0 −1 1 u3 g

(b)

1

0 1C A , respec-

= u21 − u1 u2 + u22 − u2 u3 + 21 u23 − g (u1 + u2 + u3 ). 0 1 1 10 0 g u1 2 −1 0 0 B C C B C B 1 2 −1 0 C B u2 C BgC B −1 C − ( u1 u2 u3 u4 )B C CB p(u) = ( u1 u2 u3 u4 ) B @gA @ 0 −1 2 −1 A @ u3 A 2 g u4 0 0 −1 2 = u21 − u1 u2 + u22 − u2 u3 + u23 − u3 u4 + u24 − g (u1 + u2 + u3 + u4 ), 161

(c)

p(u) =

1 2

0

2

B −1 B ( u 1 u2 u3 u4 ) B @ 0

0

−1 2 −1 0

0 −1 2 −1

10

= u21 − u1 u2 + u22 − u2 u3 + u23 − u3 u4 + 6.1.14.

!

1

0

1

0 u1 g B B C C 0C C B u2 C BgC CB C − ( u1 u2 u3 u4 )B C @gA −1 A @ u3 A 1 u4 g 1 2

u24 − g (u1 + u2 + u3 + u4 ). !

!

1 3 −2 u1 4 (a) p(u) = ( u1 u2 ) − ( u 1 u2 ) −2 3 u 3 2 2 “ ” ⋆ 3 2 3 2 = 2 u1 − 2 u1 u2 + 2 u2 − 4 u1 − 3 u2 , so p(u ) = p 3.6, 3.4 = − 12.3. (b) For instance, p(1, 0) = −2.5, p(0, 1) = −1.5, p(3, 3) = −12.

6.1.15. (a) p(u) =

7 2 7 2 u21 − 21 u1 u2 +“12 u2 − 32”u2 u3 + 12 u3 − 21 u3 u4 + 43 u24 − u2 − u3 , so p(u⋆ ) = p 1, 3, 3, 1 = − 3. (b) For instance, p(1, 0, 0, 0) = p(0, 0, 0, 1) = .75, p(0, 1, 0, 0) = p(0, 0, 1, 0) = −.4167. 3 4

6.1.16. (a) Two masses, both ends fixed, c1 = 2, c2 = 4, c3 = 2, f = ( −1, 3 )T ; equilibrium: u⋆ = ( .3, .7 )T . (b) Two masses, top end fixed, c1 = 4, c2 = 6, f = ( 0, −2 )T ; “

”T

= ( −.5, −.8333 )T .

“

”T

= ( −.5, −.8333 )T .

equilibrium: u⋆ = − 12 , − 56

(c) Three masses, top end fixed, c1 = 1, c2 = 3, c3 = 5, f = ( 1, 1, −1 )T ; equilibrium: u⋆ = − 12 , − 56

(d) Four masses, both ends fixed, c1 = 3, c2 = 1, c3 = 1, c4 = 1, c5 = 3, f = ( −1, 0, 2, 0 )T ; equilibrium: u⋆ = ( −.0606, .7576, 1.5758, .3939 )T . 6.1.17. In both cases, the homogeneous system A u = 0 requires 0 = u1 = u2 = · · · = un , and so ker A = {0}, proving linear independence of its columns. 6.1.18. This is an immediate consequence of Exercise 4.2.9. ♥ 6.1.19. (a) When only the top end is supported, the potential energy is lowest when the springs are arranged from weakest at the top to strongest at the bottom: c1 = c = 1, c2 = c′ = 2, c3 = c′′ = 3, with energy − 17 3 = −5.66667 under a unit gravitational force. (b) When both ends are fixed, the potential energy is minimized when either the springs are in the order c1 = c = 1, c2 = c′ = 3, c3 = c′′ = 3 or the reverse order c1 = c = 2, 15 c2 = c′ = 3, c3 = c′′ = 1, both of which have energy − 22 = −.681818 under a unit gravitational force. 6.1.20. True. The potential energy function (6.16) uniquely determines the symmetric stiffness matrix K and the external force vector f . According to (6.12), (6.15), the off-diagonal entries of K determine the individual spring constants c2 , . . . , cn of all but the first and (if there is one) last springs. But once we know c2 and cn , the remaining one or two constants, c1 and cn+1 , are uniquely prescribed by the (1, 1) and (n, n) entries of K. If cn+1 = 0, then the bottom end is not attached to a support. Thus, the potential energy uniquely prescribes the entire mass–spring chain.

162

6.2.1. (a)

(b)

(c)

(d)

0

1 B1 B B1 6.2.2. (a) A = B B @0 0 (c) u =

“

−1 0 0 1 1

15 9 3 8 , 8, 2

0 −1 0 −1 0

”T

(e)

1

0 0C C C; −1 C C 0A −1

(b)

0

3

−1 3 −1

B @ −1

−1

10

1

0

1

3 −1 u B C B 1C −1 C A @ u2 A = @ 0 A. 0 u3 2

= ( 1.875, 1.125, 1.5 )T ; ”T

“

3 9 = ( .75, .375, 1.875, − .375, 1.125 )T . (d) y = v = A u = 43 , 38 , 15 8 , −8, 8 (e) The bulb will be brightest when connected to wire 3, which has the most current flowing through. 1 0 1 −1 B1 0C C B C C, and the equilibrium equations are B1 0 6.2.3. The reduced incidence matrix is A⋆ = B C B @0 1A 00 11 ! ! ! 9 3 −1 3 1.125 8 @ A u = , with solution u = ; the resulting currents are = 3 −1 3 0 .375 “

8

”T

y = v = A u = 34 , 89 , 98 , 38 , 38 = ( .75, 1.125, 1.125, .375, .375 )T . Now, wires 2 and 3 both have the most current. Wire 1 is unchanged; the current in wires 2 has increased; the current in wires 3, 5 have decreased; the current in wire 4 has reversed direction. 0

6.2.4. (a) A =

“

1

B0 B B B1 B B1 B B B0 B @0

0

−1 1 0 0 1 0 0

0 −1 −1 0 0 1 0

0 0 0 −1 −1 0 1

4 3 9 1 19 16 11 35 , 35 , 7 , 35 , 5 , 35 , 35

y= (c) wire 6.

1

0 0C C C 0C C 0C C; 0C C C −1 A −1

”T

(b) u =

0 34 B 35 B 23 B 35 B B 19 @ 35 16 35

1 C C C C C A

0

1

.9714 B C B .6571 C C; =B @ .5429 A .4571

= ( .3143, .1143, .4286, .2571, .2000, .5429, .4571 )T ;

6.2.5. (a) Same incidence matrix; (b) u = ( .4714, −.3429, .0429, −.0429 )T ; y = ( .8143, −.3857, .4286, .2571, −.3000, .0429, −.0429 )T ; (c) wire 1. ♠ 6.2.6. None. 163

♠ 6.2.7. There is no current on the two wires connecting the same poles of the batteries (positivepositive and negative-negative) and 2.5 amps along all the other wires. ♠ 6.2.8. (a) The potentials remain the same, but the currents are all twice as large. (b) The potentials are u = ( −4.1804, 3.5996, −2.7675, −2.6396, .8490, .9376, −2.0416, 0. ) T , while the currents are y = ( 1.2200, −.7064, −.5136, .6876, .5324, −.6027, −.1037, −.4472, −.0664, .0849, .0852, −.1701 ) T . ♣ 6.2.9. Resistors 3 and 4 should be on the battery wire and the opposite wire — in either order. Resistors 1 and 6 are connected to one end of resistor 3 while resistors 2 and 5 are connected to its other end; also, resistors 1 and 5 are connected to one end of resistor 4 while resistors 2 and 6 are connected to its other end. Once the wires are labeled, there are 8 possible configurations. The current through the light bulb is .4523. ♣ 6.2.10. (a) For n = 2, the potentials are 0 B B B @

1 16 1 8 1 16

1 8 3 8 1 8

1 16 1 8 1 16

1

0

.0625 =B @ .125 .0625

C C C A

.125 .375 .125

1

.0625 .125 C A. .0625

The currents along the horizontal wires are 0 B B B @

1 − 16 − 18 1 − 16

1 − 16 − 41 1 − 16

1 16 1 4 1 16

1 16 1 8 1 16

1 C C C A

0

−.0625 =B @ −.125 −.0625

−.0625 −.25 −.0625

.0625 .25 .0625

1

.0625 .125 C A, .0625

where all wires are oriented from left to right, so the currents are all going away from the center. The currents in the vertical wires are given by the transpose of the matrix. For n = 3, the potentials are 0

.0288 B .0577 B B B .0769 B @ .0577 .0288

.0577 .125 .1923 .125 .0577

.0769 .1923 .4423 .1923 .0769

.0577 .125 .1923 .125 .0577

1

.0288 .0577 C C C. .0769 C C .0577 A .0288

The currents along the horizontal wires are 0

−.0288

B −.0577 B B B −.0769 B @ −.0577

−.0288

−.0288 −.0673 −.1153 −.0673 −.0288

−.0192 −.0673 −.25 −.0673 −.0192

.0192 .0673 .25 .0673 .0192

.0288 .0673 .1153 .0673 .0288

1

.0288 .0577 C C C, .0769 C C .0577 A .0288

where all wires are oriented from left to right, so the currents are all going away from the center. The currents in the vertical wires are given by the transpose of the matrix. For n = 4, the potentials are 0

.0165

B .0331 B B B .0478 B B .0551 B B B .0478 B @ .0331

.0165

.0331 .0680 .1029 .125 .1029 .0680 .0331

.0478 .1029 .1710 .2390 .1710 .1029 .0478

.0551 .125 .2390 .4890 .2390 .125 .0551

164

.0478 .1029 .1710 .2390 .1710 .1029 .0478

.0331 .0680 .1029 .125 .1029 .0680 .0331

1

.0165 .0331 C C C .0478 C C .0551 C C. .0478 C C C .0331 A .0165

The currents along the horizontal wires are 0

−.0165 B −.0331 B B B −.0478 B B −.0551 B B B −.0478 B @ −.0331 −.0165

−.0165 −.0349 −.0551 −.0699 −.0551 −.0349 −.0165

−.0147 −.0349 −.0680 −.1140 −.0680 −.0349 −.0147

−.0074 −.0221 −.0680 −.25 −.0680 −.0221 −.0074

.0074 .0221 .0680 .25 .0680 .0221 .0074

.0147 .0349 .0680 .1140 .0680 .0349 .0147

.0165 .0349 .0551 .0699 .0551 .0349 .0165

1

.0165 .0331 C C C .0478 C C .0551 C C, .0478 C C C .0331 A .0165

where all wires are oriented from left to right, so the currents are all going away from the center source. The currents in the vertical wires are given by the transpose of the matrix. As n → ∞ the potentials approach a limit, which is, in fact, the fundamental solution to the Dirichlet boundary value problem for Laplace’s equation on the square, [ 47, 59 ]. The horizontal and vertical currents tend to the gradient of the fundamental solution. But this, of course, is the result of a more advanced analysis beyond the scope of this text. Here are graphs of the potentials and horizontal currents for n = 2, 3, 4, 10:

6.2.11. This is an immediate consequence of Theorem 5.59, which states that the minimum norm solution to AT y = f is characterized by the condition y = corng AT = rng A. But, solving the system AT A u = f results in y = A u ∈ rng A.

6.2.12. (a) (i) u = ( 2, 1, 1, 0 )T , y = ( 1, 0, 1 )T ; (ii) u = ( 3, 2, 1, 1, 0 )T , y = ( 1, 1, 0, 1 )T ; (iii) u = ( 3, 2, 1, 1, 1, 0 )T , y = ( 1, 1, 0, 0, 1 )T ; (iv ) u = ( 3, 2, 2, 1, 1, 0 )T , y = ( 1, 0, 1, 0, 1 )T ; (v ) u = ( 3, 2, 2, 1, 1, 1, 0 )T , y = ( 1, 0, 1, 0, 0, 1 )T . (b) In general, the current only goes through the wires directly connecting the top and bottom nodes. The potential at a node is equal to the number of wires transmitting the current that are between it and the grounded node. 6.2.13. ”T “ “ , y = 1, 12 , (i) u = 23 , 21 , 0, 0

1 2

”T

; 165

(ii) u = (iii) u = (iv ) u = (v ) u =

“

” ”T “ 5 3 1 1 1 T ; , , , 0, 0 , , y = 1, 1, 2 2 2 2 2 “ ” ”T “ 7 4 1 1 1 1 T ; , y = 1, 1, 3 , 3 , 3 3 , 3 , 3 , 0, 0, 0 “ ”T ” “ 1 3 2 1 1 T 8 3 , y = 1, 5 , 5 , 5 , 5 ; 5 , 5 , 0, 5 , 0, 0 “ ”T ”T “ 11 4 1 . , y = 1, 74 , 37 , 71 , 71 , 71 7 , 7 , 0, 7 , 0, 0

6.2.14. According to Exercise 2.6.8(b), a tree with n nodes has n − 1 edges. Thus, the reduced incidence matrix A⋆ is square, of size (n − 1) × (n − 1), and is nonsingular since the tree is connected. 6.2.15. (a) True, since they satisfy the same systems of equilibrium equations K u = − A T C b = f . (b) False, because the currents with the batteries are, by (6.37), y = C v = C A u + C b, while for the current sources they are y = C v = C A u. 6.2.16. In general, if v1 , . . . , vm are the rows of the (reduced) incidence matrix A, then the resistivity matrix is K = AT C A =

m X

i=1

ci viT vi . (This relies on the fact that C is a diagonal

matrix.) In the situation described in the problem, two rows of the incidence matrix are the same, v1 = v2 = v, and so their contribution to the sum will be c1 vT v + c2 vT v = (c1 + c2 )vT v = c vT v, which is the same contribution as a single wire between the two vertices with conductance c = c1 + c2 . The combined resistance is R=

R1 R2 1 1 1 = = . = c c1 + c2 1/R1 + 1/R2 R1 + R2

6.2.17. (a) If f are the current sources at the nodes and b the battery terms, then the nodal voltage potentials satisfy AT C A u = f − AT C b. (b) By linearity, the combined potentials (currents) are obtained by adding the potentials (currents) due to the batteries and those resulting from the current sources. ♦ 6.2.18. The resistivity matrix K ⋆ is symmetric, and so is its inverse. The (i, j) entry of (K ⋆ )−1 is the ith entry of uj = (K ⋆ )−1 ej , which is the potential at the ith node due to a unit current source at the j th node. By symmetry, this equals the (j, i) entry, which is the potential at the j th node due to a unit current source at the ith node. 6.2.19. If the graph has k connected subgraphs, then there are k independent compatibility conditions on the unreduced equilibrium equations K u = f . The conditions are that the sum of the current sources at the nodes on every connected subgraph must be equal to zero.

6.3.1. 8 cm 6.3.2. The bar will be stress-free provided the vertical force is 1.5 times the horizontal force. 6.3.3. (a) For a unit horizontal force on the two nodes, the displacement vector is u = ( 1.5, − .5, 2.5, 2.5 )T , so the left node has moved slightly down and three times as far to the right, while the right node has moved five times as far up and to the right. Note that the force on the left node is transmitted through the top bar to the right node, which explains why it moves significantly further. The stresses are 166

e = ( .7071, 1, 0, − 1.5811 )T , so the left and the top bar are elongated, the right bar is stress-free, and the reinforcing bar is significantly compressed. (b) For a unit horizontal force on the two nodes, u = ( .75, −.25, .75, .25 )T so the left node has moved slightly down and three times as far to the right, while the right node has moved by the same amount up and to the right. The stresses are e = (.353553, 0., −.353553, −.790569, .79056)T , so the diagonal bars fixed at node 1 are elongated, the horizontal bar is stress-free, while the bars fixed at node 4 are both compressed. the reinforcing bars experience a little over twice the stress of the other two diagonal bars. 6.3.4. The swing set cannot support a uniform horizontal force, since f1 = f2 = f , g1 = g2 = h1 = h2 = 0 does not satisfy the constraint for equilibrium. Thus, the swing set will collapse. For the reinforced version, under a horizontal force of magnitude f , the displacements of the two free nodes are ( 14.5 f, 0, −3 f )T and ( 14.5 f, 0, 3 f )T respectively, so the first node has moved down and in the direction of the force, while the second node has moved up and in the same horizontal direction. The corresponding elongation vector is e = ( 1.6583 f, 1.6583 f, 0, −1.6583 f, −1.6583 f, −3 f, −3 f )T , and so the horizontal bar experiences no elongation; the diagonal bars connecting the first node are stretched by an amount 1.6583 f , the diagonals connecting the second node are compressed by the same amount, while the reinforcing vertical bars are, respectively, compressed and stretched by an amount 3 f . 0

♥ 6.3.5. (a) A =

B B B B B B B @

0 −1 0 0

1 0 0 0

− √1 2

√1 2

0 1 0

√1 2

0

0 0 1

1

C C C C C; 1 √ C 2C A

3 2

(b)

0

u1 − −

− u1 +

1 2 v1

1 2 u1 3 2 u2 1 2 u2

− u2 = 0,

+ + +

3 2 v1 1 2 v2 3 2 v2

= 0, = 0, = 0.

(c) Stable, statically indeterminate. (d) Write down f = K e1 , so f1 =

3! 2 , − 12

f2 =

!

−1 . 0

The horizontal bar; it is compressed by −1; the upper left to lower right bar is compressed − √1 , while all other bars are stress free. 2

♥ 6.3.6. Under a uniform horizontal force, the displacements and stresses are: Non-joined version: “ √ ”T √ u = ( 3, 1, 3, −1 )T , e = 1, 0, −1, 2, − 2 ; Joined version: u = ( 5, 1, 5, −1, 2, 0 )T , “ √ √ √ √ ”T e = 1, 0, −1, 2, − 2, 2, − 2 ; Thus, joining the nodes causes a larger horizontal displacement of the upper two nodes, but no change in the overall stresses on the bars. Under a uniform vertical force, the displacements and elongations are: Non-joined version: u= e=

” 1 5 T 1 5 = ( .1429, .7143, − .1429, .7143 )T , 7, 7, −7, 7 „ «T √ √ 5 2 2 2 2 2 5 = ( .7143, − .2857, .7143, .4041, .4041 )T 7, 7 , −7, 7 , 7 “

;

Joined version:

u = ( .0909, .8182, − .0909, .8182, 0, .3636 )T ,

e = ( .8182, − .1818, .8182, .2571, .2571, .2571, .2571 )T ; Thus, joining the nodes causes a larger vertical displacement, but smaller horizontal displacement of the upper two nodes. The stresses on the vertical bars increases, while the horizontal bar and the diagonal bars have less stress (in magnitude).

167

6.3.7. (a) A =

0 B B B B B B @

0 −

√1 2

1 − √1

2

0

0

0

0

√1 2 − √1 2

√1 2 √1 2

0 0

0 0

1

C C C C. C 1 −√ C 2A

√1 2

0 0 0 0 0 1 (b) There are two mechanisms: u1 = ( 1, 0, 1, 0, 1, 0 )T , where all three nodes move by the same amount to the right, and u2 = ( 2, 0, 1, 1, 0, 0 )T , where the upper left node moves to the right while the top node moves up and to the right. (c) f1 + f2 + f3 = 0, i.e., no net horizontal force, and 2 f1 + f2 + g2 = 0. (d) You need to add additional two reinforcing bars; any pair, e.g. connecting the fixed nodes to the top node, will stabilize the structure. ♥ 6.3.8. (a) A =

0 B B B B B B B B B @

0 −

1

√3 10

− √1

10

−1 0

0 0

0

0

√3 10

√1 10

0

0

1

√1 10

√3 10

−

√3 10

0 0

0 0

−

1

C C C C C 0 C C C √1 C 10 A

0 0 0 0 0 1 1 0 1 0 0 0 0 B − .9487 − .3162 C .9487 .3162 0 0 C B C B −1 C; =B 0 0 0 1 0 B C @ 0 0 − .9487 .3162 .9487 − .3162 A 0 0 0 0 0 1 (b) One instability: the mechanism of simultaneous horizontal motion of the three nodes. (c) No net horizontal force: f1 + f2 + f3 = 0. For example, if f1 = f2 = f3 = ( 0, 1 )T , then 0

„

q

«T

q

e = 23 , 52 , − 32 , 52 , 23 = ( 1.5, 1.5811, −1.5, 1.5811, 1.5 )T , so the compressed diagonal bars have slightly more stress than the compressed vertical bars or the elongated horizontal bar. (d) To stabilize, add in one more bar starting at one of the fixed nodes and going to one of the two movable nodes not already connected to it. „

q

«T

q

= ( 1.5, 1.5811, −1.5, 1.5811, 1.5, 0 )T , so the (e) In every case, e = 23 , 52 , − 32 , 52 , 23 , 0 stresses on the previous bars are all the same, while the reinforcing bar experiences no stress. (See Exercise 6.3.21 for the general principle.) ♣ 6.3.9. Two-dimensional house: 0

(a) A =

B B B B B B B B B B B B B B B @

0 0 −1 0

1 −1 0 0

0 0 0

0 0 0

0 1 0

− √2

− √1

0

0

0

0 0

0 0

0 0

5

5

0 0 0

0 0 0

0 0 0 0

√2 5 √ − 2 5

√1 5 √1 5

√2 5

0 0

0 0

0 0

0 0 0 0

0 0 1 0

− √1

0

1 0

0 0

5

0 0 0 0

1

C C C C C C C C C; C C 0 C C C −1 C A

1

(b) 3 mechanisms: (i) simultaneous horizontal motion of the two middle nodes; (ii) simultaneous horizontal motion of the three upper nodes; (iii) the upper left node moves horizontally to the right by 1 unit, the top node moves vertically by 1 unit and to the right by 1 2 a unit. 168

(c) Numbering the five movable nodes in order starting at the middle left, the corresponding forces f1 = ( f1 , g1 )T , . . . , f5 = ( f5 , g5 )T must satisfy: f1 + f5 = 0, f2 + f3 + f4 = 0, f2 + 12 f3 + g3 = 0. When f1 = f2 = f4 = f5 = ( 0, 1 )T , f3 = 0, then e = ( 2, 1, 0, 0, 0, 1, 2 )T , so the lower vertical bars are compressed twice as much as the upper vertical bars, while the horizontal and diagonal bars experience no elongation or stress. When f1 = f5 = 0, − f2 = f4 = ( 1, 0 )T , f3 = ( 0, 1 )T , then „

√

«T

√

1 1 5 5 1 1 e = , so all the vertical bars are compressed by .5, the di2 , 2 , 2 , 0, 2 , 2 , 2 agonal bars slightly more than twice as compressed, while the horizontal bar has no stress.

(d) To stabilize the structure, you need to add in at least three more bars. (e) Suppose we add in an upper horizontal bar and two diagonal bars going from lower left to upper right. For the first set of forces, e = ( 2, 1, 0, 0, 0, 1, 2, 0, 0, 0 )T ; for the second „

√

«T

√

set of forces, e = 12 , 21 , 25 , 0, 25 , 12 , 21 , 0, 0, 0 . In both cases, the stresses remain the same, and the reinforcing bars experience no stress. Three-dimensional house: (a) 0

A=

0

B 0 B B B 0 B B B 0 B B B 0 B B −1 B B 0 B B B 0 B B 0 B B 0 B B B 0 B @ 0

0

0 − √1 2 −1 0 0 0 0 0 0 0 0 0 0

1 − √1 2 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 −1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0

0

0

√1 2

0 − √1 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

− √1 2 −1 0 0

0 0 0 0 0 0 0 −1 0 0 0 0 0

√1 2

0

√1 2

0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 1

− √1 2 0 0 0

0 0 0 0 0 0 1 0 0 0 0 0 0

0 0 1

√1 2

0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0

√1 2

0 − √1 2 0

0 0 0

− √1 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

√1 2

0

√1 2

0

0 0 0 0 0 0 0 1 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 1

√1 2

0

0 0 0 0 0 0 0 0 0 0 0 −

1

C C C C C C C C C C C C C C; C C C C C C C C C C 1 √ C A 2

1

(b) 5 mechanisms: horizontal motion of (i) the two topmost nodes in the direction of the bar connecting them; (ii) the two right side nodes in the direction of the bar connecting them; (iii) the two left side nodes in the direction of the bar connecting them; 169

(iv ) the three front nodes in the direction of the bar connecting the lower pair; (v ) the three back nodes in the direction of the bar connecting the lower pair. (c) Equilibrium requires no net force on each unstable pair or nodes in the direction of the instability. (d) For example, when the two topmost nodes are subject to a unit downwards vertical force, the vertical bars have elongation/stress − 12 , the diagonal bars have √1 = .7071, while the front and back horizontal bars have stress.

1 2.

2

The longer horizontal bars have no

(e) To stabilize, you need to add in at least five more bars, e.g., two diagonal bars across the front and back walls and a bar from a fixed node to the opposite topmost node. In all cases, if a minimal number of reinforcing bars are added, the stresses remain the same on the old bars, while the reinforcing bars experience no stress. See Exercise 6.3.21 for the general result. ♥ 6.3.10. (a) Letting wi denote the vertical displacement and hi the vertical component of the force on the ith mass, 2 w1 − w2 = h1 , − w1 + 2 w2 − w3 = h2 , − w2 + w3 = h3 . The system is statically determinate and stable. (b) Same equilibrium equations, but now the horizontal displacements u1 , u2 , u3 are arbitrary, and so the structure is unstable — there are three independent mechanisms corresponding to horizontal motions of each individual mass. To maintain equilibrium, the horizontal force components must vanish: f1 = f2 = f3 = 0. (c) Same equilibrium equations, but now the two horizontal displacements u 1 , u2 , u3 , v1 , v2 , v3 are arbitrary, and so the structure is unstable — there are six independent mechanisms corresponding to the two independent horizontal motions of each individual mass. To maintain equilibrium, the horizontal force components must vanish: f1 = f2 = f3 = g1 = g2 = g3 = 0. ♥ 6.3.11. (a) The incidence matrix is 0

A=

1

B −1 B B B B B B B B @

of size n × n. The stiffness matrix is 0 c1 + c2 − c2 B − c2 c2 + c3 B B − c3 B B K = AT C A =

B B B B B B B B B @

1 −1

−1 1 −1

− c3 c3 + c 4 − c4

1 .. .

..

. −1

1

1 C C C C C C C C C A

− c1 − c4 c4 + c5 ..

.

− c5 ..

.

..

.

1

C C C C C C C C, C C C C C C A

− cn−1 cn−1 + cn − cn − cn − cn cn + c1 and the equilibrium system is K u = f . (b) Observe that ker K = ker A is one-dimensional with basis vector z = ( 1, 1, . . . , 1 ) T . Thus, the stiffness matrix is singular, and the system is not stable. To maintain equilibrium, the force f must be orthogonal to z, and so f1 + · · · + fn = 0, i.e., the net force on 170

the ring is zero. (c) For instance, if c1 = c2 = c3 = c4 = 1, and f = ( 1, −1, 0, 0 )T , then the solution is “

”T

u = 41 , − 21 , − 14 , 0 + t ( 1, 1, 1, 1 )T for any t. Nonuniqueness is telling us that the masses can all be moved by the same amount, i.e., the entire ring is rotated, without affecting the force balance at equilibrium.

♣ 6.3.12.

(a)

0

A=

B B B B B B B B B B B B @

0

(b) v1 =

1

−1 0 0

0 −1 0

0

0

0 0 −1

√1 2 √1 2

0

0

0

0

0

0

0 0

1

B0C B C B C B0C B C B1C B C B C B0C B C B0C B C, B1C B C B C B0C B C B0C B C B C B1C B C @0A

1 0 0

v2 =

0

0 0 0 − √1

2

0

0

0 0

1


v3 =

0

0 0 0

0

− √1

√1 2

v4 =

0 0 0

0 0 0

0 0 0

0

0

0

0

0

√1 2

0

0

0

0

√1 2

0

1


0 1

√1 2

2

0

0

−

− √1

2

0

1

B 0C C B C B B 0C B C B 0C C B C B B 0C C B B 0C C B B 0 C, C B C B B 0C C B B −1 C C B C B B 0C C B @ 1A

v5 =

0

0

0

v6 =

0

1

B 0C B C B C B 0C B C B 0C B C B C B −1 C B C B 0C B C B 1 C; B C B C B 0C B C B 0C B C C B B 0C C B @ 0A

0

(c) v1 , v2 , v3 correspond to translations in, respectively, the x, y, z directions; (d) v4 , v5 , v6 correspond to rotations around, respectively, the x, y, z coordinate axes; 0

1 B 0 B B B 0 B B −1 B B B 0 B B B 0 B (e) K = B B 0 B B B 0 B B B 0 B B B 0 B B @ 0 0

0 1 0 0 0 0 0 −1 0 0 0 0

0 0 1 0 0 0 0 0 0 0 0 −1

−1 0 0 2 1 −2 − 21 − 12 1 2

0 − 12 0 1 2

0 0 0

0 0 0

0 0 0

− 12

− 12 0

− 12

0

1 2

0

1 2

1 2 1 −2

0 0 0 0

0 0 0 1 2

0 − 12

1 2

1 2 1 −2

0 0 0 0

171

0 −1 0

1 2 − 12

0

− 12 2 − 21 0 1 −2 1 2

0 0 0 0 0 0 0 − 21 1 2 0 1 2 − 21

0 0 0

− 21 0 1 2

0 0 0 1 2

0 − 12

0 0 0 0 0 0 0 − 12 1 2

0 1 2 − 12

1

0 0C C C −1 C C

1C 2C C 0C C C − 21 C C C; 0C C 1C C 2C 1C −2 C C − 21 C C C 1C −2 A

2

1

C C C C C C 1 C C; C C 1 √ C 2C A √1 2

2

1

B 0C C B C B B 0C B C B 0C B C B C B 0C B C B −1 C B C B 0 C, B C B C B 0C B C B 0C C B C B B 1C C B @ 0A

0 0 0

− √1

0

0

1

0

0

−

0 0 0

(f ) For fi = ( fi , gi , hi )T we require f1 + f2 + f3 + f4 = 0, g1 + g2 + g3 + g4 = 0, h1 + h2 + h3 + h4 = 0, h3 = g4 , h2 = f4 , g2 = f3 , i.e., there is no net horizontal force and no net moment of force around any axis. (g) You need to fix three nodes. Fixing two still leaves a rotation motion around the line connecting them. (h) Displacement of the top node: u4 = ( −1, −1, −1 )T ; since e = ( −1, 0, 0, 0 )T , only the vertical bar experiences compression of magnitude 1. ♣ 6.3.13.

(a)

Placing the vertices at

0

1 √1 B 3C B C B 0 C, @ A

0

A=

0 3 √ B 2 B 3 B√ B 2 B B 1 B√ B 3 B B B 0 B B B 0 B @

0

0 B B B @

−

1 √ 2 3 1 2

0 1 2

1

C C C, A

0 B B B @

1 √ 2 3 − 21

−

0

− 21 1 2

0

− √3 2

0

0

0

0

0

− √2

− √3

0

0

0

0 0

0 0

0

1

1 − √ 2 3

1 2

0√

− √2 3

0

0

0

0

0

√

3

0

0

C C C, A

0

B B B @

0 0 √

√2 3

1

C C C, A

we obtain

0

0

0

0

− 12

0

0

0

0

0

0

0

− √1

0

0 0

−1 0

0 0

0

1 √ 2 3

− 12

− √2

0 − 21

2

−

0

1

3

√

3

1 √ 2 3 1 √ 2 3 1

1 2

√ √2 3

0

√ √2 √3 √2 3

1

C C C C C C C C C; C C C C C C A

√ 1 √ 1 0 0 0 1 1 0 1 0 0 1 0 0 − √2 −√ 2 B B −2 C C B0C B0C B1C B 6 C B − 6 C B C C B C B C B C B B B C C C B C B C B C B B −1 C B √ C B0C B1C B0C B −1 C √ B B 0 C C C B C B C B C B B B B1C B0C B0C B −2 2 C 3 C 0 C B B C C C B C B C B C B B C B C C B C B C B C B C B C B 1 C B0C B1C B0C B 0 0 C C C B B B C B C B C B C C C B B B0C B0C B1C B 0 0 1 C, v = B √ C, v = B C; C, v = B C, v = B C, v = B √ (b) v1 = B C C B B− 3C B0C B1C B0C B 5 6 3 2 4 0 C C B −2 2 C B B C B C B C B C C C B B B C B C B C B B B 1 C B1C B0C B0C B 0 C 0 C C C C B B B C B C B C B B B 0 C B0C B0C B1C B 0 C 1 C C C C B B B C B C B C B C C C B B B C B C B C B C C B B 0 C B1C B0C B0C B 0 0 C C C B B B C B C B C B @ @ 0 A @0A @1A @0A @ 0 A 0 A 0 0 1 0 0 0 0

(c) v1 , v2 , v3 correspond to translations in, respectively, the x, y, z directions;

(d) v4 , v5 , v6 correspond to rotations around the top node;

172

(e) K = 0

11 6

B B B 0 B √ B B 2 B− 3 B B B −3 B 4 B √ B 3 B B 4 B B B 0 B B 3 B B −4 B √ B 3 B B− 4 B B B 0 B B B B −1 B 3 B B B 0 B √ @ 2 3

0

−

1 2

√ 2 3

0

− 43 √

3 4

√ 3 4 − 14

0

2 3

0

3 4 − 41

0

5 6 − √1 3 1 √ 3 2

3 2 − √1 6

√

0

0

0

− √1

3

−1

0

0

0

5 6 1 √ 3 1 √ 3 2 1 − 12 1 − √ 4 3 1 − √ 3 2

√1 3 3 2 √1 6 1 − √ 4 3 − 41 − √1 6

1 √ 3 2 √1 6 2 3 1 − √ 3 2 − √1 6 2 −3

0

0

0

0

0

0

−1

0

0

0

− 23

1 √ 4 3 1 − √ 3 2

0

1 √ 4 3 − 14 √1 6

0

0

0

1 − 12

0

0

0

0

0

0

0

0

√ 2 3

3 4

0

0

3 4 − 14

0

√

√

3 4 − 14

−

1 √ 3 2 − √1 6 2 3

0

−

−

0

0

√

− 34

0

0 1 √ 3 2 √1 6 − 32

−

(f ) For fi = ( fi , gi , hi )T we require

0

− 13 0

√ 2 3 1 − 12 1 √ 4 3 1 − √ 3 2 1 − 12 1 − √ 4 3 1 − √ 3 2 1 2

0

√

2 3

0

0

0

− 32

1 √ 4 3 − 41 √1 6 1 − √ 4 3 − 41 − √1 6

1 √ 3 2 √1 6 − 32 1 − √ 3 2 − √1 6 − 32

0

0

0

1 2

0

0

0

2

−

1

C C C C C C C C C C C C C C C C C C C C. C C C C C C C C C C C C C C C C C C A

√ √ 6 g1 − h1 − 2 2 f2 + h2 = 0, √ √ − 2 g1 + 3 f2 + g2 − 3 f3 + g3 = 0, √ √ √ − 2 f1 − 6 g1 − h1 − 2 2 f3 + h3 = 0, −

f1 + f2 + f3 + f4 = 0, g1 + g2 + g3 + g4 = 0, h1 + h2 + h3 + h4 = 0,

√

2 f1 +

i.e., there is no net horizontal force and no net moment of force around any axis. (g) You need to fix three nodes. Fixing only two nodes still permits a rotational motion around the line connecting them. “

(h) Displacement of the top node: u = 0, 0, − 21 experience compression of magnitude √1 .

”T

; all the bars connecting the top node

6

6.3.14. True, since stability only depends on whether the reduced incidence matrix has trivial kernel or not, which depends only on the geometry, not on the bar stiffnesses. 6.3.15. (a) True. Since K u = f , if f 6= 0 then u 6= 0 also. (b) False if the structure is unstable, since any u ∈ ker A yields a zero elongation vector y = A u = 0. 6.3.16. (a) 3 n. (b) Example: a triangle each of whose nodes is connected to the ground by two additional, non-parallel bars.

♦ 6.3.17. As in Exercise 6.1.6, this follows from the symmetry of the stiffness matrix K, which implies that K −1 is a symmetric matrix. Let fi = ( 0, . . . , 0, n, 0, . . . , 0 )T denote the force vector corresponding to a unit force at node i applied in the direction of the unit vector n. The resulting displacement is ui = K −1 fi , and we are interested the displacement of node j in the direction n, which equals the dot product fj · ui = fjT K −1 fi = (K −1 fj )T fi = fi · uj , proving equality. 173

6.3.18. False in general. If the nodes are collinear, a rotation around the line through the nodes will define a rigid motion. If the nodes are not collinear, then the statement is true. 6.3.19. Since y = e = A u ∈ rng A = corng AT , which, according to Theorem 5.59, is the condition for the solution of minimal norm to the adjoint equation. ♦ 6.3.20. (a) We are assuming that f ∈ rng K = corng A = rng AT , cf. Exercise 3.4.31. Thus, we can write f = AT h = AT C g where g = C −1 h. (b) The equilibrium equations K u = f are AT C A u = AT C g which are the normal equations (4.57) for the weighted least squares solution to A u = g. ♥ 6.3.21. Let A be the reduced incidence matrix of the structure, so the equilibrium equations are AT C A u = f , where we are assuming f ∈ rng K = rng AT C A. We use Exercise 6.3.20 to write f = AT C g and characterize u as the weighted least squares solution to A u = g, i.e., the vector that minimizes the weighted norm k A u − g k2 . ! A e where Now, the reduced incidence matrix for the reinforced structure is A = B the rows of B represent the reinforcing bars. The structure will be stable if and only if e = corng A + corng B = R n , and the number of bars is minimal if and only if corng A corng A ∩ corng B = {0}. Thus, corng A and corng B are complementary subspaces of R n , which implies, as in Exercise 5.6.12, that their orthogonal complements ker A and ker B are also complementary subspaces, so ker A + ker B = R n , ker A ∩ ker B = {0}. eA e v = AT C A v+ eT C The reinforced equilibrium equations for the new displacement v are A T B D B v = f , where D is the diagonal matrix whose entries are the stiffnesses of the rein! ! g C O T T T T e e e , . Since f = A C g = A C g + B D 0 = A C forcing bars, while C = 0 O D again using Exercise 6.3.20, the reinforced displacement v is the least squares solution to the combined system A v = g, B v = 0, i.e., the vector that minimizes the combined weighted norm k A v − b k 2 + k B v k2 . Now, since we are using the minimal number of bars, we can uniquely decompose v = z + w where z ∈ ker A and w ∈ ker B, we find k A v − b k2 + k B v k2 = k A w − b k2 + k B z k2 . Clearly this will be minimized if and only if B z = 0 and w minimizes k A w − b k2 . Therefore, w = u ∈ ker B, and so the entries of B u = 0 are the elongations of the reinforcing bars. ♥ 6.3.22. (a) A⋆ =

0 B B B B @

√1 2

√1 2

0

0

0

1

0C C

⋆ ⋆ ⋆ 0C C; K u = f where K =

B B B B B B B B @

3 2 1 2

1 2 1 2

−1 0

0 0

1

0 C 0C C C 1 C. −2 C

3 1 −1 0 2 −2 C 1 1C 0 0 − 12 2 2A 1 1 1 0 0 −2 2 2 (b) Unstable: there are two mechanisms prescribed by the kernel basis elements ( 1, −1, 1, 1, 0 ) T , which represents the same mechanism as when the end is fixed, and ( 1, −1, 1, 0, 1 ) T , in which the roller and the right hand node move horizontally to the right, while the left node moves down and to the right.

−1 0

0 0

1

0

− √1 2

√1 2

A √1 2

174

♥ 6.3.23. (a) A⋆ =

0

√1 B 2 B B −1 B @

0

√1 2

0

0

0 0

1

0

− √1

K ⋆ u = f ⋆ where K ⋆ =

2 0 B B B B B B B B @

√1 2 3 2 1 2

0

1

C C C; C A − √1 2 1 −1 2 1 0 2 3 0 2 0 − 21 1 0 2

0

0 0

1

0 C 0C C C 1 C. 2C

−1 − 12 C 1C 1 0 2 −2 A 1 − 12 0 2 (b) Unstable: there are two mechanisms prescribed by the kernel basis elements ( 1, −1, 1, 1, 0 ) T , which represents the same mechanism as when the end is fixed, and ( −1, 1, −1, 0, 1 ) T , in which the roller moves up, the right hand node moves horizontally to the left, while the left node moves up and to the left. ♥ 6.3.24.

1

0

.7071 .7071 0 0 0 C −1 0 1 0 0 C C C; (a) Horizontal roller: A⋆ = 0 0 − .7071 .7071 .7071 C A − .9487 .3162 0 0 .9487 1 0 2.4 .2 −1 0 − .9 C B B .2 .6 0 0 .3 C C B C C; B −1 K ⋆ u = f ⋆ where K ⋆ = B 0 1.5 − .5 − .5 C B C B @ 0 0 − .5 .5 .5 A − .9 .3 − .5 .5 1.4 unstable: there is one mechanism prescribed by the kernel basis element ( .75, −.75, .75, −.25, 1 )T , in which the roller moves horizontally to the right, the right hand node moves right and slightly down, while the left node moves right and down. 1 0 .7071 .7071 0 0 0 C B B C − 1 0 1 0 0 B C ⋆ C; (b) Vertical roller: A = B B 0 0 − .7071 .7071 − .7071 C @ A − .9487 .3162 0 0 − .3162 0 1 2.4 .2 −1 0 .3 B C B .2 .6 0 0 − .1 C B C C B −1 C; K ⋆ u = f ⋆ where K ⋆ = B 0 1.5 − .5 .5 B C B C @ 0 0 − .5 .5 − .5 A .3 − .1 .5 − .5 .6 unstable: there is one mechanism prescribed by the kernel basis element ( −.25, .25, −.25, .75, 1. )T , in which the roller moves up, the right hand node moves up and slightly to the left, while the left node moves slightly left and up. B B B B B @

6.3.25. (a) Yes, if the direction of the roller is perpendicular to the vector between the two nodes, the structure admits an (infinitesimal) rotation around the fixed node. (b) A total of six rollers is required to eliminate all six independent rigid motions. The rollers must not be “aligned”. For instance, if they all point in the same direction, they do not eliminate a translational mode.

175

Solutions — Chapter 7 7.1.1. Only (a) and (d) are linear. 7.1.2. Only (a),(d) and (f ) are linear. 2 0

!

!

0 7.1.3. (a) F (0, 0) = 6= , (b) F (2 x, 2 y) = 4 F (x, y) 6= 2 F (x, y), (c) F (− x, − y) = 0 ! ! 0 1 . 6= F (x, y) 6= − F (x, y), (d) F (2 x, 2 y) 6= 2 F (x, y), (e) F (0, 0) = 0 0 !

0 7.1.4. Since T 0 identity map. 0

0 7.1.5. (a) B @1 0 (e)

0 B B B @

1 3 2 3 2 3 !

= 1

−1 0 0 2 3 − 13 2 3

−

!

a , linearity requires a = b = 0, so the only linear translation is the b

0 0C A, 1

0

(b)

2 3 2 3 − 13

1

B B B0 @

0

0

1

C C C, A

(f )

0

1 √2 3 2

1

B @0

0

0 1 0

1

0C C C, A

√ − 23 1 2 1

0 0C A, 0

(c)

(g)

!

!

1 , 1

0 B B B @

0

1

−

1

0 1 0

B @0

0

0 0C A, −1 − 31

1 6 5 6 1 3

5 6 1 6 1 3

1

(d)

0

0

B @1

0

0 0 1

1

1 0C A, 0

C 1C C. 3A 1 3

1 form a basis, so we can write any v ∈ −1 ! ! ! 1 1 1 2 R as a linear combination v = c +d . Thus, by linearity, L[ v ] = c L + 1 −1 1 ! 1 dL = 2 c + 3 d is uniquely determined by its values on the two basis vectors. −1

7.1.6. L

x y

=

5 2

x−

1 2

0

y. Yes, because

−2x+ 7.1.7. L(x, y) = @ 31 −3x−

4 3 1 3

1

yA . y

x1 y1

7.1.8. The linear function exists and is unique if and only if dent. In this case, the matrix form is A = c1

x1 y1

!

only if c1

+ c2 a1 b1

!

x2 y2

!

+ c2

a1 b1

a2 b2

!

x1 y1

!

,

x2 y2

x2 y2

!−1

!

are linearly indepen-

. On the other hand, if

= 0, then the linear function exists (but is not uniquely defined) if and a2 b2

!

= 0.

7.1.9. No, because linearity would require 0

1

20

1

0

13

0

1

0

1

1 1 1 1 0 B B B C C 6B C C7 C LB @ 1 A = L 4 @ −1 A − @ −1 A 5 = L@ 0 A − L@ −1 A = 3 6= −2. −1 0 0 0 −1 ♦ 7.1.10. La [ c v + d w ] = a × (c v + d w) = c a × v + d a × w = c La [ v ] + d La [ w ]; 176

matrix representative:

0 B @

0 c −b

−c 0 a

7.1.11. No, since N (− v) = N (v) 6= − N (v).

1

b −a C A. 0

7.1.12. False, since Q(c v) = c2 v 6= c v in general.

7.1.13. Set b = L(1). Then L(x) = L(x 1) = x L(1) = x b. The proof of linearity is straightforward; indeed, this is a special case of matrix multiplication. 7.1.14. (a) L[ c X + d Y ] = A(c X + d Y ) = c A X + d A Y =1c L[ X ] + d L[ Y ]; 0 a 0 b 0 C B B0 a 0 bC C. matrix representative: B @ c 0 d 0A 0 c 0 d (b) R[ c X + d Y ] = (c X + d Y )B = c 0 X B + d Y B =1c R[ X ] + d R[ Y ]; p r 0 0 C B Bq s 0 0C C. matrix representative: B @0 0 p rA 0 0 q s (c) K[ c X + d Y ] = A(c X + d Y )B =0c A X B + d A Y B = c K[ X ] + d K[ Y ];; 1 ap ar bp br B aq as bq bs C C C. B matrix representative: B @ cp cr dp dr A cq cs dq ds

7.1.15. (a) Linear; target space = Mn×n . (b) Not linear; target space = Mn×n . (c) Linear; target space = Mn×n . (d) Not linear; target space = Mn×n . (e) Not linear; target space = R. (f ) Linear; target space = R. (g) Linear; target space = R n . (h) Linear; target space = R n . (i) Linear; target space = R.

♦ 7.1.16. (a) If L satisfies (7.1), then L[ c v + d w ] = L[ c v ] + L[ d w ] = c L[ v ] + d L[ w ], proving (7.3). Conversely, given (7.3), the first equation in (7.1) is the special case c = d = 1, while the second corresponds to d = 0. (b) Equations (7.1, 3) prove (7.4) for k = 1, 2. By induction, assuming the formula is true for k, to prove it for k + 1, we compute L[ c1 v1 + · · · + ck vk + ck+1 vk+1 ] = L[ c1 v1 + · · · + ck vk ] + ck+1 L[ vk+1 ] = c1 L[ v1 ] + · · · + ck L[ vk ] + ck+1 L[ vk+1 ].

♦ 7.1.17. If v = c1 v1 + · · · + cn vn , then, by linearity, L[ v ] = L[ c1 v1 + · · · + cn vn ] = c1 L[ v1 ] + · · · + cn L[ vn ] = c1 w1 + · · · + cn wn . Since v1 , . . . , vn form a basis, the coefficients c1 , . . . , cn are uniquely determined by v ∈ V and hence the preceding formula uniquely determines L[ v ]. ♥ 7.1.18. e , w) = (c v +e e ee e e e (a) B(c v+ec v 1 c v 1 ) w1 −2 (c v2 + c v 2 ) w2 = c (v1 w1 −2 v2 w2 )+ c (v 1 w1 −2 v 2 w2 ) = e , w), so B(v, w) is linear in v for fixed w. Similarly, B(v, c w + e e = c B(v, w) + ec B(v c w) e ) − 2 v (c w + e e ) = c (v w − 2 v w ) + e e − 2v w e ) = c B(v, w) + v1 (c w1 + ec w c w c (v w 1 2 2 2 1 1 2 2 2 1 2 2 e e c B(v, w), proving linearity in w for fixed v. e , w) = 2 (c v + e (b) B(c v + ec v c ve1 ) w2 − 3 (c v2 + ec ve2 ) w3 = c (2 v1 w2 − 3 v2 w3 ) + 1 e , w), so B(v, w) is linear in v for fixed w. Sime c (2 ve1 w2 − 3 ve2 w3 ) = c B(v, w) + ec B(v e = 2 v (c w + e e ) − 3 v (c w + e e ) = c (2 v w − 3 v w ) + ilarly, B(v, c w + ec w) c w cw 1 2 2 2 3 3 1 2 2 3 e e ) = c B(v, w) + e e − 3v w e c B(v, w), proving bilinearity. c (2 v2 w 1 2 3 e , w) = h c v + e e , w i = ch v , w i + e e , w i = c B(v, w) + e e , w), (c) B(c v + ec v cv ch v c B(v e = h v , cw + e e i = ch v , w i + e e i = c B(v, w) + e e B(v, c w + ec w) cw ch v , w c B(v, w). 177

(d) B(c v + ec v e , w) = (c v + e e )T A w = c v T A w + e e T A w = c B(v, w) + e e , w), cv cv c B(v

e = vT A(c w + e e = c vT A w + e e = c B(v, w) + e e B(v, c w + ec w) c w) c vT A w c B(v, w). (e) Set aij = B(ei , ej ) for i = 1, . . . , m, j = 1, . . . , n. Then

B(v, w) = B(v1 e1 + · · · + vm em , w) = v1 B(e1 , w) + · · · + vm B(em , w) = v1 B(e1 , w1 e1 + · · · + wn en ) + · · · + vm B(em , w1 e1 + · · · + wn en ) =

m X

n X

i=1 j =1

vi wj B(ei , ej ) =

m X

n X

i=1 j =1

vi wj aij = vT A w.

(f ) Let B(v, w) = ( B1 (v, w), . . . , Bk (v, w) )T . Then the bilinearity conditions B(c v + e , w) = c B(v, w) + e e and B(v, c w + e e = c B(v, w) + e e hold if and e cv c B(v, w) c w) c B(v, w) only if each component Bj (v, w) is bilinear. e , w)) e e, c w + e e = c2 B(v, w) + c e e + (g) False. B(c (v, w) + ec (v = B(c v + ec v c w) c B(v, w) 2 e , w) + e e , w) e 6= c B(v, w) + e e , w). e c ec B(v c B(v c B(v 7.1.19. (a) Linear; target space = R. (b) Not linear; target space = R. (c) Linear; target space = R. (d) Linear; target space = R. (e) Linear; target space = C 1 (R). (f ) Linear; target space = C1 (R). (g) Not linear; target space = C1 (R). (h) Linear; target space = C0 (R). (i) Linear; target space = C0 (R). (j) Linear; target space = C0 (R). (k) Not linear; target space = R. (l) Linear; target space = R. (m) Not linear; target space = R. (n) Linear; target space = C2 (R). (o) Linear; target space = C2 (R). (p) Not linear; target space = C1 (R). (q) Linear; target space = C1 (R). (r) Linear; target space = R. (s) Not linear; target space = C2 (R). 7.1.20. True. For any constants c, d, A[ c f + d g ] =

1 b−a

Z b a

[ c f (x) + d g(x) ] dx =

c b−a

Z b a

f (x) dx+

d b−a

Z b a

g(x) dx = c A[ f ]+d A[ g ].

7.1.21. Mh [ c f (x) + d g(x) ] = h(x) (c f (x) + d g(x)) = c h(x) f (x) + d h(x) g(x) = c M h [ f (x) ]+d Mh [ g(x) ]. To show the target space is Cn [ a, b ], you need the result that the product of two n times continuously differentiable functions is n times continuously differentiable. 7.1.22. Iw [ c f + d g ] =

Z bh

=c

a Z b a

i

c f (x) + d g(x) w(x) dx f (x) w(x) dx + d

Z b a

g(x) w(x) dx = c Iw [ f ] + d Iw [ g ].

i ∂f ∂ h ∂g c f (x) + d g(x) = c +d = c ∂x [ f ] + d ∂x [ g ]. The same ∂x ∂x ∂x proof works for ∂y . (b) Linearity requires d = 0.

7.1.23. (a) ∂x [ c f + d g ] =

7.1.24. ∆[ c f + d g ] =

i i ∂2 h ∂2 h c f (x, y) + d g(x, y) + c f (x, y) + d g(x, y) ∂x2 ∂y 2 0

= c@

∂2f ∂2f + ∂x2 ∂y 2

1

0

A + d@

∂2g ∂2g + ∂x2 ∂y 2

178

1 A

= c ∆[ f ] + d ∆[ g ].

7.1.25. G[ c f + d g ] = ∇(c f + d g) =

0 B B B B B B @

h

∂ c f (x) + d g(x)

i 1

C C ∂x C h iC C ∂ c f (x) + d g(x) C A

∂y = c ∇f + d ∇g = c G[ f ] + d G[ g ].

0 B B

= cB B @

∂f ∂x ∂f ∂y

0

1

B C B C C + dB B C @ A

∂g ∂x ∂g ∂y

1 C C C C A

7.1.26. (a) Gradient: ∇(c f + d g) = c ∇f + d ∇g; domain is space of continuously differentiable scalar functions; target is space of continuous vector fields. (b) Curl: ∇ × (c f + d g) = c ∇ × f + d ∇ × g; domain is space of continuously differentiable vector fields; target is space of continuous vector fields. (c) Divergence: ∇ · (c f + d g) = c ∇ · f + d ∇ · g; domain is space of continuously differentiable vector fields; target is space of continuous scalar functions.

7.1.27. (a) dimension = 3; basis: ( 1, 0, 0 ! ) , ( 0, 1, 0 )!, ( 0, 0, 1 ).! ! 1 0 0 1 0 0 0 0 (b) dimension = 4; basis: , , , . 0 0 0 0 1 0 0 1 (c) dimension = m n; basis: Eij with (i, j) entry equal to ! and all other entries 0, for i = 1, . . . , m, j = 1, . . . , n. (d) dimension = 4; basis given by L0 , L1 , L2 , L3 , where Li [ a3 x3 + a2 x2 + a1 x + a0 ] = ai . (e) dimension = 6; basis given by L0 , L1 , L2 ,!M0 , M1 , M2 , where ! ai 0 2 2 Li [ a 2 x + a 1 x + a 0 ] = , Mi [ a 2 x + a 1 x + a 0 ] = . 0 ai (f ) dimension = 9; basis given by L0 , L1 , L2 , M0 , M1 , M2 , N0 , N1 , N2 , where, for i = 1, 2, 3, Li [ a2 x2 + a1 x + a0 ] = ai , Mi [ a2 x2 + a1 x + a0 ] = ai x, Ni [ a2 x2 + a1 x + a0 ] = ai x2 . 0 0

7.1.28. True. The dimension is 2, with basis

!

1 , 0

0 0

!

0 . 1

7.1.29. False. The zero function is not an element. “

7.1.30. (a) a = ( 3, −1, 2 )T , (b) a = 3, − 21 ,

2 3

”T

, (c) a =

“

1 5 5 4,−2, 4

”T

.

♦ 7.1.31. (a) a = K −1 rT since L[ v ] = r v = r K −1 K v = aT K v and K T = K. (b) (i) a = ! !−1 ! ! ! !−1 ! 2 2 3 0 2 1 2 2 −1 3 , = , (ii) a = . = (iii) a = −1 0 2 −1 0 −1 −1 3 −1 ♥ 7.1.32. (a) By linearity, Li [ x1 v1 + · · · + xn vn ] = x1 Li [ v1 ] + · · · + xn Li [ vn ] = xi . (b) Every real-valued linear function L ∈ V ∗ has the form L[ v ] = a1 x1 + · · · + an xn = a1 L1 [ v ] + · · · + an Ln [ v ] and so L = a1 L1 + · · · + an Ln proving that L1 , . . . , Ln span V ∗ . Moreover, they are linearly independent since a1 L1 + · · · + an Ln = O gives the trivial linear function if and only if a1 x1 + · · · + an xn = 0 for all x1 , . . . , xn , which implies a1 = · · · = an = 0. (c) Let ri denote the ith row of A−1 which we identify as the linear function (Li [ v ] = ri v. 1 i = j, The (i, j) entry of the equation A−1 A = I says that Li [ vj ] = ri vj = 0, i 6= j, which is the requirement for being a dual basis.

179

7.1.33. In ”all cases,“the dual basis consists of the liner functions L [ v ] = ri“v. (a) r1” = “ ” “ ” “ ” i 1 1 1 1 3 2 1 1 1 1 1 2 , 2 , r2 = 2 , − 2 , (b) r1 = 7 , 7 , r2 = 7 , − 7 , (c) r1 = 2 , 2 , − 2 , r2 = “

”

“

”

, r3 = − 12 , 21 , 21 , (d) r1 = ( 8, 1, 3 ) , r2 = ( 10, 1, 4 ) , r3 = ( 7, 1, 3 ), (e) r1 = ( 0, 1, −1, 1 ) , r2 = ( 1, −1, 2, −2 ) , r3 = ( −2, 2, −2, 3 ) , r4 = ( 1, −1, 1, −1 ). 1 1 1 2,−2, 2

7.1.34. (a) 9 − 36 x + 30 x2 ,

(b) 12 − 84 x + 90 x2 , (c) 1, (d) 38 − 192 x + 180 x2 .

7.1.35. 9 − 36 x + 30 x2 , −36 + 192 x − 180 x2 , 30 − 180 x + 180 x2 .

7.1.36. Let w1 , . . . , wn be any basis of V . Write v = y1 w1 +· · ·+yn wn , so, by linearity, L[ v ] = y1 L[ w1 ]+ · · · +yn L[ wn ] = b1 y1 + · · · +bn yn , where bi = L[ wi ]. On the other hand, if we write a = a1 w1 + · · · + an wn , then h a , v i =

n X

i,j = 1

a i yj h w i , w j i =

n X

i,j = 1

kij ai yj = xT K b,

where kij = h wi , wj i are the entries of the Gram matrix K based on w1 , . . . , wn . Thus setting a = K −1 b gives L[ v ] = h a , v i.

7.1.37. (a) S ◦ T (b) S ◦ T (c) S ◦ T (d) S ◦ T

= T ◦ S = clockwise rotation by 60◦ = counterclockwise rotation by 300◦ ; = T ◦ S = reflection in the line y = x; = T ◦ S = rotation by 180◦ ; “ ” = counterclockwise rotation by cos−1 − 45 = 12 π − 2 tan−1 21 radians; “

T ◦ S = clockwise rotation by cos−1 − 54 (e) S ◦ T = T ◦ S = O; (f ) S ◦ T maps ( x, y )T to

(g) S ◦ T maps ( x, y ) (h) S ◦ T maps ( x, y )

T

T

“

1 2

“

− 25

(x + y), 0

=

1 2

π − 2 tan−1

to ( y, 0 ) ; T ◦ S maps ( x, y ) to

T ◦ S maps

x+

1 5

0

4 5

y, x −

”T 2 ; 5y !

0

1

radians;

; T ◦ S maps ( x, y )T to

1 −1 −1 0 ; (b) M = ; (c) −3 2 −3 2 (d) Each linear transformation is uniquely determined ! 2 −1 0 = (e) M [ ei ] = N ◦ L[ ei ] for i = 1, 2. 0 −3 2

7.1.38. (a) L =

1 2

T

!

T

”T

”

1

to

( 0, x )T ; “

“

1 2

x, 12 x

− 25 x +

”T

4 5

!

;

y, 51 x −

2 5

y

”T

.

2 1 ; 0 1 by ! its action on!a basis of R 2 , and 1 −1 1 . −3 2 1 N=

0

1

1 0 0 0 −1 0 0 −1 0 B C 0 0C 0 −1 C 7.1.39. (a) R = B (b) R ◦ S = B @ 0 0 −1 A, S = @ 1 @0 A; A 6= S ◦ R = 0 1 0 0 0 1 1 0 0 1 0 0 0 1 C B @ 1 0 0 A; under R ◦ S, the basis vectors e1 , e2 , e3 go to e3 , − e1 , − e2 , respectively. 0 1 0 Under S ◦ R, they go to e2 , e3 , e1 . (c) Do it. 7.1.40. No. the matrix representatives for P, Q and R = Q ◦ P are, respectively, P =

0 B B B @

−

2 3 1 3 1 3

− 13 2 3 1 3

1 3 1 3 2 3

1

C C C, A

Q=

0 B B B @

2 3 1 3 1 3

1 3 2 3 − 31

1 3 − 13 2 3

1

C C C, A

180

R = QP =

0 B B B @

4 9 1 9 5 9

− 19 2 9 1 9

5 9 − 19 4 9

1

C C C, A

but orthogo-

nal projection onto L =

8 0 19 > 1 > < = B C t @0A > > : ;

has matrix representative M =

1

0

1 B2 B @ 0 1 2

0 0 0

1 1 2C 0C A 1 2

6= R.

7.1.41. (a) L = E ◦ D where D[ f (x) ] = f ′ (x), E[ g(x) ] = g(0). No, they do not commute — D ◦ E is not even defined since the target of E, namely R, is not the domain of D, the space of differentiable functions. (b) e = 0 is the only condition. 7.1.42. L ◦ M = x D 2 + (1 − x2 )D − 2 x, M ◦ L = x D2 + (2 − x2 )D − x. They do not commute. 7.1.43. (a) According to Lemma 7.11, Ma ◦ D is linear, and hence, for the same reason, L = D ◦ (Ma ◦ D) is also linear. (b) L = a(x) D 2 + a′ (x) D. ♦ 7.1.44. (a) Given L: V → U , M : W → V and N : Z → W , we have, for z ∈ Z, ((L ◦ M ) ◦ N )[ z ] = (L ◦ M )[ N [ z ] ] = L[ M [ N [ z ] ] ] = L[ (M ◦ N )[ z ] ] = (L ◦ (M ◦ N ))[ z ] as elements of U . (b) Lemma 7.11 says that M ◦ N is linear, and hence, for the same reason, (L ◦ M ) ◦ N is linear. (c) When U = R m , V = R n , W = R p , Z = R q , then L is represented by an m × n matrix A, M is represented by a n × p matrix B, and N is represented by an p × q matrix C. Associativity of composition implies (A B) C = A (B C).

7.1.45. Given L = an Dn + · · · + a1 D + a0 , M = bn Dn + · · · + b1 D + b0 , with ai , bi constant, the linear combination c L + d M = (c an + d bn ) Dn + · · · + (c a1 + d b1 ) D + (c a0 + d b0 ), is also a constant coefficient linear differential operator, proving that it forms a subspace of the space of all linear operators. A basis is D n , Dn−1 , . . . , D, 1 and so its dimension is n + 1. 7.1.46. If p(x, y) =

X

cij xi y j then p(x, y) =

X

cij ∂xi ∂yj is a linear combination of linear opera-

tors, which can be built up as compositions ∂xi ◦ ∂yj = ∂x ◦ · · · ◦ ∂x ◦ ∂y ◦ · · · ◦ ∂y of the basic first order linear partial differential operators. ♥ 7.1.47. (a) Both L ◦ M and M ◦ L are linear by Lemma 7.11, and, since the linear transformations form a vector space, their difference L ◦ M − M ◦ L is also linear. (b) L ◦ M = M ◦ L if and only if [ L, M ] = L ◦ M − M ◦ L = O. 0 1 ! ! 0 2 0 0 −2 1 3 C , (iii) B , (ii) (c) (i) @ −2 0 −2 A. −2 0 0 −1 0 2 0 h i ◦ ◦ ◦ ◦ ◦ [ L, M ], N = (L M − M L) N − N (L M − M ◦ L) (d) h h

i

= L ◦ M ◦ N − M ◦ L ◦ N − N ◦ L ◦ M + N ◦ M ◦ L,

[ N, L ], M = (N ◦ L − L ◦ N ) ◦ M − M ◦ (N ◦ L − L ◦ N ) i

= N ◦ L ◦ M − L ◦ N ◦ M − M ◦ N ◦ L + M ◦ L ◦ N,

[ M, N ], L = (M ◦ N − N ◦ M ) ◦ L − L ◦ (M ◦ N − N ◦ M )

= M ◦ N ◦ L − N ◦ M ◦ L − L ◦ M ◦ N + L ◦ N ◦ M, which add up to O. ! ! h i h i h i −3 2 0 0 (e) [ L, M ], N = , [ N, L ], M = , [ M, N ], L = 2 3 −2 0 ! ! ! 3 −2 0 0 −3 2 = O. + + whose sum is 0 −3 −2 0 2 3 e e ◦ M − M ◦ (c L + e e M ] = (c L + e e M) = [cL + e (f ) B(c L + ec L, c L) c L) c L,

3 0

−2 −3

!

,

e = c [ L, M ] + e e M ] = c B(L, M ) + e e M ), e ◦ M − M ◦ L) c [ L, c B(L, = c (L ◦ M − M ◦ L) + ec (L f ) = − B(c M + e f , L) = − c B(M, L)− e f , L) = c B(L, M )+ e f ). B(L, c M + ec M cM c B(M c B(L, M (Or the latter property can be proved directly.)

181

♦ 7.1.48. (a) [ P, Q ] [ f ] = P ◦ Q[ f ] − Q ◦ P [ f ] = P [ x f ] − Q[ f ′ ] = (x f )′ − x f ′ = f . (b) According to Exercise 1.2.32, the trace of any matrix commutator is zero: tr[ P, Q ] = 0. On the other hand, tr I = n, the size of the matrix, not 0. ♥ 7.1.49. (a) D (1) is a subspace of the vector space of all linear operators acting on the space of polynomials, and so, by Proposition 2.9, one only needs to prove closure. If L = p(x) D + q(x) and M = r(x) D + s(x) are operators of the given form, so is c L + d M = [ c p(x) + d r(x) ] D + [ c q(x) + d s(x) ] for any scalars c, d ∈ R. It is an infinite dimensional vector space since the operators xi D and xj for i, j = 0, 1, 2, . . . are all linearly independent. (b) If L = p(x) D + q(x) and M = r(x) D + s(x), then L ◦ M = p r D 2 + (p r′ + q r + p s) D + (p s′ + q s), M ◦ L = p r D2 + (p′ r + q r + p s) D + (q ′ r + q s),

hence [ L, M ] = (p r ′ − p′ r) D + (p s′ − q ′ r). (c) [ L, M ] = L, [ M, N ] = N, [ N, L ] = − 2 M, and so h

i

h

i

h

i

[ L, M ], N + [ N, L ], M + [ M, N ], L = [ L, N ] − 2 [ M, M ] + [ N, L ] = O.

7.1.50. Yes, it is a vector space, but the commutator of two second order differential operators is, in general, a third order differential operator. For example [ x D 2 , D2 ] = − 2 D 3 .

7.1.51. (a) The inverse is the scaling transformation that halves the length of each vector. (b) The inverse is counterclockwise rotation by 45◦ . (c) The inverse is reflection through the y axis. (d) No inverse. ! 1 −2 (e) The inverse is the shearing transformation . 0 1 7.1.52. ! ! 1 0 2 0 2 . (a) Function: ; inverse: 0 2 0 12 0 0 1 1 √1 √1 √1 √1 − B 2 2 2C 2C (b) Function: B @ A; inverse: @ 1 A. √1 √ √1 − √1 2

(c) Function:

−1 0

(d) Function:

1 2 1 2

(e) Function:

1 0

!

2

−1 0

0 ; inverse: 1

1! 2 ; 1 2!

!

2

no inverse.

2 ; inverse: 1

1 0

!

−2 . 1

7.1.53. Since L has matrix representative −2 1

2 !

0 . 1

−3 , and so L−1 [ e1 ] = 1

−2 1

!

!

3 , its inverse has matrix representative −2 ! −3 −1 . and L [ e2 ] = 1

0

1 −1

2 7.1.54. Since L has matrix representative B @ 1 −1

1 2 1

182

1

−1 2C A, its inverse has matrix representative 2

0 B B B @

−

2 3 4 3

−1

1 −1 1

− 34

1

C 5C C, 3A

and so L−1 [ e1 ] =

−1

0 B B B @

−

2 3 4 3

−1

1

C C C, A

0 B

L−1 [ e2 ] = B @

0

1

1

4 1C B − 3C −1 5C B C. [ e3 ] = B − 1C A L 3A @ 1 −1

♦ 7.1.55. If L ◦ M = L ◦ N = I W , M ◦ L = N ◦ L = I V , then, by associativity, M = M ◦ I W = M ◦ (L ◦ N ) = (M ◦ L) ◦ N = I V ◦ N = N .

♥ 7.1.56. (a) Every vector in V can be uniquely written as a linear combination of the basis elements: v = c1 v1 + · · · + cn vn . Assuming linearity, we compute L[ v ] = L[ c1 v1 + · · · + cn vn ] = c1 L[ v1 ] + · · · + cn L[ vn ] = c1 w1 + · · · + cn wn . Since the coefficients c1 , . . . , cn of v are uniquely determined, this formula serves to uniquely define the function L: V → W . We must then check that the resulting function is linear. Given any two vectors v = c1 v1 + · · · + cn vn , w = d1 v1 + · · · + dn vn in V , we have L[ v ] = c1 w1 + · · · + cn wn , L[ w ] = d1 w1 + · · · + dn wn . Then, for any a, b ∈ R, L[ a v + b w ] = L[ (a c1 + b d1 ) v1 + · · · + (a cn + b dn ) vn ] = (a c1 + b d1 ) w1 + · · · + (a cn + b dn ) wn “

”

“

= a c1 w1 + · · · + cn wn + b d1 w1 + · · · + dn wn

”

= a L[ v ] + d L[ w ],

proving linearity of L. (b) The inverse is uniquely defined by the requirement that L−1 [ wi ] = vi , i = 1, . . . , n. Note that L ◦ L−1 [ wi ] = L[ vi ] = wi , and hence L ◦ L−1 = I W since w1 , . . . , wn is a basis. Similarly, L−1 ◦ L[ vi ] = L−1 [ wi ] = vi , and so L−1 ◦ L = I V . (c) If A = ( v1 v2 . . . vn ), B = ( w1 w2 . . . wn ), then L has matrix representative B A−1 , while L−1 has matrix representative A B −1 . 0 0 1 1 ! ! 1 3 1 1 2 −5 3 5 3 A, L−1 = @ 2 2 A; ; (ii) L = @ 3 , L−1 = (d) (i) L = 3 1 −1 3 1 2 1 −1 − 2 2 1 0 0 1 1 1 1 0 1 1 2 2C B − 2 B B −1 1 1 1C C 1 0 1C , L = (iii) L = B @ A. 2 −2 2A @ 1 1 0 1 1 1 2 2 −2

7.1.57. Let m = dim V = dim W . As guaranteed by Exercise 2.4.24, we can choose bases v1 , . . . , vn and w1 , . . . , wn of R n such that v1 , . . . , vm is a basis of V and w1 , . . . , wm is a basis of W . We then define the invertible linear map L: R n → R n such that L[ vi ] = wi , i = 1, . . . , n, as in Exercise 7.1.56. Moreover, since L[ vi ] = wi , i = 1, . . . , m, maps the basis of V to the basis of W , it defines an invertible linear function from V to W . 7.1.58. Any m ×!n matrix of rank n < m. The inverse is not unique. For example, if 1 , then B = ( 1 a )T , for any scalar a, satisfies B A = I = (1). A= 0 ♦ 7.1.59. Use associativity of composition: N = N ◦ I W = N ◦ L ◦ M = I V 7.1.60. (a) L[ a x2 + b x + c ] = a x2 + (b + 2 a) x + (c + b); L−1 [ a x2 + b x + c ] = a x2 + (b − 2 a) x + (c − b + 2 a) = e− x (b) Any of the functions Jc [ p ] = D ◦J

c

Z x 0

Z x

−∞

◦M

= M.

ey p(y) dy.

p(y) dy + c, where c is any constant, is a right inverse:

= I . There is no left inverse since ker D 6= {0} contains all constant functions. 183

♥ 7.1.61. (a) It forms a three-dimensional subspace since it is spanned by the linearly independent functions“ x2 ex , x ex , ex . ” (b) D[ f ] = a x2 + (b + 2 a) x + (c + b) ex is invertible, with inverse “

”

D−1 [ f ] = a x2 + (b − 2 a) x + (c − b + 2 a) ex = h

i

(c) D[ p(x) ex ] = p′ (x) + p(x) ex , while D −1 [ p(x) ex ] =

7.2.1. (a)

0

0 @

1 2

(d)

(e)

−

0 @

− −

1

2C A. √1 2 !

−1 0

(b) (c)

− √1

√1 B 2 @ 1 √ 2

3 5 4 5

Z x

−∞

Z x

−∞ y

f (y) dy.

p(y) e dy.

(i) The line y = x; (ii) the rotated square 0 ≤ x + y, x − y ≤ (iii) the unit disk.

0 . (i) The x axis; (ii) the square −1 ≤ x, y ≤ 0; (iii) the unit disk. −1

!

1 4 5 A. 3 5

(i) The line 4 x + 3 y = 0; (ii) the rotated square with vertices „ „ «T “ √ ” «T T ( 0, 0 )T , √1 , √1 , − √1 , √1 , 0, 2 ; (iii) the unit disk. 2

2

2

2

0 . (i) The line y = 4 x; (ii) the parallelogram with vertices ( 0, 0 )T , ( 1, 2 )T , 1 ( 1, 3 )T , ( 0, 1 )T ; (iii) the elliptical domain 5 x2 − 4 x y + y 2 ≤ 1. 1 3 2A 5 2

1 2 3 2

=

0

√1 B 2 @ 1 √ 2

− √1

1

1 0

2C A 1 √ 2

3 1

!

0 B @

−

1 √1 2C A. 1 √ 2

√1 2 √1 2

“

(i) The line y = 3 x; (ii) the parallelogram with vertices ( 0, 0 )T , − 21 , − 32 ( 1, 1 )T ,

(f )

0

1 @5 2 5

√ 2;

“

3 5 , 1 2 2

2 5 A. 4 5

”T

;

(iii) the elliptical domain

17 2

x2 − 9 x y +

(i) The line y = 2 x; (ii) the line segment

(iii) the line segment

7.2.2. Parallelogram with vertices



!

0 , 0

˛ ˛ ( x, 2 x )T ˛˛ !

!

1 , 2

n

√1 5

ﬀ

.

˛ ˛ ˛

,

3 5

o

0≤x≤

;

3 2.5

!

3 : 1

4 , 3

y 2 ≤ 1.

( x, 2 x )T

− √1 ≤ x ≤ 5

5 2

”T

2 1.5 1 0.5 1

2

3

1

2

3

4

1

(a) Parallelogram with vertices

!

0 , 0

!

1 , 1

!

4 , −1

!

3 : −2

0.5 4

-0.5 -1 -1.5 -2

4

(b) Parallelogram with vertices

!

0 , 0

!

2 , 1

!

3 , 4

!

1 : 3

3

2

1

0.5

184

1

1.5

2

2.5

3

8

!

0 , 0

(c) Parallelogram with vertices

!

!

10 , 8

5 , 7

!

6

5 : 1

4

2

4

2

6

8

10

5

4

(d) Parallelogram with vertices 0 √ 1 0 3 √ 1 ! √1 − 2 −√ + 2 2C 0 B 2 C B √ A, @ 3 2 √ A, ,@ 1 0 √ + √ +2 2 2 2

2

√ ! √2 : 2 2

3

2

1

-0.5

0.5

1

1

!

0 , 0

(e) Parallelogram with vertices

!

!

3 , 1

2 , −1

!

0.5

−1 : −2

-1

1

2

3

-0.5 -1 -1.5 -2

1 0.75 0.5

(f ) Line segment between

− 21 1 2

!

and

0.25

!

1 : −1

-1 -0.75 -0.5-0.25

0.25 0.5 0.75

1

-0.25 -0.5 -0.75 -1

6

4

(g) Line segment between

−1 −2

!

and

!

3 : 6

2

-3

-2 -1

1

2

3

4

5

-2

7.2.3.

!

−1 0 represents a rotation by θ = π; (a) L = 0 −1 (b) L is clockwise rotation by 90◦ , or, equivalently, counterclockwise rotation by 270◦ . 2

!2

!

0 1 1 0 7.2.4. = . L represents a reflection through the line y = x. Reflecting twice 1 0 0 1 brings you back where you started. !

!

!

!

1 0 1 0 1 0 1 0 , we see that it is the com7.2.5. Writing A = = −2 1 0 −1 2 1 0 −1 position of a reflection in the x axis followed by a shear along the y axis with shear factor 2, which is the same as first doing a shear along the y axis with shear factor −2 and then reflecting in the x axis. If we perform L twice, the shears cancel each other out, while reflecting twice brings us back where we started. 7.2.6. Its image is the line that goes through the image points

185

!

−1 , 2

!

−4 . −1

2 0

7.2.7. Example: reflection, e.g.,

!

0 . It is not unique, because you can compose it with any rotation or 3 0 1 0√ √ 1 ! √1 √1 − 2 − 2 2 0 B 2 2C A. @ 1 A=@ 3 √ √3 √ √1 0 3 2 2

0

1 7.2.8. Example: A = B @0 0 matrix.

0 2 0

12

2

0 b = A Q where Q is any 3 × 3 orthogonal 0C A. More generally, A 4

7.2.9. (a) True. (b) True. (c) False: in general, squares are mapped to parallelograms. (d) False: in general circles are mapped to ellipses. (e) True. ♦ 7.2.10. (a) The reflection through the line takes e1 to ( cos θ, sin θ )T and e2 to ( sin θ, − cos θ )T , and hence has the ! representative. ! indicated matrix ! cos ϕ sin ϕ cos θ sin θ cos(ϕ − θ) − sin(ϕ − θ) (b) = is rotation by angle sin ϕ − cos ϕ sin θ − cos θ sin(ϕ − θ) cos(ϕ − θ) ϕ − θ. Composing in the other order gives the opposite rotation, through angle θ − ϕ. ♦ 7.2.11. (a) Let z be a unit vector that is orthogonal to u, so u, z form an orthonormal basis of R 2 . Then L[ u ] = u = R u since uT u = 1, while L[ z ] = − z = R z since u · z = uT z = 0. v vT − I. Thus, L[ v ] = R v since they agree on a basis of R 2 . (b) R = 2 k v k02 0 1 1 ! ! 24 12 5 7 − − − − 0 1 1 0 25 25 A; 13 13 A. (iii) (c) (i) , (iv ) @ ; (ii) @ 24 12 7 5 1 0 0 −1 − 25 − 25 13 13 7.2.12. (a)

(b)

(c)

(d)

(e)

!

!

!

!

!

1 0 −3 0 0 1 0 2 1 − 13 : = 0 2 0 1 1 0 −3 1 0 1 1 a shear of magnitude − 3 along the x axis, followed by a scaling in the y direction by a factor of 2, followed by a scaling in the x direction by a factor of 3 coupled with a reflection in the by a!reflection ! in the line y = x. ! ! y axis, followed 1 1 1 0 1 0 1 1 : = 0 1 0 2 −1 1 −1 1 a shear of magnitude −1 along the x axis, followed by a scaling in the y direction by a factor of −1 ! ! of magnitude ! 2, followed! by a shear ! along the y axis. 1 3 1 3 0 1 0 1 0 1 3 : = 1 0 1 1 2 0 53 0 1 3 1 a shear of magnitude 13 along the x axis, followed by a scaling in the y direction by a factor of 53 , followed by a scaling of magnitude 3 in the x direction, followed by a shear 1 of magnitude along the 1 y0 axis. 0 1 30 10 10 10 10 1 1 1 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 1 0 B C B CB CB CB CB CB 1 0 A @ 0 1 0 A @ 0 −1 0 A @ 0 1 −1 A @ 0 1 0 C @1 0 1A = @1 1 0A@0 A: 0 1 1 0 0 1 0 −1 1 0 0 2 0 0 1 0 0 1 0 0 1 a shear of magnitude 1 along the x axis that fixes the xz plane, followed a shear of magnitude −1 along the y axis that fixes the xy plane, followed by a reflection in the xz plane, followed by a scaling in the z direction by a factor of 2, followed a shear of magnitude −1 along the z axis that fixes the xz plane, followed a shear of magnitude 1 along the y axis 1 that 0 fixes the yz1 plane. 10 1 0 0 10 10 10 1 2 0 1 2 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0 0 C B C B CB CB CB CB 1 CB @ 2 4 1 A = @ 0 0 1 A @ 2 1 0 A @ 0 1 0 A @ 0 −3 0 A @ 0 1 − 3 A @ 0 1 0 A: 0 0 1 0 0 1 2 1 1 0 1 0 0 0 1 2 0 1 0 0 1 186

a shear of magnitude 2 along the x axis that fixes the xz plane, followed a shear of magnitude − 13 along the y axis that fixes the xy plane, followed by a scaling in the y direction by a factor of 3 coupled with a reflection in the xz plane, followed by a shear of magnitude 2 along the z axis that fixes the yz plane, followed shear of magnitude 2 along the y axis that fixes the yz plane, followed by a reflection in the plane y − z = 0. 7.2.13.

!

!

!

!

!

2 cos θ − sin θ 1 a 1 0 1 a since = 1 + ab 2a + a b = sin θ cos θ 0 1 b 1 0 1 b 1 + ab 1 1+a b = 1−tan = 1−2 sin2 12 θ = cos θ and 2 a+a2 b = − 2 tan 21 θ+tan2 21 θ sin θ = 2 θ sin θ ” “ ! − 2 tan 21 θ 1 − sin2 21 θ = −2 cos 21 θ sin 21 θ = − sin θ. −1 0 (b) The factorization is not valid when θ is an odd multiple of π, where the matrix 0 −1 represents rotation by 180◦ . (c) The first and third factors represent shears along the x axis with shear factor a, while the middle factor is a shear along the y axis with shear factor b.

(a)

0

1

0

1 √0 C − 23 C A 1 2

0

!

0

10

10

10

1

1 0 0 1 0 0 1 0 0 1 0 √0 C 1 B CB CB CB 1 = 0 1 0 3 A: 0 0 1 0 0 0 1 − @ A A @ @ A @ √ 2 √2 3 0 0 0 1 0 0 1 3 1 0 0 2 0 2 √ a shear of magnitude − 3 along the y axis that fixes the xy plane, following by a scaling in the z direction by a factor of 2, following by a scaling in the y direction by a factor of 12 , √ following by a shear of magnitude 3 along the z axis that fixes the xz plane.

B 0 7.2.14. B @

1 0

7.2.15. (a)

0 , 0

♦ 7.2.16.

(b)

1 @2 1 2

1 1 2 A, 1 2

(c)

0 @

−

4 13 6 13

6 − 13 9 13

1

A.

!

a b (a) If A = has rank 1 then its rows are linearly dependent, and hence either ( a, b ) = ! ! c d λ 1 λ ( c, d ) or c = d = 0. In the first case A = ( c, d ); in the second, A = ( a, b ). 1 0 (See Exercise 1.8.15 for the general case.) (b) When u = v is a unit vector, or, more generally, when v = u/k u k2 .

7.2.17.

0

1

B @0

0

0

0 B @1 0

0

0 B @0 1

0

1 B @0 0 0

0 B @0 1

0 1 0 1 0 0 0 1 0 0 0 1 1 0 0

1

0 0C A is the identity transformation; 1 1

0 0C A is a reflection in the plane x = y; 1 1

1 0C A is a reflection in the plane x = z; 0 1

0 1C A is a reflection in the plane y = z; 0 1

0 ◦ 1C A is rotation by 120 around the line x = y = z; 0

187

0

0

B @1

0

0 0 1

0

1 7.2.18. Xψ = B @0 0

1

1 ◦ 0C A is rotation by 240 around the line x = y = z. 0

0 cos ψ − sin ψ !

1

0 sin ψ C A. cos ψ

0

1

−1 0 0 −1 0 7.2.19. det = +1, representing a 180◦ rotation, while det B 0C @ 0 −1 A = −1, 0 −1 0 0 −1 and so is a reflection — but through the origin, not a plane, since it doesn’t fix any nonzero vectors.

♦ 7.2.20. If v is orthogonal to u, then uT v = 0, and so Qπ v = − v, while since k u k2 = uT u = 1, we have Qπ u = u. Thus, u is fixed, while every vector in the plane orthogonal to it is rotated through an angle π. This suffices to show that Qπ represents the indicated rotation. ♦ 7.2.21. (a) First, w = (u · v) u = (uT v) u = u uT v is the orthogonal projection of v onto the line in the direction of u. So the reflected vector is v − 2 w = ( I − 2 u uT )v. (b) RT R = ( I − 2 u uT )2 = I − 4 u uT + 4 u uT u uT = I because uT u = k u k2 = 1. it is improper because it reverses orientation. 1 1 0 0 1 0 2 24 72 2 151 24 7 − 169 − 31 C 0 − 25 169 169 3 3 C B B 25 C B 2 137 96 C 1 2C C, C. B B − 24 (iii) B (ii) B (c) (i) B 0 1 0C A, @ 169 169 169 A 3 −3 3A @ @ −

24 25

0

7 − 25

72 169

96 169

119 − 169

(d) Because the reflected vector is minus the rotated vector.

−

1 3

2 3

2 3

♦ 7.2.22. (a) In the formula for Rθ , the first factor rotates to align a with the z axis, the second rotates around the z axis by angle θ, while the third factor QT = Q−1 rotates the z axis back to the line through a. The1combined effect is a rotation through angle θ around 0 1 0 0 C T the axis a. (b) Set Q = B @ 0 0 −1 A and multiply out to produce Q Zθ Q = Yθ . 0 1 0

♥ 7.2.23. (a) (i) 3 + i + 2 j , (ii) 3 − i + j + k , (iii) −10 + 2 i − 2 j − 6 k , (iv ) 18. (b) q q = (a + b i + c j + d k )(a − b i − c j − d k ) = a2 + b2 + c2 + d2 = k q k2 since all other terms in the product cancel. (c) This can be easily checked for all basic products, e.g., (1 i ) j = k = 1( i j ), ( i j ) k = −1 = i ( j k ), ( i i ) j = − j = i ( i j ), etc. The general case follows by using the distributive property (or by a direct computation). (d) First note that if a ∈ R ⊂ H, then a q = q a for any q ∈ H. Thus, for any a, b ∈ R, we use the distributive property to compute Lq [ a r + b s ] = q (a r + b s) = a q r + b q s = a Lq [ r ] + b Lq [ s ], and Rq [ a r + b s ] = (a r + b s) q = a r q + b s q = a Rq [ r ] + b Rq [ s ]. The 0 1 0 1 a −b −c −d a −b −c −d B B b a −d cC a d −c C C Bb C B C, Rq = B C. matrix representatives are Lq = B @c @ c −d d a −b A a bA d −c b a d c −b a T 2 2 2 2 T L R , and so L L = (a + b + c + d ) I = R (e) By direct computation: LT q q = I = q q q q RqT Rq when k q k2 = a2 + b2 + c2 + d2 = 1.

(f ) For q = ( b, c, d )T , r = ( x, y, z )T , we have q r = (b i + c j + d k )(x i + y j + z k ) =

188

−(b x + c y + d z) + (c z − d y) i + (d x − b z) j + (b y − c x) k , while q · r = b x + c y + d z and q × r = ( c z − d y, d x − b z, b y − c x )T . The associativity law (q r) s = q (r s) implies that (q × r) · s = q · (r × s), which defines the vector triple product, and the cross product identity (q × r) × s − (q · r)s = q × (r × s) − (r · s)q.

7.2.24. (a) 0

!

1 −2

−4 , (b) 3

−3 7.2.25. (a) B @ 6 1

7.2.26. (a) (b) (c) (d)

(e)

−1 1 1 !

1

−2 6C A, 0 !

!

1

− 43

−6 , (c) 3 0

−1 (b) B @ 0 0 !

0 −2 0

!

−1 2 1

0 0C A, 1

!

0 , (d) 5 (c)

0 B @

−1 0 1 5

0 − 52

!

0 , (e) 5

−3 2

!

−8 . 7

1

− 12 5 C 0 A. − 15

0 −2 0 !

1 2 1 0 1 0 , ; canonical form: ; , and Bases: , 2 1 0 1 0 1 0 1 0 1 0 1 ! ! ! 1 −4 0 1 0 0 2 1 B C B C B C ; ; canonical form: bases: @ 0 A , @ 0 A , @ 4 A, and , 0 0 0 1 −2 0 1 3 0 0 1 0 1 0 1 1 ! ! 2 3 4 1 0 0 1 B C B C B C C , and B , bases: @ 0 A , @ 4 A , @ −5 A; canonical form: @ 0 1 A; 1 0 −1 1 8 1 0 0 0 0 0 1 0 1 0 0 1 0 1 1 1 0 1 1 2 −1 1 0 B C B B C B C B C C B C bases: B @ 0 A , @ 1 A , @ −2 A, and @ 1 A , @ −1 A , @ −1 A; canonical form: @ 0 1 0 0 31 0 2 1 1 0 11 0 0 0 1 1 0 0 1 0 1 0 0 1 −2 −1 0 1 −1 0 −3 1 C C B C B B C B C B C B B C B 0C B0C B 1C B 2C B 1C B 1C B 1C C B 0C C; C, B C, B C, B B C, B C, B C, and B C, B bases: B @0A @1A @ 0A @ 4A @ −1 A @ −1 A @ 1 A @ 0 A 1 0 −1 0 1 0 0 00 1 1 0 0 0 B 0 1 0 0C C B C. canonical form: B @0 0 0 0A 0 0 0 0

1

0 0C A; 0

7.2.27. (a) Let v1 , . . . , vn be any basis for the domain space and choose wi = L[ vi ] for i = 1, . . . , n. Invertibility implies that w1 , . . . , wn are linearly independent, and so form a basis for the target space. (b) Only the identity transformation, since A1=0S I S −11 = I . 0 ! ! ! ! ! ! √1 − √1 C 0 1 0 2 0 1 B 2C B 2 , (c) (i) , and @ 1 A , @ , . (ii) , , and A. 1 0 2 0 1 0 √ √1 (iii)

1 0

!

,

!

0 , and 1

1 −2

!

,

2

!

2

0 . 1

♦ 7.2.28. (a) Given any v ∈ R n , write v = c1 v1 + · · · + cn vn ; then A v = c1 A v1 + · · · + cn A vn = c1 w1 + · · · + cn wn , and hence A v is uniquely defined. In particular, the value of A ei uniquely specifies the ith column of A. (b) A = C B −1 , where B = ( v1 v2 . . . vn ), C = ( w1 w2 . . . wn ). ♦ 7.2.29. (a) Let Q have columns u1 , . . . , un , so Q is an orthogonal matrix. Then the matrix representative in the orthonormal basis is B = Q−1 A Q = QT A Q, and B T = QT AT (QT )T =

189

1 0

QT A Q = B. (b) Not necessarily. For example, if A = ! 2 0 −1 S AS = is not symmetric. 1 1

0 2

!

!

1 1

and S =

−1 , then 0

7.2.30. (a) Write h x , y i = xT K y where K > 0. Using the Cholesky factorization (3.70), write K = M M T where M is invertible. Let M −T = ( v1 v2 . . . vn ) define the basis. Then n X

x=

i=1

ci vi = M −T c, T

T

y= −1

n X

i=1 −T

T

di vi = M −T d,

implies that h x , y i = x K y = c M M M M d = cT d = c · d. The basis is not unique since one can right multiply by any orthogonal matrix Q to another one: 1 0 produce ! ! √ ! √1 2 , √0 ; (ii) 1 , B 3C e v e e M −T Q = ( v @ 2 A. 1 2 . . . vn ). (b) (i) 3 0 √ 0 3

7.3.1. (a) (i) The horizontal line y = −1; (ii) the disk (x − 2)2 + (y + 1)2 ≤ 1 of radius 1 centered at ( 2, −1 )T ; (iii) the square { 2 ≤ x ≤ 3, −1 ≤ y ≤ 0 }. (b) (i) The x-axis; (ii) the ellipse 19 (x + 1)2 + 14 y 2 ≤ 1; (iii) the rectangle { −1 ≤ x ≤ 2, 0 ≤ y ≤ 2 }. (c) (i) The horizontal line y = 2; (ii) the elliptical domain x2 −4 x y+5 y 2 +6 x−16 y+12 ≤ 0; (iii) the parallelogram with vertices ( 1, 2 )T , ( 2, 2 )T , ( 4, 3 )T , ( 3, 3 )T . (d) (i) The line x = 1; (ii) the disk (x − 1)2 + y 2 ≤ 1 of radius 1 centered at ( 1, 0 )T ; (iii) the square { 1 ≤ x ≤ 2, −1 ≤ y ≤ 0 }. (e) (i) The line 4 x + 3 y + 6 = 0; (ii) the disk (x + 3)2 + (y − 2)2 ≤ 1 of radius 1 centered at ( −3, 2 )T ; (iii) the rotated square with corners (−3, 2), (−2.4, 1.2), (−1.6, 1.8), (−2.2, 2.6). (f ) (i) The line y = x−1; (ii) the line segment from

„

1−

2 2

!

√1 , − √1 2 2

«T

!

1 0

!

0 −1

1 0

!

to

„

1+

√1 , √1 2 2

«T

;

(iii) the line segment from ( 1, 0 )T to ( 2, 1 )T . √ (g) (i) The line x + y + 1 = 0; (ii) the disk (x − 2)2 + (y + 3)2 ≤ 2 of radius 2 centered at ( 2, −3 )T ; (iii) the rotated square with corners (2, −3), (3, −4), (4, −3), (3, −2). “ “ √ √ ”T √ √ ”T (h) (i) The line x+y = 2; (ii) the line segment from 1 − 5, 1 + 5 to 1 + 5, 1 − 5 ; (iii) the line segment from ( 1, 1 )T to ( 4, −2 )T .

7.3.2.

−2 −1

(a) T3 ◦ T4 [ x ] =

1 0

!

x+

!

2 , 2

!

!

0 1 2 −2 1 = −1 0 1 −1 0 ! ! 0 1 3 , (b) T4 ◦ T3 [ x ] = x+ −1 −2 −1 with

with

0 −1

0

(c) T3 ◦ T6 [ x ] = @

3 2 1 2

1 −2 1

!

=

3 2 Ax + 1 2

0 −1 !

1 0

!

1 0

1 0

2 , 2

190

!

,

2 1

!

,

=

3 −1

!

1 0

=

2 1

+

1 2

!

1 ; 2 !

+

!

1 ; 0

with (d) T6 ◦ T3 [ x ] with

0

3 @2 1 2 0 1 = @ 12 2 0 1 @2 1 2

0 −4

(e) T7 ◦ T8 [ x ] =

1 3 2A= 1 1 0 2 0 1 3 2 Ax + @ 3 2 0 1 1 3 2A=@2 3 1 2 2 !

0 −2

!0 1 @2 1 2 1

x+

!

5 2 A, 3 2 1 1 2A 1 2 !

1 0

2 , 2

0 0 1 1 with = −4 −2 −1 1 ! ! 2 1 3 , x+ (f ) T8 ◦ T7 [ x ] = 0 −1 −3 1 −1

with

3 −3

!

2 −2

=

1 1 2 A, 1 2

2 1

2 1

!

1 −1

!

2 −2

!

2 2

0

,

1 1

!

!

,

,

1 0

=

1 5 @2A 3 2

1 −1

1 −1

!

2 1

0

1 @2 1 2

=

4 −3 2 0

!

!

!

1 0

1 1 2A 1 2

=

=

!

+

1 2

!

1 −1 2 −2

!

1 ; 2

+

1 1

1 −1

!

!

!

4 ; −3 !

!

1 2 + ; 1 −3 2 −3

!

+

!

1 . 1

7.3.3. (a) True. (b) True. (c) False: in general, squares are mapped to parallelograms. (d) False: in general circles are mapped to ellipses. (e) True. 7.3.4. The triangle with vertices (−1, −6), (7, −2), (1, 6). 7.3.5. (a) if and only if their matrices are mutual inverse: B = A−1 . (b) if and only if c = B a + b = 0, as in (7.34), so b = − B a. 7.3.6. (a) F [ x ] = A x + b has an inverse if and only if A in nonsingular. (b) Yes: F −1 [ x ] = A!−1 x − A−1 b. ! ! ! ! ! ! 1 1 0 x −2 x x −1 −1 x 3 + 3 , + = (c) T1 = , T2 1 y y 1 y y 0 0 ! 2 ! ! ! ! ! ! 3 x 0 −1 x 1 −2 −1 x −1 x T3 = + , T4 = + −2 y 1 0 y y y 0 1 T5−1

x y

T7−1

x y

!

!

= =

0

.6 .8

1 @2 1 2

−.8 .6

− 12 1 2

!

1 A

x y

x y

!

!

+ 0

+@

!

3.4 , 1.2 −

1 5 2 A, 1 2

!

0 , −1

T6 has no inverse, T8 has no inverse.

♦ 7.3.7. (a) First b = w0 = F [ 0 ], while A vi = F [ vi ] − b = wi − w0 for i = 1, . . . , n. Therefore, knowing its action on the basis vectors uniquely prescribes the matrix A. (b) A = ( w1 − w0 w2 − w0 . . . wn − w0 ) and b = w0 . (c) A = C B −1 , where B = ( v1 v2 . . . vn ), C = ( w1 − w0 . . . wn − w0 ), while b = w0 .

7.3.8. It can be regarded as a subspace of the vector space of all functions from R n to R n and so one only needs to prove closure. If F [ x ] = A x + b and G[ x ] = C x + d, then (F + G)[ x ] = (A + C) x + (b + d) and (c F )[ x ] = (c A) x + (c b) are affine for all scalars c. The dimension is n2 + n; a basis consists of the n2 linear functions Lij [ x ] = Eij x, where Eij is the n × n matrix with a single 1 in the (i, j) entry and zeros everywhere else, along with the n translations Ti [ x ] = x + ei , where ei is the ith standard basis vector.

191

!

!

!

!

!

!

BA Ba + b A a B b Ax + b x A b . = , (b) = 0 1 0 1 0 1 1 1 0 1 (c) The inverse of F [ x ] = A x + b is F −1 [ y ] = A−1 (y − b) = A−1 y − A−1 b. The inverse !−1 ! −1 −1 A b A −A b = . matrix is 0 1 0 1

♦ 7.3.9. (a)

7.3.10. (a), (b), (e) are isometries 7.3.11. (a) True. (b) True if n is even; false if n is odd, since det(− I n ) = (−1)n . 0 1

7.3.12. Write y = F [ x ] = Q(x − a) + a where Q = “

”T

−1 0

!

represents a rotation through

an angle of 90◦ , and a = 32 , − 21 . Thus, the vector y − a = Q(x − a) is obtained by rotating the vector x − a by 90◦ , and so the point y is obtained by rotating x by 90◦ around the point a. ♦ 7.3.13. If Q = I , then F [ x ] = x + a is a translation. Otherwise, since we are working in R 2 , by Exercise 1.5.7(c), the matrix Q − I is invertible. Setting c = (Q − I )−1 a, we rewrite y = F [ x ] as y − c = Q(x − c), so the vector y − c is obtained by rotating x − c according to Q. We conclude that F represents a rotation around the point c. 7.3.14. (a) F [ x ] =

(b)

0

1

! 1 2C ; Ax; G[ x ] = x + √1 0 2 0 0 1 1 √1 √1 √1 √1 − B 2 B 2C B 2 2C @ 1 Ax + @ 1 A = @ 1 √ √1 √ √ 2 2 2 2

− √1

√1 B 2 @ 1 √ 2 0

12

0 13 1 0 − 21 A 5 @ − 21 A C 2 @ 4 √ √ F ◦ G[ x ] = + A x− 2+1 2+1 √1 2 2 2 0 1 1 2 A by 45◦ ; is counterclockwise rotation around the point @ √2+1 2 0 1 12 0 0 13 1 0 ! 1 1 1 1 1 √1 √ √ √ − − 1 B C C B 2 2 2 2 2 2 @ A 5 A @ 4 √ + √ G ◦ F[x] = @ 1 =@ 1 Ax + A x− 2+1 2+1 √ √1 √ √1 0 2 2 2 2 2 2 0 1 1 is counterclockwise rotation around the point @ √ 2 A by 45◦ . 2+1 2 √ √ 1 0√ 0 0 1" 1 ! !# 1 1 3 3 3− 3 − − 1 1 2√ A; √2 A x − √2 Ax + @ F[x] = @ 2 =@ 2 + 1 1 1− 3 3 3 1 1 2 2 " 2 2! !2 ! !# !

0 1

G[ x ] =

0

F ◦ G[ x ] = @

−1 0

1 √2 3 2

−

x−

−2 1

√ 1 3 2 Ax + − 21

−

+

−2 1

=

0 1

−1 0

x+

−1 ; 3

√ 12 √ 13 √ 1 0 0 3 −1− 3 −1− 3 2 A4x − @ 2√ A 5 + @ 2√ A −1+ 3 −1+ 3 − 21 2 2 2 √ 1 0 −1− 3 2 √ A by 120◦ ; point @ −1+ 3 2

√ ! 0 −1 − 3 √ = @ √2 3 3

is counterclockwise rotation around the

− √1

192

−

0

G ◦ F[x] = @

√ 1 √ 1 0 −3+ 3 3 2 Ax + @ 2√ A 9− 3 − 12 2

1 √2 3 2

−

−

0

=@

is counterclockwise rotation around the point (c) F [ x ] = G[ x ] =

!

0 1

!

1 x+ 0

−1 0

0 −1

−1 ; 1

!"

1 1

x−

+

!

−1 0

=

0 −1

!

x+

!

2 ; 2

1 3

!

is a glide reflection (see Exercise 7.3.16) along the line

!

3 1

!

is a glide reflection (see Exercise 7.3.16) along the line

0 −1 x+ −1 0 y = x − 1 by a distance 2.

G ◦ F[x] =

1 1

−

!

0 −1 x+ −1 0 y = x + 1 by a distance 2; F ◦ G[ x ] =

!#

√ 13 √ 1 √ 12 0 0 −1− 3 −1− 3 3 2 A4x − @ 2√ A 5 + @ 2√ A 5− 3 5− 3 − 21 2 2 √ 1 −1− 3 2√ A by 120◦ . @ 5− 3 2

1 √2 3 2 0

−

♥ 7.3.15. (a) If F [ x ] = Q x + a and G[ x ] = R x + b, then G ◦ F [ x ] = R Q x + (R a + b) = S x + c is a isometry since S = Q R, the product of two orthogonal matrices, is also an orthogonal matrix. (b) F [ x ] = x + a and G[ x ] = x + b, then G ◦ F [ x ] = x + (a + b) = x + c. (c) Using Exercise 7.3.13, the rotation F [ x ] = Q x + a has Q 6= I , while G[ x ] = x + b is the translation. Then G ◦ F [ x ] = Q x + (a + b) = Q x + c, and F ◦ G[ x ] = Q x + (a + Q b) = e are both rotations. Qx + c (d) From part (a), G ◦ F [ x ] = R Q x + (R a + b) = S x + c is a translation if and only if R = Q−1 , i.e., the two rotations are by the opposite angle. (e) Write x + c = G ◦ F [ x ] where F [ x ] = Q x and G[ x ] = Q−1 x + c for any Q 6= I . ♦ 7.3.16. (a) (c)

1 0 0 −1

0 −1

!

−1 0

x+ !

2 0

!

x−

= 1 0

!

x+2 , −y

!!

+

1 0

!

0 1 1 0 √ ! √2 = − 2

(b) +

!

1 √3 2C x+B @ 3 A= √ 2 √ 0

0

y+ B @

!

x+

−y + 1 + √2 . −x + 1 − 2

1 √3 2C A, √3 2

♦ 7.3.17. (a) F [ x ] = R(x − a) + a where R = 2 u uT − I is the elementary reflection matrix corresponding to the line in the direction of u through the origin. (b) G[ x ] = R(x − a) + a + d u, where R = 2 u uT − I is the same reflection matrix. (c) Let F [ x ] = R x + b with R an improper orthogonal matrix. According to (5.31) and Exercise 7.2.10, R represents a reflection through a line L = ker( I − R) that goes through the origin. The affine map is a reflection through a line ℓ parallel to L and passing through a point a if it takes the form F [ x ] = R(x − a) + a, which requires b ∈ rng ( I − R) = L⊥ . Otherwise, we decompose b = c + e with c = ( I − R)a ∈ L⊥ and e ∈ L = ker( I − R) 6= 0. Then the affine map takes the form F [ x ] = R(x − a) + a + e, which is a glide reflection along the line ℓ by a distance k e k. ♥ 7.3.18. (a) Let A = { x + b | x ∈ W } ( R n be the affine subspace. If ai = xi + b ∈ W for all i, then ai − aj = xi − xj ∈ W all belong to a proper subspace of R n and so they cannot span R n . Conversely, let W be the span of all ai − aj . Then we can write ai = xi + b where 193

xi = ai − a1 ∈ W and b = a1 , and so all ai belong to the affine subspace A. (b) Let vi = ai − a0 , wi = bi − b0 , for i = 1, . . . , n. Then, by the assumption, vi · vi = k vi k2 = k ai − a0 k2 = k bi − b0 k2 = k wi k2 = wi · wi for all i = 1, . . . , n, while k vi k2 − 2 vi · vj + k vj k2 = k vi − vj k2 = k ai − aj k2 = k bi − bj k2 = k wi − wj k2 = k wi k2 − 2 wi · wj + k wj k2 , and hence vi · vj = wi · wj for all i 6= j. Thus, we have verified the hypotheses of Exercise 5.3.20, and so there is an orthogonal matrix Q such that wi = Q vi for i = 1, . . . , n. Therefore, bi = wi + b0 = Q vi + b0 = Q ai + (b0 − Q a0 ) = F [ ai ] where F [ x ] = Q x + c with c = b0 − Q a0 is the desired isometry. ♦ 7.3.19. First,

k v + w k 2 = k v k2 + 2 h v , w i + k w k 2 ,

k L[ v + w ] k2 = k L[ v ] + L[ w ] k2 = k L[ v ] k2 + 2 h L[ v ] , L[ w ] i + k L[ w ] k2 . If L is an isometry, k L[ v + w ] k = k v + w k, k L[ v ] k = k v k, k L[ w ] k = k w k. Thus, equating the previous two formulas, we conclude that h L[ v ] , L[ w ] i = h v , w i.

♦ 7.3.20. First, if L is an isometry, and k u k = 1 then k L[ u ] k = 1, proving that L[ u ] ∈ S 1 . Conversely, if L preserves the unit sphere, and 0 6= v ∈ V , then u = v/r ∈ S1 where r = k v k, so k L[ v ] k = k L[ r v ] k = k r L[ u ] k = r k L[ u ] k = r = k v k, proving (7.37). 7.3.21. (a) All affine transformations F [ x ] = Q x + b where b is arbitrary and Q is a symmetry of the unit square, and so a rotation by 0, 90, 180 or 270 degrees, or a reflection in the x axis, the y axis, the line y = x or the line y = −x. (b) Same form, but where Q is one of 48 symmetries of the unit cube, consisting of 24 rotations and 24 reflections. The rotations are the identity, the 9 rotations by 90, 180 or 270 degrees around the coordinate axes, the 6 rotations by 180 degrees around the 6 lines through opposite edges, e.g., x = ±y, z = 0, and the 8 rotations by 120 or 240 degrees around the 4 diagonal lines x = ±y = ±z. The reflections are obtained by multiplying each of the rotations by − I , and represent either a reflection through the origin, in the case of − I , or a reflection through the plane orthogonal to the axis of the rotation in the other cases. 7.3.22. Same answer as previous exercise. Now the transformation must preserve the unit diamond/octahedron, which has the same (linear) symmetries as the unit square/cube. ♥ 7.3.23.

2 2 (a) q(H x) = (x cosh α + y sinh α) − (x sinh α + y cosh α) = (cosh2 α − sinh2 α)(x2 − y 2 ) = x2 − y 2 = q(x).

(b) (a x + b y + e)2 − (c x + d y + f )2 = x2 − y 2 if and only if a2 − c2 = 1, d2 − b2 = 1, a b = c d, e = f = 0. Thus, a = ± cosh α, c = sinh α, d = ± cosh β, c = sinh β, and sinh(α − β) = 0, and so α = β. Thus, the complete collection of linear (and affine) transformations preserving q(x) is cosh α sinh α

!

sinh α , cosh α

cosh α − sinh α !

!

sinh α , − cosh α

− cosh α sinh α

!

− sinh α , cosh α !

− cosh α − sinh α

T T x y y , (b) q = , 0 , . The maps are nonlinear 1−y 1+y−x 1+y−x — not affine; they are not isometries because distance between points is not preserved.

♥ 7.3.24. (a) q =

7.4.1. (a) L(x) = 3 x; domain R; target R; right hand side −5; inhomogeneous. 194

!

− sinh α . − cosh α

(b) L(x, y, z) = x − y − z;! domain R 3 ; target R; right hand side 0; homogeneous. ! −3 u − 2v ; inhomogeneous. ; domain R 3 ; target R 2 ; right hand side (c) L(u, v, w) = −1 v−w ! ! 3p − 2q 0 (d) L(p, q) = ; domain R 2 ; target R 2 ; right hand side ; homogeneous. p+q 0 (e) L[ u ] = u′ (x) + 3 x u(x); domain C1 (R); target C0 (R); right hand side 0; homogeneous. 1 (f ) L[ u ] = u′ (x); domain C (R); target C0 (R); right hand side − 3 x; inhomogeneous. ! ! u′ (x) − u(x) 0 (g) L[ u ] = ; domain C1 (R); target C0 (R) × R; right hand side ; u(0) 1 inhomogeneous. ! ! u′′ (x) − u(x) ex 2 0 ; ; domain C (R); target C (R) × R; right hand side (h) L[ u ] = 0 u(0) − 3 u(1) inhomogeneous. 1 0 0 1 3x u′′ (x) + x2 u(x) B C C 2 0 2 (i) L[ u ] = B A; domain C (R); target C (R)×R ; right hand side @ 1 A; @ u(0) ′ 0 u (0) inhomogeneous. ! u′ (x) − v(x) (j) L[ u, v ] = ; domain C1 (R) × C1 (R); target C0 (R) × C0 (R); right − 2 u(x) + v ′ (x) ! 0 ; homogeneous. hand side 0 (k) L[ u, v ] =

0 ′′ 1 u (x) − v ′′ (x) − 2 u(x) + v(x) B C u(0) − v(0) @ A;

u(1) − v(1) 1 0 C right hand side B @ 0 A; homogeneous. 0 0

domain C2 (R)×C2 (R); target C0 (R) × R 2 ;

Z x

u(y) dy; domain C0 (R); target C0 (R); right hand side the constant (l) L[ u ] = u(x) + 3 0 functionZ 1; inhomogeneous. ∞ u(t) e−s t dt; domain C0 (R); target C0 (R); right hand side 1 + s2 ; inhomo(m) L[ u ] = 0 geneous. (n) L[ u ] =

Z 1

u(x) dx − u

0Z

1

” 1 2 ; Z 1

“

domain C0 (R); target R; right hand side 0; homogeneous.

y v(y) dy; domain C0 (R) × C0 (R); target R; right hand side u(y) dy − (o) L[ u, v ] = 0 0 0; homogeneous. ∂u ∂u (p) L[ u ] = +2 ; domain C1 (R 2 ); target C0 (R 2 ); right hand side the constant ∂t ∂x function 1; inhomogeneous. ! ∂u/∂x − ∂v/∂y (q) L[ u ] = ; domain C1 (R 2 ) × C1 (R 2 ); target C0 (R 2 ); right hand side ∂u/∂y + ∂v/∂x the constant vector-valued function 0; homogeneous. ∂2u ∂2u − ; domain C2 (R 2 ); target C0 (R 2 ); right hand side x2 + y 2 − 1; (r) L[ u ] = − ∂x2 ∂y 2 inhomogeneous. Z b

K(x, y) u(y) dy. The domain is C0 (R) and the target is R. To show 7.4.2. L[ u ] = u(x) + a linearity, for constants c, d,

195

L[ c u + d v ] = [ c u(x) + d v(x) ] + =c

u(x) +

Z b a

Z b a

K(x, y) u(y) dy

K(x, y) [ c u(y) + d v(y) ] dy !

+d

v(x) +

Z b a

K(x, y) v(y) dy

!

= c L[ u ] + d L[ v ].

Z t

K(t, s) u(s) ds. The domain is C0 (R) and the target is C0 (R). To show 7.4.3. L[ u ] = u(t) + a linearity, for any constants c, d, L[ c u + d v ] = [ c u(t) + d v(t) ] + =c

u(t) +

Z t a

Z t a

K(t, s) u(s) ds

K(t, s) [ c u(s) + d v(s) ] ds !

+d

v(t) +

Z t a

K(t, s) v(s) ds

!

= c L[ u ] + d L[ v ].

7.4.4. (a) Since a is constant, by the Fundamental Theorem of Calculus, ! Z 0 Z t d du k(s) u(s) ds = a. k(s) u(s) ds = k(t) u(t). Moreover, u(0) = a + a+ = 0 0 dt dt 2 t (b) (i) u(t) = 2 e− t , (ii) u(t) = et −1 , (iii) u(t) = 3 ee −1 . 7.4.5. True, since the equation can be written as L[ x ] + b = c, or L[ x ] = c − b.

7.4.6. (a) (b) (c) (d)

u(x) = c1 e2 x + c2 e− 2 x , dim = 2; u(x) = c1 e4 x + c2 e2 x , dim = 2; u(x) = c1 + c2 e3 x + c3 e− 3 x , dim = 3; u(x) = c1 e− 3 x + c2 e− 2 x + c3 e− x + c4 e2 x , dim = 4.

7.4.7. (a) If y ∈ C2 [ a, b ], then y ′′ ∈ C0 [ a, b ] and so L[ y ] = y ′′ + y ∈ C0 [ a, b ]. Further, L[ c y + d z ] = (c y + d z)′′ + (c y + d z) = c (y ′′ + y) + d (z ′′ + z) = c L[ y ] + d L[ z ]. (b) ker L is the span of the basic solutions cos x, sin x. 7.4.8. (a) If y ∈ C2 [ a, b ], then y ′ ∈ C1 [ a, b ], y ′′ ∈ C0 [ a, b ] and so L[ y ] = 3 y ′′ − 2 y ′ − 5 y ∈ C0 [ a, b ]. Further, L[ c y + d z ] = 3 (c y + d z)′′ −2 (c y + d z)−5 (c y + d z) = c (3 y ′′ − 2 y ′ − 5 y)+ d (3 z ′′ − 2 z ′ − 5 z) = c L[ y ]+d L[ z ]. (b) ker L is the span of the basic solutions e− x , e5 x/3 . 7.4.9. (a) p(D) = D 3 + 5 D 2 + 3 D − 9. (b) ex , e− 3 x , x e− 3 x . The general solution is y(x) = c1 ex + c2 e− 3 x + c2 x e− 3 x . 7.4.10. (a) Minimal order 2: u′′ + u′ − 6 u = 0. (b) minimal order 2: u′′ + u′ = 0. ′′ ′ (c) minimal order 2: u − 2 u + u = 0. (d) minimal order 3: u′′ − 6 u′′ + 11 u′ − 6 u = 0. 7.4.11. (a) u = c1 x + (d) u = c1 | x |

√

3

c2 , (b) u = c1 x2 + c2 x5

+ c2 | x | −

7.4.12. u = c1 x + c2 | x |

√

3

√ 3

q

1

|x|

, (c) u = c1 | x |(1+

√ 5)/2

, (e) u = c1 x3 + c2 x− 1/3 , (f ) v = c1 +

+ c3 | x |−

√ 3

+ c2 | x |(1−

√

5)/2

,

c2 . x

. There is a three-dimensional solution space for x > 0; √

only those in the two-dimensional subspace spanned by x, | x | tiable at x = 0. 196

3

are continuously differen-

7.4.13.

2 2 du du d2 v du dv 2t d u t du 2 d u = e + e +x = et =x , = x , 2 2 2 dt dx dx dt dx dx dx dx 2 dv d v + c v = 0. and so v(t) solves a 2 + (b − a) dt dt (ii) In all cases, u(x) = v(log x) gives the solutions in Exercise 7.4.11. (a) v ′′ + 4 v ′ − 5 v = 0, with solution v(t) = c1 et + c2 e− 5 t . (b) 2 v ′′ − 3 v ′ − 2 v = 0, with solution v(t) = c1 e2 t + c2 et/2 .

(i) Using the chain rule,

1

(c) v ′′ − v ′ − v = 0, with solution v(t) = c√1 e 2 (1+

√

√ 1 5) t 2 (1− 5) t . + c e 2 √ − 3t

(d) v ′′ − 3 v = 0, with solution v(t) = c1 e 3 t + c2 e . ′′ ′ 3t (e) 3 v − 8 v − 3 v = 0, with solution v(t) = c1 e + c2 e− t/3 . (f ) v ′′ + v ′ = 0, with solution v(t) = c1 + c2 e− t . ♦ 7.4.14. (a) v(t) = c1 er t + c2 t er t , so u(x) = c1 | x |r + c2 | x |r log | x |. (b) (i) u(x) = c1 x + c2 x log | x |, (ii) u(x) = c1 + c2 log | x |.

e2 x e− 2 x 7.4.15. v ′′ −4 v = 0, so u(x) = c1 +c2 . The solutions with c1 +c2 = 0 are continuously x x differentiable at x = 0, but only the zero solution is twice continuously differentiable. 7.4.16. True if S is a connected interval. If S is disconnected, then D[ u ] = 0 implies u is constant on each connected component. Thus, the dimension of ker D equals the number of connected components of S. ∂2u 2 y 2 − 2 x2 ∂2u = = − . Similarly, when v = ∂x2 (x2 + y 2 )2 ∂y 2 x ∂2v 6 x y 2 − 2 x3 ∂2v 1 ∂u , then = = − . Or simply notice that v = , and so if 2 x2 + y 2 ∂x2 (x2 + y 2 )3 ∂y 2 ∂x 1 0 ∂2u ∂2u ∂2v ∂2v 1 ∂ @ ∂2u ∂2u A = 0. + = 0, then + = + ∂x2 ∂x2 ∂x2 ∂y 2 2 ∂x ∂x2 ∂y 2

7.4.17. For u = log(x2 + y 2 ), we compute

7.4.18. u = c1 + c2 log r. The solutions form a two-dimensional vector space. “

”

7.4.19. log c (x − a)2 + d (y − b)2 . Not a vector space! ♥ 7.4.20.

∂2 x ∂2 x e cos y + e cos y = ex cos y − ex cos y = 0. 2 ∂x ∂y 2 p2 (x, y) = 1 + x + 21 x2 − 12 y 2 satisfies ∆p2 = 0. Same for p3 (x, y) = 1 + x + 21 x2 − 12 y 2 + 16 x3 − 12 x y 2 . If u(x, y) is harmonic, then any of its Taylor polynomials are also harmonic. To prove this, we write u(x, y) = pn (x, y) + rn (x, y), where pn (x, y) is the Taylor polynomial of degree n and rn (x, y) is the remainder. Then ∆u(x, y) = ∆pn (x, y) + ∆rn (x, y), where ∆pn (x, y) is a polynomial of degree n − 2, and hence the Taylor polynomial of degree n − 2 for ∆u, while ∆rn (x, y) is the remainder. If ∆u = 0, then its Taylor polynomial ∆pn = 0 also, and hence pn is a harmonic polynomial. The Taylor polynomial of degree 4 is p4 (x, y) = −2 x − x2 + y 2 − 23 x3 + 2 x y 2 − 21 x4 + 3 x2 y 2 − 12 y 4 , which is harmonic: ∆p4 = 0.

(a) ∆[ ex cos y ] =

(b) (c) (d)

(e)

7.4.21. (a) Basis: 1, x, y, z, x2 − y 2 , x2 − z 2 , x y, x z, y z; dimension = 9. 197

(b) Basis: x3 −3 x y 2 , x3 −3 x z 2 , y 3 −3 x2 y, y 3 −3 y z 2 , z 3 −3 x2 z, z 3 −3 y 2 z, x y z; dimension = 7. 7.4.22. u = c1 + c2 /r. The solutions form a two-dimensional vector space. 7.4.23. (a) If x ∈ ker M , then L ◦ M [ x ] = L[ M [ x ] ] = L[ 0 ] = 0 and so x ∈ ker L. (b) For example, if L = O, but M 6= O!then ker(L ◦ M ) = {0} = 6 ker M . 0 1 Other examples: L = M = , and L = M = D, the derivative function. 0 0

0

1

0

0

1

1

3 7 0 B−5 C B 2C B C B B C 6 C. (c) x = B 3 C C C; kernel element is 0. 7.4.24. (a) Not in the range. (b) x = B @ 1 A + zB @−4 A @−5 A 0 1 0 0 13 2 0 1 1 0 2 3 −2 B C7 6 B C B 0C B 0C7 6 B1C C C 7. (e) Not in the range. C + 6 y B C + wB B (d) x = B @ −3 A 5 4 @0A @ 2A 1 0 0 0

1

2 0

1

0

13

−2 0 1 6 B C B C B C7 6 B −1 C B 1C B1C7 C + 6 zB C + w B C 7. (f ) x = B @ 0A 4 @ 1A @0A5 0 1 0

7.4.25. (a) x = 1, y = −3, unique; (b) x = − 17 + 73 z, y = 47 + 72 z, not unique; (c) no solution; (d) u = 2, v = −1, w = 0, unique; (e) x = 2 + 4 w, y = − 2 w, z = − 1 − 6 w, not unique. 7.4.26. (a) u(x) = (b) u(x) = (c) u(x) =

4x 11 1 16 − 4 x + c e , 2 x/5 1 x cos 54 x + c2 e2 x/5 sin 45 6 e sin x + c1 e 3 x 3 x 1 − 19 e + c1 + c2 e3 x . 3 xe

x,

7.4.27. (a) u(x) = 41 ex − 14 e4− 3 x , (b) u(x) = 14 − 14 cos 2 x, 1 −x e − 13 x e− x , (c) u(x) = 49 e2 x − 21 ex + 18 1 11 − x 9 −x 1 (d) u(x) = − 10 cos x + 5 sin x + 10 e cos 2 x + 10 e sin 2 x, 1 3 1 x (e) u(x) = − x − 1 + 2 e + 2 cos x + 2 sin x. √ sin 2 x √ 7.4.28. (a) Unique solution: u(x) = x − π ; (b) no solution; (c) unique solution: sin 2 π u(x) = x + (x − 1) ex ; (d) infinitely many solutions: u(x) = 21 + c e− x sin x; (e) unique 3 e2 − 5 x 3 e − 5 2 x e + 2 e ; (f ) no solution; (g) unique solution: solution: u(x) = 2 x + 3 − 2 e −e e −e 36 5 x2 u(x) = − ; (h) infinitely many solutions: u(x) = c (x − x2 ). 31 x2 31 c 7.4.29. (a) u(x) = 12 x log x + c1 x + 2 , (b) u(x) = 21 log x + 43 + c1 x + c2 x2 , x c 3 (c) u(x) = 1 − 8 x + c1 x5 + 2 . x 7.4.30. (a) If b ∈ rng (L ◦ M ), then b = L ◦ M [ x ] for some x, and so b = L[ M [ x ] ] = L[ y ], 198

with y = M [ x ], belongs to rng L. (b) If M = O, but L 6= O then rng (L ◦ M ) = {0} = 6 rng L. ♦ 7.4.31. (a) First, Y ⊂ rng L since every y ∈ Y can be written as y = L[ w ] for some w ∈ W ⊂ U , and so y ∈ rng L. If y1 = L[ w1 ] and y2 = L[ w2 ] are elements of Y , then so is c y1 + d y2 = L[ c w1 + d w2 ] for any scalars c, d since c w1 + d w2 ∈ W , proving that Y is a subspace. (b) Suppose w1 , . . . , wk form a basis for W , so dim W = k. Let y = L[ w ] ∈ Y for w ∈ W . We can write w = c1 w1 + · · · + ck wk , and so, by linearity, y = c1 L[ w1 ] + · · · + ck L[ wk ]. Therefore, the k vectors y1 = L[ w1 ], . . . , yk = L[ wk ] span Y , and hence, by Proposition 2.33, dim Y ≤ k. ♦ 7.4.32. If z ∈ ker L then L[ z ] = 0 ∈ ker L, which proves invariance. If y = L[ x ] ∈ rng L then L[ y ] ∈ rng L, which proves invariance.

7.4.33. (a) {0}, the x axis, the y axis,; R 2 ; (b) {0}, the x axis, R 2 ; (c) If θ 6= 0, π, then the only invariant subspaces are {0} and R 2 . On the other hand, R0 = I , Rπ = − I , and so in these cases every subspace is invariant.

♦ 7.4.34. (a) If L were invertible, then the solution to L[ x ] = b would be unique, namely x = L −1 [ y ]. But according to Theorem 7.38, we can add in any element of the kernel to get another solution, which would violate uniqueness. (b) If b 6= rng L, then we cannot solve L[ x ] = b, and so the inverse cannot exist. (c) On a finite-dimensional vector space, every linear function is equivalent to multiplication by a square matrix. If the kernel is trivial, then the matrix is nonsingular, and hence Z x f (y) dy on V = C0 has invertible. An example: the integral operator I[ f (x) ] = 0

trivial kernel, but is not invertible because any function with g(0) 6= 0 does not lie in the range of I and hence rng I 6= V .

7.4.35. (a) u(x) = 12 + 25 cos x + 15 sin x + c e− 2 x , 1 sin x + c1 e3 x + c2 e− 3 x , (b) u(x) = − 19 x − 10 1 (c) u(x) = 10 + 18 ex cos x + c1 ex cos 31 x + c2 ex sin 31 x, 1 x 1 e + 14 e− x + c1 ex + c2 e− 2 x , (d) u(x) = 6 x ex − 18 1 3x e + c1 + c2 cos 3 x + c3 sin 3 x. (e) u(x) = 91 x + 54 7.4.36. (a) u(x) = 5 x + 5 − 7 ex−1 , (b) u(x) = c1 (x + 1) + c2 ex . √ √ 7.4.37. u(x) = − 7 cos x − 3 sin x. 7.4.38. u′′ + x u = 2, u(0) = a, u(1) = b, for any a, b.

1 sin 3 x, (b) u(x) = 12 (x2 − 3 x + 2) e4 x , 7.4.39. (a) u(x) = 91 x + cos 3 x + 27 3 sin 2 x− 15 sin 3 x, (d) u(x) = 1− 21 (x+1) ex + 21 (x2 +2 x−1) ex−1 . (c) u(x) = 3 cos 2 x+ 10

7.4.40. u(x, y) =

1 2 4 (x

+ y2 ) +

4 1 12 (x

+ y 4 ).

♥ 7.4.41. ′′ ′ (a) If u = v u1 , then u′ = v ′ u1 + v u′1 , u′′ = v ′′ u1 + 2 v ′ u′1 + v u′′ 1 , and so 0 = u + a u + b u = ′′ ′ ′ ′ ′ ′ ′′ u1 v + (2 u1 + a u1 ) v + (u1 + a u1 + b u1 ) v = u1 w + (2 u1 + a u1 ) w, which is a first order 199

ordinary differential equation for w. −x − x2 − x2 (b) (i) u(x) = c1 ex + c2 x ex , (ii) u(x) = c (x − 1) + c e , (iii) u(x) = c e + c x e , 1 2 1 2 Z 2

(iv ) u(x) = c1 ex

/2

2

+ c2 ex

2

e− x dx.

/2

♦ 7.4.42. We use linearity to compute L[ u⋆ ] = L[ c1 u⋆1 + · · · + ck u⋆k ] = c1 L[ u⋆1 ] + · · · + ck L[ u⋆k ] = c1 f 1 + · · · + ck f k , and hence u⋆ is a particular solution to the differential equation (7.66). The second part of the theorem then follows from Theorem 7.38.

7.4.43. An example: Let A =

1 i

2 2i

!

. Then ker A consist of all vectors c

a + i b is any complex number. Then its real part a

−2 1

!

−2 1

!

where c =

and imaginary part b

−2 1

!

are

also solutions to the homogeneous system. 7.4.44. (a) u(x) = c1 cos 2 x + c2 sin 2 x, (b) u(x) = c1 e− 3 x cos x + c2 e− 3 x sin x, (c) u(x) = c1 ex + c2 e− x/2 cos 32 x + c3 e− x/2 sin 32 x, (d) (e) (f ) (g)

√

√

u(x) = c1 ex/ 2 cos √1 x + c2 e− x/ 2 cos √1 x + c3 ex/ 2 2 u(x) = c1 cos 2 x√+ c2 sin 2 x + c3 cos 3 √ x + c4 sin 3 x, u(x) = c1 x cos( 2 log | x |) + c2 x sin( 2 log | x |), u(x) = c1 x2 + c2 cos(2 log | x |) + c3 sin(2 log | x |).

7.4.45. (a) Minimal order 2: (b) minimal order 4: (c) minimal order 5: (d) minimal order 4: (e) minimal order 6:

√

2

sin

√1 2

x + c4 e− x/

√

2

√1 2

x,

x √ ) 2

i

sin

u′′ + 2 u′ + 10 u = 0; u(iv) + 2 u′′ + u = 0; u(v) + 4 u(iv) + 14 u′′′ + 20 u′′ + 25 u′ = 0; u(iv) + 5 u′′ + 4 u = 0. u(vi) + 3 u(iv) + 3 u′′ + u = 0.

7.4.46. (a) u(x) = c e i x = c cos x + i c sin x. “ ” x ( i −1) x (b) u(x) = c1 e + c2 e = c1 ex + c2 e− x cos x + i e− x sin x, (c)

u(x) = c1 e(1+ i ) x/ h

= c1 ex/

√

2

cos

√

2

x √ 2

+ c2 e− (1+ i ) x/ + c2 e− x/

√

2

√ 2

cos

x √ ) 2

i

h

+ i c1 ex/

√

2

sin

x √ 2

− c2 e− x/

√

2

sin

.

7.4.47. (a) x4 − 6 x2 y 2 + y 4 , 4 x3 y − 4 x y 3 . (b) The polynomial u(x, y) = a x4 + b x3 y + c x2 y 2 + ∂2u ∂2u + = (12 a + 2 c) x2 + (6 b + 6 d) x y + (2 c + 12 e) y 2 = 0 if and d x y 3 + e y 4 solves ∂x2 ∂y 2 only if 12 a + 2 c = 6 b + 6 d = 2 c + 12 e = 0. The general solution to this homogeneous linear system is a = e, b = − d, c = − 6 e, where d, e are the free variables. Thus, u(x, y) = e (x4 − 6 x2 y 2 + y 4 ) + 14 d (4 x3 y − 4 x y 3 ). 2 2 2 2 ∂2u ∂u = − k 2 e− k t+ i k x = ; (b) e− k t− i k x ; (c) e− k t cos k x, e− k t sin k x. ∂t ∂x2 (d) Yes. When k = a + i b is complex, we obtain the real solutions

♥ 7.4.48. (a)

200

2

2

2

2

2

e(b −a ) t−b x cos(a x − 2 a b t), e(b −a ) t−b x sin(a x − 2 a b t) from e− k t+ i k x , along with 2 2 2 2 2 e(b −a ) t+b x cos(a x + 2 a b t), e(b −a ) t+b x sin(a x + 2 a b t) from e− k t− i k x . (e) All those in part (a), as well as those in part (b) for | a | > | b | — which, when b = 0, include those in part (a). 2

2

7.4.49. u(t, x) = ek t+k x , u(t, x) = e− k t+k x , where k is any real or complex number. When k = a + i b is complex, we obtain the four independent real solutions 2

e(a e(b

2

−b2 ) t+a x 2

−a ) t+a x

2

e(a

cos(b x + 2 a b t),

e(b

cos(b x − 2 a b t),

2

−b2 ) t+a x 2

−a ) t+a x

7.4.50. (a), (c), (e) are conjugated. Note: Case (e) is all of C 3 .

sin(b x + 2 a b t), sin(b x − 2 a b t).

u+u u+u u+u u−u , then v = = = v. Similarly, if w = Im u = , 2 2 2 2i u−u u−u u−u = = = w. then w = −2 i −2 i 2i

♦ 7.4.51. If v = Re u =

♦ 7.4.52. Let z1 , . . . , zn be a complex basis of V . Then xj = Re zj , yj = Im zj , j = 1, . . . , n, are real vectors that span V since each zj = xj + i yj is a linear combination thereof, cf. Exercise 2.3.19. Thus, by Exercise 2.4.22, we can find a basis of V containing n of the vectors x1 , . . . , xn , y1 , . . . , yn . Conversely, if v1 , . . . , vn is a real basis, and v = c1 v1 + · · · + cn vn ∈ V , then v = c1 v1 + · · · + cn vn ∈ V also, so V is conjugated.

♦ 7.4.53. Every linear function from C n to C m has the form L[ u ] = A u for some m × n complex matrix A. The reality condition implies A u = L[ u ] = L[ u ] = A u = A u for all u ∈ C n . Therefore, A = A and so A is a real matrix. ♦ 7.4.54. L[ u ] = L[ v ] + i L[ w ] = f , and, since L is real, the real and imaginary parts of this equation yield L[ v ] = f , L[ w ] = 0. ““

”

“

”

”T

7.4.55. (a) L[ u ] = L[ u ] = 0. (b) u = − 32 + 21 i y + − 12 − 12 i z, y, z where y, z ∈ C are the free variables, is the general solution to the first system, and so its complex conjugate ““

“

”

”T

”

u= − 32 − 21 i y + − 12 + 12 i z, y, z , where y, z ∈ C are free variables, solves the conjugate system. (Note: Since y, z are free, they could be renamed y, z if desired.) ♦ 7.4.56. They are linearly independent if and only if u is not a complex scalar multiple of a real solution. Indeed, if u = (a + i b)v where v is a real solution, then x = a v, y = b v are linearly dependent. Conversely, if y = 0, then u = x is already a real solution, while if x = a y for a real† , then u = (1 + i a)y is a scalar multiple of the real solution y.

7.5.1. (a)

1 2

!

−1 , 3

(b)

0 @

7.5.2. Domain (a), target (b):

†

1

1 4 3

2 4

− 23 A , 3 !

−3 ; 9

(c)

0

13 @ 7 5 7

− 10 7 15 7

1

A.

domain (a), target (c):

a can’t be complex, as otherwise x wouldn’t be real. 201

3 1

!

−5 ; 10

domain (b), target (a): domain (c), target (a): 0

1 7.5.3. (a) B @1 0

−1 0 1

1

0 −1 C A, 2

0

1 @2 2 03 6 @7 5 7

(b) 0

1

− 12 A ; 11 − 17 A ; 5

domain (b), target (c): domain (c), target (b):

7

0

1

−2 0

0

2 3

B1 B @2

0

1

C − 32 C A,

2

1

(c)

0

B0 B B1 @

0

1 4 − 32 11 4

0

3 @2 1 03 12 @ 7 10 7 1

3 2C C . −4 C A 9 2

− 52 10 3

1

A;

− 37 15 7

1

A.

0

1

1 −2 0 1 −1 −1 7.5.4. Domain (a), target (b): B 0 −3 C domain (a), target (c): B 0 −2 C @1 A; @2 A; 0 2 6 1 4 5 0 1 1 0 1 −1 0 1 −1 −1 B1 C C B domain (b), target (a): B domain (b), target (c): B 0 − 12 C 0 −1 C @2 A; A; @ 1 4 5 1 1 2 0 3 3 3 3 03 0 1 1 1 1 1 − 1 −1 3C 4 2 4 B B C C C 1 1 B B ; domain (c), target (b): B . domain (c), target (a): B 0 −2 C 0 −6 C 2 2 @ @ A A 1 1 1 −4 −4 2 1 6 2 7.5.5. Domain (a), target (a): domain (a), target (c): domain (b), target (b):

0 2

−1 ; 1

2 8

0 8

−2 ; 4

0

1 @2

1 1 domain (c), target (a): @ 1 0 0

domain (c), target (c):

!

1 3

16 @ 7 18 7

!

0 4 3 2 7 4 7

8 7 16 7

domain (a), target (b): domain (b), target (a):

1

− 32 A ; 11 − 73 A ; 1

domain (b), target (c): domain (c), target (b):

7

1 − 47 A . 6 7

0

1 3

0 4

1 @2

1

2 3

1

0

8 03

8 3 4 7 8 7

@1

1

7.5.6. Using the monomial basis 1, x, x2 , the operator D has matrix representative 0 0

0

1

!

−3 . 3 − 12

1

A. 1 3!

−1

4 . 3 1 − 97 A . 3 7

1 1 0 3 B 1 2 B1 C 1 1 2 A. The inner product is represented by the Hilbert matrix K = B 3 4 @2 0 1 0 1 1 1 −6 2 3 3 4 5 B C −1 T Thus, the adjoint is represented by K A K = @ 12 −24 −26 A, so 0 30 30 ∗ ∗ 2 ∗ D [ 1 ] = −6 + 12 x, D [ x ] = 2 − 24 x − 26 x , D [ x2 ] = 3 − 26 x − 26 x2 . One can check that Z 1 Z 1 D∗ [ p(x) ] q(x) dx = h D ∗ [ p ] , q i p(x) q ′ (x) dx = h p , D[ q ] i =

0 A=B @0 0

1 0 0

1

C C C. A

0

0

for all quadratic polynomials by verifying it on the monomial basis elements.

♦ 7.5.7. Suppose M, N : V → U both satisfy h u , M [ v ] i = hh L[ u ] , v ii = h u , N [ v ] i for all u ∈ U, v ∈ V . Then h u , (M − N )[ v ] i = 0 for all u ∈ U , and so (M − N )[ v ] = 0 for all v ∈ V , which proves that M = N .

202

♦ 7.5.8. (a) For all u ∈ U, v ∈ V , we have h u , (L + M )∗ [ v ] i = hh (L + M )[ u ] , v ii = hh L[ u ] , v ii + hh M [ u ] , v ii = h u , L∗ [ v ] i + h u , M ∗ [ v ] i = h u , (L∗ + M ∗ )[ v ] i. Since this holds for all u ∈ U, v ∈ V , we conclude that (L + M )∗ = L∗ + M ∗ . (b) h u , (c L)∗ [ v ] i = hh (c L)[ u ] , v ii = c hh L[ u ] , v ii = c h u , L∗ [ v ] i = h u , c L∗ [ v ] i. (c) hh (L∗ )∗ [ u ] , v ii = h u , L∗ [ v ] i = hh L[ u ] , v ii. (d) (L−1 )∗ ◦ L∗ = (L ◦ L−1 )∗ = I ∗ = I , and L∗ ◦ (L−1 )∗ = (L−1 ◦ L)∗ = I ∗ = I .

7.5.9. In all cases, L = L∗ if and only if its matrix ! representative A, with respect!to the stan−1 0 0 1 dard basis, is symmetric. (a) A = = AT , (b) A = = AT , 0 −1 1 0 (c) A =

3 0

0 3

!

T

=A ,

(d) A =

0

1 @2 1 2

1 1 2A 1 2

= AT .

♦ 7.5.10. According to (7.78), the adjoint A∗ = M −1 AT M = A if and only if M A = AT M = (M T AT )T = (M A)T since M = M T . ! ! 2 0 12 6 7.5.11. The inner product matrix is M = , so M A = is symmetric, and 0 3 6 12 1 0 hence, by Exercise 7.5.10, A is self-adjint. 0 1 1 C 7.5.12. (a) a12 = 21 a21 , a13 = 13 a31 , 21 a23 = 13 a32 , (b) B @ 2 1 2 A. 3 3 2 7.5.13. (a) 2 a −a22 = − a +2 a21 −a31 , 2 a13 −a23 = − a21 +2 a31 , − a13 +2 a23 −a33 = − a22 +2 a32 , 0 12 1 11 0 1 1 (b) B −6 C @2 3 A. 5 3 −16 7.5.14. True. h I [ u ] , v i = h u , v i = h u , I [ v ] i for all u, v ∈ U , and so, according to (7.74), I∗ = I. ! 2 0 7.5.15. False. For example, A = is not self-adjoint with respect to the inner product 0 1 ! ! 2 −1 4 −1 defined by M = since M A = is not symmetric, and so fails the −1 2 −2 2 criterion of Exercise 7.5.10.

7.5.16. (a) (L + L∗ )∗ = L∗ + (L∗ )∗ = L∗ + L. (b) Since L ◦ L∗ = (L∗ )∗ ◦ L∗ , this follows from Theorem 7.60. (Or it can be proved directly.) ♦ 7.5.17. (a) Write the condition as h N [ u ] , u i = 0 where N = K − M is also self-adjoint. Then, for any u, v ∈ U , we have 0 = h N [ u + v ] , u + v i = h N [ u ] , u i + h N [ u ] , v i + h N [ v ] , u i + h N [ v ] , v i = 2 h u , N [ v ] i, where we used the self-adjointness of N to combine h N [ u ] , v i = h u , N [ v ] i = h N [ v ] , u i. Since h u , N [ v ] i = 0 for all u, v, we conclude that N = K − M = O. (b) When we take U = R n with the dot product, then K and M are represented by n × n matrices, A, B, respectively, and the condition is (A u) · u = (B u) · u for all u ∈ R n , which implies A = B provided A, B are symmetric matrices. In particular, if AT = − A is any skew-symmetric matrix, then (A u) · u = 0 for all u. 203

7.5.18. (a) =⇒ (b): Suppose h L[ u ] , L[ v ] i = h u , v i forqall u, v ∈ U , then q k L[ u ] k = h L[ u ] , L[ u ] i = h u , u i = k u k . (b) =⇒ (c): Suppose k L[ u ] k = k u k for all u ∈ U . Then h L∗ ◦ L[ u ] , u i = h L[ u ] , L[ u ] i = h u , u i. Thus, by Exercise 7.5.17, L∗ ◦ L = I . Since L is assumed to be invertible, this proves, cf. Exercise 7.1.59, that L∗ = L−1 . (c) =⇒ (a): If L∗ ◦ L = I , then h L[ u ] , L[ v ] i = h u , L∗ ◦ L[ v ] i = h u , v i for all u, v ∈ U. 7.5.19. (a) h Ma [ u ] , v i =

Z b a

Ma [ u(x) ] v(x) dx =

h u , Ma [ v ] i, proving self-adjointness.

Z b a

a(x) u(x) v(x) dx =

(b) Yes, by the same computation, hh Ma [ u ] , v ii =

Z b a

Z b a

u(x) Ma [ v(x) ] dx =

a(x) u(x) v(x) w(x) dx = hh u , Ma [ v ] ii.

♥ 7.5.20. (a) If AT = − A, then (A u) · v = (A u)T v = uT AT v = − uT A v = −u · A v for all u, v ∈ R n , and so A∗ = − A. (b) When AT M = − M A. (c) h (L − L∗ )[ u ] , v i = h L[ u ] , v i−h L∗ [ u ] , v i = h u , L∗ [ v ] i−h u , L[ v ] i = h u , (L∗ − L)[ v ] i. Thus, by the definition of adjoint, (L − L∗ )∗ = L∗ − L = − (L − L∗ ). (d) Write L = K + S, where K = 21 (L + L∗ ) is self-adjoint and S = 12 (L − L∗ ) is skewadjoint. ♦ 7.5.21. Define L: U → V1 × V2 by L[ u ] = (L1 [ u ], L2 [ u ]). Using the induced inner product hhh (v1 , v2 ) , (w1 , w2 ) iii = hh v1 , w1 ii1 + hh v2 , w2 ii2 on the Cartesian product V1 × V2 given in Exercise 3.1.18, we find h u , L∗ [ v1 , v2 ] i = hhh L[ u ] , (v1 , v2 ) iii = hhh (L1 [ u ], L2 [ u ]) , (v1 , v2 ) iii = hh L1 [ u ] , v1 ii1 + hh L1 [ u ] , v2 ii2 = h u , L∗ [ v ] i + h u , L∗ [ v ] i = h u , L∗ [ v ] + L∗ [ v ] i, 1

1

2

2

1

1

2

2

∗ and hence L∗ [ v1 , v2 ] = L∗ 1 [ v1 ] + L2 [ v2 ]. As a result, ∗◦ ◦ L∗ ◦ L[ u ] = L∗ 1 L1 [ u ] + L2 L2 [ u ] = K1 [ u ] + K2 [ u ] = K[ u ].

7.5.22. Minimizer: 7.5.23. Minimizer: 7.5.24. Minimizer: 7.5.25. (a) Minimizer: (b) Minimizer: (c) Minimizer: 7.5.26. (a) Minimizer:

“

“

“

1 1 5,−5

”T

; minimum value: − 15 .

14 2 3 13 , 13 , − 13 2 1 3, 3

”T

”T

; minimum value: − 31 26 .

; minimum value: −2.

” 5 1 T ; minimum value: − 56 . , 18 18 ”T “ 5 4 ; minimum value: −5. 3, 3 ” “ 1 1 T ; minimum value: − 12 . 6 , 12 “

“

7 2 13 , 13

”T

7 ; minimum value: − 26 .

204

(b) Minimizer: (c) Minimizer: (d) Minimizer: 7.5.27. (a)

1 3,

(b)

“

11 39 , “ 12 13 , “ 19 39 , 6 11 ,

” 1 T ; 13 ” 5 T ; 26 ” 4 T ; 39

(c)

minimum value: − 11 78 . minimum value: − 43 52 . minimum value: − 17 39 .

3 5.

♦ 7.5.28. Suppose L: U → V is a linear map between inner product spaces with ker L 6= {0} and adjoint map L∗ : V → U . Let K = L∗ ◦ L: U → U be the associated positive semi-definite operator. If f ∈ rng K, then any solution to the linear system K[ u⋆ ] = f is a minimizer for the quadratic function p(u) = 21 k L[ u ] k2 − h u , f i. The minimum is not unique since if u⋆ is a minimizer, so is u = u⋆ + z for any z ∈ ker L.

205

Solutions — Chapter 8 8.1.1. (a) u(t) = − 3 e5 t ,

(b) u(t) = 3 e2(t−1) ,

(c) u(t) = e− 3(t+1) .

8.1.2. γ = log 2/100 ≈ .0069. After 10 years: 93.3033 gram; after 100 years: 50 gram; after 1000 years: .0977 gram. 8.1.3. Solve e− (log 2)t/5730 = .0624 for t = − 5730 log .0624/ log 2 = 22, 933 years. “

”t/t⋆

8.1.4. By (8.6), u(t) = u(0) e− (log 2)t/t⋆ = u(0) 21 = 2−n u(0) when t = n t⋆ . After every time period of duration t⋆ , the amount of radioactive material is halved. 8.1.5. The solution is u(t) = u(0) e1.3 t . To double, we need e1.3 t = 2, so t = log 2/1.3 = .5332. To quadruple takes twice as long, t = 1.0664. To reach 2 million needs t = log 10 6 /1.3 = 10.6273. 8.1.6. The solution is u(t) = u(0) e.27 t . For the given initial conditions, u(t) = 1, 000, 000 when t = log(1000000/5000)/.27 = 19.6234 years. ♦ 8.1.7. b du (a) If u(t) ≡ u⋆ = − , then = 0 = a u + b, hence it is a solution. a dv dt b = a v, so v(t) = c ea t , and u(t) = c ea t − . (b) v = u − u⋆ satisfies dt a (c) The equilibrium solution is asymptotically stable if and only if a < 0 and is stable if a = 0. 8.1.8. (a) u(t) =

1 2

+

1 2

e2 t , (b) u(t) = − 3, (c) u(t) = 2 − 3 e− 3(t−2) .

log 2 du = − u + 5 ≈ − .000693 u + 5. (b) Stabilizes at the equilibrium solution dt 1000 " !# 5000 log 2 u⋆ = 5000/ log 2 ≈ 721 tons. (c) The solution is u(t) = 1 − exp − t which log 2 1000 ! 1000 log 2 ≈ 20.14 years. equals 100 when t = − log 1 − 100 log 2 5000

8.1.9. (a)

♥ 8.1.10. (a) The first term on the right hand side says that the rate of growth remains proportional to the population, while the second term reflects the fact that hunting decreases the population by a fixed amount. (This assumes hunting is done continually throughout the year, which is not what happens ”in real life.) “ e.27 t + 1000 (b) The solution is u(t) = 5000 − 1000 .27 .27 . Solving u(t) = 100000 gives 1 1000000 − 1000/.27 t= log = 24.6094 years. .27 5000 − 1000/.27 (c) To avoid extinction, tThe equilibrium u⋆ = b/.27 must be less than the initial population, so b < 1350 deer. ♦ 8.1.11. (a) | u1 (t) − u2 (t) | = ea t | u1 (0) − u2 (0) | → ∞ when a > 0, since u1 (0) = u2 (0) if and only if the solutions are the same. (b) t = log(1000/.05)/.02 = 495.17. 8.1.12. (a) u(t) =

1 3

e2 t/7 . 206

h

i

(b) One unit: t = log 1/(1/3 − .3333) /(2/7) = 36.0813; h

i

1000 units: t = log 1000/(1/3 − .3333) /(2/7) = 60.2585;

(c) One unit: t ≈ 30.2328 solves 31 e2 t/7 − .3333 e.2857 t = 1. 1000 units: t ≈ 52.7548 solves 13 e2 t/7 − .3333 e.2857 t = 1000. Note: The solutions to these nonlinear equations are found by a numerical equation solver, e.g., the bisection method, or Newton’s method, [ 10 ]. du = c a ea t = a u, and so u(t) is a valid solution. By dt Euler’s formula (3.84), if Re a > 0, then u(t) → ∞ as t → ∞, and the origin is an unstable equilibrium. If Re a = 0, then u(t) remains bounded t → ∞, and the origin is a stable equilibrium. If Re a < 0, then u(t) → 0 as t → ∞, and the origin is an asymptotically stable equilibrium.

♦ 8.1.13. According to Exercise 3.6.24,

8.2.1. (a) Eigenvalues: 3, −1; (b) Eigenvalues:

1 1 2, 3;

(c) Eigenvalue: 2;

eigenvectors: eigenvectors:

eigenvector:

(f )

(g) (h) (i)

(j)

!

!

−1 . 1

√ ! − i 2 , eigenvectors: 1 0 1 0 1 0 1 1 −1 1 C B C C B eigenvectors: B @ −1 A, @ 0 A, @ 2 A. 1 1 1 √

√ √ (d) Eigenvalues: 1 + i 2, 1 − i 2; (e) Eigenvalues: 4, 3, 1;

!

1 −1 . , 1 1 ! ! 1 4 . , 1 3

0

1

0

√3 √ 2 √3 2

−1 + B

i

√ ! 2 . 1

1 0

√ √3 √ 2 √3 2

−1 − B

2 C √ √ C B C B C, B Eigenvalues: 1, 6, − 6; eigenvectors: B @ 0 A, B A @ 2− @ 2+ 1 1 1 0 1 0 1 0 3 3 − 2i 3 + 2i C B C B Eigenvalues: 0, 1 + i , 1 − i ; eigenvectors: B @ 1 A, @ 3 − i A, @ 3 + i 01 1 1 0 1 0 1 −1 C B C Eigenvalues: 2, 0; eigenvectors: B @ −1 A, @ −3 A. 1 1 1 0 2 C −1 is a simple eigenvalue, with eigenvector B @ −1 A; 1 1 0 1 0 1 − 23 3 C B C 2 is a double eigenvalue, with eigenvectors B @ 0 A, @ 1 A. 1 0 1 1 0 0 0 1 C C B B B −1 C B 0 C C; C, B −1 is a double eigenvalue, with eigenvectors B @ 0 A @ −3 A 2 0

207

1

C A.

1

C C C. A

7 is also a double eigenvalue, with eigenvectors

(k) Eigenvalues: 1, 2, 3, 4;

0

1

1 0

0

1

B C B C B1C B0C B C, B C. @0A @1A

0 2 1 0 1 0 1 0 1 0 0 0 1 B C B C B C B C 0 0 1 C B C B C B1C B C, B C, B C, B C. eigenvectors: B @0A @1A @1A @0A 1 1 0 0 0

!

1 . They are real 8.2.2. (a) The eigenvalues are e = cos θ ± i sin θ with eigenvectors ∓i only for θ = 0 and π. (b) Because Rθ − a I has an inverse if and only if a is not an eigenvalue. ±iθ

8.2.3. The eigenvalues are ±1 with eigenvectors ( sin θ, ±1 − cos θ )T . 8.2.4. (a) O, and (b) − I , are trivial examples. 3

2

8.2.5. (a) The characteristic equation is − λ +γ λ +β λ+α = 0. (b) For example, √

0

0

B @0

c

1 0 b

1

0 1C A. a

a2 + b2 + c2 . If 0 a =1b = 0 then A = O and0all vectors 1 0c = 1 −ac a b i C B C C B are eigenvectors. Otherwise, the eigenvectors are B @ − b c A. @ b A, @ −a A∓ √ 2 + b 2 + c2 2 2 a a +b c 0

8.2.6. The eigenvalues are 0 and ± i

8.2.7.

!

!

−1 1 . , 1 0 ! √ √ i (2 ± 5) . (b) Eigenvalues: ± 5; eigenvectors: 1! ! 1 3 −1 + i 5 5 . , (c) Eigenvalues: −3, 2 i ; eigenvectors: 1 1 0 1 −1 C (d) −2 is a simple eigenvalue with eigenvector B @ −2 A; 1 0 1 0 −1 + i 1+ i B C B 0 i is a double eigenvalue with eigenvectors @ A, @ 1 1 0 (a) Eigenvalues: i , −1 + i ;

eigenvectors:

1

C A.

8.2.8. (a) Since O v = 0 = 0 v, we conclude that 0 is the only eigenvalue; all nonzero vectors v 6= 0 are eigenvectors. (b) Since I v = v = 1 v, we conclude that 1 is the only eigenvalue; all nonzero vectors v 6= 0 are eigenvectors. !

!

−1 1 , and . For n = 3, 1 1 0 1 0 1 0 1 −1 −1 1 B C C B C the eigenvalues are 0, 0, 3, and the eigenvectors are B @ 0 A, @ 1 A, and @ 1 A. In general, 1 0 1 the eigenvalues are 0, with multiplicity n − 1, and n, which is simple. The eigenvectors corresponding to the eigenvalue 0 are all nonzero vectors of the form ( v1 , v2 , . . . , vn )T with v1 + · · · + vn = 0. The eigenvectors corresponding to the eigenvalue n are all nonzero vectors of the form ( v1 , v2 , . . . , vn )T with v1 = · · · = vn .

8.2.9. For n = 2, the eigenvalues are 0, 2, and the eigenvectors are

208

♦ 8.2.10. (a) If A v = λ v, then A(c v) = c A v = c λ v = λ (c v) and so c v satisfies the eigenvector equation for the eigenvalue λ. Moreover, since v 6= 0, also c v 6= 0 for c 6= 0, and so c v is a bona fide eigenvector. (b) If A v = λ v, A w = λ w, then A(c v + d w) = c A v + d A w = c λ v + d λ w = λ (c v + d w). (c) Suppose A v = λ v, A w = µ w. Then v and w must be linearly independent as otherwise they would be scalar multiples of each other and hence have the same eigenvalue. Thus, A(c v + d w) = c A v + d A w = c λ v + d µ w = ν(c v + d w) if and only if c λ = c ν and d µ = d ν, which, when λ 6= µ, is only possible when either c = 0 or d = 0. 8.2.11. True — by the same computation as in Exercise 8.2.10(a), c v is an eigenvector for the same (real) eigenvalue λ. ♦ 8.2.12. Write w = x + i y. Then, since λ is real, the real and imaginary parts of the eigenvector equation A w = λ w are A x = λ x, A y = λ y, and hence x, y are real eigenvectors of A. Thus x = a1 v1 + · · · + ak vk , y = b1 v1 + · · · + bk vk for a1 , . . . , ak , b1 , . . . , bk ∈ R, and hence w = c1 v1 + · · · + ck vk where cj = aj + i bj . 0

♦ 8.2.13. (a) A =

0

B B B B B B B B B B B @0

1 0

0 1 0

0 0 1 0 .. .. .. . . . 0 1 0

1

0 0C C

C C C C C. C C 0C C 1A

1 0 0 (b) A A = I by direct computation, or, equivalently, note that the columns of A are the standard orthonormal basis vectors en , e1 , e2 , . . . , en−1 , written in a slightly different order. „ « T

(c) Since

1, e2 k π i /n , e4 k π i /n , . . . , e2 (n−1) k π i /n

ωk =

S ωk =

„

e2 k π i /n , e4 k π i /n , . . . , e2 (n−1) k π i /n,1

T

«T

,

= e2 k π i /n ωk ,

so ωk is an eigenvector with corresponding eigenvalue e2 k π i /n for each k = 0, . . . , n − 1.

“

8.2.14. (a) Eigenvalues: −3, 1, 5; eigenvectors: ( 2, −3, 1 )T , − 23 , −1, 1 (b) tr A = 3 = −3 + 1 + 5, (c) det A = −15 = (−3) · 1 · 5.

”T

, ( 2, 1, 1 )T ;

8.2.15. (a) tr A = 2 = 3 + (−1); det A = −3 = 3 · (−1). (b) tr A = 56 = 21 + 31 ; det A = 16 = 12 · 31 . (c) tr A = 4 = 2 + 2; √det A = 4 = √ 2 · 2. √ √ (d) tr A = 2 = (1 + i 2) + (1 − i 2); det A = 3 = (1 + i 2) · (1 − i 2). (e) tr A = 8 = 4 + 3 √ A = 12 = 4 · 3 · 1. √ √ √+ 1; det 6). (f ) tr A = 1 = 1 + 6 + (− 6); det A = −6 = 1 · 6 · (− √ √ (g) tr A = 2 = 0 + (1 + i ) + (1 − i ); det A = 0 = 0 · (1 + i 2) · (1 − i 2). (h) tr A = 4 = 2 + 2 + 0; det A = 0 = 2 · 2 · 0. (i) tr A = 3 = (−1) + 2 + 2; det A = −4 = (−1) · 2 · 2. (j) tr A = 12 = (−1) + (−1) + 7 + 7; det A = 49 = (−1) · (−1) · 7 · 7. (k) tr A = 10 = 1 + 2 + 3 + 4; det A = 24 = 1 · 2 · 3 · 4.

209

8.2.16. (a) a = a11 + a22 + a33 = tr A, b = a11 a22 − a12 a21 + a11 a33 − a13 a31 + a22 a33 − a23 a32 , c = a11 a22 a33 + a12 a23 a31 + a13 a21 a 32 − a11 a23 a32 − a12 a21 a33 − a13 a22 a31 = det A

(b) When the factored form of the characteristic polynomial is multiplied out, we obtain − (λ − λ1 )(λ − λ2 )(λ − λ3 ) = − λ3 + (λ1 + λ2 + λ3 )λ2 − (λ1 λ2 + λ1 λ3 + λ2 λ3 )λ + λ1 λ2 λ3 , giving the eigenvalue formulas for a, b, c.

8.2.17. If U is upper triangular, so isQU − λ I , and hence p(λ) = det(U − λ I ) is the product of the diagonal entries, so p(λ) = (uii − λ). Thus, the roots of the characteristic equation are u11 , . . . , unn — the diagonal entries of U . ♦ 8.2.18. Since Ja − λ I is an upper triangular matrix with λ − a on the diagonal, its determinant is det(Ja − λ I ) = (a − λ)n and hence its only eigenvalue is λ = a, of multiplicity n. (Or use Exercise 8.2.17.) Moreover, (Ja − a I )v = ( v2 , v3 , . . . , vn , 0 )T = 0 if and only if v = c e1 . ♦ 8.2.19. Parts (a,b) are special cases of part (c): If A v = λ v then Bv = (c A + d I )v = (c λ + d) v. 8.2.20. If A v = λ v then A2 v = λ A v = λ2 v, and hence v is also an eigenvector of A2 with eigenvalue λ2 . !

!

0 0 0 1 , but the eigenand 8.2.21. (a) False. For example, 0 is an eigenvalue of both ! 1 0 0 0 0 1 values of A + B = are ± i . (b) True. If A v = λ v and B v = µ v, then (A + B)v = 1 0 (λ + µ)v, and so v is an eigenvector with eigenvalue λ + µ. 8.2.22. False in general, but true if the eigenvectors coincide: If A v = λ v and B v = µ v, then A B v = (λ µ)v, and so v is an eigenvector with eigenvalue λ µ. ♦ 8.2.23. If A B v = λ v, then B A w = λ w, where w = B v. Thus, as long as w 6= 0, it is an eigenvector of B A with eigenvalue λ. However, if w = 0, then A B v = 0, and so the eigenvalue is λ = 0, which implies that A B is singular. But then so is B A, which also has 0 as an eigenvalue. Thus every eigenvalue of A B is an eigenvalue of B A. The converse follows by the same reasoning. Note: This does not imply that their null eigenspaces (kernels) have the same dimension; compare Exercise 1.8.18. In anticipation of Section 8.6, even though A B and B A have the same eigenvalues, they may have different Jordan canonical forms. ♦ 8.2.24. (a) Starting with A v = λ v, multiply both sides by A−1 and divide by λ to obtain A−1 v = (1/λ) v. Therefore, v is an eigenvector of A−1 with eigenvalue 1/λ. (b) If 0 is an eigenvalue, then A is not invertible. ♦ 8.2.25. (a) If all | λj | ≤ 1 then so is their product 1 ≥ | λ1 . . . λn | = | det A |, which is a contra! 2 0 diction. (b) False. A = has eigenvalues 2, 13 while det A = 32 . 0 13 8.2.26. Recall that A is singular if and only if ker A 6= {0}. Any v ∈ ker A satisfies A v = 0 = 0 v. Thus ker A is nonzero if and only if A has a null eigenvector. 8.2.27. Let v, w be any two linearly independent vectors. Then A v = λ v and A w = µ w for some λ, µ. But v + w is an eigenvector if and only if A(v + w) = λ v + µ w = ν(v + w), which requires λ = µ = ν. Thus, A v = λ v for every v, which implies A = λ I . 8.2.28. If λ is a simple real eigenvalue, then there are two real unit eigenvectors: u and − u. For a complex eigenvalue, if u is a unit complex eigenvector, so is e i θ u, and so there are

210

infinitely many complex unit eigenvectors. (The same holds for a real eigenvalue if we also allow complex eigenvectors.) If λ is a multiple real eigenvalue, with eigenspace of dimension greater than 1, then there are infinitely many unit real eigenvectors in the eigenspace. 8.2.29. All false. Simple ! ! 2 × 2 examples suffice to disprove them: 0 −1 0 −1 has eigenvalue −1; , which has eigenvalues i , − i ; (a) Strt with 1 −2 1 0 ! ! 1 0 0 −4 (b) has eigenvalues 1, −1; (c) has eigenvalues 2 i , −2 i . 0 −1 1 0 8.2.30. False. The eigenvalue equation A v = λ v is not linear in the eigenvalue and eigenvector since A(v1 + v2 ) 6= (λ1 + λ2 )(v1 + v2 ) in general. 8.2.31.

!

!

0

!

1

24 7 0 25 − 25 A. (a) (i) Q = . (ii) Q = @ 7 1 − 24 25 − 25 0 1 0 1 0 1 1 0 0 3 4 C Eigenvalues −1, 1; eigenvectors @ 54 A, @ 53 A. (iii) Q = B @ 0 −1 0 A. Eigenvalue −1 − 0 0 1 5 5 0 1 0 1 0 1 0 1 1 0 0 0 0 1 C B C B C B C has eigenvector: B @ 1 A; eigenvalue 1 has eigenvectors: @ 0 A, @ 0 A. (iv ) Q = @ 0 1 0 A. 0 1 1 0 0 0 0 1 0 1 0 1 −1 1 0 C B C B C Eigenvalue 1 has eigenvector: B @ 0 A; eigenvalue −1 has eigenvectors: @ 0 A, @ 1 A. 1 0 1 (b) u is an eigenvector with eigenvalue −1. All vectors orthogonal to u are eigenvectors with eigenvalue +1.

−1 0

0 . Eigenvalues −1, 1; eigenvectors 1

1 , 0

♦ 8.2.32. h i (a) det(B − λ I ) = det(S −1 A S − λ I ) = det S −1 (A − λ I )S

= det S −1 det(A − λ I ) det S = det(A − λ I ). (b) The eigenvalues are the roots of the common characteristic equation. (c) Not usually. If w is an eigenvector of B, then v = S w is an eigenvector of A and conversely. ! ! 2 0 1 1 −1 (d) Both have 2 as a double eigenvalue. Suppose S, or, equiv= S 0 2 −1 3 ! ! ! x y 2 0 1 1 alently, S . Then, equating entries, S for some S = = z w 0 2 −1 3 we must have x − y = 2 x, x + 3 y = 0, z − w = 0, z + 3 w = 2 w, which implies x = y = z = w = 0, and so S = O, which is not invertible.

8.2.33. (a)

pA−1 (λ) = det(A

−1

"

−1

− λ I ) = det λ A

1 I −A λ

!#

(− λ)n = p det A A

1 λ

!

1 λ

!

.

Or, equivalently, if pA (λ) = (−1)n λn + cn−1 λn−1 + · · · + c1 λ + c0 , then, since c0 = det A 6= 0, pA−1 (λ) = (−1)

(−1)n = c0

c c λ + 1 λn−1 + · · · + n−1 λ c0 c0 n

n 2 4

1 − λ

!n

+ c1

1 − λ 211

!n−1

!

+

1 c0

+ · · · + cn

3 5

(− λ)n = p det A A

.

(b) (i) A−1 =

−2

1

3 2

− 12

!

. Then pA (λ) = λ2 − 5 λ − 2, while

pA−1 (λ) = λ2 + 0 B

B (ii) A−1 = B @

− −

3 5 6 5 4 5

− 45 3 5 − 25

5 2

4 5 − 85 7 5 3

pA−1 (λ) = − λ +

1 2

λ− 1

C C C. A

7 5

=

λ2 2

−

!

5 2 − +1 . λ2 λ

Then pA (λ) = − λ3 + 3 λ2 − 7 λ + 5, while 2

λ −

3 5

λ+

1 5

− λ3 = 5

!

3 7 1 − 3 + 2 − +5 . λ λ λ

♥ 8.2.34. k (a) If A v = λ v!then 0 = Ak v = λ! v and hence λk = 0, so λ = 0. 0 0 0 1 ; has A2 = (b) A = 0 0 0 0 0 1 0 0 1 1 0 0 1 0 1 1 0 0 0 B B C C C 3 2 A=B @ 0 0 1 A has A = @ 0 0 0 A, A = @ 0 0 0 A. 0 0 0 0 0 0 0 0 0 In general, A can be any upper triangular matrix with all zero entries on the diagonal, and all nonzero entries on the super-diagonal. ♥ 8.2.35. (a) det(AT − λ I ) = det(A − λ I )T = det(A − λ I ), and hence A and AT have the same characteristic polynomial, which implies that they have the same eigenvalues. (b) No. See the examples. (c) λ v · w = (A v)T w = vT AT w = µ v · w, so if µ 6= λ, v · w = 0 and the vectors are orthogonal. ! ! 1 −1 − 2 ; the (d) (i) The eigenvalues are 1, 2; the eigenvectors of A are v1 = , v2 = 1 1 ! ! 2 1 eigenvectors of AT are w1 = , w2 = , and v1 , w2 are orthogonal, as are 1 1 v2 , w1 . 0 1 0 1 1 1 C B C (ii) The eigenvalues are 1, −1, −2; the eigenvectors of A are v1 = B @ 1 A, v2 = @ 2 A, v3 = 0 1 0 0 0 1 1 1 0 1 3 2 1 0 B B B C C C B1C T @ 2 A; the eigenvectors of A are w1 = @ −2 A, w2 = @ −2 A, w3 = @ −1 A. Note 1 1 1 1 that vi is orthogonal to wj whenever i 6= j. 8.2.36. (a) The characteristic equation of a 3 × 3 1 matrix is a real cubic polynomial, and hence has at 0 0 1 0 0 B −1 0 0 0C C C has eigenvalues ± i . (c) No, since the characterisB least one real root. (b) B @ 0 0 0 1A 0 0 −1 0 tic polynomial is degree 5 and hence has at least one real root. 8.2.37. (a) If A v = λ v, 0 then v = 1 0 B B 0 −1 4 satisfy λ = 1. (b) B @0 0 0 0

A4 v = 1 λ4 v, and hence, since v 6= 0, all its eigenvalues must 0 0 0 0C C C. 0 −1 A 1 0 212

8.2.38. If P v = λ v then P 2 v = λ2 v. Since P v = P 2 v, we find λ v = λ2 v. Since v 6= 0, it follows that λ2 = λ, so the only eigenvalues are λ = 0, 1. All v ∈ rng P are eigenvectors with eigenvalue 1 since if v = P u, then P v = P 2 u = P u = v, whereas all w ∈ ker P are null eigenvectors. 8.2.39. False. For example,

0

0

B @1

0

0 0 1

1

1 1 0C A has eigenvalues 1, − 2 ± 0

√ 3 2

i.

8.2.40. (a) According to Exercise 1.2.29, if z = ( 1, 1, . . . , 1 )T , then A z is the vector of row sums of A, and hence, by the assumption, A z = z. Thus,o z is an eigenvector with eigenvalue 1. (b) Yes, since the column sums of A are the row sums of AT , and Exercise 8.2.35 says that A and AT have the same eigenvalues. 8.2.41. (a) If Q v = λ v, then QT v = Q−1 v = λ−1 v and so λ−1 is an eigenvalue of QT . Furthermore, Exercise 8.2.35 says that a matrix and its transpose have the same eigenvalues. (b) If Q v = λ v, then, by Exercise 5.3.16, k v k = k Q v k = | λ | k v k, and hence | λ | = 1. Note that this proof also applies to complex eigenvalues/eigenvectors, with k · k denoting the Hermitian norm in C n . (c) Let λ = e i θ be the eigenvalue. Then e i θ vT v = (Q v)T v = vT QT v = vT Q−1 v = e− i θ vT v. Thus, if e i θ 6= e− i θ , which happen if and only if it is not real, then “ ” 0 = vT v = k x k2 − k y k2 + 2 i x · y, and so the result follows from taking real and imaginary parts of this equation. ♦ 8.2.42. (a) According to Exercise 8.2.36, a 3 × 3 orthogonal matrix has at least one real eigenvalue, which by Exercise 8.2.41 must be ±1. If the other two eigenvalues are complex conjugate, µ± i ν, then the product of the eigenvalues is ±(µ2 +ν 2 ). Since this must equal the determinant of Q, which by assumption, is positive, we conclude that the real eigenvalue must be +1. Otherwise, all the eigenvalues of Q are real, and they cannot all equal −1 as otherwise its determinant would be negative. (b) True. It must either have three real eigenvalues of ±1, of which at least one must be −1 as otherwise its determinant would be +1, or a complex conjugate pair of eigenvalues λ, λ, and its determinant is −1 = ± | λ |2 , so its real eigenvalue must be −1 and its complex eigenvalues ± i . ♦ 8.2.43. (a) The axis of the rotation is the eigenvector v corresponding to the eigenvalue +1. Since Q v = v, the rotation fixes the axis, and hence must rotate around it. Choose an orthonormal basis u1 , u2 , u3 , where u1 is a unit eigenvector in the direction of the axis of eigenvector for the eigenvalue e i θ . In this basis, Q rotation, while u20+ i u3 is a complex 1 1 0 0 C has matrix form B @ 0 cos θ − sin θ A, where θ is the angle of rotation. 0 sin θ 0 cos θ 1 2 C (b) The axis is the eigenvector B @ −5 A for the eigenvalue 1. The complex eigenvalue is 1 √ 7 13

+i

2 30 13 ,

and so the angle is θ = cos−1

7 13

≈ 1.00219.

8.2.44. In general, besides the trivial invariant subspaces {0} and R 3 , the axis of rotation and 213

its orthogonal complement plane are invariant. If the rotation is by 180 ◦ , then any line in the orthogonal complement plane, as well as any plane spanned by such a line and the axis of rotation are also invariant. If R = I , then every subspace is invariant. 8.2.45. (a) (Q − I )T (Q − I ) = QT Q − Q − QT + I = 2 I − Q − QT = K and hence K is a Gram matrix, which is positive semi-definite by Theorem 3.28. (b) The Gram matrix is positive definite if and only if ker(Q − I ) = {0}, which means that Q does not have an eigenvalue of 1. ♦ 8.2.46. If Q = I , then we have a translation. Otherwise, we decompose b = c + d, where c ∈ rng (Q − I ) while d ∈ coker(Q − I ) = ker(QT − I ). Thus, c = (Q − I )a, while QT d = d, and so d = Q d, so d belongs to the axis of the rotation represented by Q. Thus, referring to (7.41), F (x) = Q(x − a) + a + d represents either a rotation around the center point a, when d = 0, or a screw around the line in the direction of the axis of Q passing through the point a, when d 6= 0. ♥ 8.2.47.

!

0 1 (a) M2 = 1 0 0 0 1 M3 = B @1 0 0 1 (b) The j th entry

: eigenvalues 1, −1; eigenvectors 1

!

1 , 1

!

−1 ; 1 0

1 0

1 0

1

0 1 −1 1 √ √ B √ C B C B√ C 1C A: eigenvalues − 2, 0, 2; eigenvectors @ − 2 A, @ 0 A, @ 2 A. 1 1 1 0 of the eigenvalue equation Mn vk = λk vk reads (j − 1) k π (j + 1) k π kπ jkπ + sin = 2 cos sin , sin n+1 n+1 n+1 n+1 α−β α+β which follows from the trigonometric identity sin α + sin β = 2 cos sin . 2 2 These are all the eigenvalues because an n × n matrix has at most n distinct eigenvalues.

♦ 8.2.48. We have A = a I + b Mn , so by Exercises 8.2.19 and 8.2.47 it has the same eigenvectors kπ for k = 1, . . . , n. as Mn , while its corresponding eigenvalues are a + b λk = a + 2 b cos n+1 ♥ 8.2.49. For k = 1, . . . , n, 2kπ λk = 2 cos , vk = n

cos

2kπ 4kπ 6kπ , cos , cos , n n n !

!

. . . , cos !

2 (n − 1) k π , 1 n

!T

.

!

v v Av v = 8.2.50. Note first that if A v = λ v, then D is an eigen, and so =λ 0 0 0 0 vector for D with eigenvalue λ. Similarly, each eigenvalue µ and eigenvector w of B gives ! 0 of D. Finally, to check that D has no other eigenvalue, we compute an eigenvector w ! ! ! ! v Av λv v D = = =λ and hence, if v 6= 0, then λ is an eigenvalue of A, w Bw λw w while if w 6= 0, then it must also be an eigenvalue for B. ♥ 8.2.51. (a) Follows by direct computation: ! ! ! ! 2 0 0 1 0 a b a + b c a b + b d . = + (a d − b c) pA (A) = − (a + d) 0 0 0 1 c d a c + c d b c + d2 (b) by part (a), O = A−1 pA (A) = A − (tr A) I + (det A)A−1 , and the formula follows upon solving for A−1 . (c) tr A = 4, det A = 7 and one checks A2 − 4 A + 7 I = O. ♥ 8.2.52. 214

(a) B v = (A − v bT )v = A v − (b · v)v = (λ − β)v. (b) B (w + c v) = (A − v bT )(w + c v) = µ w + (c (λ − β) − b · w)v = µ (w + c v) provided c = b · w/(λ − β − µ). (c) Set B = A−λ1 v1 bT where v1 is the first eigenvector of A and b is any vector such that b · v1 = 1. For example, we can set b = v1 /k v1 k2 . (Weilandt deflation, [ 10 ], chooses b = rj /(λ1 v1,j ) where v1,j is any nonzero entry of v1 and rj is the corresponding row of A.) ! ! 1 −3 (d) (i) The eigenvalues of A are 6, 2 and the eigenvectors , . The deflated ma1 1 ! ! ! λ1 v1 v1T 0 1 0 0 . , has eigenvalues 0, 2 and eigenvectors = trix B = A− 1 1 −2 2 k v1 k2 (ii) The eigenvalues of A are 4, 3, 1 and the eigenvectors

flated matrix B = A − 0

1 0

λ1 v1 v1T = k v1 k2 1 0

1

−1 1 1 C B C C B eigenvectors B @ −1 A, @ 0 A, @ 2 A. 1 1 1

0 B B B @

−

5 3 1 3 4 3

1 3 2 3 1 3

0

1

1 0

B C B @ −1 A, @

− 43

1

C 1C C 3A 5 3

1

1 0

1

1 −1 B C 0C A, @ 2 A. The de1 1

has eigenvalues 0, 3, 1 and

8.3.1. (a) Complete; dim = 1 with basis ( 1, 1 )T . (b) Not complete; dim = 1 with basis ( 1, 0 )T . (c) Complete; dim = 1 with basis ( 0, 1, 0 )T . (d) Not an eigenvalue. (e) Complete; dim = 2 with basis ( 1, 0, 0 )T , ( 0, −1, 1 )T . (f ) Complete; dim = 1 with basis ( i , 0, 1 )T . (g) Not an eigenvalue. (h) Not complete; dim = 1 with basis ( 1, 0, 0, 0, 0 )T . 8.3.2.

(b) Eigenvalues: (c) Eigenvalues: (d) Eigenvalues: (e) Eigenvalue 3 (f ) Eigenvalue 2 (g) Eigenvalue 3

!

2 ; not complete. 1 ! ! 1 2 ; complete. , 2, −2; eigenvectors: 1 1 ! 1± i 1 ± 2 i ; eigenvectors: ; complete. 2 ! ! i −i ; complete. , 0, 2 i ; eigenvectors: 1 1 0 1 0 1 1 1 C B C has eigenspace basis B @ 1 A, @ 0 A; not complete. 0 1 0 1 1 0 1 0 −1 0 −2 B C B C C has eigenspace basis B @ 0 A, @ 1 A; eigenvalue −2 has @ −1 A; complete. 1 0 1 0 1 0 1 0 −1 C B C has eigenspace basis B @ 1 A; eigenvalue −2 has @ 1 A; not complete. 1 1

(a) Eigenvalue: 2; eigenvector:

215

(h) Eigenvalue 0 has

0

0

1

B C B0C B C; @0A

1

eigenvalue −1 has

0

0

1

B C B0C B C; @1A

eigenvalue 1 has

11 1 0 1 2 B C B 0 C B −1 C C B C, B C; eigenvalue 2 has (i) Eigenvalue 0 has eigenspace basis B @1A @ 0A 0 1 0

8.3.3.

!

−1 , 1

(a) Eigenvalues: −2, 4; the eigenvectors

!

1 1

sion is 0.

(c) Eigenvalue: 1; there is only one eigenvector v1 = space of R 2 . (d)

(e)

(f )

(g)

(h)

0

1

1 0

1

1

B C B C B3C B1C B C, B C; @1A @0A

complete.

0 1 1 −1 C B B 1C B C; not complete. @ −5 A 1 0

form a basis for R 2 .

!

i , 1

(b) Eigenvalues: 1 − 3 i , 1 + 3 i ; the eigenvectors

0

1 0

−i 1 !

!

, are not real, so the dimen-

spanning a one-dimensional sub-

1

0

1

1 1 C B C The eigenvalue 1 has eigenvector B @ 0 A, while the eigenvalue −1 has eigenvectors @ 1 A, 2 0 1 0 0 C B 3 @ −1 A. The eigenvectors form a basis for R . 0 0 1 0 1 1 0 B C C The eigenvalue 1 has eigenvector @ 0 A, while the eigenvalue −1 has eigenvector B @ 0 A. 0 1 The eigenvectors span a two-dimensional subspace0of R 31. 0 1 0 1 0 0 8 C B C B C The eigenvalues are −2, 0, 2. The eigenvectors are B @ −1 A, @ 1 A, and @ 5 A, forming a 1 1 7 3 basis for R . 0 1 0 1 0 1 −i i 0 C B C B C The eigenvalues are i , − i , 1. The eigenvectors are B @ 0 A, @ 0 A and @ 1 A. The real 1 1 0 eigenvectors span only a one-dimensional subspace of R 3 . 1 0 1 0 1 0 −1 4 0 B C B C B 1C B3C B i C C C, B C, B C, B The eigenvalues are −1, 1, − i − 1, − i + 1. The eigenvectors are B @0A @2A @−i A 1 0 1 6 0 −1 C B B−i C 4 C. The real eigenvectors span a two-dimensional subspace of R . B @ i A 1

8.3.4. Cases (a,b,d,f,g,h) have eigenvector bases of C n . 8.3.5. Examples:

0

1 (a) B @0 0

1 1 0

1

0 1C A, 1

0

1 (b) B @0 0

0 1 0

1

0 1C A. 1

8.3.6. (a) True. The standard basis vectors !are eigenvectors. 1 1 is incomplete since e1 is the only eigenvector. (b) False. The Jordan matrix 0 1

216

8.3.7. According to Exercise 8.2.19, every eigenvector of A is an eigenvector of c A + d I with eigenvalue c λ + d, and hence if A has a basis of eigenvectors, so does c A + d I . 8.3.8. (a) Every eigenvector of A is an eigenvector of A2 with eigenvalue λ2 , and hence if A ! 0 1 with A2 = O. has a basis of eigenvectors, so does A2 . (b) A = 0 0 ♦ 8.3.9. Suppose A v = λv. Write v =

n X

i=1

ci vi . Then A v =

independence, λi ci = λ ci . Thus, either λ = λi or ci = 0.

n X

i=1

ci λi vi and hence, by linear

8.3.10. (a) If A v = λ v, then, by induction, An v = λn v, and hence v is an eigenvector with eigenvalue λn . (b) Conversely,√if A is complete and An has eigenvalue µ, then at least one of its complex nth roots λ = n µ is an eigenvalue of A. Indeed, the eigenvector basis of A is an eigenvector basis of An , and hence, using Exercise 8.3.9, every eigenvalue of An is the nth power of an eigenvalue of A. ♦ 8.3.11. As in Exercise 8.2.32, if v is an eigenvector of A then S −1 v is an eigenvector of B. Moreover, if v1 , . . . , vn form a basis, so do S −1 v1 , . . . , S −1 vn ; see Exercise 2.4.21 for details. 8.3.12. According to Exercise 8.2.17, its only eigenvalue is λ, the common value of its diagonal entries, and so all eigenvectors belong to ker(U − λ I ). Thus U is complete if and only if dim ker(U − λ I ) = n, which happens if and only if U − λ I = O. 8.3.13. Let V = ker(A − λ I ). If v ∈ V , then A v ∈ V since (A − λ I )A v = A(A − λ I )v = 0. ♦ 8.3.14. (a) Let v = x + i y, w = v = x − i y be the corresponding eigenvectors, so x = y = − 12 i v + 21 i w. Thus, if c x + d y = 0 where c, d are real scalars, then “

”

“

1 2

v+

1 2

w,

”

c − 21 i d v + 12 c + 21 i d w = 0. Since v, w are eigenvectors corresponding to distinct eigenvalues µ + i ν 6= µ − i ν, Lemma 8.13 implies they are linearly independent, and hence 1 1 1 1 2 c − 2 i d = 2 c + 2 i d = 0, which implies c = d = 0. (b) Same reasoning: xj = 12 vj + 12 wj , y = − 12 i vj + 21 i wj , where vj , wj = vj are the complex eigenvectors corresponding to the eigenvalues µj + i νj , µj − i νj . Thus, if 0 = c1 xk + d1 y1 + · · · + ck xk + dk yk =

“

1 2 c1

−

1 2

”

i d1 v1 +

“

1 2 c1

1 2

+

1 2

”

i d1 w1 + · · · +

“

1 2 ck

−

1 2

”

i dk vk +

“

1 2 ck

+

1 2

”

i dk wk ,

then, again by Lemma 8.13, the complex eigenvectors v1 , . . . , vk , w1 , . . . , wk are linearly independent, and so 21 c1 − 12 i d1 = 21 c1 + 12 i d1 = · · · = 21 ck − 12 i dk = 21 ck + 12 i dk = 0. This implies c1 = · · · = ck = d1 = · · · = dk = 0, proving linear independence of x1 , . . . , xk , y1 , . . . , yk .

8.3.15. In all cases, A = S Λ S −1 . !

(a) S =

3 1

3 , Λ= 2

0 0

(b) S =

2 1

1 , Λ= 1

3 0

(c) S =

− 53 + 1

!

1 5

i

− 35 − 1

!

0 . −3 !

0 . 1

1 5

i

!

, Λ=

−1 + i 0 217

0 −1 − i

!

.

(d) S =

0

1

0

0

0 (e) S = B @1 0

1

0

1

0

1 − 10 −2 C 1 C, Λ = B 0 @ −2 A 0 1

1 1 0

B B @0

21 −10 7

0

1 0 B 6C A, Λ = @ 0 3 0

1

0 1 0

0 0C A. 3 1

0 7 0 1

0 0C A. −1

0

2 + 3i − 1 − 53 i − 15 + 35 i −1 B C B 5 (f ) S = @ −1 −1 0 A, Λ = @ 0 0 1 1 1 0 1 0 1 1 0 −1 2 0 0 B (g) S = B 0C 0C @ 0 −1 A, Λ = @ 0 2 A. 0 1 1 0 0 −3 (h) S =

0

1

−4

12 0 B B −1 (i) S = B @ 0 1 0

0

−1 B B 1 (j) S = B B @ 0 0 1 1

8.3.16.

1 0

!

3 2

i 1 −2 − 2i 1+ i 1

1 −3 0 0

= !

0

0

− 23 i − 21 + 2 i 1− i 1

1

0

1 B B0 Λ=B @0 0

C C C, C A

10 0 AB √ @ 1− 5 0 2 1 !0 0 @ − 2i 12 A ; i 1 −i 2 2

√ √ 10 1+ 5 1− 5 2 A@ 2

√ 1+ 5 @ 2

1

1

!

0 0 C A. −2

1

3 1 0 −2 0 0 0 B 2 0 1C 0 −1 0 0C C C B C. C, Λ = B @ 0 6 0 0A 0 1 0A 0 0 0 0 0 0 2 1 0 1 −1 0 0 0 −1 0 1 C B C 0 1 0C B 0 −1 0 0 C C, Λ = B C. @ 0 0 1 0A 1 0 1A 0 0 0 1 0 1 0

B B −3 B @ 0

1

0 2 − 3i 0

0 −1 0 0

1

0 0 i 0

√1 5 1 −√ 5

0 0C C C. 0A −i

√ 1

− 1−√ 5

2 √5 C A. 1+√ 5 2 5

0 −1 i i −i 8.3.17. a rotation does not stretch any real = 1 0 0 1 1 vectors, but somehow corresponds to two complex stretches. 8.3.18. 0

0 (a) B @1 0

(b)

0 B B @

1

−1 0 0 5 13

0 − 12 13

0

0 i B 0C A=@1 1 0 0 1 0

1 12 13 C 0C A 5 13

0 B

=B @

10

−i 1 0

0 i B 0C A@ 0 1 0

−i 0 1

i 0 1

1

0

0CB B B 1C AB @ 0

0 −i 0

5+12 i 13

0 0

10

i

0 B−2 i 0C AB @ 2 1 0 0 5−12 i 13

0

1 2 1 2

1

0 C 0C A,

0 1 12

0

8.3.19. (a) Yes: distinct real eigenvalues −3, √ 2. (b) No: complex eigenvalues 1 ± i 6.√ (c) No: complex eigenvalues 1, − 12 ± 25 i . (d) No: incomplete eigenvalue 1 (and complete eigenvalue −2). (e) Yes: distinct real eigenvalues 1, 2, 4. (f ) Yes: complete real eigenvalues 1, −1.

218

i

0C 2 CB CB B − i 0C 2 A@ 0 1

0 0 1

1 2 1 2

0

1

C C C. A

−1 8.3.20. In all cases, A . ! = S ΛS ! 1 −1 1+ i 0 , Λ= (a) S = . 1 1 0 −1 + i ! ! −2 + i 0 1− i 0 , Λ= . (b) S = 1 1 0 2+ i ! ! 1 1 4 0 (c) S = 1 − 2 − 2 i , Λ = . 0 −1 1 1 0 1 0 1 0 0 1 −1 B 1 C 3 , Λ = + i 0 1 − i (d) S = B @ A @ 1 −1 − i 5 5 0 0 0 1 1

0 0 −1 − i

1

C A.

8.3.21. Use the formula A = S Λ S −1 . For parts (e,f ) you can choose any other eigenvalues and eigenvectors you want to fill in S0and Λ. 1 ! ! 6 6 −2 7 4 3 0 , , (b) B (c) (a) 0C @ −2 −2 A, −8 −5 0 3 6 6 −4 0 1 0 1 1 ! 0 0 4 3 4 2 0C B C B 1 − 3 , (d) (e) example: @ 0 1 0 A, (f ) example: @ −2 3 0 A. 6 −3 0 0 −2 −2 −2 0 ! ! ! −1 0 11 −6 − 4 −6 . , (c) , (b) 8.3.22. (a) 0 2 18 −10 3 5 ♦ 8.3.23. Let S1 be the eigenvector matrix for A and S2 the eigenvector matrix for B. Thus, by the hypothesis S1−1 A S1 = Λ = S2−1 B S2 and hence B = S2 S1−1 A S1 S2−1 = S −1 A S where S = S1 S2−1 . 8.3.24. The hypothesis says B = P A P T = P A P −1 where P is the corresponding permutation matrix, and we are using Exercise 1.6.14 to identify P T = P −1 . Thus A and B are similar matrices, and so, according to Exercise 8.2.32, have the same eigenvalues. If v is an eigenvector fofr A, then the vector w = P v obtained by permuting its entries is an eigenvector for B with the same eigenvalue, since B w = P A P T P v = P A v = λ P v = λ w. 8.3.25. True. Let λj = ajj denote the j th diagonal entry of A, which is the same as the j th eigenvalue. We will prove that the corresponding eigenvector is a linear combination of e1 , . . . , ej , which is equivalent to the eigenvector matrix S being upper triangular. We use induction on the size n. Since A is upper triangular, it leaves the subspace V spanned by e1 , . . . , en−1 invariant, and hence its restriction to the subspace is represented by an (n − 1) × (n − 1) upper triangular matrix. Thus, by induction and completeness, A possesses n − 1 eigenvectors of the required form. The remaining eigenvector vn cannot belong to V (otherwise the eigenvectors would be linearly dependent) and hence must involve e n . 8.3.26. The diagonal entries are all eigenvalues, and so are obtained from each other by permutation. If all eigenvalues are distinct, then there are n ! different diagonal forms — othern! distinct wise, if it has distinct eigenvalues of multiplicities j1 , . . . jk , there are j1 ! · · · jk ! diagonal forms. 2 2 8.3.27. Let A = S Λ S −1 . Then are ±1. ! ! A = I if and only if Λ = I , and so all its!eigenvalues 1 1 3 −2 ; or, even , with eigenvalues 1, −1 and eigenvectors Examples: A = 2 1 4 −3 ! 0 1 simpler, A = . 1 0

219

♥ 8.3.28. (a) If A = S Λ S −1 and B = S D S −1 where Λ, D are diagonal, then A B = S Λ D S −1 = S D Λ S −1 = B A, since diagonal matrices commute. (b) According to Exercise 1.2.12(e), the only matrices that commute with an n × n diagonal matrix with distinct entries is another diagonal matrix. Thus, if A B = B A, and A = S Λ S −1 where all entries of Λ are distinct, then D = S −1 B S commutes with Λ and hence is a diagonal matrix. ! 0 1 (c) No, the matrix commutes with the identity matrix, but is not diagonalizable. 0 0 See also Exercise 1.2.14.

8.4.1.

!

!

1 1 2 1 ,√ . (a) Eigenvalues: 5, −10; eigenvectors: √ 5 1! 5 −2 ! 1 1 −1 1 (b) Eigenvalues: 7, 3; eigenvectors: √ , √ . 1 2 2 1 √ √ 7 + 13 7 − 13 , ; (c) Eigenvalues: 0 0 1 √ 2 2 3− 13 2 2 eigenvectors: q √ @ 2 A, q √ @ 1 26 − 6 13 26 + 6 13 0 4 1 0 1 0 4 1 √ − √ −3

(d) Eigenvalues: 6, 1, −4;

(e) Eigenvalues: 12, 9, 2;

eigenvectors:

eigenvectors:

B5 2C B B 3 C B B √ C, B B5 2C @ A @ √1 2 0 1 0 √1 6 B C B B C B B − √1 C, B C B B 6 A @ @ 2 √ 6

5 4 5

0 −

1 √ 3+ 13 A. 2

1

5 2C 3 C C. − √ 5 2C A √1 2 1 1 0 √1 C B 2C C C B C, B √1 C. C C B A @ 2A

C B C B C, B A B @

√1 3 √1 3 √1 3

0

8.4.2. √ (a) Eigenvalues 52 ± 21 17; positive definite. (b) Eigenvalues −3, 7; not positive definite. √ (c) Eigenvalues 0, 1, 3; positive semi-definite. (d) Eigenvalues 6, 3 ± 3; positive definite. 8.4.3. Use the fact that K = − N is positive definite and so has all positive eigenvalues. The eigenvalues of N = − K are − λj where λj are the eigenvalues of K. Alternatively, mimic the proof in the book for the positive definite case. 8.4.4. If all eigenvalues are distinct, there are 2n different bases, governed by the choice of sign in each of the unit eigenvectors ± uk . If the eigenvalues are repeated, there are infinitely many, since any orthonormal basis of each eigenspace will contribute to an orthonormal eigenvector basis of the matrix. 8.4.5. (a) The characteristic equation p(λ) = λ2 − (a + d)λ + (a d − b c) = 0 has real roots if and only if its discriminant is non-negative: 0 ≤ (a + d)2 − 4 (a d − b c) = (a − d)2 + 4 b c, which is the necessary and sufficient condition for real eigenvalues. (b) If A is symmetric,!then b = c and so the discriminant is (a − d)2 + 4 b2 ≥ 0. 1 1 (c) Example: . 0 2

220

♥ 8.4.6. (a) If A v = λ v and v 6= 0 is real, then

λ k v k2 = (A v) · v = (A v)T v = vT AT v = − vT A v = − v · (A v) = − λ k v k2 , and hence λ = 0. (b) Using the Hermitian dot product, λ k v k2 = (A v) · v = vT AT v = − vT A v = − v · (A v) = − λ k v k2 ,

and hence λ = − λ, so λ is purely imaginary. (c) Since det A = 0, cf. Exercise 1.9.10, at 0 least one of the1eigenvalues of A must be 0. 0 c −b 3 2 2 2 (d) The characteristic polynomial of A = B 0 aC @ −c A is − λ + λ(a + b + c ) and b −a 0 √ 2 2 2 hence the eigenvalues are 0, ± i a + b + c , and so are √ all zero if and only if A = O. (e) The eigenvalues are: (i) ± 2 i , (ii) 0, ± 5 i , (iii) 0, ± 3 i , (iv ) ± 2 i , ± 3 i . ♥ 8.4.7. (a) Let A v = λ v. Using the Hermitian dot product, λk v k2 = (A v) · v = vT AT v = vT A v = v · (A v) = λk v k2 ,

and hence λ = λ, which implies that the eigenvalue λ is real. (b) Let A v = λ v, A w = µ w. Then

λ v · w = (A v) · w = vT AT w = vT A w = v · (A w) = µ v · w, since µ is real. Thus, if λ 6= µ then v · w = 0.√ ! ! √ √ (c) (i) Eigenvalues ± 5; eigenvectors: (2 − 5) i , (2 + 5) i . 1! 1 ! 2− i −2 + i (ii) Eigenvalues 4, −2; eigenvectors: , . 1 5 0 1 0 1 0 1 1 −1 −1 √ √ C B C B √ C B (iii) Eigenvalues 0, ± 2; eigenvectors: @ 0 A, @ i 2 A, @ − i 2 A. 1 1 1

♥ 8.4.8. (a) Rewrite (8.31) as M −1 K v = λ v, and so v is an eigenvector for M −1 K with eigenvalue λ. The eigenvectors are the same. (b) M −1 K is not necessarily symmetric, and so we can’t use Theorem 8.20 directly. If v is an generalized eigenvector, then since K, M are real matrices, K v = λ M v. Therefore, λ k v k2 = λ vT M v = (λ M v)T v = (K v)T v = vT (K v) = λ vT M v = λ k v k2 , and hence λ is real. (c) If K v = λ M v, K w = µ M w, with λ, µ and v, w real, then λ h v , w i = (λ M v)T w = (K v)T w = vT (K w) = µ vT M w = µ h v , w i, and so if λ 6= µ then h v , w i = 0, proving orthogonality. (d) If K > 0, then λ h v , v i = vT (λ M v) = vT K v > 0, and so, since M is positive definite, λ > 0. (e) Part (b) proves that the eigenvectors are orthogonal with respect to the inner product induced by M , and so the result follows immediately from Theorem 5.5. 8.4.9. (a) Eigenvalues:

5 1 3, 2;

eigenvectors:

(b) Eigenvalues: 2, 12 ; eigenvectors:

!

!

1 −3 , 2 . 1 1! ! 1 1 − 2 . , 1 1

221

1 2

(c) Eigenvalues: 7, 1; eigenvectors:

!

10

!

1 . 0

,

1 0

1 0

1

6 −6 2 C B C B C (d) Eigenvalues: 12, 9, 2; eigenvectors: B @ −3 A, @ 3 A, @ 1 A. 41 0 21 0 0 1 0 1 −1 1 C B C B 1C (e) Eigenvalues: 3, 1, 0; eigenvectors: B @ −2 A, @ 0 A, @ − 2 A. 1 10 1 0 1 1 1 −1 C B C (f ) 2 is a double eigenvalue with eigenvector basis B @ 0 A, @ 1 A, while 1 is a simple eigen0 1 1 0 2 C value with eigenvector B @ −2 A. For orthogonality you need to select an M orthogonal 1 basis of the two-dimensional eigenspace, say by using Gram–Schmidt. ♦ 8.4.10. If L[ v ] = λ v, then, using the inner product,

λ k v k2 = h L[ v ] , v i = h v , L[ v ] i = λ k v k2 , which proves that the eigenvalue λ is real. Similarly, if L[ w ] = µ w, then λ h v , w i = h L[ v ] , w i = h v , L[ w ] i = µ h v , w i, and so if λ 6= µ, then h v , w i = 0.

♦ 8.4.11. As shown in the text, since yi ∈ V ⊥ , its image A yi ∈ V ⊥ also, and hence A yi is a linear combination of the basis vectors y1 , . . . , yn−1 , proving the first statement. Furthermore, since y1 , . . . , yn−1 form an orthonormal basis, by (8.30), bij = yi · A yj = (A yi ) · yj = bji . ♥ 8.4.12. (a)

0 B B B B B B B B B B B @

−1 0

1 −1 0

0 1 −1 .. .

0 0 1 .. . 0

0 1

0 .. . −1 0

0

(b) Using Exercise 8.2.13(c),

.. 1 −1 0

.

01 0C C

C C C C C. C C 0C C 1A

−1

“

”

∆ωk = (S − I ) ωk = 1 − e2 k π i /n ωk ,

and so ωk is an eigenvector of ∆ with corresponding eigenvalue 1 − e2 k π i /n . (c) Since S is an orthogonal matrix, S T = S −1 , and so S T ωk = e− 2 k π i /n ωk . Therefore, K ωk = (S T − I )(S − I )ωk = (2 I − S − S T )ωk “

”

= 2 − e−2 k π i /n − e− 2 k π i /n ωk =

2 − 2 cos

kπ i n

!

ωk ,

kπ i . n (d) Yes, K > 0 since its eigenvalues are all positive; or note that K = ∆T ∆ is a Gram matrix, with ker ∆ = {0}. (n − k) π i kπ i = 2 − 2 cos for k 6= 21 n is double, with a two(e) Each eigenvalue 2 − 2 cos n n dimensional eigenspace spanned by ωk and ωn−k = ωk . The corresponding real eigenvectors are Re ωk = 12 ωk + 12 ωn−k and Im ωk = 21i ωk − 21i ωn−k . On the other hand, and hence ωk is an eigenvector of K with corresponding eigenvalue 2 − 2 cos

222

if k = 12 n (which requires that n be even), the eigenvector ωn/2 = ( 1, −1, 1, −1, . . . )T is real. ♥ 8.4.13. (a) The shift matrix has c1 = 1, ci = 0 for i 6= 1; the difference matrix has c0 = −1, c1 = 1, and ci = 0 for i > 1; the symmetric product K has c0 = 2, c1 = cn−1 = −1, and ci = 0 for 1 < i < n − 2; (b) The eigenvector equation C ωk = (c0 + c1 e2 k π i /n + c2 e4 k π i /n + · · · + +cn−1 e2 (n−1) k π i /n ) ωk can either be proved directly, or by noting that C = c0 I + c1 S + c2 S 2 + · · · + cn−1 S n−1 ,

and using Exercise 8.2.13(c). “ ” (c) This follows since the individual columns of Fn = ω0 , . . . , ωn−1 are the sampled exponential eigenvectors, and so the columns of the matrix equation C Fn = Λ Fn are the eigenvector equations C ωk = λk ωk ! for k = ! 0, . . . , n − 1. 1 1 . , (d) (i) Eigenvalues 3, −1; eigenvectors −1 1 0 1 0 0 1 1√ 1√ C B 1 B √ √ C B B C B 3 3 3 3 1 3 1 , B − 2 − 23 i (ii) Eigenvalues 6, − 2 − 2 i , − 2 + 2 i ; eigenvectors @ 1 A, B −2 + 2 i C A @ @ √ √ 1 1 1 3 3 − − − i + 21 0 2 i 1 20 0 1 0 12 1 1 1 1 C B B C B C C B B 1 C B i C B −1 C B − i C C, B C, B C. C, B (iii) Eigenvalues 0, 2 − 2 i , 0, 2 + 2 i ; eigenvectors B @ 1 A @ −1 A @ 1 A @ −1 A −1i 0 −1 1 i 1 0 1 0 1 0 1 1 1 1 C C B B B C B 1C B i C C B −1 C B − i C C. C, B C, B B C, B (iv ) Eigenvalues 0, 2, 4, 2; eigenvectors B @ 1 A @ −1 A @ 1 A @ −1 A i √ −1 −i 1 √ √ √ 7+ 5 7+ 5 7− 5 7− 5 (e) The eigenvalues are (i) 6, 3, 3; (ii) 6, 4, 4, 2; (iii) 6, 2 , 2 , 2 , 2 ; (iv ) in 2kπ the n × n case, they are 4 + 2 cos for k = 0, . . . , n − 1. The eigenvalues are real and n positive because the matrices are positive definite. (f ) Cases (i,ii) in (d) and all matrices in part (e) are invertible. In general, an n × n circulant matrix is invertible if and only if none of the roots of the polynomial c0 +c1 x+· · ·+ cn−1 xn−1 = 0 is an nth root of unity: x 6= e2 k π/n .

8.4.14. −3 4

(a)

2 −1

(b)

(c)

0

1

B @1

0

4 3

!

0

√1 B 5 @ 2 √ 5 0

− √2

1

5 0

0 −5

0 ! B @

√1 5 √2 5

1 √2 5C A. √1 5

5C A √1 − 5 √ √ √ 1 0 2 1+ 2 1− 2 ! √1− √ ! √ √ √ √ √ C 3+ 2 B 4−2 2 −1 0 B 4−2 4+2 2 C B B √ 2 √ @ =@ A 2 4 √ 1 √ √ 1 √ 0 3− 2 √1+ √ 4−2 2 4+2 2 4+2 2 0 1 0 1 1 0 √ √1 √1 √2 √1 1 − √1 6 2 3 6 6 6 3 0 0 B C B 1 0 CB B B 2 C 1 1 C 1 B C B C B √ √ √ √ 0 − 0 2 1A = B 6 @ 0 1 0 AB − 2 3C 2 @ A @ 1 1 1 1 1 1 1 0 0 0 √ √ √ √ √1 −√ 6 3 3 2 3 3

=

223

1 1 √ C 4−2 2 C A. √ 1 √ 4+2 2 1

√

C C C. C A

1

C C C. A

(d)

0

3

−1 2 0

B @ −1

−1

8.4.15.

1

0

B −1 B 0C A=B B @ 2

−

√2 6 √1 6 √1 6

0 − √1

2 √1 2

√1 3 √1 3 √1 3

1 0 C 4 CB CB 0 C@ A

0

0 2 0

1

0

0 CB B B 0C AB @ 1

−

√2 6

0 √1 3

√1 6 − √1 2 √1 3

√1 6 √1 2 √1 3

1

C C C. C A

0

0 1 1 ! √2 √1 √1 0 B 5 5C 5 5C = (a) @ 1 A A. √ 0 −10 − √2 − √2 5 5 1 1 05 ! ! √1 √1 √1 √1 − − 5 −2 5 0 B 2 2C 2 2C (b) =B @ A A. @ −2 5 √1 √1 √1 √1 0 −10 2 2 2 2 √ √ √ 1 0 0 √ 0 1 √3− 13 3+ 13 3− 13 ! √ 2 √ √ √ √ √ 7+ 13 C B 26−6√13 B 26−6 13 0 2 −1 26−6 13 26+6 13 2 B C B @ A √ √ (c) =@ A 2 −1 5 7− 13 @ √3+ 13 √ 2 √ √ 2 √ √ 0 √ √ 2 26−6 13 26+6 13 26+6 13 26+6 13 1 0 1 0 4 4 0 1 3 4 √ 0 1 − 35 − √ √ √ √1 5 2C 6 0 0 B B5 2 5 2 5 2 2C 1 0 4 B C B 3 CB C 4 3 C 4 3 CB 0 1 CB √ √ (d) B @0 1 3A = B 0C − 0 C, C B5 2 B @ A 5 5 5 5 2 A A @ @ 4 3 1 4 3 1 1 1 √ 0 0 −4 − √ − √ √ √ 0 5 2 5 2 2 2 2 0 1 1 0 0 1 √1 √1 √1 √2 0 1 − √1 − √1 6 3 2 6 6 6 12 0 0 B C C B 6 −4 1 B CB C CB √1 √1 √1 CB 0 9 0 CB − √1 √1 √1 C. (e) B 6 −1 C @ −4 A=B B − 6 C C B @ A 3 2 3 3 3 @ A A @ 1 −1 11 2 1 1 1 0 0 2 √ √ √ √ 0 0 6 3 2 2 0 1 1 0 24 3 1 57 25 − 25 A, (b) @ − 2 2 A. (c) None, since eigenvectors are not orthog8.4.16. (a) @ 3 1 43 − 24 25 2 −2 ! 25

2 6

6 −7

!

√2 B 5 @ 1 √ 05

2 0 . Note: even though the given eigenvectors are not orthogonal, one can 0 2 construct an orthogonal basis of the eigenspace. onal. (d)

8.4.17. (a) (b) (c)

(d)

(e)

„ «2 «2 11 1 √1 y √1 x + √3 y − (3 x + y)2 + 20 (− x + 3 y)2 , + 11 = 20 2 10 10 10 „ «2 «2 „ √2 x + √1 y 7 √1 x + √2 y + 11 = 75 (x + 2 y)2 + 52 (− 2 x + y)2 , − 2 5 5 5 5 „ «2 «2 „ “ ”2 4 3 3 4 √ √ −4 + − 53 x + 54 y x+ √ y − √1 z x+ √ y + √1 z + 6 5 2 5 2 2 5 2 5 2 2 2 2 2 1 3 2 = − 25 (4 x + 3 y − 5 z) + 25 (− 3 x + 4 y) + 25 (4 x + 3 y + 5 z) , «2 «2 «2 „ „ „ 1 √1 x + √1 y + √1 z √1 y + √1 z √2 x + √1 y + √1 z + − + 2 − 2 3 3 3 2 2 6 6 6 = 61 (x + y + z)2 + 21 (− y + z)2 + 31 (− 2 x + y + z)2 , «2 „ «2 „ «2 „ + 9 − √1 x + √1 y + √1 z + 12 √1 x − √1 y + √2 z 2 √1 x + √1 y 2 2 3 3 3 6 6 6 2 2 2 1 2

„

√3 10

x+

= (x + y) + 3 (− x + y + z) + 2 (x − y + 2 z) .

0 0 1 0 1 1 ♥ 8.4.18. 1 −1 1 B B C C C (a) λ1 = λ2 = 3, v1 = B @ 1 A, v2 = @ 0 A, λ3 = 0, v3 = @ −1 A; 1 1 0 (b) det A = λ1 λ2 λ3 = 0. (c) A is positive semi-definite, but not positive definite since it has a zero eigenvalue.

224

1

C C, A

0 B B

(d) u1 = B B (e)

0

(f )

0

B @

@

2 1 −1 1

1

B C @0A

0

1 2 1 =

0

1

− √1

1

0

√1 6C 3 B B C B B C C C, u = B √1 C, u = B − √1 B B C 3 2 6C 3 @ @ A A 2 1 √ √ 0 6 3 0 1 1 0 √ √1 1 − √1 6 3C 3 B 2 −1 B 1 CB √ √1 √1 CB 0 − 1C A=B B 2 C@ 6 3 @ A 2 1 2 0 √ √ 0 6 3

√1 2 √1 2

√1 2

u1 −

√1 6

u2 +

√1 3

1

C C C; C A

0 3 0

1

0

0 CB B B 0C AB − @ 0

√1 2 √1 6 √1 3

√1 2 √1 6 1 √ − 3

0 √2 6 √1 3

1

C C C; C A

u3 .

♦ 8.4.19. The simplest is A = I . More generally, any matrix of the form A = S T Λ S, where S = ( u1 u2 . . . un ) and Λ is any real diagonal matrix. 8.4.20. True, assuming that the eigenvector basis is real. If Q is the orthogonal matrix formed by the eigenvector basis, then A Q = Q Λ where Λ is the diagonal eigenvalue matrix. Thus, A = Q Λ Q−1 = Q Λ QT = AT is symmetric. For complex eigenvector bases, the result ! is cos θ − sin θ has false, even for real matrices. For example, any 2 × 2 rotation matrix sin θ cos θ ! ! i −i orthonormal eigenvector basis , . See Exercise 8.6.5 for details. 1 1 8.4.21. Using the spectral factorization, we have xT A x = (QT x)T Λ(QT x) =

n X

i=1

λi yi2 , where

yi = ui · x = k x k cos θi denotes the ith entry of QT x. √ √ 8.4.22. Principal stretches = eigenvalues: 4 + 3, 4 − 3, 1; ”T “ ”T “ √ √ principal directions = eigenvectors: 1, −1 + 3, 1 , 1, −1 − 3, 1 , ( −1, 0, 1 )T .

♦ 8.4.23. Moments of inertia: 4, 2, 1; principal directions: ( 1, 2, 1 )T , ( −1, 0, 1 )T , ( 1, −1, 1 )T . ♦ 8.4.24. (a) Let K = Q Λ QT be its spectral factorization. Then xT K x = yT Λ y where x = Q y. 2 2 The ellipse yT Λ y = λq 1 y1 + λ2 y1 = 1 has its principal axes aligned with the coordinate axes and semi-axes 1/ λi , i = 1, 2. The map x = Q y serves to rotate the coordinate axes to align with the columns of Q, i.e., the eigenvectors, while leaving the semi-axes unchanged. 1

0.5

(b) (i)

-1

-0.5

0.5

1

ellipse with semi-axes

-0.5

-1

225

1, 12

and principal axes

!

1 , 0

!

0 . 1

1.5

1

0.5

(ii)

-1.5

-1

-0.5

1

0.5

1.5

-0.5

!

√ q ellipse with semi-axes 2, 23 , and principal axes

−1 , 1

!

1 . 1

-1

-1.5 1.5

1

(iii)

-1.5

-1

-0.5

1 1 √ ,√ √ , 2+ 2 2− √ !2

ellipse with semi-axes √

0.5

0.5

1

1.5

and principal axes

-0.5

1+ 2 , 1

√ ! 1− 2 . 1

-1

-1.5

(c) If K is positive semi-definite it is a parabola; if K is symmetric and indefinite, a hyperbola; if negative (semi-)definite, the empty set. If K is not symmetric, replace K by T 1 2 (K + K ) as in Exercise 3.4.20, and then apply the preceding classification. ♦ 8.4.25. (a) Same method as in Exercise 8.4.24. Its principal axes are the eigenvectors of K, and the semi-axes are the reciprocals of the square roots of the eigenvalues. (b) Ellipsoid with principal axes: ( 1, 0, 1 )T , ( −1, −1, 1 )T , ( −1, 2, 1 )T and semi-axes √1 , √1 , √1 . 6

12

24

8.4.26. If Λ = diag (λ1 , . . . , λn ), then the (i, j) entry of Λ M is di mij , whereas the (i, j) entry of M Λ is dk mik . These are equal if and only if either mik = 0 or di = dk . Thus, Λ M = M Λ with M having one or more non-zero off-diagonal entries, which includes the case of nonzero skew-symmetric matrices, if and only if Λ has one or more repeated diagonal entries. Next, suppose A = Q Λ QT is symmetric with diagonal form Λ. If A J = J A with J T = − J 6= O, then Λ M = M Λ where M = QT J Q is also nonzero, skew-symmetric, and hence A has repeated eigenvalues. Conversely, if λi = λj , choose M such that mij = 1 = − mji , and then A commutes with J = Q M QT .

♦ 8.4.27. √ √ (a) Set B = Q Λ QT , where Λ is the diagonal matrix with the square roots of the eigenvalues of A along the diagonal. Uniqueness follows from the fact that the eigenvectors and eigenvalues are uniquely determined. (Permuting them does not change the final form of B.) ! √ √ √ ! √ 1 1 3+1 3−1 −1 1− 2 ; 2 2√ q √ √ ; (ii) (b) (i) √ √ 1 2 3−1 3+1 (2 − 2) 2 + 2 1 − 2 (iii)

0√ B @

2 0 0

√0 5 0

1

0 C 0 A; 3

(iv )

0 B B B B @

√1 + √1 2 3 1 √ √ − 1 −1+ 2 3 − 1 + √2 3

1+

√1 − √1 2 3 1 1 √ √ 1+ + 2 3 1 − √2 3

−1 +

√2 3 √2 3 √4 3

−1 + 1− 1+

1

C C C. C A

8.4.28. Only the identity matrix is both orthogonal and positive definite. Indeed, if K = K T > 0 is orthogonal, then K 2 = I , and so its eigenvalues are all ± 1. Positive definiteness implies that all the eigenvalues must be +1, and hence its diagonal form is Λ = I . But then K = Q I QT = I also. √ ♦ 8.4.29. If A = Q B, then K = AT A = B T QT Q B = B 2 , and hence B = K is the positive 226

definite square root of K. Moreover, Q = A B −1 then satisfies QT Q = B −T AT A B −1 = B −1 K B −1 = I since K = B 2 . Finally, det A = det Q det B, and det B > 0 since B > 0. So if det A > 0, then det Q = +1 > 0. 8.4.30. (a)

0 2

1 0

2 1

(c) 0

0 (d) B @1 0 (e)

0

1

B @1

1

1 0 !

−3 0 4

!

0 1

=

0

1 0

1

√1 2 √1 2 0

1

0

=B @

−

!

2 0

0 , 1

10 √1 2 CB A@ √1 2

− 53 0

8 0 B 0C A = @1 6 0

0 −2 1

!

4 5

1 .3897 B 0C A = @ .5127 0 .7650

(b)

1 √1 √1 2 2C A, √1 √3 2 2 10 4 1 0 5 CB 0A@0 5 3 0 0 5

.0323 −.8378 .5450

2 1

−3 6

!

=

0

√2 B 5 @ 1 √ 5

1

− √1

5C A √2 5

√ 5 0

!

0 √ , 3 5

1

0 0C A, 10

10

.9204 1.6674 B −.1877 C A@ −.2604 −.3430 .3897

−.2604 2.2206 .0323

1

.3897 .0323 C A. .9204

♥ 8.4.31. (i) This follows immediately from the spectral factorization. The rows of Λ Q T are T λ1 uT 1 , . . . , λn un , and formula (8.34) follows from the alternative version of matrix multiplication given in0Exercise 0 1 1.2.34. 1 ! 1 2 4 − 25 A −3 4 5 5 5 @ @ A =5 2 4 −5 (ii) (a) . 1 2 4 3 − 5 5 5 5 √ √ √ √ 2 −1

(b) 0

1 (c) B @1 0

−1 4

1

1 2 1

0

3 (d) B @ −1 −1

!

0

0

0 B B 1C A = 3B @ 1 1

−1 2 0

0 1 1− √2 3−2√2 3+2√2 √ B 4−2√ 2 4−2√ 2 4−2 2 C A + (3 − 2 )@ 1− √2 1+ √2 1√ 4−2 2 4−2 2 4−2 2 1 1 0 1 1 1 0 − 6C 2 2C B 1C C+B 0 0 0C A. @ 3A 1 1 1 − 2 10 6 0 1 02 1 1 2 − − 0 0 0 3 3 3C B B 1C 1 1 1C C+B B C + 2B − 0 − 13 A @ 2 2 6 6A @ 1 1 1 1 1 0 − −3 2 2 6 6

√ = (3 + 2 )B @ 1 3 2 3 1 03

1 6 1 3 1 6

−1 B B 0C A = 4B @ 2

1 1+ √2 4−2 2 C A. 1√ 4−2 2

1 3 1 3 1 3

1 3 1 3 1 3

1 3 1 3 1 3

1

C C C. A

♦ 8.4.32. According to Exercise 8.4.7, the eigenvalues of an n × n Hermitian matrix are all real, and the eigenvectors corresponding to distinct eigenvalues orthogonal with respect to the Hermitian inner product on C n . Moreover, every Hermitian matrix is complete and has an orthonormal eigenvector basis of C n ; a proof follows along the same lines as the symmetric case in Theorem 8.20. Let U be the corresponding matrix whose columns are the orthonormal eigenvector basis. Orthonormality implies that U is a unitary matrix: U † U = I , and satisfies H U = U Λ where Λ is the real matrix with the eigenvalues of H on the diagonal. Therefore, H = U Λ U † . 8.4.33. (a)

(b)

3 − 2i 6 1 + 2i

2i 6

!

0

=B @

1 − 2i 2

!

1 0 ! √2 − √i 5 5C 2 0 B 5 A @ 2i √ √1 √2 0 7 5 5 0 1 05 −1+2 1−2 ! √ i √ i B 6 30 C 7 0 B √ @ A @ √1 √5 0 1 6 6

√i

− =

227

1 2i √ 5C A, √1 5 1+2 √ i 6 √ i − 1+2 30

1 √1 √6 C A, √5 6

(c)

0

−1

B @ −5 i

−4

1

5i −1 −4 i

0

B −4 B 4i C A=B B @ 8

√1 6 √i 6 √2 6

− √i

−

√1 3 − √i 3 √1 3

2 √1 2

0

1 0 C 12 CB CB 0 C@ A

0

0 −6 0

0 1 0 CB B B 0C AB @

0

√1 6 √i 2 √1 3

−

− √i

6 √1 2 √i 3

1 √2 6C C 0 C C. A 1 √ 3

8.4.34. Maximum: 7; minimum: 3. 8.4.35. Maximum:

√ 4+ 5 2 ;

minimum:

√ 4− 5 2 .

8.4.36. √ (a) 5+2 5 = max{ 2 x2 − 2 x y + 3 y 2 | x2 + y 2 = 1 }, √ 5− 5 2

= min{ 2 x2 − 2 x y + 3 y 2 | x2 + y 2 = 1 }; (b) 5 = max{ 4 x2 + 2 x y + 4 y 2 | x2 + y 2 = 1 }, 3 = min{ 4 x2 + 2 x y + 4 y 2 | x2 + y 2 = 1 }; (c) 12 = max{ 6 x2 − 8 x y + 2 x z + 6 y 2 − 2 y z + 11 z 2 | x2 + y 2 + z 2 = 1 }, 2 = min{ 6 x2 − 8 x y + 2 x z + 6 y 2 − 2 y z + 11 z 2 | x2 + y 2 + z 2 = 1 }; (d) 6 = √ max{ 4 x2 − 2 x y − 4 x z + 4 y 2 − 2 y z + 4 z 2 | x2 + y 2 + z 2 = 1 }, 3 − 3 = min{ 4 x2 − 2 x y − 4 x z + 4 y 2 − 2 y z + 4 z 2 | x2 + y 2 + z 2 = 1 }.

8.4.37. (c) 9√= max{ 6 x2 − 8 x y + 2 x z + 6 y 2 − 2 y z + 11 z 2 | x2 + y 2 + z 2 = 1, x − y + 2 z = 0 }; (d) 3 + 3 = max{ 4 x2 − 2 x y − 4 x z + 4 y 2 − 2 y z + 4 z 2 | x2 + y 2 + z 2 = 1, x − z = 0 }, 5 2 ; minimum: √ 8+ 5 8− 5 2√ = 5.11803; minimum: 2 √ = 2.88197; 4− 10 4+ 10 = 3.58114; minimum: = .41886. 2 2

8.4.38. (a) Maximum: 3; minimum: −2; (c) maximum:

(d) maximum:

√

8.4.39. Maximum: cos

π ; n+1

(b) maximum:

minimum: − cos

− 12 ;

π . n+1

8.4.40. Maximum: r 2 λ1 ; minimum: r 2 λn , where λ1 , λn are, respectively, the maximum and minimum eigenvalues of K. 8.4.41. max{ xT K x | k x k = 1 } = λ1 is the largest eigenvalue of K. On the other hand, K −1 is positive definite, cf. Exercise 3.4.10, and hence min{ xT K −1 x | k x k = 1 } = µn is its smallest eigenvalue. But the eigenvalues of K −1 are the reciprocals of the eigenvalues of K, and hence its smallest eigenvalue is µn = 1/λ1 , and so the product is λ1 µn = 1. ♦ 8.4.42. According to the discussion preceding the statement of the Theorem 8.30, λj = max

n

yT Λ y

˛ ˛ ˛

o

k y k = 1, y · e1 = · · · = y · ej−1 = 0 .

Moreover, using (8.33), setting x = Q y and using the fact that Q is an orthogonal matrix and so (Q v) · (Q w) = v · w for any v, w ∈ R n , we have xT A x = yT Λ y,

k x k = k y k,

y · e i = x · vi ,

where vi = Q ei is the ith eigenvector of A. Therefore, by the preceding formula, λj = max

n

˛

o

xT A x ˛˛ k x k = 1, x · v1 = · · · = x · vj−1 = 0 .

♦ 8.4.43. Let A be a symmetric matrix with eigenvalues λ1 ≥ λ2 ≥ · · · ≥ λn and corresponding orthogonal eigenvectors v1 , . . . , vn . Then the minimal value of the quadratic form xT A x 228

over all unit vectors which are orthogonal to the last n−j eigenvectors is the j th eigenvalue: λj = min

n

˛

o

xT A x ˛˛ k x k = 1, x · vj+1 = · · · = x · vn = 0 .

vT K v v = uT K u, where u = is a unit vector. Moreover, if v is orthogok v k2 kvk nal to an eigenvector vi , so is u. Therefore, by Theorem 8.30

8.4.44. Note that 8 <

max :

˛ ˛ ˛ ˛ ˛ ˛

vT K v k v k2

v 6= 0,

9 =

v · v1 = · · · = v · vj−1 = 0 ;

= max

n

T

u Ku

˛ ˛ ˛

k u k = 1,

u · v1 = · · · = u · vj−1 = 0

o

. = λj

♥ 8.4.45. √ f = R−1 K R−1 . Then (a) Let R = M be the positive definite square root of M , and set K f y, xT M x = yT y = k y k2 , where y = R x. Thus, xT K x = yT K max

n

˛ ˛ ˛

xT K x

xT M x = 1

o

= max

n

fy yT K

˛ ˛ ˛

k y k2 = 1

o

e , =λ 1

f But K f y = λ y implies K x = λ x, and so the eigenvalues of the largest eigenvalue of K. f coincide. K and K x (b) Write y = √ so that yT M y = 1. Then, by part (a), T x 8 Mx 9 ˛ < xT K x ˛˛ = ˛ o n ˛ T T ˛ x 6= 0 ˛ y M y = 1 = λ1 . = max y K y max : T ˛ ; x Mx ˛

(c) λn = min{ xT K x | xT M x = 1 }. (d) λj = max{ xT K x | xT M x = 1, xT M v1 = · · · = xT M vj−1 = 0 } where v1 , . . . , vn are the generalized eigenvectors. √

√

8.4.46. (a) Maximum: 34 , minimum: 52 ; (b) maximum: 9+47 2 , minimum: 9−47 (c) maximum: 2, minimum: 12 ; (d) maximum: 4, minimum: 1. 1 0

8.4.47. No. For example, if A = only if | b | < 4. 8.5.1. (a) 3 ±

!

√ 5 B 10−2√5 B =@ √ 2 √ 10−2 5 ! ! 0

(a)

1 0

(b)

0 −1

1 0

(c)

1 −3

−2 6

!

(d)

2 0

0 0

!

(e)

2 0

!

;

has eigenvalues 1, 4, then q(x) > 0 for x 6= 0 if and

√ √ √ √ 5, (b) 1, 1; (c) 5 2; (d) 3, 2; (e) 7, 2; (f ) 3, 1.

8.5.2. 1 2

b 4

2

0 3 1 −1

√−1+ 1 0

0 −1 0 − √1 B

=

0 1

=@

=

0 1

−1 1

!

√ 5 √ 10+2 5 √ 2 √ 10+2 ! 5

√−1− 1 0

0 1

1

0q

C C@ A

0 1

0

=B @

3 0 0 2 − √2 5 √1 5

0 1

0

!

1 0

1 √1 5C A √2 5

√2 5 !

q

√ 5

0 1 B AB @

√ 5 √ 10−4 √ 5 5 −2− √ √ 10+4 5

√−2+

1 1 √ C 10−4 5 C A, √ 1 √ 10+4 5

√

”

,

0 , 0 0 ! √ 7 0 B− √ @ 0 2 229

0

3−

1 , 0

1 “ √ ”“ 10 C − √1 A 5 2 3 5 √ 10 ! !

1 0

√ 3+ 5

√4 35 √2 10

− √3

35

− √1

10

√1 35 √2 10

1 √3 35 C A, √1 10

0

1 (f ) B @ −1 0

−1 2 −1

8.5.3. (a)

1 0

1 1

!

=

0

1

0 −1 C A= 1 √

5 B 10+2√5 B @ √ 2 √ 10+2 5

√1+

0 B B B B B @

−

√1 √6 √2 3 √1 6 √

− √1

1

2C C C 0C C A 1 √ 2

3 0

1 0q 5 3 √ C 10−2 5 C@ 2 A √ 2 √ 10−2 5

√1−

0 ! 0 B @

1

√ 5 + 1 2

0

−

q

√1 6 √1 2

3 2

−

√

1 √1 6C A.

− √2 3 1 √ 0 2

√ 5 √ 0 AB 10−2 5 √ B @ −1−√5 1 √ √ 2 5 10+2 5 1

0

√−1+

1 2 √ C 10−2 5 C A; √ 2 √ 10+2 5

√

(b) The first and last matrices are proper orthogonal, and so represent rotations, while the middle matrix is a stretch along the coordinate directions, in proportion to the singular values. Matrix multiplication corresponds to composition of the corresponding linear transformations. 8.5.4. √ 221 = 14.933, .0667. The square roots of these (a) The eigenvalues of K = AT A are 15 2 ± 2 eigenvalues give us the singular values of A. i.e., 3.8643, .2588. The condition number is 3.86433 / .25878 = 14.9330. (b) The singular values are 1.50528, .030739, and so the condition number is 1.50528 / .030739 = 48.9697. (c) The singular values are 3.1624, .0007273, and so the condition number is 3.1624 / .0007273 = 4348.17; slightly ill-conditioned. (d) The singular values are 12.6557, 4.34391, .98226, so he condition number is 12.6557 / .98226 = 12.88418. (e) The singular values are 239.138, 3.17545, .00131688, so the condition number is 239.138 / .00131688 = 181594; ill-conditioned. (f ) The singular values are 30.2887, 3.85806, .843107, .01015, so the condition number is 30.2887 / .01015 = 2984.09; slightly ill-conditioned. ♠ 8.5.5. In all cases, the large condition number results in an inaccurate solution. (a) The exact solution is x = 1, y = −1; with three digit rounding, the computed solution is x = 1.56, y = −1.56. The singular values of the coefficient matrix are 1615.22, .274885, and the condition number is 5876. (b) The exact solution is x = −1, y = −109, z = 231; with three digit rounding, the computed solution is x = −2.06, y = −75.7, z = 162. The singular values of the coefficient matrix are 265.6, 1.66, .0023, and the condition number is 1.17 × 105 . (c) The exact solution is x = −1165.01, y = 333.694, z = 499.292; with three digit rounding, the computed solution is x = −467, y = 134, z = 200. The singular values of the coefficient matrix are 8.1777, 3.3364, .00088, and the condition number is 9293. ♠ 8.5.6. (a) The 2×2 Hilbert matrix has singular values 1.2676, .0657 and condition number 19.2815. The 3 × 3 Hilbert matrix has singular values 1.4083, .1223, .0027 and condition number 524.057. The 4 × 4 Hilbert matrix has singular values 1.5002, .1691, .006738, .0000967 and condition number 15513.7. (b) The 5 × 5 Hilbert matrix has condition number 4.7661 × 105 ; the 6 × 6 Hilbert matrix has condition number 1.48294 × 107 . 8.5.7. Let A = v ∈ R n be the matrix (column vector) in question. (a) It has one singular value: “ ” vT v , Σ = k v k — a 1 × 1 matrix, Q = (1); (c) v+ = . k v k; (b) P = kvk k v k2

8.5.8. Let A = vT , where v ∈ R n , be the matrix (row vector) in question. (a) It has one singu“ ” v v lar value: k v k; (b) P = (1), Σ = k v k , Q = ; (c) v+ = . kvk k v k2 230

8.5.9. Almost true, with but one exception — the zero matrix. 8.5.10. Since S 2 = K = AT A, the eigenvalues λ of K are the squares, λ = σ 2 of the √ eigenvalues σ of S. Moreover, since S > 0, its eigenvalues are all non-negative, so σ = + λ, and, by definition, the nonzero σ > 0 are the singular values of A. 8.5.11. True. If A = P Σ QT is the singular value decomposition of A, then the transposed equation AT = Q Σ P T gives the singular value decomposition of AT , and so the diagonal entries of Σ are also the singular values of AT . ♦ 8.5.12. Since A is nonsingular, so is K = AT A, and hence all its eigenvalues are nonzero. Thus, Q, whose columns are the orthonormal eigenvector basis of K, is a square orthogonal matrix, as is P . Therefore, the singular value decomposition of the inverse matrix is A −1 = Q−T Σ−1 P −1 = Q Σ−1 P T . The diagonal entries of Σ−1 , which are the singular values of A−1 , are the reciprocals of the diagonal entries of Σ. Finally, κ(A−1 ) = σn /σ1 = 1/κ(A). ♦ 8.5.13. (a) When A is nonsingular, all matrices in its singular value decomposition (8.40) are square. Thus, we can compute det A = det P det Σ det QT = ±1 det Σ = ± σ1 σ2 · · · σn , since the determinant of an orthogonal matrix is ±1. The result follows upon taking absolute values of this equation and using the fact that the product of the singular values is nonnegative. (b) No — even simple nondiagonal examples show this is false. (c) Numbering the singular values in decreasing order, so σk ≥ σn for all k, we conclude n , and the result follows by taking the nth root. 10− k > | det A | = σ1 σ2 · · · σn ≥ σn (d) Not necessarily, since all the singular values could be very small but equal, and in this case the condition number would be 1. (e) The diagonal matrix with entries 10k and 10−k for k ≫ 0, or more generally, any 2 × 2 matrix with singular values 10k and 10−k , has condition number 102 k . 8.5.14. False. For example, the diagonal matrix with entries 2 · 10k and 10−k for k ≫ 0 has determinant 2 but condition number 2 · 102 k . 8.5.15. False — the singular values are the absolute values of the nonzero eigenvalues. 8.5.16. False. For example, U =

1 0

1 2

!

has singular values 3 ±

√ 5.

8.5.17. False, unless A is symmetric or, more generally, normal, meaning that AT A = A AT . ! r √ 1 1 are 32 ± 25 , while the singular values of For example, the singular values of A = ! 0 1 q √ 1 2 are 3 ± 2 2 . A2 = 0 1 8.5.18. False. This is only true if S is an orthogonal matrix. ♥ 8.5.19. (a) If k x k = 1, then y = A x satisfies the equation y T B y = 1, where B = A−T A−1 = P T Σ−2 P . Thus, by Exercise 8.4.24, the principal axes of the ellipse are the columns of P , and the semi-axes are the reciprocals of the square roots of the diagonal entries of Σ−2 , which are precisely the singular values σi . (b) If A is symmetric (and nonsingular), P = Q is the orthogonal eigenvector matrix, and so the columns of P coincide with the eigenvectors of A. Moreover, the singular values σi = | λi | are the absolute values of its eigenvalues. (c) From elementary geometry, the area of an ellipse equals π times the product of its semi231

axes. Thus, area E = π σ1 σ2!= π | det A |!using Exerciser8.5.13. r √ √ 2√ 2√ 3 5 3 + , − 25 ; area: π. , ; semi-axes: (d) (i) principal axes: 2 2 2 1− 5 1+ 5 ! ! 2 1 (but any orthogonal basis of R 2 will also do); semi, (ii) principal axes: −1 2 √ √ axes: 5, 5 (it’s a!circle);! area: 5 π. √ √ 3 1 (iii) principal axes: , ; semi-axes: 3 5 , 5 ; area: 15 π. 1 −3 (e) If A = O, then E = {0} is a point. Otherwise, rank A = 1 and its singular value decomposition is A = σ1 p1 qT 1 where A q1 = σ1 p1 . Then E is a line segment in the direction of p1 of length 2 σ1 . 8.5.20. “

10

10 11

3 11

”2

“

3 11

2 11

”2

2

2

u+ v + u+ v = 1 or 109 u + 72 u v + 13 v = 121. Since A is symmetric, the semi-axes are the eigenvalues, which are the same as the singular values, namely 11, 1, so the ellipse is very long and thin; the principal axes −1 3

are the eigenvectors,

!

,

7.5 5 2.5

-4

-2

4

2 -2.5

!

3 ; the area is 11 π. 1

-5 -7.5 -10

8.5.21. (a) In view of the singular value decomposition of A = P Σ QT , the set E is obtained by first rotating the unit sphere according to QT , which doesn’t change it, then stretching it along the coordinate axes into an ellipsoid according to Σ, and then rotating it according to P , which aligns its principal axes with the columns of P . The equation is (65 u + 43 v − 2 w)2 + (43 u + 65 v + 2 w)2 + (− 2 u + 2 v + 20 w)2 = 2162 , 2

2

or

2

1013 u + 1862 u v + 1013 v − 28 u w + 28 v w + 68 w = 7776.

(b) The semi-axes are the eigenvalues: 12, 9, 2; the principal axes are the eigenvectors: ( 1, −1, 2 )T ,

( −1, 1, 1 )T ,

(c) Since the unit sphere has volume

4 3

( 1, 1, 0 )T .

π, the volume of E is

4 3

π det A = 288 π.

♦ 8.5.22. (a) k A u k2 = (A u)T A u = uT K u, where K = AT A. According to Theorem 8.28, max{ uT K u | k u k = 1 eigenvalue λ1 of K = AT A, hence the maxi√} is the largest q mum value of k A u k = uT K u is λ1 = σ1 . (b) This is true if rank A = n by the same reasoning, but false if ker A 6= {0}, since then the minimum is 0, but, according to our definition, singular values are always nonzero. (c) The k th singular value σk is obtained by maximizing k A u k over all unit vectors which are orthogonal to the first k − 1 singular vectors. ♦ 8.5.23. Let λ1 be the maximal eigenvalue, and let u1 be a corresponding unit eigenvector. By Exercise 8.5.22, σ1 ≥ k A u1 k = | λ1 |. 8.5.24. By Exercise 8.5.22, the numerator is the largest singular value, while the denominator is the smallest, and so the ratio is the condition number. 8.5.25. (a)

0 @

−

1 20 1 20

3 − 20 3 20

1

A,

(b)

0 @

−

1 5 2 5

1 2 5 A, 1 5

232

(c)

0

1 @2

0

0 −1

1

0A , 0

(d)

0

0

B @0

1

0 −1 0

1

0 0C A, 0

(e) 8.5.26.

0 B B B @

−

1 15 1 15 1 15

2 − 15

1

C 2 C C, 15 A 2 − 15

(f )

0

1 @ 140 3 140

1 70 3 70

0

!

1 3 140 A, 9 140

(g)

1

0

1 B 9 B 5 B @ 18 1 18

− 91 2 9 4 9

0

!

2 9 1 18 7 − 18

1

C C C. A

1

1 3 1 1 1 1 ⋆ + 20 A, @ − 4 A; , A+ = @ 20 (a) A = x = A = 1 3 3 3 −2 − 141 20 20 0 0 1 1 0 1 −3 2 1 1 1 0 C ⋆ +B 3 6 A, (b) A = B 1C x = A A+ = @ 6 3 @2 @ −1 A = A, 7 1 1 − − 11 11 11 11 1 1 0 0 1 1 0

1 2

(c) A =

1 −1

!

1 , 1

A+ =

1 B7 B4 B @7 2 7

2 7 5 − 14 1 14

C C C, A

x⋆ = A+

5 2

!

=

!

;

9 B 7C B 15 C B C. @ 7 A 11 7

♥ 8.5.27. We repeatedly use the fact that the columns of P, Q are orthonormal, and so P T P = I , QT Q = I . (a) Since A+ = Q Σ−1 P T is the singular value decomposition of A+ , we have (A+ )+ = P (Σ−1 )−1 QT = P Σ QT = A. (b) A A+ A = (P Σ QT )(Q Σ−1 P T )(P Σ QT ) = P Σ Σ−1 Σ QT = P Σ QT = A. (c) A+ A A+ = (Q Σ−1 P T )(P Σ QT )(Q Σ−1 P T ) = Q Σ−1 Σ Σ−1 P T = Q Σ−1 P T = A+ . Or, you can use the fact that (A+ )+ = A. + T (d) (A A ) = (Q Σ−1 P T )T (P Σ QT )T = P (Σ−1 )T QT QΣT P T = P (Σ−1 )T ΣT P T = P P T = P Σ−1 Σ P T = (P Σ QT )(Q Σ−1 P T ) = A A+ . (e) This follows from part (d) since (A+ )+ = A. 8.5.28. In general, we know that x⋆ = A+ b is the vector of minimum norm that minimizes the least squares error k A x − b k. In particular, if b ∈ rng A, so b = A x0 for some x0 , then the minimum least squares error is 0 = k A x0 − b k. If ker A = {0}, then the solution is unique, and so x⋆ − x0 ; otherwise, x⋆ ∈ corng A is the solution to A x = b of minimum norm.

8.6.1. (a) U =

0 B @

0

(b) U = B @

0

(c) U = B @ (d) U =

0 B @

−

√1 2 √1 2

−

√1 2 √1 2

−

√3 13 √2 13

1+3 √ i 14 √ − √2 7

1 √1 2C A, √1 2 1 √1 2C A, √1 2 1 √2 13 C A, √3 13 √ 1 √2 7 C A, 1−3 √ i 14

!

∆=

2 0

−2 ; 2

∆=

3 0

0 ; −1

!

∆=

2 0

∆=

3i 0

!

15 ; −1 −2 − 3 i −3 i

233

!

;

0 B B

(e) U = B B @

(f ) U =

−

4 √ 3√5 5 3 2 √ 3 5

√1 5

0 √2 5

0 1 √ B 2 B B 0 B @ √1 2

√i − 1+

2 2 √1 2 1+ √i 2 2

2 3 − 32 1 3 1 2 1− i 2 − 21

1

C C C, C A 1

C C C, C A

0

−2

−1

@

0

1

0

0

B B

∆=B B 0 B

B ∆=B @

−1

1

0 0

i 0

22 √ 5 − √9 5

1

1

C C C; C A

1 1− √i 2√ C √ C . − 2 + i 2C A

−i

8.6.2. If U is real, then U † = U T is the same as its transpose, and so (8.45) reduces to U T U = I , which is the condition that U be an orthogonal matrix. ♦ 8.6.3. If U1† U1 = I = U2† U2 , then (U1 U2 )† (U1 U2 ) = U2† U1† U1 U2 = U2† U2 = I , and so U1 U2 is also orthogonal. ♦ 8.6.4. If A is symmetric, its eigenvalues are real, and hence its Schur Decomposition is A = Q ∆ QT , where Q is an orthogonal matrix. But AT = (QT QT )T = Q T T QT , and hence ∆T = ∆ is a symmetric upper triangular matrix, which implies that ∆ = Λ is a diagonal matrix with the eigenvalues of A along its diagonal. ♥ 8.6.5. (a) (b) (c) (d)

If A is real, A† = AT , and so if A = AT then AT A = A2 = A AT . If A is unitary, then A† A = I = A A† . Every real orthogonal matrix is unitary, so this follows from part (b). When A is upper triangular, the ith diagonal entry of the matrix equation A† A = A A† is | aii |2 =

n X

k=i

| aik |2 , and hence aik = 0 for all k > i. Therefore A is a diagonal matrix.

(e) Let U = ( u1 u2 . . . un ) be the corresponding unitary matrix, with U −1 = U † . Then A U = U Λ, where Λ is the diagonal eigenvalue matrix, and so A = U Λ U † = U Λ U † . Then A A† = U ΛU † U Λ† U † = U Λ Λ† U † = A† A since Λ Λ† = Λ† Λ as it is diagonal. (f ) Let A = U ∆ U † be its Schur Decomposition. Then, as in part (e), A A† = U ∆ ∆† U † , while A† A = U ∆† ∆ U † . Thus, A is normal if and only if ∆ is; but part (d) says this happens if and only if ∆ = Λ is diagonal, and hence A = U Λ U † satisfies the conditions of part (e). (g) If and only if it is symmetric. Indeed, by the argument in part (f ), A = Q ΛQT where Q is real, orthogonal, which is just the spectral factorization of A = AT .

8.6.6. (a) (b) (c) (d) (e)

One 2 × 2 Jordan block; eigenvalue 2; eigenvector e1 . Two 1 × 1 Jordan blocks; eigenvalues −3, 6; eigenvectors e1 , e2 . One 1 × 1 and one 2 × 2 Jordan blocks; eigenvalue 1; eigenvectors e1 , e2 . One 3 × 3 Jordan block; eigenvalue 0; eigenvector e1 . One 1 × 1, 2 × 2 and 1 × 1 Jordan blocks; eigenvalues 4, 3, 2; eigenvectors e1 , e2 , e4 .

234

0

8.6.7.

2

B B0 B @0

0 2 B B0 B @0 0 0

8.6.8.

8.6.9.

0

2 B @0 0 0 2 B @0 0

1

0 2 0 0 1 2 0 0

0 0 2 0 0 0 2 0

0 2 0 1 2 0

0 0C A, 5 1 0 0C A, 5

1

0 0C C C, 0A 2 1 0 0C C C, 1A 2 0

2 B @0 0 0 5 B @0 0

0

2

B B0 B @0

0 2 B B0 B @0 0 0

0 5 0 0 2 0

1 2 0 0 1 2 0 0 1

0 0C A, 2 1 0 1C A, 2

0

0 0C C C, 0A 2 1 0 0C C C, 0A 2

5 B @0 0 0 2 B @0 0

(c)

(d)

(e)

(f )

−3 0

!

1 . −3

0

0 2 0 0 0 2 0 0

0 2 B B0 B @0 0 0

1

0 0C A, 2 1 0 1C A, 5

1 0

(b) Eigenvalue: −3. Jordan basis: v1 =

2

B B0 B @0

0 2 0 0 5 0

(a) Eigenvalue: 2. Jordan basis: v1 =

Jordan canonical form:

0

1

0 0 2 0 0 1 2 0

!

1 2

0

2 B @0 0 0 5 B @0 0

1

0 5 0 1 5 0 0

, v2 = !

1

0 1 2 0 0 1 2 0

1 3

, v2 = 0

0 0C C C, 0A 2 1 0 0C C C, 1A 2

1

0 0C A, 5 1 0 0C A. 2

!

0

2

B B0 B @0

0 2 B B0 B @0 0 0

0

5 B @0 0

0 2 0 0 1 2 0 0

0 0 2 0 0 1 2 0

0 2 0

1

0 0C C C, 1A 2 1 0 0C C C. 1A 2

1

0 0C A, 5

0

5 B @0 0

0 1

!

1 . 2

!

.

0

1

0 1 0 C B C B C Eigenvalue: 1. Jordan basis: v1 = B @ 0 A , v2 = @ 1 A , v3 = @ −1 A. 1 0 0 0 1 1 1 0 C Jordan canonical form: B @ 0 1 1 A. 0 0 1 0 1 0 1 0 1 1 0 1 B C B C C Eigenvalue: −3. Jordan basis: v1 = B @ 0 A , v2 = @ 1 A , v3 = @ 0 A. 11 0 0 0 −3 1 0 Jordan canonical form: B 1C @ 0 −3 A. 0 0 −3 0 0 0 1 1 1 −1 0 0 B B C C C Eigenvalues: −2, 0. Jordan basis: v1 = B @ 0 A , v2 = @ −1 A , v3 = @ −1 A. 1 0 1 0 1 −2 1 0 C Jordan canonical form: B @ 0 −2 0 A. 0 0 0 0 1 0 1 1 0 −1 1 −2 1 B− C B C B B 2C 1C B0C C C C , v2 = B C , v3 = B B Eigenvalue: 2. Jordan basis: v1 = B B 1 C, v4 = @0A @ 1A @ 2A 0 0 0 0 1 2 0 0 0 B 0 2 1 0C C B C. Jordan canonical form: B @0 0 2 1A 0 0 0 2

235

0 0C A, 2

2 0

. Jordan canonical form:

1 2

1

0 5 0

0 B B B @

1

0 0C C C. 0A

− 21

8.6.10.

−1 Jλ,n =

0 −1 λ B B 0 B B B 0 B B B 0 B B . B . @ .

0

− λ−2 λ−1 0 0 .. . 0

λ−3 − λ−2 λ−1 0 .. . 0

− λ−4 λ−3 − λ−2 λ−1 .. . 0

... ... ... ... .. . ...

1

− (−λ)n C − (−λ)n−1 C C C − (−λ)n−2 C C . C − (−λ)n−3 C C C .. C A . −1 λ

8.6.11. True. All Jordan chains have length one, and so consist only of eigenvectors. ♦ 8.6.12. No in general. If an eigenvalue has multiplicity ≤ 3, then you can tell the size of its Jordan blocks by the number of linearly independent eigenvectors it has: if it has 3 linearly independent eigenvectors, then there are three 1 × 1 Jordan blocks; if it has 2 linearly independent eigenvectors then there are two Jordan blocks, of sizes 1 × 1 and 2 × 2, while if it only has one linearly independent eigenvector, then it corresponds to a single 3 × 3 Jordan block. But if the multiplicity of the eigenvalue is 4, and there are only 2 linearly independent eigenvectors, then it could have two 2 × 2 blocks, or a 1 × 1 and an 3 × 3 block. Distinguishing between the two cases is a difficult computational problem. 8.6.13. True. If zj = c wj , then A zj = cA wj = c λwj + c wj−1 = λzj + zj−1 . 8.6.14. False. Indeed, the square of a Jordan matrix is not necessarily a Jordan matrix, e.g., 0

♦ 8.6.15.

!

1 B @0 0

1 1 0

1

0

1 0 2 B 1C A = @0 1 0

2 1 0

1

1 2C A. 1

0 1 . Then e2 is an eigenvector of A2 = O, but is not an eigenvector of A. 0 0 (b) Suppose A = S J S −1 where J is the Jordan canonical form of A. Then A2 = S J 2 S −1 . Now, even though J 2 is not necessarily a Jordan matrix, cf. Exercise 8.6.14, since J is upper triangular with the eigenvalues on the diagonal, J 2 is also upper triangular and its diagonal entries, which are its eigenvalues and the eigenvalues of A2 , are the squares of the diagonal entries of J. (a) Let A =

8.6.16. Not necessarily. A simple example is A = ! 0 0 . whereas B A = 0 0

1 0

!

1 ,B = 0

0 0

!

1 , so A B = 0

0 0

!

1 , 0

♦ 8.6.17. First, since Jλ,n is upper triangular, its eigenvalues are its diagonal entries, and hence λ is the only eigenvalue. Moreover, v = ( v1 , v2 , . . . , vn )T is an eigenvector if and only if (Jλ,n − λ I )v = ( v2 , . . . , vn , 0 )T = 0. This requires v2 = · · · = vn = 0, and hence v must be a scalar multiple of e1 .

♦ 8.6.18. k is the matrix with 1’s along the k th upper diagonal, i.e., in positions (a) Observe that J0,n n = O. (i, k + i). In particular, when k = n, all entries are all 0, and so J0,n (b) Since a Jordan matrix is upper triangular, the diagonal entries of J k are the k th powers of diagonal entries of J, and hence J m = O requires that all its diagonal entries are zero. Moreover, J k is a block matrix whose blocks are the k th powers of the original Jordan blocks, and hence J m = O, where m is the maximal size Jordan block. (c) If A = S J S −1 , then Ak = S J k S −1 and hence Ak = O if and only if J k = O. (d) This follow from parts (c–d). 236

8.6.19. (a) Since J k is upper triangular, Exercise 8.3.12 says it is complete if and only if it is a diagonal matrix, which is the case if and only if J is diagonal, or J k = O. (b) Write A = S J S −1 in Jordan canonical form. Then Ak = S J k S −1 is complete if and only if J k is complete, so either J is diagonal, whence A is complete, or J k = O and so Ak = O. ♥ 8.6.20. (a) If D = diag (d1 , . . . , dn ), then pD (λ) =

n Y

i=1

(λ − di ). Now D − di I is a diagonal matrix

with 0 in its ith diagonal position. The entries of the product pD (D) =

n Y

i=1

(D − di I ) of

diagonal matrices is the product of the individual diagonal entries, but each such product has at least one zero, and so the result is a diagonal matrix with all 0 diagonal entries, i.e., the zero matrix: pD (D) = O. (b) First, according to Exercise 8.2.32, similar matrices have the same characteristic polynomials, and so if A = S D S −1 then pA (λ) = pD (λ). On the other hand, if p(λ) is any polynomial, then p(S D S −1 ) = S −1 p(D) S. Therefore, if A is complete, we can diagonalize A = S D S −1 , and so, by part (a) and the preceding two facts, pA (A) = pA (S D S −1 ) = S −1 pA (D) S = S −1 pD (D) S = O. (c) The characteristic polynomial of the upper triangular Jordan block matrix J = J µ,n n = O by Exercise with eigenvalue µ is pJ (λ) = (λ − µ)n . Thus, pJ (J) = (J − µ I )n = J0,n 8.6.18. (d) The determinant of a (Jordan) block matrix is the product of the determinants of the individual blocks. Moreover, by part (c), substituting J into the product of the characteristic polynomials for its Jordan blocks gives zero in each block, and so the product matrix vanishes. (e) Same argument as in part (b), using the fact that a matrix and its Jordan canonical form have the same characteristic polynomial. ♦ 8.6.21. The n vectors are divided into non-null Jordan chains, say w1,k , . . . , wik ,k , satisfying B wi,k = λk wi,k + wi−1,k with λk 6= 0 the eigenvalue, (and w0,k = 0 by convention) along with the null Jordan chains, say y1,l , . . . , wil ,l , wil +1,l , supplemented by one additional vector, satisfying B yi,k = yi−1,k , and, in addition, the null vectors z1 , . . . , zn−r−k ∈ ker B \ rng B. Suppose some linear combination vanishes: Xh k

a1,k w1,k + · · · + aik ,k wik ,k

i

+

Xh l

b1,l y1,l + · · · + bil ,l yil ,l + bil +1,l yil +1,l

i

+ (c1 z1 + · · · cr−k zr−k ) = 0.

Multiplying by B and using the Jordan chain equations, we find i Xh (λk a1,k + a2,k ) w1,k + · · · + (λk aik −1,k + aik ,k ) wik −1,k + λk aik ,k wik ,k k

+

Xh l

i

b2,l y1,l + · · · + bil +1,l yil ,l = 0.

Since we started with a Jordan basis for W = rng B, by linear independence, their coefficients in the preceding equation must all vanish, which implies that a1,k = · · · = aik ,k = b2,l = · · · = bil +1,l = 0. Substituting this result back into the original equation, we are left with X b1,l y1,l + (c1 z1 + · · · cr−k zr−k ) = 0, l

which implies all b1,l = cj = 0, since the remaining vectors are also linearly independent.

237


9.1.1.

du = (i) (a) u(t) = c1 cos 2 t + c2 sin 2 t. (b) dt

0 −4

1 0

!

!

c1 cos 2 t + c2 sin 2 t . − 2 c1 sin 2 t + 2 c2 cos 2 t

u. (c) u(t) =

0.4 0.2

(d)

(e)

4

2

6

8

-0.2 -0.4

−2t

(ii) (a) u(t) = c1 e

du + c2 e . (b) = dt 2t

0 4

1 0

!

u. (c) u(t) =

!

c1 e− 2 t + c2 e2 t . − 2 c1 e− 2 t + 2 c2 e2 t

7 6 5 4

(e)

(d)

3 2 1 -1

−t

(iii) (a) u(t) = c1 e

+ c2 t e

−t

du . (b) = dt

0 −1

1 −2

!

-0.5

0.5

u. (c) u(t) =

1

!

c1 e− t + c2 t e− t . (c2 − c1 ) e− t − c2 t e− t

3 2 1

(e)

(d)

-1

-0.5

0.5

1

-1 -2 -3

−t

(iv ) (a) u(t) = c1 e

−3t

+ c2 e

du = . (b) dt

0 −3

1 −4

!

u. (c) u(t) =

!

c1 e− t + c2 e− 3 t . − c1 e− t − 3 c2 e− 3 t

2 -1

(d)

-0.5

0.5 -2

(e)

-4 -6 -8

(v ) (a) u(t) = c1 et cos 3 t + c2 et sin 3 t. (b)

du = dt

0 −10

238

1 2

!

u.

1

(c) u(t) =

!

c1 e− t cos 3 t + c2 e− t sin 3 t . − (c1 + 3 c2 ) e− t cos 3 t + (3 c1 − c2 ) e− t sin 3 t 20 10

(e)

(d)

-1

1

2

3

4

-10 -20 -30

9.1.2.

0

0 du =B 0 (a) @ dt −12

1 0 −4

1

0 1C A u. −3

(b) u(t) = c1 e− 3 t + c2 cos 2 t + c3 sin 2 t, (c) dimension = 3.

0

1

c1 e− 3 t + c2 cos 2 t + c3 sin 2 t B C B u(t) = @ − 3 c1 e− 3 t − 2 c2 sin 2 t + 2 c3 cos 2 t C A, −3t 9 c1 e − 4 c2 cos 2 t − 4 c3 sin 2 t 0

1

0

u1 (t) 0 C B B du ¦ ¦ B u2 (t) C Bc C. Then =B 9.1.3. Set u1 = u, u2 = u, u3 = v, u4 = v and u(t) = B @ u3 (t) A @0 dt u4 (t) r

1 a 0 p

0 d 0 s

1

0 bC C Cu. 1A q

9.1.4. False; by direct computation, we find that the functions u1 (t), u2 (t) satisfy a quadratic equation α u21 + β u1 u2 + γ u22 + δu1 + εu2 = c if and only if c1 = c2 = 0. ♦ 9.1.5.

du dv =− (− t) = − A u(− t) = − A v. dt dt (b) Since v(t) = u(− t)!parametrizes the same curve as u(t), but in the reverse direction. ! dv c1 cos 2 t − c2 sin 2 t 0 −1 (c) (i) . v; solution: v(t) = = 2 c1 sin 2 t + 2 c2 cos 2 t 4 0 dt ! ! dv c1 e2 t + c2 e− 2 t 0 −1 (ii) . v; solution: v(t) = = −4 0 dt − 2 c1 e2 t + 2 c2 e−!2 t ! dv c1 et − c2 t et 0 −1 . v; solution: v(t) = = (iii) 1 2 dt (c2 − c1 ) et + c2 t!et ! dv c1 et + c2 e3 t 0 −1 (iv ) . v; solution: v(t) = = 3 4 dt − c1 et − 3 c2 e3 t ! ! dv c1 et cos 3 t − c2 et sin 3 t 0 −1 . = v; solution: v(t) = (v ) 10 −2 dt − (c1 + 3 c2 ) et cos 3 t − (3 c1 − c2 ) et sin 3 t ¦ (d) Time reversal changes u1 (t) = u(t) into v1 (t) = u1 (− t) = u(− t) and u2 (t) = u(t) into ¦ ¦ v2 (t) = u2 (− t) = u(− t) = − v(t). The net effect is to change the sign of the coefficient d2 u du d2 v dv of the first derivative term, so + a + b u = 0 becomes −a + b v = 0. 2 2 dt dt dt dt d d v(t) = 2 u(2 t) = 2 A u(2 t) = 2 A v(t), and 9.1.6. (a) Use the chain rule to compute dt dt so the coefficient matrix is multiplied by 2. (b) The solution trajectories are the same, but the solution moves twice as fast (in the same direction) along them. (a) Use the chain rule to compute

♦ 9.1.7. (a) This is proved by induction, the case k = 0 being assumed. 239

d dk+1 u = If k+1 dt dt 0

0 @

dk u dtk 1

1 A

= A 0

dk u , then differentiating the equation with respect to t dtk 1

d @ dk+1 u A d @ dk u A dk+1 u yields = = A , which proves the induction step. A dt dt dtk+1 dtk dtk+1 (b) This is also proved by induction, the case k = 1 being assumed. If true for k, then d dk+1 u = k+1 dt dt

0 @

dk u dtk

1 A

=

d du (Ak u) = Ak = Ak A u = Ak+1 u. dt dt

9.1.8. False. If u = A u then the speed along the trajectory at the point u(t) is k A u(t) k. So the speed is constant only if k A u(t) k is constant. (Later, in Lemma 9.31, this will be shown to correspond to A being a skew-symmetric matrix.) ¦

♠ 9.1.9. In all cases, the t axis is plotted vertically, and the three-dimensional solution curves ¦ ¦ (u(t), u(t), t)T project to the phase plane trajectories (u(t), u(t))T .

(i) The solution curves are helices going around the t axis:

(ii) Hyperbolic curves going away from the t axis in both directions:

(iii) The solution curves converge on the t axis as t → ∞:

(iv ) The solution curves converge on the t axis as t → ∞:

(v ) The solution curves spiral away from the t axis as t → ∞:

240

♥ 9.1.10. (a) Assuming b 6= 0, we have

1 ¦ a bc − ad d ¦ ¦ (∗) u + u. u − u, v= b b b b Differentiating the first equation yields dv 1 ¦¦ a ¦ = u − u. dt b b Equating this to the right hand side of the second equation yields leading to the second order differential equation ¦ ¦¦ (∗∗) u − (a + d)u + (a d − b c)u = 0. v=

(b) If u(t) solves (∗∗), then defining v(t) by the first equation in (∗) yields a solution to the first order system. Vice versa, the first component of any solution ( u(t), v(t) ) T to the system gives a solution u(t) to the second order equation. ¦¦ (c) (i) u + u = 0, hence u(t) = c1 cos t + c2 sin t, v(t) = − c1 sin t + c2 cos t. ¦¦ ¦ (ii) u − 2u + 5 u = 0, hence u(t) = c1 et cos 2 t + c2 et sin 2 t, v(t) = (c1 + 2 c2 ) et cos 2 t + (− 2 c1 + c2 ) et sin 2 t. ¦ ¦¦ (iii) u − u − 6 u = 0, hence u(t) = c1 e3 t + c2 e− 2 t , v(t) = c1 e3 t + 6 c2 e− 2 t . ¦¦ (iv ) u − 2 u = 0,√hence √ √ √ √ √ u(t) = c1 e 2 t + c2 e− 2 t , v(t) = ( 2 − 1)c1 e 2 t − ( 2 + 1)c2 e− 2 t . ¦¦ (v ) u = 0, hence u(t) = c1 t + c2 , v(t) = c1 . 1 ¦ c ad − bc b ¦ ¦ u + v, leading to the same (d) For c 6= 0 we can solve for u = v − v, u = d ¦¦ d d d ¦ second order equation for v, namely, v − (a + d)v + (a d − b c)v = 0. (e) If b = 0 then u solves a first order linear equation; once we solve the equation, we can then recover v by solving an inhomogeneous first order equation. Although u continues to solve the same second order equation, it is no longer the most general solution, and so the one-to-one correspondence between solutions breaks down.

9.1.11. u(t) =

7 −5t 5e

+

8 5t 5e ,

−5t v(t) = − 14 + 5 e

4 5t 5e .

9.1.12. √ √ √ √ √ √ (a) u1 (t) = ( 10−1)c1 e(2+ 10) t −( 10+1)c2 e(2− 10) t , u2 (t) = c1 e(2+ 10) t +c2 e(2− 10) t ; (b) x1 (t) = − ch1 e− 5 t + 3 c2 e5 t , x2 (t) = 3 c1 e− 5 t + ch2 e5 t ; i i (c) y1 (t) = e2 t c1 cos t − (c1 + c2 ) sin t , y2 (t) = e2 t c2 cos t + (2 c1 + c2 ) sin t ; (d) y1 (t) = − c1 e− t − c2 et − 23 c3 , y2 (t) = c1 e− t − c2 et , y3 (t) = c1 e− t + c2 et + c3 ; (e) x1 (t) = 3 c1 et + 2 c2 e2 t + 2 c3 e4 t , x2 (t) = c1 et + 21 c2 e2 t , x3 (t) = c1 et + c2 e2 t + c3 e4 t .

9.1.13. “ (a) u(t) = 21 e2−2 t +

1 − 2+2 t , − 21 e2−2 t 2e ” 3t −t 3t T

“

+

” 1 − 2+2 t T , 2e

e− t − 3 e , e + 3 e , „ «T √ √ (c) u(t) = et cos 2 t, − √1 et sin 2 t , 2 „ q ” “ “ “ √ √ ” «T √ √ ” , (d) u(t) = e− t 2 − cos 6 t , e− t 1 − cos 6 t + 23 sin 6 t , e− t 1 − cos 6 t (b) u(t) =

(e) u(t) = ( − 4 − 6 cos t − 9 sin t, 2 + 3 cos t + 6 sin t, − 1 − 3 sin t )T , (f ) u(t) =

“

1 2−t 2e

+

1 − 2+t , − 21 e4−2 t 2e

+

1 −4+2 t , − 21 e2−t 2e

241

+

1 −2+t 1 4−2 t ,2e 2e

+

” 1 −4+2 t T e , 2

(g) u(t) =

“

− 21 e− t +

3 2

cos t −

3 2

sin t, 32 e− t −

5 2

9.1.14. (a) x(t) = e− t cos t, y(t) = − e− t sin t;

“

9.1.15. x(t) = et/2 cos

√

3 2

t−

√ 3 sin

√ 3 2

cos t +

3 2

sin t, 2 cos t, cos t + sin t

”T

.

(b)

”

“

t , y(t) = et/2 cos

√ 3 2

t−

time t = 1, the position is ( x(1), y(1) )T = ( −1.10719, .343028 )T .

√1 3

sin

√ 3 2

“

”

t , and so at

”T

9.1.16. The solution is x(t) = c1 cos 2 t − c2 sin 2 t, c1 sin 2 t + c2 cos 2 t, c3 e− t . The origin is a stable, but not asymptotically stable, equilibrium point. Fluid particles starting on the xy plane move counterclockwise, at constant speed with angular velocity 2, around circles centered at the origin. Particles starting on the z axis move in to the origin at an exponential rate. All other fluid particles spiral counterclockwise around the z axis, converging exponentially fast to a circle in the xy plane. 9.1.17. The coefficient matrix has eigenvalues λ1 = −5, !λ2 = −7, and,!since the coefficient ma1 −1 trix is symmetric, orthogonal eigenvectors v1 = , v2 = . The general solution 1 1 is ! ! − 7 t −1 −5t 1 . + c2 e u(t) = c1 e 1 1 For the initial conditions ! ! ! 1 −1 1 = u(0) = c1 + c2 , 1 1 2 we can use orthogonality to find h u(0) , v1 i = 23 , c1 = k v1 k2

c2 =

Therefore, the solution is

u(t) =

3 −5t 2e

1 1

!

+

1 −7t 2e

h u(0) , v2 i = 12 . k v2 k2 −1 1

!

.

0

1

1

9.1.18. (a) Eigenvalues: λ1 = 0, λ2 = 1, λ3 = 3, eigenvectors: v1 =

B C @1A,

1

v2 =

0 B @

1

(b) By direct computation, v1 · v2 = v1 · v3 = v2 · v3 = 0. (c) The matrix is positive semi-definite since it has one zero eigenvalue and all the rest are positive. (d) The general solution is 0 1 0 1 0 1 1 1 1 C C C t B 3t B u(t) = c1 B @ −2 A . @ 1 A + c2 e @ 0 A + c3 e 1 −1 1 For the given initial conditions 0 0 0 0 1 1 1 1 1 1 1 1 B B B C C C C u(0) = c1 B @ 1 A + c2 @ 0 A + c3 @ −2 A = @ 2 A = u0 , 1 −1 −1 1 242

0

1

1 1 C B 0C A , v3 = @ −2 A . 1 −1

we can use orthogonality to find h u 0 , v2 i h u 0 , v1 i 2 c2 = = , = 1, c1 = 2 k v1 k 3 k v2 k2 Therefore, the solution is u(t) =

“

2 3

+ et −

2 3

e3t ,

2 3

+

c3 = 4 3

e3t ,

2 3

h u 0 , v3 i 2 =− . 2 k v3 k 3

− et −

2 3

e3t

”T

.

9.1.19. The general complex solution to the system is 0 1 0 1 0 1 −1 1 1 C C −t B (1+2 i ) t B C (1−2i) t B u(t) = c1 e @ 1 A + c2 e @ i A + c3 e @−i A. 1 1 1 Substituting into the initial conditions, 0 1 0 1 c1 = − 2, 2C − c1 + c2 + c3 B B C C c2 = − 21 i , u(0) = @ c1 + i c2 − i c3 A = B we find @ −1 A c1 + +c2 + c3 −2 c3 = 21 i . Thus, we obtain the same solution: 0 0 1 0 1 1 0 1 −1 1 1 2 e−t + et sin 2 t C C B C (1+2 i ) t B C (1−2i) t B 1 1 u(t) = − 2 e−t B @ 1A − 2 i e @ i A+ 2 ie @ − i A = @ − 2 e−t + et cos 2 t A. −t t 1 1 1 − 2 e + e sin 2 t 9.1.20. Only (d) and (g) are linearly dependent. du d e (t) = e e (t) solves u (t − t0 ) = A u(t − t0 ) = A u(t), and hence u dt dt e (t ) = u(0) = b has the correct initial conditions. the differential equation. Moreover, u 0 e (t) is always ahead of u(t) by an amount t . The trajectories are the same curves, but u 0

♦ 9.1.21. Using the chain rule,

9.1.22. (a) This follows immediately from uniqueness, since they both solve the initial value prob¦ e (t) for all t; lem u = A u, u(t1 ) = a, which has a unique solution, and so u(t) = u e (b) u(t) = u(t + t2 − t1 ) for all t. ”T “ du = Λ u. Moreover, the solution is a = λ1 c1 eλ1 t , . . . , λn cn eλn t dt linear combination of the n linearly independent solutions eλi t ei , i = 1, . . . , n.

9.1.23. We compute

9.1.24.

dv du =S = S A u = S A S −1 v = B v. dt dt

♦ 9.1.25. (i) This is an immediate consequence of the preceding two exercises. ! ! ! −2t ! c e c1 e3 t −1 1 −1 1 1 (ii) (a) u(t) = , (b) u(t) = , t 1 1 1 1 c2 e− t c2 e2 0 √ 1 √ ! √ (1+ i 2) t 2 i 2 i @ c1 e − A, √ (c) u(t) = 1 1 c2 e(1− i 02)t 1 −t 1 0 c e 1 2 1q 1q √ C B 2 2 CB (−1+ i 6) t C C, B 1 − i (d) u(t) = B A @1 1 + i c e 3 3 @ 2 A √ 1 1 1 (−1− i 6)t c3 e 0

4 (e) u(t) = B @ −2 1

3 + 2i −2 − i 1

3 − 2i −2 + i 1

10 CB AB @

1

c1 C c2 e i t C A, c3 e− i t

243

0

0 B B −1 (f ) u(t) = B @ 0 1

−1 0 1 0

0 1 0 1

10

−2t 1 C C C, C A

1 B c1 e c2 e− t 0C CB CB B 1 A@ c3 et 0 c e2 t 4

0

−1 B B 1 B (g) u(t) = B @ 0 0

3 2

i − 12 − 2 i 1+ i 1

−1 3 0 0

− 23 i − 12 + 2 i 1− i 1

10 CB CB CB CB AB @

1

c1 et C c2 e− t C C C. c3 e i t C A c4 e− i t

9.1.26.

0

!

(a) 0

c1 e2 t + c2 t e2 t , c2 e2 t

1

”

“

c e− 3 t + c2 12 + t e− 3 t A , (c) @ 1 2 c1 e− 3 t + 2 c2 t e− 3 t

”

“

0

1

c e− 3 t + c2 t e− 3 t + c3 1 + 21 t2 e− 3 t C B 1 C, (e) B c2 e− 3 t + c3 t e− 3 t A @ −3t −3t 2 −3t 1 c1 e + c2 te + 2 c3 t e

(g)

0

c e3 t B 1 B B B B @

9.1.27. (a)

1

+ c2 t e3 t − 14 c3 e− t − 14 c4 (t + 1) e− t C C c3 e− t + c4 t e− t C C, (h) −t 3t 1 C c2 e − 4 c4 e A c4 e− t du = dt

du (d) = dt 0

1 −5

0 du (g) =B @ −2 dt −2

− 12 1

2 0

1 −1 1 2

0 0

!

!

u,

u,

11 2C 0 A u,

0

(b)

du = dt 0

2 du =B (e) @0 dt 2

−1 −9

0 −3 3 0 1 du (h) =B @ −1 dt 0

“

1

”

c e− t + c2 13 + t e− t A (b) @ 1 , 3 c1 e− t + 3 c2 t e− t 1 0 c e− t + c2 et + c3 t et B 1 C C, (d) B −c1 e− t − c3 et A @ −t t t 2 c1 e + c2 e + c3 t e 1 0 c1 e− t + c2 t e− t + 12 c3 t2 e− 3 t C B C, (f ) B c2 e− t + c3 (t − 1) e− t A @ −t c3 e 0

1

c1 cos t + c2 sin t + c3 t cos t + c4 t cos t C B B − c1 sin t + c2 cos t − c3 t sin t + c4 t cos t C B C. @ A c3 cos t + c4 sin t −c3 sin t + c4 cos t

!

1 u, (c) −1 1 0 0C (f ) A u, 0 1 1 0 2 C 1 −1 A u. 1 1 2

du 0 = 1 dt 0

0 0

0 du =B @ −1 dt 0

!

u,

1 0 0

1

0 0C A u, 0

dui is a linear combination of u1 , u2 . Or note that the trajecdt tories described by the solutions cross, violating uniqueness. (b) No, since polynomial solutions a two-dimensional system can be at most first order in t. (c) No, since !a two-dimensional ¦ −1 0 system has at most 2 linearly independent solutions. (d) Yes: u = u. (e) Yes: 0 1 ! dui ¦ 2 3 is a linear combination of u1 , u2 . Or note that u. (f ) No, since neither u= −3 2 dt both solutions have the unit circle as0their trajectory, but traverse it in opposite directions, 1 0 0 0 ¦ ¦ C violating uniqueness. (g) Yes: u = B @ 1 0 0 A u. (h) Yes: u = u. (i) No, since a three−1 0 0 dimensional system has at most 3 linearly independent solutions.

9.1.28. (a) No, since neither

0

1

u(t) du ¦ C 9.1.29. Setting u(t) = B = @ u(t) A, the first order system is ¦¦ dt u(t)

244

0

0 B 0 @ −12

1 0 −4

1

0 1C A u. The eigen−3

values of the coefficient matrix are −3, ± 2 i with eigenvectors resulting solution is u(t) =

0

0

1

1 0

1

1

C B C B @ −3 A, @ ± 2 i A

9 1

c1 e− 3 t + c2 cos 2 t + c3 sin 2 t C B B − 3 c e− 3 t − 2 c sin 2 t + 2 c cos 2 t C, A @ 1 2 3 9 c1 e− 3 t − 4 c2 cos 2 t − 4 c3 sin 2 t

−4

and the

which is the same as

that found in Exercise 9.1.2. 0

1

u(t) C B ¦ du B u(t) C C, the first order system is 9.1.30. Setting u(t) = B = @ v(t) A dt ¦ v(t) 0

0

1

0 1 0 0 C B B 1 1 −1 0 C C. The coeffiB @ 0 0 0 1A −1 0 1 1 1 1 0 1 0 1 0 1 1 1 1 C B C B C B B −1 C C B0C B1C B 2C C, B C, B C, B B C. Thus cient matrix has eigenvalues −1, 0, 1, 2 and eigenvectors B @ −1 A @ 1 A @ 1 A @ −1 A 1 0 1 0 1 −2 c e− t + c + c et + c e2 t

2 3 4 C B 1 B − c e− t + c et + 2 c e2 t C C B 1 3 4 C, whose first and third components give the general u(t) = B B − c e− t + c + c et − c e2 t C @ A 1 2 3 4 c1 e− t + c3 et − 2 c4 e2 t solution u(t) = c1 e− t + c2 + c3 et + c4 e2 t , v(t) = − c1 e− t + c2 + c3 et − c4 e2 t to the second

order system.

9.1.31. The degree is at most n − 1, and this occurs if and only if A has only one Jordan chain in its Jordan basis. ♦ 9.1.32. (a) By direct computation, duj = λ eλ t dt which equals

j X

i=1

tj−i w + eλ t (j − i) ! i

j−1 X

i=1

2

tj−i−1 w, (j − i − 1) ! i 3

j X tj−1 tj−i tj−i A wi = eλ t 4 w1 + (λ wi + wi−1 ) 5 . A uj = e (j − 1) ! i = 2 (j − i) ! i = 1 (j − i) ! (b) At t = 0, we have uj (0) = wj , and the Jordan chain vectors are linearly independent. j λt X

9.1.33. ¦ ¦ (a) The equilibrium solution satisfies A u⋆ = − b, and so v(t) = u(t) − u⋆ satisfies v = u = A u + b = A(u − u⋆ ) = A v, which is the homogeneous system. u(t) = − 2 c1 cos 2 t + 2 c2 sin 2 t − 3, u(t) = − 3 c1 e2 t + c2 e− 2 t − 41 , (ii) (b) (i) v(t) = c1 sin 2 t + c2 cos 2 t − 21 . v(t) = c1 e2 t + c2 e− 2 t + 14 .

9.2.1. (a) (b) (c) (d) (e) (f ) (g) (h)

Asymptotically stable: the eigenvalues are −2 ± i ; √ 11 1 unstable: the eigenvalues are 2 ± 2 i ; asymptotically stable — eigenvalue −3; stable: the eigenvalues are ±4 i ; stable: the eigenvalues are 0, −1, with 0 complete; unstable: the eigenvalues are 1, −1 ± 2 i ; asymptotically stable: the eigenvalues are −1, −2; unstable: the eigenvalues are −1, 0, with 0 incomplete. 245

9.2.2.

r ” ” “ r √ √ i c1 + 23 c2 cos 6 t + − 23 c1 + c2 sin 6 t + h √ √ i v(t) = e− t c1 cos 6 t + c2 sin 6 t + 12 c3 e− 2 t , h √ i √ w(t) = e− t c1 cos 6 t + c2 sin 6 t + c3 e− 2 t .

u(t) = e− t

h“

−2t 1 , 2 c3 e

The system is stable because all terms in the solution are exponentially decreasing as t → ∞. 9.2.3. ¦ ¦ (a) u = − 2 u, v = − 2 v, with solution u(t) = c1 e− 2 t , v(t) = c2 e− 2 t . ¦ ¦ (b) u = − v, v = − u, with solution u(t) = c1 et + c2 e− t , v(t) = − c1 et + c2 e− t . ¦ ¦ 2 v, with solution (c) u = − 8 u + 2 v, v = 2 u − √ √ √ √ √

− (5− 13)t u(t) = − c1 13+3 , v(t) = c1 e− (5+ 13)t +c2 e− (5− e− (5+ 13)t +c2 13−3 2 2 ¦ e ¦ ¦ (d) u = − 4 u + v + 2 v, v = u −√4 v + w, w = 2√u + v − 4 w, with solution √ √ − (3+ 3) t u(t) = − c1 e− 6 t + c√2 e− (3+ 3) t + c3 e− (3− 3) t , v(t) 3 + 1) c e = − ( + 2 √ √ √ −6t − (3+ 3) t − (3− 3) t − (3− 3) t , w(t) = c1 e + c2 e + c3 e . +( 3 − 1) c3 e

√

13)t

9.2.4. ¦ ¦ (a) u = 2 v, v = − 2 u, with solution u(t) = c1 cos 2 t + c2 sin 2 t, v(t) = − c1 sin 2 t + c2 cos 2 t; stable. ¦ ¦ (b) u = u, v = − v, with solution u(t) = c1 et , v(t) = c2 e− t ; unstable. √ √ ¦ ¦ (c) u = − 2 u + 2 v, v = − 8 u + 2 v, with solution u(t) = 41 (c1 − 3 c2 ) cos 2 3 t + √ √ √ √ 1 4 ( 3 c1 + c2 ) sin 2 3 t, v(t) = c1 cos 2 3 t + c2 sin 2 3 t; stable. 9.2.5. (a) Gradient flow; asymptotically stable. (b) Neither; unstable. (c) Hamiltonian flow; unstable. (d) Hamiltonian flow; stable. (e) Neither; unstable. !

∂H ∂H b , then we must have = a u + b v, = − b u − c v. Therefore, c ∂v ∂u 2 ∂ H = a = −c. But if K > 0, both diagonal entries must by equality of mixed partials, ∂u ∂v be positive, a, c > 0, which is a contradiction.

9.2.6. True. If K =

a b

9.2.7. (a) The characteristic equation is λ4 + 2 λ2 + 1 = 0, and so ± i are double eigenvalues. However, each has only one linearly independent eigenvector, namely ( 1, ± i , 0, 0 )T . 0 1 c1 cos t + c2 sin t + c3 t cos t + c4 t cos t B C B − c1 sin t + c2 cos t − c3 t sin t + c4 t cos t C C. (b) The general solution is u(t) = B @ A c3 cos t + c4 sin t −c3 sin t + c4 cos t (c) All solutions with c23 + c24 6= 0 spiral off to ∞ as t → ±∞, while if c3 = c4 = 0, but c21 + c22 6= 0, the solution goes periodically around a circle. Since the former solutions can start out arbitrarily close to 0, the zero solution is not stable. 9.2.8. Every solution to a real first order system of period P comes from complex conjugate eigenvalues ± 2 π i /P . A 3 × 3 real matrix has at least one real eigenvalue λ1 . Therefore, if the system has a solution of period P , its eigenvalues are λ1 and ± 2 π i /P . If λ1 = 0, every solution has period P . Otherwise, the solutions with no component in the direction of the real eigenvector all have period P , and are the only periodic solutions, proving the result. The system is stable (but never asymptotically stable) if and only if the real eigenvalue λ1 ≤ 0. 9.2.9. No, since a 4 × 4 matrix could have two distinct complex conjugate pairs of purely imaginary eigenvalues, ± 2 π i /P1 , ± 2 π i /P2 , and would then have periodic solutions of periods 246

.

P1 and P2 . The general solution in such a case is quasi-periodic; see Section 9.5 for details. 9.2.10. The system is stable since ± i must be simple eigenvalues. Indeed, any 5 × 5 matrix has 5 eigenvalues, counting multiplicities, and the multiplicities of complex conjugate eigenvalues are the same. A 6 × 6 matrix can have ± i as complex conjugate, incomplete double eigenvalues, in addition to the simple real eigenvalues −1, −2, and in such a situation the origin would be unstable. 9.2.11. True, since Hn > 0 by Proposition 3.34. 9.2.12. True, because the eigenvalues of the coefficient matrix − K are real and non-negative, λ ≤ 0. Moreover, as it is symmetric, all its eigenvalues, including 0, are complete. 9.2.13. (a) v = B v = − A v. (b) True, since the eigenvalues of B = − A are minus the eigenvalues of A, and so will all have positive real parts. (c) False. For example, a saddle point, with one positive and one negative eigenvalue is still unstable when going backwards in time. (d) False, unless all the eigenvalues of A and hence B are complete and purely imaginary or zero. ¦

9.2.14. The eigenvalues of − A2 are all of the form − λ2 ≤ 0, where λ is an eigenvalue of A. Thus, if A is nonsingular, the result is true, while if A is singular, then the equilibrium solutions are stable, since the 0 eigenvalue is complete, but not asymptotically stable. 9.2.15. (a) True, since the sum of the eigenvalues equals the trace, so at least one must be positive or!have positive real part in order that the trace be positive. (b) False. A = −1 0 gives an example of a asymptotically stable system with positive determinant. 0 −2 9.2.16. (a) Every v ∈ ker K gives an equilibrium solution u(t) ≡ v. (b) Since K is complete, the general solution has the form u(t) = c1 e− λ1 t v1 + · · · + cr e− λr t vr + cr+1 vr+1 + · · · + cn vn ,

where λ1 , . . . , λr > 0 are the positive eigenvalues of K with (orthogonal) eigenvectors v1 , . . . , vr , while vr+1 , . . . , vn form a basis for the null eigenspace, i.e., ker K. Thus, as t → ∞, u(t) → cr+1 vr+1 + · · · + cn vn ∈ ker K, which is an equilibrium solution. (c) The origin is asymptotically stable if K is positive definite, and stable if K is positive semi-definite. (d) Note that a = u(0) = c1 v1 + · · · + cr vr + cr+1 vr+1 + · · · + cn vn . Since the eigenvectors are orthogonal, cr+1 vr+1 + · · · + cn vn is the orthogonal projection of a onto ker K. 9.2.17. (a) The tangent to the Hamiltonian trajectory at a point ( u, v )T is v = ( ∂H/∂v, −∂H/∂u )T while the tangent to the gradient flow trajectory is w = ( ∂H/∂u, ∂H/∂v ) T . Since v · w = 0, the tangents are orthogonal. 1

1

0.75 0.5

0.5 0.25

(b) (i)

-1

-0.5

0.5

(ii)

1

-1 -0.75 -0.5-0.25

0.25 0.5 0.75

1

-0.25

-0.5 -0.5 -0.75

-1

-1

9.2.18. False. Only positive definite Hamiltonian functions lead to stable gradient flows.

247

9.2.19. (a) When q(u) = 12 uT K u then d ¦ q(u) = 12 uT K u + dt

1 2

uT K u = uT K u = − (K u)T K u = − k K u k2 . ¦

¦

d q(u) = − k K u k2 < 0 and hence q(u) is a strictly dt decreasing function of t whenever u(t) is not an equilibrium solution. Moreover, u(t) → u⋆ goes to equilibrium exponentially fast, and hence its energy decreases exponentially fast to its equilibrium value: q(u) → q(u⋆ ).

(b) Since K u 6= 0 when u 6∈ ker K,

♥ 9.2.20. (a) By the multivariable calculus chain rule ∂H ∂H du ∂H dv ∂H ∂H ∂H d − H(u(t), v(t)) = + = + dt ∂u dt ∂v dt ∂u ∂v ∂v ∂u Therefore H(u(t), v(t)) ≡ c is constant, with its value c = H(u0 , v0 ) fixed by the initial conditions u(t0 ) = u0 , v(t0 ) = v0 . (b) The solutions are u(t) = c1 cos(2 t)−c1 sin(2 t)+2 c2 sin(2 t), v(t) = c2 cos(2 t) − c1 sin(2 t) + c2 sin(2 t), and leave the Hamiltonian function constant: H(u(t), v(t)) = u(t)2 − 2 u(t) v(t) + 2 v(t)2 = c21 − 2 c1 c2 + 2 c22 = c.

!

≡ 0. 1 0.75 0.5 0.25

-1 -0.75 -0.5-0.25

0.25 0.5 0.75

1

-0.25 -0.5 -0.75 -1

♦ 9.2.21. In both cases, | f (t) | = tk eµ t . If µ > 0, then eµ t → ∞, while tk ≥ 1 as t → ∞, and so | f (t) | ≥ eµ t → ∞. If µ = 0, then | f (t) | = 1 when k = 0, while | f (t) | = tk → ∞ if k > 0. If µ < 0, then | f (t) | = eµ t+k log t → 0 as t → ∞, since µ t + k log t → − ∞. ♦ 9.2.22. An eigensolution u(t) = eλ t v with λ = µ+ i ν is bounded in norm by k u(t) k ≤ eµ t k v k. Moreover, since exponentials grow faster than polynomials, any solution of the form u(t) = eλ t p(t), where p(t) is a vector of polynomials, can be bounded by C ea t for any a > µ = Re λ and some C > 0. Since every solution can be written as a linear combination of such solutions, every term is bounded by a multiple of ea t provided a > a⋆ = max Re λ and so, by the triangle inequality, is their sum. If the maximal eigenvalues are complete, then there are no polynomial terms, and we can use the eigensolution bound, so we can set a = a ⋆ .

9.3.1. (i) A =

0 9

!

−1 ; 0

!

!

i −i , λ2 = − 3 i , v2 = , 3 3 u1 (t) = c1 cos 3 t + c2 sin 3 t, u2 (t) = 3 c1 sin 3 t − 3 c2 cos 3 t; λ1 = 3 i , v1 = center; stable.

(ii) A =

−2 −1

λ1 =

1 2

λ2 =

1 2

+i −i

!

3 ; 1 √

3 2 ,

√

3 2 ,

v1 =

3 2

v2 =

3 2

+i 1 −i 1

√ ! 3 2 ,

√ ! 3 2 ,

248

u1 (t) = e− t/2

»„

»

3 2 c1

u2 (t) = e− t/2 c1 cos

−

√

3 2

√

3 2 c2

«

cos

t + c2 sin

√

3 2

√ 3 2

t+ –

„√

3 2 c1

+

3 2 c2

«

sin

t ;

stable focus; asymptotically stable

(iii) A =

3 2

!

−2 ; −2

!

!

1 2 , λ2 = 2, v2 = , 2 1 u1 (t) = c1 e− t + 2 c2 e2 t , u2 (t) = 2 c1 e− t + c2 e2 t ; λ1 = −1, v1 =

saddle point; unstable

9.3.2.

!

!

1 1 ; + c2 et 1 3 saddle point; unstable.

(i) u(t) = c1 e− t

!

!

2 cos t − sin t 2 sin t + cos t + c2 e− t ; 5 cos t 5 sin t stable focus; asymptotically stable.

(ii) u(t) = c1 e− t

!

!

t 1 + c2 e− t/2 (iii) u(t) = c1 e ; t + 52 1 stable improper node; asymptotically stable. − t/2

9.3.3.

!

−1 4 , (a) For the matrix A = 1 −2 tr A = −3 < 0, det A = −2 < 0, ∆ = 17 > 0, so this is an unstable saddle point.

249

√

3 2

–

t ,

!

−2 1 , 1 −4 tr A = −6 < 0, det A = 7 > 0, ∆ = 8 > 0, so this is a stable node.

(b) For the matrix A =

!

5 4 , 1 2 tr A = 7 > 0, det A = 6 > 0, ∆ = 25 > 0, so this is an unstable node.

(c) For the matrix A =

!

−3 −2 (d) For the matrix A = , 3 2 tr A = −1 < 0, det A = 0, ∆ = 1 > 0, so this is a stable line.

9.3.4. (a)

(b)

(c)

(d)

9.3.5. (a) u(t) =

(e)

“

e− 2 t cos t + 7 e− 2 t sin t, 3 e− 2 t cos t − 4 e− 2 t sin t

”T

;

(b)

(c) Asymptotically stable since the coefficient matrix has tr A = −4 < 0, det A = 5 > 0, ∆ = −4 < 0, and hence it is a stable focus; equivalently, both eigenvalues − 2 ± i have negative real part. 250

♦ 9.3.6. For (9.31), the complex solution

h

i

h

eλ t v = e(µ+ i ν) t (w + i z) = eµ t cos(ν t) w − sin(ν t) z + eµ t sin(ν t) w + cos(ν t) z leads to the general real solution h i h i u(t) = c1 eµ t cos(ν t) w − sin(ν t) z + c2 eµ t sin(ν t) w + cos(ν t) z h

h

i

i

i

= eµ t c1 cos(ν t) + c2 sin(ν t) w + eµ t − c1 sin(ν t) + c2 cos(ν t) z

q

h

i

= r eµ t cos(ν t − σ) w − sin(ν t − σ) z ,

where r = c21 + c22 and tan σ = c2 /c1 . To justify (9.32), we differentiate i h i du d h (c1 + c2 t)eλ t v + c2 eλ t w = λ (c1 + c2 t)eλ t v + c2 eλ t w + c2 eλ t v, = dt dt which is equal to A u = (c1 + c2 t)eλ t A v + c2 eλ t A w = (c1 + c2 t)eλ t λ v + c2 eλ t (λ w + v) by the Jordan chain condition. 9.3.7. All except for cases IV(a–c), i.e., the stars and the trivial case.

9.4.1. 0

(a)

4 t @3e 4 t 3e

− −

1 −2t 3e 4 −2t 3e !

(c)

cos t sin t

(f )

e− t + 2 t e− t −2 t e− t

9.4.2.

− sin t , cos t

0

(c)

(d)

0

(d)

0 cos t − sin t

1 3t 1 4t 1 t B6e + 2e + 3e B 1 t 1 4t B 3e − 3e @ 1 3t 1 4t 1 t 6e − 2e + 3e 0 e− 2 t + t e− 2 t B B − 1 + e− 2 t @ −2t −2t 0

1−e

B B B1 t B3e B @ 1 t 3e

− −

1 1 −2t 3e A, 4 −2t 3e !

1 0

t , 1

(b) (e)

!

2 t e− t . −t e − 2 t e− t

1 (a) B @ 2 sin t 2 cos t − 2

(b)

− 31 et + − 31 et +

− te

0

1 t @2e 1 t 2e

+ −

1 −t 2e 1 −t 2e

1 t 2e 1 t 2e

e2 t cos t − 3 e2 t sin t −5 e2 t sin t

− +

1 1 −t 2e A 1 −t 2e

=

!

cosh t sinh t

sinh t , cosh t !

2 e2 t sin t , 2t e cos t + 3 e2 t sin t

1

0 sin t C A, cos t 1 t 3e 2 t 3e 1 t 3e

− + −

t e− 2 t e− 2 t −t e− 2 t

1 4t 3e 1 4t 3e 1 4t 3e

1

1 t 6e

− 12 e3 t + 13 e4 t C C 1 t 1 4t C, 3e − 3e A 1 t 1 3t 1 4t e + e + e 6 2 3 1

t e− 2 t C −1 + e− 2 t C A, −2t 1 − te

√ √ 1 t 2 − t/2 1 t 1 − t/2 3 3 √1 − t/2 sin e + e cos t e − e cos 3 3 2 3 3 2 t− 3e √ √ √ 1 − t/2 1 t 2 − t/2 3 3 √1 e− t/2 sin 3 t e cos t + e + e cos 3 2 2 3 3 2 t 3 √ √ √ 1 − t/2 cos 23 t − √1 e− t/2 sin 23 t 31 et − 13 e− t/2 cos 23 t + √1 e− t/2 sin 3e 3 3 √ √ 1 − t/2 3 3 1 t 1 − t/2 1 sin 2 t C cos 2 t + √ e 3e − 3e 3 √ √ C 3 1 − t/2 1 t √1 e− t/2 sin 3 t C e − e cos t − C 3 3 2 2 3√ A 3 1 t 2 − t/2 e + e cos t 3 3 2

251

√ 3 2

t .

√ 3 2

t

9.4.3. 9.4.1 (a) det et A = e− t = et tr A , (b) det et A = 1 = et tr A , (c) det et A = 1 = et tr A , (d) det et A = 1 = et tr A , (e) det et A = e4 t = et tr A , (f ) det et A = e− 2 t = et tr A . 9.4.2 (a) det et A = 1 = et tr A , (b) det et A = e8 t = et tr A , (c) det et A = e− 4 t = et tr A , (d) det et A = 1 = et tr A . 0 1 0 √ √ √ 1 ! 7 7 1 3 1 3 e cos 2 − 2 e sin 2 (e + e ) (e − e ) 3 −1 2 2 √ √ @ A A @ , , (b) , (c) 9.4.4. (a) 7 7 1 3 1 3 √1 e sin 2 4 −1 e cos 2 2 2 (e − e ) 2 (e + e ) 0

e 0 (d) B @ 0

0

e−2 0

0 0

e−5

9.4.5.

1

C A,

(e)

B B4 B @9 2 9 !

(a) u(t) =

cos t sin t

(b) u(t) =

3 e− t − 2 e − 3 t 2 e− t − 2 e − 3 t

0 B

(c) u(t) = B @

− sin t cos t

0

− −

4 5 9 + 9 cos 3 4 1 9 cos 3 − 3 sin 3 2 2 9 cos 3 + 3 sin 3

1 −2

!

=

B

9.4.6. et O = I for all t.

=B @

−

2 9

−

4 1 9 cos 3 + 3 sin 3 4 5 9 + 9 cos 3 2 2 9 cos 3 − 3 sin 3

−3 e− t + 3 e− 3 t −2 e− t + 3 e− 3 t

!

2 9 2 9

!

cos t + 2 sin t , sin t − 2 cos t

3 e− t − 2 cos 3 t − 2 sin 3 t − 2 e− t + 2 cos 3 t + 2 sin 3 t 2 e− t − 2 cos 3 t 0

4 9

−1 1

!

=

− −

1 2 2 cos 3 − sin 3 9 3 C C 2 2 C. cos 3 + sin 3 9 3 A 1 8 + cos 3 9 9

!

−6 e− t + 5 e− 3 t , −4 e− t + 5 e− 3 t

3 e− t − 3 cos 3 t − sin 3 t −2 e− t + 3 cos 3 t + sin 3 t 2 e− t − 2 cos 3 t + sin 3 t 1

3 e− t − 3 cos 3 t − sin 3 t C − 2 e− t + 3 cos 3 t + sin 3 t C A. −t 2 e − 2 cos 3 t + sin 3 t

10

1

0 2 sin 3 t CB C B1C −2 sin 3 t C A@ A cos 3 t + sin 3 t 0

9.4.7. There are none, since et A is always invertible. 9.4.8. tA

e

=

cos 2 πt sin 2 πt

!

− sin 2 πt , and hence, when t = 1, eA = cos 2 πt

cos 2 π sin 2 π

− sin 2 π cos 2 π

!

=

1 0

!

0 . 1

9.4.9. (a) According to Exercise 8.2.51, A2 = − δ 2 I since tr A = 0, det A = δ 2 . Thus, by induction, A2 m = (−1)m δ 2 m I , A2 m+1 = (−1)m δ 2 m A. et A =

∞ X

n=0

∞ ∞ X X sin δ t (δ t)2 m t2 m+1 δ 2 m tn n A = I+ A = cos δ t + . (−1)m (−1)m n! (2 m)! (2 m + 1)! δ m=0 m=0

Setting t = 1 proves the formula. (b) eA = (cosh δ) I + (c) eA = I + A since A2 = O by Exercise 8.2.51.

√ sinh δ A, where δ = − det A . δ

♦ 9.4.10. Assuming A is an n × n matrix, since et A is a matrix solution, each of its n individual columns must be solutions. Moreover, the columns are linearly independent since e 0 A = I is nonsingular. Therefore, they form a basis for the n-dimensional solution space. 9.4.11. (a) False, unless A−1 = − A. (b) True, since A and A−1 commute. ♦ 9.4.12. Fix s and let U (t) = e(t+s) A , V (t) = et A es A . Then, by the chain rule, ¦ ¦ U = A e(t+s) A = A U , while, by the matrix Leibniz rule (9.40), V = A et A es A = A V . Moreover, U (0) = es A = V (0). Thus U (t) and V (t) solve the same initial value problem, hence, by uniqueness, U (t) = V (t) for all t, proving the result. 252

9.4.13. Set U (t) = A et A , V (t) = et A A. Then, by the matrix Leibniz formula (9.40), U = ¦ A2 et A = A U , V = A et A A = A V , while U (0) = A = V (0). Thus U (t) and V (t) solve the same initial value problem, hence, by uniqueness, U (t) = V (t) for all t. Alternatively, one ∞ tn X An+1 = et A A. can use the power series formula (9.46): A et A = n! n=0 ¦

9.4.14. Set U (t) = e− t λ et A . Then, U = − λ e− t λ et A + e− t λ A et A = (A − λ I )U. Moreover, U (0) = I . Therefore, by the the definition of matrix exponential, U (t) = et (A−λ I ) . ¦

♦ 9.4.15.

!

dV d tA T (a) Let V (t) = (e ) . Then = e = (et A A)T = AT (et A )T = AT V , and dt dt T V (0) = I . Therefore, by the the definition of matrix exponential, V (t) = et A . ¦ (b) The columns of et A form a basis for the solutions to u = A u, while its rows are a basis for the solutions to v = AT v. The stability properites of the two ystems are the same since the eigenvalues of A are the same as those of AT . tA T

♦ 9.4.16. First note that An = S B n S −1 . Therefore, using (9.46), tA

e

=

1

0

∞ tn ∞ tn X X tn n A = S B n S −1 = S @ B n A S −1 = S et B S −1 . n! n! n! n=0 n=0

∞ X

n=0

An alternative proof relies on the fact that et A and S et B S −1 both satisfy the initial value ¦ problem U = A U = S B S −1 U, U (0) = I , and hence, by uniqueness, must be equal. d diag (et d1 , . . . et dn ) = diag (d1 et d1 , . . . , dn et dn ) = D diag (et d1 , . . . , et dn ). Moredt over, at t = 0, we have diag (e0 d1 , . . . , e0 dn ) = I . Therefore, diag (et d1 , . . . , et dn ) satisfies the defining properties of et D . (b) See Exercise 9.4.16.

♦ 9.4.17. (a)

!

et 0

(c) 9.4.1: (a)

1 1

1 4

(b)

1 1

−1 1

!

et 0

(c)

i 1

−i 1

!

eit 0

0

e− 2 t

!

0

e

−t

1 1

!

1 1

0

e− i t

!

i 1

!−1

0

4 t e − 1 e− 2 t = @ 34 t 43 − 2 t 3e − 3e 0 !−1 1 t e + 1 e− t −1 = @ 12 t 12 − t 1 2e − 2e !−1

1 4

−i 1

=

cos t sin t

− 13 et + − 13 et + 1 t 2e 1 t 2e !

− +

1 1 −2t 3e A; 4 −2t e 3 1 1 −t e 2 A; 1 −t e 2

− sin t ; cos t

(d) not diagonalizable; (e)

3 5

− 51 i 1

3 5

+ 51 i 1 =

!

e(2+ i )t 0

0

e(2− i )t

e2 t cos t − 3 e2 t sin t −5 e2 t sin t

!

3 5

− 51 i 1

3 5

+ 51 i 1

!−1

!

2 e2 t sin t ; 2t e cos t + 3 e2 t sin t

(f ) not diagonalizable. 9.4.2: (a)

0 B @

−1 0 2

0 −i 1

10

0 1 B iC A@0 1 0

0

eit 0

10

−1 0 B 0C A@ 0 2 e− i t 253

0 −i 1

1

0

1 0 −1 =B iC @ 2 sin t A 1 2 cos t − 2

0 cos t − sin t

1

0 sin t C A; cos t

0

1 B @ (b) 2 1

−1 0 1

10

1 et CB −1 A @ 0 1 0 =

0

0 e3 t 0

10

1 0 CB 0 A@2 1 e4 t

1 t 1 3t 1 4t B6e + 2e + 3e B 1 t 1 4t B 3e − 3e @ 1 3t 1 4t 1 t 6e − 2e + 3e

−1 0 1 1 t 3e 2 t 3e 1 t 3e

1

1 −1 −1 C A 1

− + −

1 4t 3e 1 4t 3e 1 4t 3e

(c) not diagonalizable;

(d)

0

B1 B B1 @

1

− 12 − i − 12

+i 1

√ 3 √2 3 2

− 21 + i − 21

−i 1

√ 3 √2 3 2

10

1

CB CB CB 0 A@

0

− 21 e3 t + 31 e4 t C C 1 t 1 4t C; 3e − 3e A 1 t 1 3t 1 4t e + e + e 6 2 3

0

0

√ 1 3 − ( 2 − i 2 )t e

0 0

B1

B B1 @

1 which equals the same matrix exponential as before.

1

1 t 6e

0

√ 3 1 − ( 2 + i 2 )t e √ − 12 − i 23 − 21 √ − 12 + i 23 − 21

1

1 C C C A

+i −i 1

√ 1−1 3 √2 C 3C C 2 A

♦ 9.4.18. Let M have size p × q and N have size q × r. The derivative of the (i, j) entry of the product matrix M (t) N (t) is q q q dnkj X X dmik d X mik (t) mik (t) nkj (t) = nkj (t) + . dt k = 1 dt dt k=1 k=1 The first sum is the (i, j) entry of

dM dN N while the second is the (i, j) entry of M . dt dt

♦ 9.4.19. (a) The exponential series is a sum of real terms. Alternatively, one can choose a real basis for the solution space to construct the real matrix solution U (t) before substituting into formula (9.42). (b) According to Lemma 9.28, det eA = etr A > 0 since a real scalar exponential is always positive. 9.4.20. Lemma 9.28 implies det et A = et tr A = 1 for all t if and only if tr A = 0. (Even if tr A is allowed to be complex, by continuity, the only way this could hold for all t is if tr A = 0.) du = λ et λ v = dt λ u = A u, and hence, by (9.41), u(t) = et A u(0) = et A v. Therefore, equating the two formulas for u(t), we conclude that et A v = et λ v, which proves that v is an eigenvector of et A with eigenvalue et λ .

♦ 9.4.21. Let u(t) = et λ v where v is the corresponding eigenvector of A. Then,

9.4.22. The origin is an asymptotically stable if and only if all solutions tend to zero as t → ∞. Thus, all columns of et A tend to 0 as t → ∞, and hence lim et A = O. Conversely, if t→∞

lim et A = O, then any solution has the form u(t) = et A c, and hence u(t) → 0 as t → ∞,

t→∞

proving asymptotic stability.

9.4.23. According to Exercise 9.4.21, the eigenvalues of eA are eλ = eµ cos ν + i eµ sin ν, where λ = µ + i ν are the eigenvalues of A. For eλ to real “part, we ”must have “ have negative ” 1 cos ν < 0, and so ν = Im λ must lie between 2 k + 2 π < ν < 2 k + 23 π for some integer k.

254

e (t) are linear combinations of the columns of U (t), and hence 9.4.24. Indeed, the columns of U automatically solutions to the linear system. Alternatively, we can prove this directly using e ” dU d “ dU e , since C is constant. U (t) C = = C = AU C = AU the Leibniz rule (9.40): dt dt dt ♦ 9.4.25. dU (a) If U (t) = C et B , then = C et B B = U B, and so U satisfies the differential equation. dt Moreover, C = U (0). Thus, U (t) is the unique solution to the initial value problem ¦ U = U B, U (0) = C, where the initial value C is arbitrary. ¦ (b) By Exercise 9.4.16, U (t) = C et B = et A C where A = C B C −1 . Thus, U = A U as claimed. Note that A = B if and only if A commutes with U (0) = C.

9.4.26. (a) Let U (t) = ( u1 (t) . . . un (t) ) be the corresponding matrix-valued function with the n duj ¦ X = indicated columns. Then bij ui for all j = 1, . . . , n, if and only if U = U B dt i=1 ¦

where B is the n × n matrix with entries bij . Therefore, by Exercise 9.4.25, U = A U , where A = C B C −1 with C = U (0). dk u ¦ of a solution to u = A u is also a (b) According to Exercise 9.1.7, every derivative k dt solution. Since the solution space is an n-dimensional vector space, at most n of the derivatives are linearly independent, and so either du dn−1 u dn u = c u + c + · · · + c , (∗) 0 1 n−1 dtn dt dtn−1 for some constants c0 , . . . , cn−1 , or, for some k < n, we have dk−1 u dk u du + · · · + a = a u + a . k−1 0 1 dt dtk dtk−1 In the latter case, dn−k dn u = dtn dtn−k

0

du dk−1 u @c u + c 0 1 dt + · · · + ck−1 dtk−1

(∗∗) 1 A

dn−k u dn−k+1 u dn−1 u + a + · · · + a 1 k−1 dtn−1 , dtn−k dtn−k+1 and so (∗) continues to hold, now with c0 = · · · = cn−k−1 = 0, cn−k = a0 , . . . , cn−1 = ak−1 . = a0

dU = A U, U (0) = I , in block dt ! dV V (t) W (t) form U (t) = et A = . Then the differential equation decouples into = Y (t) Z(t) dt dW dY dZ B V, = A W, = C Y, = C Z, with initial conditions V (0) = I , W (0) = dt dt dt O, Y (0) = O, Z(0) = I . Thus, by uniqueness of solutions to the initial value problems, V (t) = et B , W (t) = O, Y (t) = O, Z(t) = et C .

9.4.27. Write the matrix solution to the initial value problem

255

♦ 9.4.28. (a) 0

B1 B

d dt

B B B B B0 B B B B B B0 B B B. B. B. B B @0

0

t

t2 2

t3 6 t2 2

1

t

0

1

t

...

.. . 0 0

.. . 0 0

..

..

. ... ...

...

.

1 0

tn−1 (n − 1) ! tn−2 (n − 2) ! .. . t 1 0

0 B B B0 B B . B . B . B =B B . B . B . B B B0 @

0

0

1

tn n!

...

C C C C C C C C C C C C C C C C C C C C A

B0 B

B B B B B0 B B B B B B0 B B B. B. B. B B @0

=

0

1 0 .. . .. .

0 1 .. . .. .

0 0 .. . ..

0 0

0 0

... ...

.

... ... .. . ..

.

0 0

1

t

t2 2

...

0

1

t

...

0

0

1

...

.. . 0 0

.. . 0 0

..

..

. ... ... 0

1B 1 0 CB B B 0C CB CB B0 .. C CB B .C CB CB B .. C 0 CB C . CB B CB B .. 1C AB B. B 0 @0

0

¦

tn−1 (n − 1) ! tn−2 (n − 2) ! tn−3 (n − 3) ! .. . 1 0

.

0 0

1 C C C C C C C C C C C C C C C C C C C C A

t

t2 2

1

t

0

1

t

...

.. . 0 0

.. . 0 0

..

..

t3 6 t2 2

. ... ...

... ...

1 0

.

tn n! tn−1 (n − 1) ! tn−2 (n − 2) ! .. . t 1

1

C C C C C C C C C C C, C C C C C C C C C A

Thus, U (t) satisfies the initial value problem U = J0,n U, U (0) = I , that characterizes the matrix exponential, so U (t) = et J0,n . (b) Since Jλ,n = λ I + J0,n , by Exercise 9.4.14, et Jλ,n = et λ et J0,n , i.e., you merely multiply all entries in the previous formula by et λ . (c) According to Exercises 9.4.17, 27, if A = S J S −1 where J is the Jordan canonical form, then et A = S et J S −1 , and et J is a block diagonal matrix given by the exponentials of its individual Jordan blocks, computed in part (b). ♦ 9.4.29. If J is a Jordan matrix, then, by the arguments in Exercise 9.4.28, et J is upper triangular with diagonal entries given by et λ where λ is the eigenvalue appearing on the diagonal of the corresponding Jordan block of A. In particular, the multiplicity of λ, which is the number of times it appears on the diagonal of J, is the same as the multiplicity of e t λ for et J . Moreover, since et A is similar to et J , its eigenvalues are the same, and of the same multiplicities. ♥ 9.4.30. (a) All matrix exponentials are nonsingular by the remark after (9.44). (b) Both A = ! 0 −2 π O and A = have the identity matrix as their exponential eA = I . (c) If eA = 2π 0 I and λ is an eigenvalue of A, then eλ = 1, since 1 is the only eigenvalue of I . Therefore, the eigenvalues of A must be integer multiples of 2 π i . Since A is real, the eigenvalues must be complex conjugate, and hence either both 0, or ± 2 n π i for some positive integer n. In the latter case, the characteristic equation of A is λ2 + 4! n2 π 2 = 0, and hence A must have a b with a2 + b c = − 4 n2 π 2 . If A zero trace and determinant 4 n2 π 2 . Thus, A = c −a has both eigenvalues zero, it must be complete, and hence A = O, which is included in the previous formula. 256

9.4.31. Even though this formula is correct in the scalar case, it is false in general. Would that life were so simple!

9.4.32. 1 −2t e − 14 e2 t , u2 (t) = 13 et − 13 e− 2 t ; (a) u1 (t) = 31 et − 12 (b) u1 (t) = et−1 − et + t et , u2 (t) = et−1 − et + t et ; (c) u1 (t) = 31 cos 2 t − 21 sin 2 t − 13 cos t, u2 (t) = cos 2 t + 32 sin 2 t − 4t 3 1 13 4 t 29 3 (d) u(t) = 13 16 e + 16 − 4 t, v(t) = 16 e − 16 + 4 t; 1 2 1 3 1 2 1 3 (e) p(t) = 2 t + 3 t , q(t) = 2 t − 3 t .

1 3

sin t;

9.4.33. (a) u1 (t) = 21 cos 2 t + 14 sin 2 t + 21 − 12 t, u2 (t) = 2 e− t − 21 cos 2 t − 14 sin 2 t − 23 + u3 (t) = 2 e− t − 41 cos 2 t − 34 sin 2 t − 47 + 32 t; (b) u1 (t) = − 32 et + 12 e− t + t e− t , u2 (t) = t e− t , u3 (t) = − 3 et + 2 e− t + 2 t e− t .

3 2

t,

9.4.34. Since λ is not an eigenvalue, A − λ I is nonsingular. Set w = (A − λ I )−1 v. Then u⋆ (t) = eλ t w is a solution. The general solution is u(t) = eλ t w + z(t) = eλ t w + et A b, where b is any vector. 9.4.35. Z t e(t−s) A b ds. (a) u(t) = 0

(b) Yes, since if b = A c, then the integral can be evaluated as Z t e(t−s) A A c ds 0 tA

u(t) = where v(t) = e rium solution.

9.4.36. (a)

(b)

(c)

(d)

˛t ˛

= − e(t−s) A c ˛˛

s=0 ¦

= et A c − c = v(t) − u⋆ ,

c solves the homogeneous system v = A v, while u⋆ = c is the equilib-

!

e2 t 0 — scalings in the x direction, which expand when t > 0 and contract when 0 1 t < 0. The trajectories are half-lines parallel to the x axis. Points on the y axis are left fixed. ! 1 0 — shear transformations in the y direction. The trajectories are lines parallel t 1 to the y axis. Points on the y axis are fixed. ! cos 3 t sin 3 t — rotations around the origin, starting in a clockwise direction for − sin 3 t cos 3 t t > 0. The trajectories are the circles x2 + y 2 = c. The origin is fixed. ! cos 2 t − sin 2 t — elliptical rotations around the origin. The trajectories are the 2 sin 2 t 2 cos 2 t ellipses x2 + 41 y 2 = c. The origin is fixed. !

cosh t sinh t (e) — hyperbolic rotations. Since the determinant is 1, these are areasinh t cosh t preserving scalings: for t > 0, expanding by a factor of et in the direction x = y and contracting by the reciprocal factor e− t in the direction x = − y; the reverse holds for t < 0. The trajectories are the semi-hyperbolas x2 − y 2 = c and the four rays x = ± y. The origin is fixed. 257

9.4.37. 0 (a)

(b)

(c)

(d)

(e)

1

e2 t 0 0 C B t 2 2t @ 0 et 0 A — scalings by a factor λ = e in the y direction and λ = e in the x 0 0 1 direction. The trajectories are the semi-parabolas x = c y 2 , z = d for c, d constant, and the half-lines x 6= 0, y = 0, z = d and x = 0, y 6= 0, z = d. Points on the z axis are left fixed. 0 1 1 0 t B C @ 0 1 0 A — shear transformations in the x direction, with magnitude proportional 0 0 1 to the z coordinate. The trajectories are lines parallel to the x axis. Points on the xy plane are fixed. 0 1 cos 2 t 0 − sin 2 t B C 0 1 0 @ A — rotations around the y axis. The trajectories are the circles sin 2 t 0 cos 2 t x2 + z 2 = c, y = d. Points on the y axis are fixed. 0 1 cos t sin t 0 B 0C @ − sin t cos t A — spiral motions around the z axis. The trajectories are the pos0 0 et itive and negative z axes, circles in the xy plane, and cylindrical spirals (helices) winding around the z axis while going away from the xy pane at an exponentially increasing rate. The only fixed point is the origin. 0 1 cosh t 0 sinh t B 1 0 C @ 0 A — hyperbolic rotations in the xz plane, cf. Exercise 9.4.36(e). The sinh t 0 cosh t trajectories are the semi-hyperbolas x2 − z 2 = c, y = d, and the rays x = ± z, y = d. The points on the y axis are fixed.

9.4.38. (a) et:A = √ 0 2 1 3 + 3 cos 3 t B √ √ B 1 1 B √1 B 3 − 3 cos 3 t − 3 sin 3 t √ √ @ − 31 + 13 cos 3 t − √1 sin 3 t 3

1 3

− 13 cos

√ 3t +

√1 sin √3

√ 3t

+ 32 cos 3 t √ √ − 13 + 13 cos 3 t + √1 sin 3 t 1 3

3

T

(b) The axis is the null eigenvector: ( 1, 1, −1 ) .

√ 1 √ 3 t + √1 sin 3 t 3 √ C √ C − 13 + 31 cos 3 t − √1 sin 3 t C C. 3 √ A 1 2 3 + 3 cos 3 t − 13 + 31 cos

♥ 9.4.39. (a) Given c, d ∈ R, x, y ∈ R 3 , we have

Lv [ c x + d y ] = v × (c x + d y) = c v × x + d v × y = c Lv [ x ] + d Lv [ y ],

proving linearity. (b) (c) (d) (e)

1

0

0 c −b T 0 aC If v = ( a, b, c )T , then Av = B A = − Av . @ −c b −a 0 Every 3 × 3 skew-symmetric matrix has the form Av for some v = ( a, b, c )T . x ∈ ker Av if and only if 0 = Av x = v × x, which happens if and only if x = c v is parallel to v. 0 1 0 1 0 −r 0 cos r t − sin r t 0 B tA If v = r e3 , then Ar e3 = B 0 0C cos r t 0 C @r A and hence e r e3 = @ sin r t A 0 0 0 0 0 1 represents a rotation by angle r t around the z axis. More generally, given v with r = k v k, let Q be any rotation matrix such that Q v = r e3 . Then Q(v × x) = (Q v) × (Q x) since rotations preserve the orthogonality of the cross product and its right-handedness. 258

Thus, if x = v × x and we set y = Q x then y = r e3 × y. We conclude that the solutions x(t) are obtained by rotating the solutions y(t) and so are given by rotations around the axis v. ¦

¦

0

1

x0 cos t − y0 sin t ♥ 9.4.40. C (a) The solution is x(t) = B @ x0 sin t + y0 cos t A, which is a rotation by angle t around the z z0 q axis. The trajectory of the point ( x0 , y0 , z0 )T is the circle of radius r0 = x20 + y02 at height z0 centered on the z axis. The points on the z axis, with r0 = 0, are fixed. 0 1 x0 cos t − y0 sin t C (b) For the inhomogeneous system, the solution is x(t) = B @ x0 sin t + y0 cos t A, which is a z0 + t screw motion. If r0 = 0, the trajectory is the z axis; otherwise it is a helix of radius r0 , spiraling up the z axis. ¦ (c) The solution to the linear system x = a × x is x(t) = Rt x0 where Rt is a rotation through angle t k a k around the axis a. The solution to the inhomogeneous system is the screw motion x(t) = Rt x0 + t a. 0

1

0

1

♥ 9.4.41. 0 c −b − b 2 − c2 ab ac B C B C 2 (a) Since A = @ −c 0 a A, we have A = @ A while ab − a 2 − c2 bc 2 2 b −a 0 ac bc −a − b A3 = − (a2 + b2 + c2 )A = − A. Therefore, by induction, A2 m+1 = (−1)m A and A2 m = (−1)m−1 A2 for m ≥ 1. Thus et A =

∞ X

n=0

∞ ∞ X X t2 m+1 t2 m tn n A = I+ A− A2 . (−1)m (−1)m n! (2 m + 1)! (2 m)! m=0 m=1

The power series are, respectively, those of sin t and cos t − 1 (since the constant term doesn’t appear), proving the formula. (b) Since u 6= 0, the matrices I , A, A2 are linearly independent and hence, by the Euler– Rodrigues formula, et A = I if and only if cos t = 1, sin t = 0, so t must be an integer multiple of 2 π. (c) If v = r u with r = k v k, then t Av

e

=e

t r Au

= I

+ (sin t r)Au + (1 − cos t r) A2u

!

cos t k v k 1− k v k2

sin t k v k Av + = I+ kvk

which equals the identity matrix if and only if t = 2 k π / k v k for some integer k. 9.4.42. None of them commute: "

"

"

"

"

2 0

0 0

2 0

0 0

0 1

0 0

0 1

0 0

0 −3

!

!

!

!

3 0

0 1

,

0 0

,

0 4

,

0 −3

!

−1 0

0 1

, ,

!#

0 1

3 0 1 0

!#

!#

1 0

=

!#

= =

=

!#

=

0 −2

0 −8

−3 0

−1 0 6 0

!

0 , 0 !

−2 , 0 !

0 , 3 !

"

"

"

"

0 , 1 !

"

0 , −6 259

2 0

0 0

2 0

0 0

0 1

0 0

0 −3 0 4

!

!

!

3 0 −1 0

0 −3

,

3 0

!#

,

0 1

1 0

,

0 4

−1 0

!

!

!#

!#

,

0 4

−1 0

,

0 1

1 0

0 −2

2 , 0

=

!

1 0

=

!#

!#

6 , 0

=

=

!

0 6

=

9 0 −5 0

!

0 , −1

!

0 , −9 !

0 , 5

A2v ,

20

1

0

13

0

1

0 0 2 1 B 7 C 0C A 5 = @ 0 0 0 A, 0 0 0 0 0 0 20 1 0 13 0 1 2 0 0 0 0 −2 0 0 −4 6B C B C7 B 0A5 = @ 0 0 0C 4@0 1 0A, @0 0 A, 2 0 0 0 0 0 −4 0 0 1 0 20 13 0 1 0 1 0 2 0 0 0 1 0 C B 6B C7 B C 4 @ 0 1 0 A , @ −1 0 0 A 5 = @ 1 0 0 A, 0 0 0 0 0 1 0 0 0 20 1 0 0 1 13 2 0 0 0 0 1 0 0 2 6B C B B C7 C 4 @ 0 1 0 A , @ 0 0 0 A 5 = @ 0 0 0 A, 0 0 0 1 0 0 −2 0 0 20 1 0 13 1 0 0 0 1 0 0 −2 2 0 0 6B C B B C7 0C 0A5 = @0 0 4@0 0 0A, @0 0 A, 0 0 0 2 0 0 0 0 −2 20 1 0 13 1 0 0 0 1 0 1 0 0 0 1 6B C B C7 C B 4 @ 0 0 0 A , @ −1 0 0 A 5 = @ 0 0 1 A, 0 0 0 0 0 1 0 0 0 0 1 13 1 0 20 1 0 0 0 0 1 0 0 1 B C B C7 6B 0C A, 4@0 0 0A, @0 0 0A5 = @0 0 0 0 −1 1 0 0 0 0 0 0 1 13 20 1 0 0 0 −2 0 1 0 0 0 −2 B C B C7 6B 0C 4@0 0 A , @ −1 0 0 A 5 = @ 0 0 −2 A, −2 2 0 0 0 1 2 0 0 0 1 13 1 0 20 0 0 −2 0 0 1 −4 0 0 6B B C C7 B 0C A , @ 0 0 0 A 5 = @ 0 0 0 A, 4@0 0 0 0 4 1 0 0 2 0 0 20 1 0 13 0 1 0 1 0 0 0 1 0 0 −1 6B C B C7 B 0 −1 C 4 @ −1 0 0 A , @ 0 0 0 A 5 = @ 0 A. 0 0 1 1 0 0 1 −1 0 2

6B 4@0

0 1 0

0 0C A, 0

0

B @0

0 0 0

9.4.43. (a) If U, V are upper triangular, so are U V and V U and hence so is [ U, V ] = U V − V U . (b) If AT = − A, B T = − B then (c) No.

[ A, B ]T = (A B − B A)T = B T AT − AT B T = B A − A B = − [ A, B ].

♦ 9.4.44. The sum of h

[ A, B ], C = (A B − B A) C − C (A B − B A) = A B C − B A C − C A B + C B A,

h

[ B, C ], A = (B C − C B) A − A (B C − C B) = B C A − C B A − A B C + A C B,

h

i

i

[ C, A ], B = (C A − A C) B − B (C A − A B) = C A B − A C B − B C A + B A C,

is clearly zero.

i

dunj dU = A U , the equations in the last row are = 0 for dt dt j = 1, . . . , n, and hence the last row of U (t) is constant. In particular, for the exponential matrix solution U (t) = et A the last row must equal the last row of the identity matrix U (0) = I , which is eT n.

9.4.45. In the matrix system

260

!

V (t) g(t)

f (t) , where f (t) is a column vector, g(t) w(t) dU a row vector, and w(t) is a scalar function. Then the matrix system = A U decouples dt df dg dw dV = B V, = B f + c w, = 0, = 0, with initial conditions V (0) = I , into dt dt dt dt f (0) = O, g(0) = O, w(0) = 1. Thus, g ≡ 0, w ≡ 1, are constant, V (t) = et B . The equaZ t df e(t−s) A b ds, tion for f (t) becomes = B f + c, f (0) = 0, and the solution is u(t) = 0 dt cf. Exercise 9.4.35.

9.4.46. Write the matrix solution as U (t) =

!

!

x+t et x : translations in x direction. (b) ♦ 9.4.47. (a) : scaling in x and y direc− y e 2t y ! (x + 1) cos t − y sin t − 1 t −2t : rotations around tions by respective factors e , e . (c) (x + 1) sin t + y cos t ! ! et (x + 1) − 1 −1 the point . (d) : scaling in x and y directions, centered at the 0 e− t (y + 2) − 2 ! −1 point , by reciprocal factors et , e− t . −2

9.5.1. The vibrational frequency is ω = ω/(2 π) ≈ .297752. 9.5.2. We need

q

21/6 ≈ 1.87083, and so the number of Hertz is

ω 1 1 √ = 20 and so m = = ≈ .0000633257. 2π 1600 π 2 2π m

9.5.3.

2 1

(a) Periodic of period π:

-2

2

4

2

4

6

8

10

-1 -2 2.5 2 1.5 1 0.5

(b) Periodic of period 2:

-2 -0.5

6

8

10

2 1.5 1 0.5

(c) Periodic of period 12: -5

-0.5 -1 -1.5

5

10

15

20

25

5

10

15

20

25

2 1

(d) Quasi-periodic:

-5 -1 -2

261

3 2 1

(e) Periodic of period 120 π:

100

200

300

400

500

-1 -2 -3

3 2 1

(f ) Quasi-periodic:

10

20

30

40

50

60

-1 -2 0.5

(g) sin t sin 3 t = cos 2 t − cos 4 t, and so periodic of period π :

0.25 -5

-0.25

5

10

15

-0.5 -0.75 -1

πm , where m is the least common multiple of q and s, while 2k 2k−1 is the largest power of 2 appearing in both p and r. √ √ 9.5.5. (a) 2 , 7 ; (b) 4 — each eigenvalue gives two linearly !independent solutions; ! √ √ 2 −1 + r2 cos( 7 t − δ2 ) ; (d) The solution is periodic (c) u(t) = r1 cos( 2 t − δ1 ) 1 2 if only one frequency is excited, i.e., r1 = 0 or r2 = 0; all other solutions are quasiperiodic.

9.5.4. The minimal period is

9.5.6. (a) 5, 10; (b) 4 — each eigenvalue gives two linearly independent solutions; ! ! −3 4 (c) u(t) = r1 cos(5 t − δ1 ) + r2 cos(10 t − δ2 ) ; (d) All solutions are periodic; 4 3 when r1 6= 0, the period is 25 π, while when r1 = 0 the period is 15 π. 9.5.7. √ √ (a) u(t) = r1 cos(t√− δ1 ) + r2 cos( 5 t −√δ2 ), v(t) = r1 cos(t − δ1 ) − r2 cos( 5 t − δ2 ); (b) u(t) = r1 cos( 10 t − δ1 ) − 2 r2 cos( 15 t − δ2 ), √ √ v(t) = 2 r1 cos( 10 t − δ1 ) + r2 cos( 15 t − δ2 ); T (c) u(t) = ( r1 cos(t − δ1 ), r20 cos(2 t − δ2 ), r3 cos(3 t −0δ1 ) )1 ; 0 1 1 1 −1 1 √ √ B B C C C (d) u(t) = r1 cos( 2 t − δ1 ) B @ 1 A + r2 cos(3 t − δ2 ) @ 1 A + r3 cos( 12 t − δ3 ) @ −1 A. 0 1 2 !

!

c 0 1 9.5.8. The system has stiffness matrix K = ( 1 −1 ) 1 = (c1 + c2 ) and so the −1 0 c2 ¦¦ dynamical equation is m u + (c1 + c2 ) u = 0, which is the same as a mass connected to a single spring with stiffness c = c1 + c2 . !

52 −36 9.5.9. Yes. For example, c1 = 16, c2 = 36, c3 = 37, leads to K = with eigen−36 73 values λ1 = 25, λ2 = 100, and hence natural frequencies ω1 = 5, ω2 = 10. Since ω2 is a rational multiple of ω1 , every solution is periodic with period 52 π or 51 π. Further examples ! c1 + c2 − c2 = QT ΛQ for can be constructed by solving the matrix equation K = − c2 c2 + c 3 c1 , c2 , c3 , where Λ is a diagonal matrix with entries ω 2 , r2 ω 2 where r is any rational num262

ber and Q is a suitable orthogonal matrix, making sure that the resulting stiffnesses are all positive: c1 , c2 , c3 > 0. ♠ 9.5.10. (a) The vibrations slow down. (b) The vibrational frequencies are ω 1 = .44504, ω2 = 1.24698, ω3 = 1.80194, each of which is a bit smaller than the fixed end case, which has q q √ √ √ frequencies ω1 = 2 − 2 = .76537, ω2 = 2 = 1.41421, ω3 = 2 + 2 = 1.84776. (c) Graphing the motions of the three masses for 0 ≤ t ≤ 50: with bottom support:

without bottom support:

♠ 9.5.11. (a) The vibrational frequencies and eigenvectors are q √ √ ω1 = 2 − 2 = .7654, ω2 = 2 = 1.4142, , 0

0

1

1

ω3 =

2+

√ 2 = 1.8478,

0

1

1 √ C v3 = B @ − 2 A. 1

−1 C v2 = B @ 0 A, 1

√1 C v1 = B @ 2 A, 1

q

Thus, in the slowest mode, all three masses are moving in the same direction, with the √ middle mass moving 2 times farther; in the middle mode, the two outer masses are moving in opposing directions by equal amounts, while the middle mass remains still; in the fastest mode, the two outer masses are moving in tandem, while the middle mass is moving farther in an opposing direction. (b) The vibrational frequencies and eigenvectors are ω1 = .4450, ω2 = 1.2470, ω3 = 1.8019, v1 =

0

.3280

1

B C @ .5910 A,

.7370

v2 =

0 B @

1

.7370 .3280 C A, −.5910

v3 =

0 B @

1

−.5910 .7370 C A. −.32805

Thus, in the slowest mode, all three masses are moving in the same direction, each slightly farther than the one above it; in the middle mode, the top two masses are moving in the same direction, while the bottom, free mass moves in the opposite direction; in the fastest mode, the top and bottom masses are moving in the same direction, while the middle mass is moving in an opposing direction. ♥ 9.5.12. Let c be the commons spring stiffness. The stiffness matrix K is tridiagonal with all diagonal entries equal to 2 c and all sub- and super-diagonal entries equal to − c. Thus, by v ! u u √ kπ kπ t Exercise 8.2.48, the vibrational frequencies are 2 c 1 − cos = 2 c sin n+1 2(n + 1) for k = 1, .√ . . , n. As n → ∞, the frequencies form a denser and denser set of points on the graph of 2 c sin θ for 0 ≤ θ ≤ 12 π. ♣ 9.5.13. We take “fastest” to mean that the slowest vibrational frequency is as large as possible. Keep in mind that, for a chain between two fixed supports, completely reversing the 263

order of the springs does not change the frequencies. For the indicated springs connecting 2 masses to fixed supports, the order 2, 1, 3 or its reverse, 3, 1, 2 is the fastest, with frequencies 2.14896, 1.54336. For the order 1, 2, 3, the frequencies are 2.49721, 1.32813, while for 1, 3, 2 the lowest frequency is the slowest, at 2.74616, 1.20773. Note that as the lower frequency slows down, the higher one speeds up. In general, placing the weakest spring in the middle leads to the fastest overall vibrations. For a system of n springs with stiffnesses c1 > c2 > · · · > cn , when the bottom mass is unattached, the fastest vibration, as measured by the minimal vibrational frequency, occurs when the springs should be connected in order c1 , c2 , . . . , cn from stiffest to weakest, with the strongest attached to the support. For fixed supports, numerical computations show that the fastest vibrations occur when the springs are attached in the order cn , cn−3 , cn−5 , . . . , c3 , c1 , c2 , c4 , . . . , cn−1 when n is odd, and cn , cn−1 , cn−4 , cn−6 , . . . , c4 , c2 , c1 , c3 , . . . , cn−5 , cn−3 , cn−2 when n is even. Finding analytic proofs of these observations appears to be a challenge. 0

3 2 1 2

− 21

−1 0

1

0 C 0C C Cu = 0, where u(t) = 1C

0

1

u1 (t) C B d u − B v1 (t) C C are the horizontal and B + ♣ 9.5.14. (a) 2 3 @ u2 (t) A dt −1 0 2 2A v2 (t) 1 3 0 0 2 2 r √ vertical displacements of the two free nodes. (b) 4; (c) ω1 = 1 − 21 2 = .541196, ω2 = r r r √ √ √ 2 − 21 2 = 1.13705, ω3 = 1 + 21 2 = 1.30656, ω4 = 2 + 12 2 = 1.64533; (d) the √ 1 √ 1 0 1 0 0 −2.4142 −1 + 2 −1 − 2 B B B −1 C C C C B B −1√ C 1√ C C , v2 = B B C = B C = corresponding eigenvectors are v1 = B @ −1 − 2 A @ −2.4142 A @ 1− 2A 1 1 1 √ 1 √ 1 0 1 0 0 0 1 1 0 .4142 .4142 −2.4142 −1 + 2 −1 − 2 B B B B C C B 1 C 1 C C C B B B −1 C B C B −1√ C 1√ C C , v3 = B B C = B C = B C, v4 = B C. In the @ −.4142 A @ −1 + 2 A @ .4142 A @ 2.4142 A @ 1+ 2A 1 1 1 1 1 first mode, the left corner moves down and to the left, while the right corner moves up and to the left, and then they periodically reverse directions; the horizontal motion is proportionately 2.4 times the vertical. In the second mode, both corners periodically move up and towards the center line and then down and away; the vertical motion is proportionately 2.4 times the horizontal. In the third mode, the left corner first moves down and to the right, while the right corner moves up and to the right, periodically reversing their directions; the vertical motion is proportionately 2.4 times the horizontal. In the fourth mode, both corners periodically move up and away from the center line and then down and towards it; the horizontal motion is proportionately 2.4 times the vertical. ” 1 “ (e) u(t) = √ − cos(ω1 t) v1 + cos(ω2 t) v2 + cos(ω3 t) v3 − cos(ω4 t) v4 , which is a 4 2 quasiperiodic combination of all four normal modes. 2

B B B B B @

3 2

9.5.15. The system has periodic solutions whenever A has a complex conjugate pair of purely imaginary eigenvalues. Thus, a quasi-periodic solution requires two such pairs, ± i ω 1 and ± i ω2 , with the ratio ω1 /ω2 an irrational number. The smallest dimension where this can occur is 4.

9.5.16. √ (a) u(t) = a t + b + 2 r cos( 5 t − δ),

√ v(t) = − 2 a t − 2 b + r cos( 5 t − δ). 264

The unstable mode consists of the terms with a in them; it will not be excited if the ¦ ¦ initial conditions satisfy u(t √ √0 ) − 2 v(t0 ) = 0. (b) u(t) = − 3 a t − 3 b + r cos( 10 t − δ), v(t) = a t + b + 3 r cos( 10 t − δ). The unstable mode consists of the terms with a in them; it will not be excited if the ¦ ¦ initial conditions satisfy − 3 u(t0 ) + v(t ) = 0. r0 r ! (c)

u(t) = − 2 a t − 2 b − v(t) =

w(t) =

√ 7+ 13 2

√ 1− 13 r1 cos 4

t − δ1

−

√ 1+ 13 r2 cos 4

√ 7− 13 2

r r ! √ √ √ √ 3+ 13 3− 13 7+ 13 7− 13 −2at − 2b + + r cos t − δ r cos 1 1 2 4 2 4 2 r r ! ! √ √ 7+ 13 7− 13 t − δ1 + r2 cos t − δ2 . a t + b + r1 cos 2 2

t − δ2 t − δ2

! !

, ,

The unstable mode is the term containing a; it will not be excited if the initial condi¦ ¦ ¦ tions satisfy − 2 u(t0 ) − 2 v(t0 ) + w(t0 ) =√0. (d) u(t) = (a1 − 2 a2 ) t + b1 − 2 b2 + r cos( 6 t − δ), √ v(t) = a1 t + b1 − r cos( 6 t − δ), √ w(t) = a2 t + b2 + 2 r cos( 6 t − δ). The unstable modes consists of the terms with a1 and a2 in them; they will not be ex¦ ¦ ¦ ¦ cited if the initial conditions satisfy u(t0 ) + v(t0 ) = 0 and − 2 u(t0 ) + w(t0 ) = 0. 9.5.17. (a) Q =

0 B B B B @

− √1

√1 2

2

0

0

√1 2

√1 2

0

1

C C 1C C, A

0

0

4 Λ=B @0 0

1

0 2 0

0 0C A. 2

(b) Yes, because K is symmetric and has all positive eigenvalues. !T √ √ √ 1 (c) u(t) = cos 2 t, √ sin 2 t, cos 2 t . 2 √ (d) The solution u(t) is periodic with √ period 2 π. (e) No — since the frequencies 2, 2 are not rational multiples of each other, the general solution is quasi-periodic. 9.5.18. (a) Q =

0 B B B B @

√1 3 − √1 3 √1 3

− √1

2

0 √1 2

√1 6 √2 6 √1 6

1

C C C, C A

Λ=

0

3

B @0

0

0 2 0

1

0 0C A. 0

(b) No — K is only positive semi-definite. √ √ 1 01 1 (t + 1) + 32 cos 3 t − √ sin 3 t 3 3 B3 √ C √ B2 C 1 2 B √ 3 t + 3tC sin (t + 1) − cos (c) u(t) = B 3 C. 3 3 3 @ √ A √ 1 2 1 √ 3 (t + 1) + 3 cos 3 t − 3 3 sin 3 t (d) The solution u(t) is unstable, and becomes unbounded as | t | → ∞. (e) No — the general solution is also unbounded. 9.5.19. The solution to the initial value problem m uε (t) = a cos

s

ε (t − t0 ) + b m

s

m sin ε

s

d2 u ¦ + ε u = 0, u(t0 ) = a, u(t0 ) = b, is dt2

ε (t − t0 ). In the limit as ε → 0, using the fact that m

265

sin c h = c, we find uε (t) → a + b(t − t0 ), which is the solution to the unrestrained h→0 h ¦ ¦¦ initial value problem m u = 0, u(t0 ) = a, u(t0 ) = b. Thus, as the spring stiffness goes to zero, the motion converges to the unrestrained motion. However, since the former solution is periodic, while the latter moves along a straight line, the convergence is non-uniform on all of R and the solutions are close only for a period of time: if you wait long enough they will diverge. lim

♠ 9.5.20.

r

r √ √ 1 3 − + (a) Frequencies: ω1 = 5 = .61803, ω2 = 1, ω3 = 2 2 5 = 1.618034; √ 1 √ 1 1 0 0 −1 2− 5 2+ 5 C B B B B −1 C C C B −1√ C 1√ C C, v2 = B C; unstable C , v3 = B B stable eigenvectors: v1 = B @ −1 A @ −2 + 5 A @ −2 − 5 A 1 1 1 0 1 1 B −1 C C C. In the lowest frequency mode, the nodes vibrate up and toB eigenvector: v4 = B @ 1A 1 wards each other and then down and away, the horizontal motion being less pronounced than the vertical; in the next mode, the nodes vibrate in the directions of the diagonal bars, with one moving towards the support while the other moves away; in the highest frequency mode, they vibrate up and away from each other and then down and towards, with the horizontal motion significantly more than the vertical; in the unstable mode the left node moves down and to the right, while the right hand node moves at the same rate up and to the right. (b) Frequencies: ω1 = .444569, ω2 = .758191, ω3 = 1.06792, ω4 = 1.757; eigenvectors: 0 1 0 1 0 1 0 1 .237270 −.122385 .500054 .823815 B B C B C B C −.117940 C C B .973375 C B .185046 C B .066249 C C, v2 = B B C , v3 = B C, v4 = B C. v1 = B @ .498965 A @ −.028695 A @ .666846 A @ −.552745 A .825123 .191675 −.520597 .106830 In the lowest frequency mode, the left node vibrates down and to the right, while the right hand node moves further up and to the right, then both reversing directions; in the second mode, the nodes vibrate up and to the right, and then down and to the left, the left node moving further; in the next mode, the left node vibrates up and to the right, while the right hand node moves further down and to the right, then both reversing directions; in the highest frequency mode, they move up and towards each other and then down and away, withrthe horizontal motionrmore than the vertical. r √ 3 2 20 = .4264, ω3 = 21 − 5 = 1.1399, ω = (c) Frequencies: ω1 = ω2 = 11 4 11 11 11 = r √ 3 1.3484, ω5 = 21 11 + 11 5 = 1.5871; stable eigenvectors: 3 2

1 2 0

0

√

1

0

1

0

√

1

1 1 01 −1 − 25 + 25 B 2 C C B 2 B 3C B0C B C C B 0C B 0 0 B C B B C C C B B C B B C C C B B0C −1 1 1 C, v = B √ C, v = B √ C; B C, v3 = B B B C C C B 1 4 5 5 5 1 1 B0C B− + B−3 C C C B− − B C 2 C 2 C B 2 B C B 2 @1A @ @ A A A @ 0 0 0 0 1 1 1 0 31 B 0C C B C B B −1 C C. In the two lowest frequency modes, the individunstable eigenvector: v6 = B B 3C C B @ 0A 1

01 B1C B C B C 0C B C, v2 = v1 = B B0C B C @0A 0 0

0

266

ual nodes vibrate horizontally and transverse to the swing; in the next lowest mode, the nodes vibrate together up and away from each other, and then down and towards each other; in the next mode, the nodes vibrate oppositely up and down, and towards and then away from each other; in the highest frequency mode, they also vibrate vibrate up and down in opposing motion, but in the same direction along the swing; in the unstable mode the left node moves down and in the direction of the bar, while the right hand node moves at the same rate up and in the same horizontal direction. ♥ 9.5.21. If the mass–spring molecule is allowed to move in space, then the vibrational modes and frequencies remain the same, while there are 14 independent solutions corresponding to the 7 modes of instability: 3 rigid translations, 3 (linearized) rotations, and 1 mechanism, which is the same as in the one-dimensional version. Thus, √ the general motion of the molecule in space is to vibrate quasi-periodically at frequencies 3 and 1, while simultaneously translating, rigidly rotating, and bending, all at a constant speed. ♠ 9.5.22. √ (a) There are 3 linearly independent normal modes of vibration: one of frequency 3, , in q which the triangle expands and contacts, and two of frequency 32 , , in which one of the edges expands and contracts while the opposite vertex moves out in the perpendicular direction while the edge is contracting, and in when it expands. (Although there are three such modes, the third is a linear combination of the other two.) There are 3 unstable null eigenmodes, corresponding to the planar rigid motions of the triangle. To avoid exciting the instabilities, the initial velocity must be orthogonal to the kernel; thus, if vi is the initial velocity of the ith mode, we require v1 + v2 + v3 = 0 and v1⊥ + v2⊥ + v3⊥ = 0 where vi⊥ denotes the angular component of the velocity vector with respect to the center of the triangle. √ (b) There are 4 normal modes of vibration, all of frequency 2, in which one of the edges expands and contracts while the two vertices not on the edge stay fixed. There are 4 unstable modes: 3 rigid motions and one mechanism where two opposite corners move towards each other while the other two move away from each other. To avoid exciting the instabilities, the initial velocity must be orthogonal to the kernel; thus, if the vertices are at ( ±1, ±1 )T and vi = ( vi , wi )T is the initial velocity of the ith mode, we require v1 + v2 = v3 + v4 = w1 + w4 = w2 + w3 = 0. √ (c) There are 6 normal modes of vibration: one of frequency 3, in which three nonadjacent edges expand and then contact,qwhile the other three edges simultaneously contract and then expand; two of frequency 52 , in which two opposite vertices move back and forth in the perpendicular direction to the line joining them (only two of these three q 3 modes are linearly independent); two of frequency 2 , in which two opposite vertices move back and forth towards each other (again, only two of these three modes are linearly independent); and one of frequency 1, in which the entire hexagon expands and contacts. There are 6 unstable modes: 3 rigid motions and 3 mechanisms where two opposite vertices move towards each other while the other four move away. As usual, to avoid exciting the instabilities, the initial velocity must be orthogonal to the kernel; “

thus, if the vertices are at cos 31 k π, sin 13 k π ity of the ith mode, we require v1 + v2 + v3 + v4 + v5 + v6 = 0, w1 + w2 + w3 + w4 + w5 + w6 = 0, √ √ 3 v5 + w5 + 3 v6 + w6 = 0,

”T

, and vi = ( vi , wi )T is the initial veloc−

√ √

3 v1 + w1 + 2 w2 = 0,

3 v1 + w1 + 2 w6 = 0, √ 2 w3 + 3 v4 + w4 = 0.

♠ 9.5.23. There are 6 linearly independent normal modes of vibration: one of frequency 2, in 267

√ which the tetrahedron expands and contacts; four of frequency 2, in which one of the edges expands and contracts while the opposite vertex stays fixed; and two of frequency √ 2, in which two opposite edges move towards and away from each other. (There are three different pairs, but the third mode is a linear combination of the other two.) There are 6 unstable null eigenmodes, corresponding to the three-dimensional rigid motions of the tetrahedron. To avoid exciting the instabilities, the initial velocity must be orthogonal to the kernel, and so, using the result of Exercise 6.3.13, if vi = ( ui , vi , wi )T is the initial velocity of the ith mode, we require √ √ √ − 2 u1 + 6 v1 − w1 − 2 2 u2 + w2 = 0, u1 + u2 + u3 + u4 = 0, √ √ v1 + v2 + v3 + v4 = 0, − 2 v1 + 3 u2 + v2 − 3 u3 + v3 = 0, √ √ √ w1 + w2 + w3 + w4 = 0, − 2 u1 − 6 v1 − w1 − 2 2 u3 + w3 = 0. q

♥ 9.5.24. (a) When C = I , then K = AT A and so the frequencies ωi = λi are the square roots of its positive eigenvalues, which, by definition, are the singular values of the reduced incidence matrix. (b) Thus, a structure with one or more very small frequencies ωi ≪ 1, and hence one or more very slow vibrational modes, is almost unstable in that a small perturbation might create a null eigenvalue corresponding to an instability. 9.5.25. Since corng A is the orthogonal complement to ker A = ker K, the initial velocity is orthogonal to all modes of instability, and hence by Theorem 9.38, the solution remains bounded, vibrating around the fixed point prescribed by the initial position. 9.5.26. (a) u(t) = r1 cos

(d) u(t) =

√1 2

t − δ1

«

!

„

«

q 1 5 + r2 cos 3 t − δ2 2 ! „q « 2 8 + r2 cos t − δ 2 5 3

!

−3 , 1 ! −5 , 2

« √1 t − δ 1 3 √ 1 √ 1 !0 !0 r r √ √ 1+ 3 1− 3 3− 3 3+ 3 t − δ1 @ 2 A + r2 cos t − δ2 @ 2 A, r1 cos 2 2 1 1 0 1 0 1 0 1 3 0 ” ” −3 “√ “√ C C C 2 t − δ2 B 3 t − δ3 B r1 cos ( t − δ1 ) B @ 2 A + r3 cos @ 2 A, @ −1 A + r2 cos

(b) u(t) = r1 cos (c) u(t) =

„

„

1

1 ! 1 −1 2 (e) u(t) = r1 cos + r2 cos ( 2 t − δ2 ) , 3 t − δ1 1 1 0 0 0 1 1 1 2 −1 1 ” “√ B C C C 3 t − δ2 B (f ) u(t) = (a t + b) B @ −1 A + r1 cos ( t − δ1 ) @ 0 A + r2 cos @ −2 A. 1 1 1 „q

9.5.27. u1 (t) = u2 (t) = 9.5.28. u1 (t) = u2 (t) =

«

!

r √ √ √ 3− 3 3+1 3+ 3 √ cos cos t + t, 2 2 2 3 r r √ √ 1 1 √ cos 3−2 3 t − √ cos 3+2 3 t. 2 3 2 3 √ √ √ √ √ √ 5− 17 17−3 17+3 √ cos cos 5+2 17 t+ √ 2 2 17 2 17 √ √ √ √ √1 cos 5− 17 t − √1 cos 5+ 17 t. 2 2 17 17

√

3−1 √ 2 3

r

268

t,

1

♠ 9.5.29. The order does make a difference: Mass order

Frequencies

1, 3, 2

or

2, 3, 1

1.4943, 1.0867, .50281

1, 2, 3

or

3, 2, 1

1.5451, 1.0000, .52843

2, 1, 3

or

3, 1, 2

1.5848,

.9158, .56259

Note that, from top to bottom in the table, the fastest and slowest frequencies speed up, but the middle frequency slows down. ♣ 9.5.30. (a) We place the oxygen molecule at the origin, one hydrogen at ( 1, 0 )T and the other at 105 ( cos θ, sin θ )T = ( −0.2588, 0.9659 )T with θ = 180 π = 1.8326 radians. There are two independent vibrational modes, whose fundamental frequencies are ω1 = 1.0386, ω2 = 1.0229, with corresponding eigenvectors v1 = ( .0555, −.0426, −.7054, 0., −.1826, .6813 )T , v2 = ( −.0327, −.0426, .7061, 0., −.1827, .6820 )T . Thus, the (very slightly) higher frequency mode has one hydrogen atoms moving towards and the other away from the oxygen, which also slightly vibrates, and then reversing their motion, while in the lower frequency mode, they simultaneously move towards and then away from the oxygen atom. (b) We place the carbon atom at the origin and the chlorine atoms at «T „ √ 1 2 3 3 , 0, − 3

,

−

√

2 3 ,

r

2 1 3,−3

!T

,

−

√

2 3 ,

r

2 1 3,−3

!T

,

( 0, 0, 1 )T ,

which are the vertices of a unit tetrahedron. There are four independent vibrational modes, whose fundamental frequencies are ω1 = ω2 = ω3 = .139683, ω4 = .028571, with corresponding eigenvectors 1 0 0 1 1 0 1 0 .2248 −.5998 .9586 0. B −.7314 C B −.6510 C B B 0. C 0. C C B B C C B C B C B B C C B C B B −.1559 C B .6668 C C B C B 0. 0. C B B C C C B B B .1245 C B .0025 C B .4714 C B −.2191 C C B B C C C B B B B B B 0. C 0. C 0. C 0. C C B B C C C B B C B B C C B C B B −.0440 C B −.0009 C B .0775 C B −.1667 C C B B C C B C B B −.0318 C B −.1042 C B −.0548 C B −.2357 C C B B C C B C B C B C C B C B B B .1805 C, C C. B C B v = v1 = B .0949 .0551 , v = .4082 , v = 2 4 3 C B B C C B C B B −.0225 C B −.0737 C B −.0387 C B −.1667 C C B B C C B C B B .0246 C B .1130 C B −.0548 C B −.2357 C C B B C B C C B C B B C B C C B B .0427 C B .1957 C B −.0949 C B −.4082 C B C C B B C C B B .0174 C B .0799 C B −.0387 C B −.1667 C B C C B B C C B B C C B B C B C 0. 0. 0. 0. B C C B B C B C B C C B B C B C @ A @ @ A @ A 0. 0. 0. 0. A −.1714 0. .0401 .5000 The three high frequency modes are where two of the chlorine atoms remain fixed, while the other two vibrate in opposite directions into and away from the carbon atom, which slightly moves in the direction of the incoming atom. (Note: There are six possible pairs, but only three independent modes.) The low frequency mode is where the four chlorine atoms simultaneously move into and away from the carbon atom. (c) There are six independent vibrational modes, whose fundamental frequencies are ω 1 = 2.17533, ω2 = ω3 = 2.05542, ω4 = ω5 = 1.33239, ω6 = 1.12603. In all cases, the 269

bonds periodically lengthen and shorten. In the first mode, adjacent bonds have the opposite behavior; in the next two modes, two diametrically opposite bonds shorten while the other four bonds lengthen, and then the reverse happens; in the next two modes, two opposing nodes move in tandem along the line joining them while the other four twist accordingly; in the lowest frequency mode, the entire molecule expands and contracts. Note: The best way to understand the behavior is to run a movie of the different motions. 9.5.31. (a)

0 1 0 2 x 2 B 1C d B y1 C B B 0 C+B B dt2 @ x2 A @ −1

0 0 0 0

1 0 0 x 2 2 B 1C d B y1 C B B 0 B C+B dt2 @ x2 A @ −1

0 0 0 0

10

1

0

1

10

1

0

1

0 x1 −1 0 B B C C 0 0C B0C CB y1 C C = B C. CB 2 0 A@ x2 A @ 0 A y2 0 y2 0 0 0 √ Same vibrational frequencies: ω1 = 1, ω2 = 3, along with two unstable mechanisms corresponding to motions of either mass in the transverse direction. (b)

y2

0

r

−1 0 1 0

√

0 x1 0 B B C C 0C B0C CB y1 C CB C = B C. 0 A@ x2 A @ 0 A 0 0 y2 r

√

Same vibrational frequencies: ω1 = 3−2 5 , ω2 = 3+2 5 , along with two unstable mechanisms corresponding to motions of either mass in the transverse direction. (c) For a mass-spring chain with n masses, the two-dimensional motions are a combination of the same n one-dimensional vibrational motions in the longitudinal direction, coupled with n unstable motions of each individual mass in the transverse direction. 9.5.32. (a) −1 0 0 10 x1 1 0 0 1 B C B0C 0 0 0C CB y1 C B C CB C C 0 0 0 CB z1 C B B0C CB C = B C. B B0C C 2 0 0C CB x2 C B C 0 0 0 A@ y2 A @ 0 A 0 0 0 z2 0 √ Same vibrational frequencies: ω1 = 1, ω2 = 3, along with four unstable mechanisms corresponding to motions of either mass in the transverse directions. x1 1 0 2 B 0 By C B 1C 2 B C B d B z1 C B B 0 C+B B B −1 C x dt2 B B B 2C @ 0 @y A 2 0 z2 0

0 0 0 0 0 0

0 0 0 0 0 0

(b) x1 1 0 2 0 0 B 0 0 0 By C B 1C 2 B C B d B z1 C B B 0 0 0 C+B B B −1 0 0 C x dt2 B B B 2C @ 0 0 0 @y A 2 0 0 0 z2 r √ ω2 = 3+2 5 , along with 0

r

√

−1 0 0 1 0 0

0 0 0 0 0 0

0 10 x1 1 0 0 1 B C B0C 0C CB y1 C B C CB C C 0 CB z1 C B B0C CB C = B C. B C B0C 0C CB x2 C B C 0 A@ y2 A @ 0 A 0 z2 0

four unstable mechanisms corresponding to ω1 = 3−2 5 , motions of either mass in the transverse directions. (c) For a mass-spring chain with n masses, the three-dimensional motions are a combination of the same n one-dimensional vibrational motions in the longitudinal direction, coupled with 2 n unstable motions of each individual mass in the transverse directions. ♦ 9.5.33. K v = λ M v if and only if M −1 K v = λ v, and so the eigenvectors and eigenvalues are 270

the same. The characteristic equations are the same up to a multiple, since h

i

det(K − λ M ) = det M (M −1 K − λ I ) = det M det(P − λ I ). 9.5.34.

e d2 u d2 u fu e. = N = − N K u = − N K N −1 u = − K dt2 dt2 T −T T −T f is symmetric since K f =N Moreover, K K N = N −1 KN −1 since both N and K are symmetric. Positive definiteness follows since

(a) First,

fx eT K e =x e T N −1 K N −1 x e = xT K x > 0 x

for all

e = N x 6= 0. x

e = ω f produces two solutions e of K e 2 and corresponding eigenvector v (b) Each eigenvalue λ ( ) et cos ω fu e (t) = e to the modified system d2 u e /dt2 = − K e . The corresponding u v e sin ω t n o cos ω et e (t) = e and solutions to the original system are u(t) = N −1 u v, where ω = ω sin ω et −1 e v = N v. Finally, we observe that v is the generalized eigenvector for the generalized e of the matrix pair K, M . Indeed, K ev fv e = λ e implies K v = eigenvalue λ = ω 2 = λ −1 e N K N N v = λ v.

♦ 9.5.35. Let v1 , . . . , vk be the eigenvectors corresponding to the non-zero eigenvalues λ1 = ω12 , . . . , λk = ωk2 , and vk+1 , . . . , vn the null eigenvectors. Then the general solution to the vibra¦¦ tional system u + K u = 0 is u(t) =

k h X

i=1

i

ci cos(ωi (t − t0 ) ) + di sin(ωi (t − t0 ) ) vi +

n X

j = k+1

(pj + qj (t − t0 ))vj .

which represents a quasi-periodic vibration with frequencies ω1 , . . . , ωk around a linear motion with velocity

n X

j = k+1

qj vj . Substituting into the initial conditions u(t0 ) = a, u(t0 ) = b, ¦

and using orthogonality, we conclude that h b , vi i h a , vi i , di = , ci = 2 k vi k ωi k vi k2

pj =

h a , vj i k vj

k2

,

qj =

h b , vj i k vj k2

.

In particular, the unstable modes are not excited if and only if all qj = 0 which requires that the initial velocity b be orthogonal to the null eigenvectors vk+1 , . . . , vn , which form ¦ a basis for the null eigenspace or kernel of K. This requires u(t0 ) = b ∈ (ker K)⊥ = corng K = rng K, using the fundamental Theorem 2.49 and the symmetry of K.

9.5.36. (a) u(t) = t e− 3“t ; critically damped. ” (b) u(t) = e− t cos 3 t + 23 sin 3 t ; underdamped. (c) u(t) = 14 sin 4 (t − 1); undamped. √

√

(d) u(t) = 2 9 3 e− 3 t/2 sin 3 2 3 t; underdamped. (e) u(t) = 4 e− t/2 − 2 e− t ; overdamped. (f ) u(t) = e− 3 t (3 cos t + 7 sin t); underdamped.

(v + 5) e− t − 41 (v + 1) e− 5 t , where v = u(0) is the initial velocity. v+1 v+1 This vanishes when e4 t = , which happens when t = t⋆ > 0 provided > 1, and v+5 v+5 so the initial velocity must satisfy v < −5.

9.5.37. The solution is u(t) =

1 4

¦

271

9.5.38. (a) By Hooke’s Law, the spring stiffness is k = 16/6.4 = 2.5. The mass is √ 16/32 = .5. The ¦¦ equations of motion are .5 u + 2.5 u = 0. The natural frequency is ω = 5 = 2.23607. ¦ ¦ ¦¦ (b) The solution to the initial value problem .5 u + u + 2.5 u = 0, u(0) = 2, u(0) = 0, is u(t) = e− t (2 cos 2 t + sin 2 t). (c) The system is underdamped, and the vibrations are less rapid than the undamped system. 9.5.39. The undamped case corresponds to a center; the underdamped case to a stable focus; the critically damped case to a stable improper node; and the overdamped case to a stable node. ♦ 9.5.40. (a) The general solution has the form u(t) = c1 e− a t + c2 e− b t for some 0 < a < b. If c1 = 0, c2 6= 0, the solution does not vanish. Otherwise, u(t) = 0 if and only if e(b−a) t = − c2 /c1 , which, since e(b−a) t is monotonic, happens for at most one time t = t⋆ . (b) Yes, since the solution is u(t) = (c1 + c2 t) e− a t for some a > 0, which, for c2 6= 0, only vanishes when t = − c1 /c2 . d2 u du 9.5.41. The general solution to m 2 + β = 0 is u(t) = c1 + c2 e− β t/m . Thus, the mass dt dt approaches its equilibrium position u⋆ = c1 , which can be anywhere, at an exponentially fast rate.

9.6.1. (a) cos 8 t − cos 9 t = 2 sin

1 2

t sin

17 2

t; fast frequency:

17 2 ,

beat frequency:

1 2.

2 1

-5

5

15

10

20

-1 -2

(b) cos 26 t − cos 24 t = − 2 sin t sin 25 t; fast frequency: 25, beat frequency: 1. 2

1 -2

4

2

6

8

10

-1 -2

(c) cos 10 t + cos 9.5 t = 2 sin .25 t sin 9.75 t; fast frequency: 9.75, beat frequency: .25. 2 1

5

15

10

20

25

30

-1 -2

“

(d) cos 5 t−sin 5.2 t = 2 sin .1 t −

1 4

π

”

“

sin 5.1 t −

1 4

”

π ; fast frequency: 5.1, beat frequency: .1.

2 1

10

20

30

40

-1 -2

272

50

60

70

9.6.2. (a) u(t) = (b) u(t) = (c) (d) (e) (f )

u(t) = u(t) = u(t) = u(t) =

1 1 27 cos 3 t − 27 cos 6 t, −3t 4 −3t 4 3 35 − 50 e + 50 cos t + 50 sin t, 50 t e √ √ √ “ − t/2 15 15 1 cos 2 t − 5 sin 215 2 sin 2 t + e cos 3 t − 21 t cos 3 t − 16 sin 3 t, 1 3 1 9 −t 1 + e− t/2 , 5 cos 2 t + 5 sin 2 t + 5 e 3 − t/3 1 cos t + 15 sin t + 14 e− t − 20 e . − 10

”

t ,

9.6.3. (a) u(t) = 13 cos 4 t + 32 cos 5 t + 51 sin 5 t; undamped periodic motion with fast frequency 4.5 and beat frequency .5: 1 0.5 5

15

10

20

25

30

-0.5 -1

“

”

(b) u(t) = 3 cos 5 t + 4 sin 5 t − e− 2 t 3 cos 6 t + 13 3 sin 6 t ; the transient is an underdamped motion; the persistent motion is periodic of frequency 5 and amplitude 5: 4 2 4

2

-2 -4

6

8

10

−5t 5 60 (c) u(t) = − 29 cos 2 t + 29 sin 2 t − 56 + 8 e− t ; 29 e the transient is an overdamped motion; the persistent motion is periodic: 4 3 2 1 5

-1 -2

(d) u(t) =

1 32

sin 4 t −

1 8

15

10

20

25

t cos 4 t; resonant, unbounded motion:

3 2 1 5

-1 -2 -3

10

15

20

25

9.6.4. In general, by (9.102), the maximal allowable amplitude is α = q

q

m2 (ω 2 − η 2 )2 + β 2 η 2 =

625 η 4 − 49.9999 η 2 + 1, which, in the particular cases is (a) .0975, (b) .002, (c) .1025.

9.6.5. η ≤ .14142 or η ≥ .24495. q √ 9.6.6. β ≥ 5 2 − 3 = 2.58819.

9.6.7. The solution to .5 u + u + 2.5 u = 2 cos 2 t, u(0) = 2, u(0) = 0, is ¦¦

u(t) =

4 17

¦

¦

cos 2 t +

16 17

sin 2 t + e− t

“

30 17

cos 2 t −

1 17

sin 2 t

”

= .9701 cos(2 t − 1.3258) + 1.7657 e− t cos(2 t + .0333). The solution consists of a persistent periodic vibration at the forcing frequency of 2, with a √ phase lag of tan−1 4 = 1.32582 and amplitude 4/ 17 = .97014, combined with a transient vibration at the same frequency with exponentially decreasing amplitude.

273

♠ 9.6.8. (a) Yes, the same fast oscillations and beats can be observed graphically. For example, the graph of cos t − .5 cos 1.1 t on the interval 0 ≤ t ≤ 300 is: 1.5 1 0.5 -0.5 -1 -1.5

50

100

150

200

250

To prove this observation, we invoke the trigonometric identity a cos η t − b cos ω t „ „ „ „ ω+η « ω−η « ω+η « ω−η « = (a − b) cos t cos t + (a + b) sin t sin t 2 2 2 2 « „ ω+η t − θ(t) , = R(t) cos 2 where R(t), θ(t) are the polar coordinates of the point „ „ „ ω − η « «T ω−η « t , (a + b) sin t = ( R(t) cos θ(t), R(t) sin θ(t) )T , (a − b) cos 2 2 and represent, respectively, the envelope or slowly varying amplitude of the oscillations, i.e., the beats, and a periodically varying phase shift. (b) Beats are still observed, but the larger | a − b | is — as prescribed by the initial conditions — the less pronounced the variation in the beat envelope. Also, when a 6= b, the fast oscillations are no longer precisely periodic, but exhibit a slowly varying phase shift over the period of the beat envelope. α(cos η t − cos ω t) αt , (b) u(t) = sin ω t . 2 2 m(ω − η ) 2mω (c) Use l’Hˆ opital’s rule, differentiating with respect to η to compute αt α t sin η t α(cos η t − cos ω t) = ηlim = sin ω t. lim 2 2 → ω η→ω m(ω − η ) 2mη 2mω

♦ 9.6.9. (a) u(t) =

♦ 9.6.10. Using the method of undetermined coefficients, we set u⋆ (t) = A cos η t + B sin η t. Substituting into the differential equation (9.101), and then equating coefficients of cos η t, sin η t, we find m (ω 2 − η 2 ) A + β η B = α, − β η A + m (ω 2 − η 2 ) B = 0, where we replaced k = m ω 2 . Thus, α m (ω 2 − η 2 ) αβ η A= 2 2 , B= 2 2 . 2 2 2 2 m (ω − η ) + β η m (ω − η 2 )2 + β 2 η 2 We then put the resulting solution in phase-amplitude form u⋆ (t) = a cos(η t − ε), where, according to (2.7), A = a cos ε, B = a sin ε, which implies (9.102–103).

9.6.11. (a) Underdamped, (b) overdamped, (c) critically damped, (d) underdamped, (e) underdamped. 9.6.12. (a) u(t) = e− t/4 cos 14 t + e− t/4 sin 41 t,

(b) u(t) =

274

3 − t/3 2e

−

1 −t , 2e

(c) u(t) = e− t/3 + 31 t e− t/3 , (d) u(t) = e− t/5 cos √ 1 1 t + 3 e− t/2 sin √ t. (e) u(t) = e− t/2 cos √ 2 3

9.6.13. u(t) =

165 41

e− t/4 cos 14 t −

= 4.0244 e

− .25 t

1 10

t + 2 e− t/5 sin

1 10

t,

2 3

91 41

e− t/4 sin 14 t −

cos .25 t − 2.2195 e

124 41 − .25 t

cos 2 t +

32 41

sin 2 t

sin .25 t − 3.0244 cos 2 t + .7805 sin 2 t. √ 9.6.14. The natural vibrational frequency is ω = 1/ R C. If η 6= ω then the circuit experiences a quasi-periodic vibration as a combination of the two frequencies. As η gets close to ω, the current amplitude becomes larger and larger, exhibiting beats. When η = ω, the circuit is in resonance, and the current amplitude grows without bound. 9.6.15. (a) .02, (b) 2.8126, (c) 26.25. 9.6.16. η ≤ .03577 or η ≥ .04382. 9.6.17. R ≥ .10051.

9.6.18. (a) u(t) = cos t (b) u(t) = sin 3 t (c) u(t) = (d) u(t) = (e) u(t) =

(f ) u(t) =

(g) u(t) =

1 2

! ! 3 ! √ √ −2 1 14 2 t − δ ) 3 t − δ ) + r cos( , + r cos(2 1 2 2 1 1 1 2 7 ! √ ! q 1 √ 5 +r cos(4 − √5 t − δ ) −1 − 2 +r cos( 4 + 5 t − δ ) 2 1 1 2 2 −1

t sin 2 t + 31 cos 2 t 3 4 t sin 2 t

!

+ r1 cos

0

17 t − δ1

”

−3 2

!

+ r2 cos(2 t − δ2 )

√

!

5 ,

!

2 , 3

1 ! ! 2 q −3 1 1 5 1 @ 17 A + r2 cos( √ t − δ2 ) cos 2 t + r1 cos( 3 t − δ1 ) , 2 1 2 − 12 17 0 1 1 0 ! ! 1 1 q 2 −5 1 8 3 6 @ A A @ √ + sin 2 t + r1 cos( 5 t − δ1 ) , t − δ2 ) + r2 cos( cos t 3 3 2 − 13 − 23 1 0 0 0 1 1 0 1 6 1 1 −1 B 11 C √ √ B B C C B C B 5 C B B C B C C B cos t@ 11 A+r1 cos( 12 t − δ1 )@ −1 A+r2 cos(3 t − δ2 )@ 1 A+r3 cos( 2 t − δ3 )@ 1 C A, 1 2 0 1 11 0 1 0 0 1 1 0 1 1 −3 3 0 B8C √ √ B B C C B C C B C C B C B 3 C+r1 cos( 3 t − δ1 )B cos tB @ 2 A+r2 cos( 2 t − δ2 )@ 2 A+r3 cos(t − δ3 )@ −1 A. @8A

1

0

9.6.19.

“√

−1 + 2

(a) The resonant frequencies are

r

√ 3− 3 2

= .796225,

1

r

√ 3+ 3 = 1.53819. 2 ! r √ 3+ 3 t w, where 2

1

w1 w2

!

w = is (b) For example, a forcing function of the form cos √ ! √ not orthogonal to the eigenvector −1 − 3 , so w2 6= (1 + 3 )w1 , will excite reso1 nance. √ √ 9.6.20. When the bottom support is removed, the resonant frequencies are 5−2 17 = .468213, √ √ 5+ 17 = 1.51022. When the top support is removed, the resonant frequencies are 2 275

r

√ 2− 2 2

r

√

2+ 2 = .541196, = 1.30656. In both cases the vibrations are slower. The previ2 ous forcing function will not excite resonance.

♣ 9.6.21. In each case, you need to force the system by cos(ω t)a where ω 2 = λ is an eigenvalue and a is orthogonal to the corresponding eigenvector. In order not to excite an instability, a needs to also be orthogonal to the kernel of the stiffness matrix spanned by the unstable mode vectors. (a) Resonant frequencies: ω1 = .5412, ω2 = 1.1371, ω3 = 1.3066, ω4 = 1.6453; 0 0 1 0 1 1 0 1 .2706 .6533 − .6533 .2706 B C B C C B B .2706 C B .2706 C B − .6533 C B .6533 C C C, v3 = B C, v2 = B B C; C, v4 = B eigenvectors: v1 = B @ − .2706 A @ .6533 A @ .6533 A @ .2706 A .2706 .6533 .6533 − .2706 no unstable modes. (b) Resonant frequencies: ω1 = .4209, ω2 = 1 (double), ω3 = 1.2783, ω4 = 1.6801, ω5 = 1.8347; eigenvectors: 0 0 0 0 0 0 1 − .5000 1 .6626 1 .5000 1 0 .2470 1 01 B .1426 C B − .2887 C B − .3825 C B −1 C B − .2887 C B1C B C B C C B B B B C C C B C B C B B B B C C C .6626 C B .5000 C C B − .5000 C B .2470 C B0C B 0C b B C; C , v4 = B C, v2 = B C , v5 = B C, v2 = B C, v3 = B v1 = B B − .2887 C B − .1426 C B − .2887 C B1C B .3825 C B 1C B C B B C B C C C C B B @ @ .2852 A @ @0A @ − .7651 A @ 1A 0 A 0 A .5774 0 .5774 1 0 0 no unstable modes. (c) Resonant frequencies: ω1 = .3542, ω2 = .9727, ω3 = 1.0279, ω4 = 1.6894, ω5 = 1.7372; eigenvectors: 0 0 0 0 0 .1251 1 − .1160 1 .6889 1 − .0989 1 .3914 1 B − .6940 C B − .0706 C B .2009 C B .6780 C B .1158 C C B C B C B B C B C C B C B B C B B C 0 .2319 0 C 0 − .7829 B C C C B B C B B C C, v3 = B C, v4 = B C; C, v2 = B C , v5 = B v1 = B B .0744 C B B − .1549 C B − .9851 C B 0 C 0 C B C C B C C B C B B @ − .1251 A @ − .1160 A @ − .6889 A @ .0989 A @ .3914 A − .6940 − .6780 .1158 − .0706 − .2009 T unstable mode: z = ( 1, 0, 1, 0, 1, 0 ) . To avoid exciting the unstable mode, the initial ¦ velocity must be orthogonal to the null eigenvector: z · u(t0 ) = 0, i.e., there is no net horizontal velocity of the masses. √ √ (d) Resonant frequencies: ω1 = 32 = 1.22474 √(double), ω2 = 3 = 1.73205; 0

eigenvectors:

v1 =

1 B 2 √ B B− 3 2 B B 1 B B √2 B 3 B B 2 B B @ 1

0

0

1

0

1

C C C C C C C, C C C C C A

1 B0C B C B C 1C B C, unstable modes: z1 = B B0C B C @1A 0

b = v 1

1 − 23 C B B C B −1 C B √2 C B 3 C B C B 2 C, B C B −1 C B C 2 B C B C @ 0 A

1

0

1

0 B1C B C B C 0C B C, z2 = B B1C B C @0A 1

z3 =

0

1

0 B C B −2 C B √ C B C B− 3C C; v2 = B B C B 1 C B √ C B C @ 3 A 1

0

B− B B B B B B B B B @

√ 1 3 2 C 1 C C 2 C C 0 C C. 1 C C C 0 A

0 To avoid exciting the unstable modes, the initial velocity must be orthogonal to the null ¦ ¦ ¦ eigenvectors: z1 · u(t0 ) = z2 · u(t0 ) = z3 · u(t0 ) = 0, i.e., there is no net linear or angular velocity of the three masses. √ (e) Resonant frequencies: ω1 = 1, ω2 = 3 = 1.73205; 276

0

1

0

1

0

1

1 1 1 B C C 0C eigenvectors: v1 = v2 = B @ −2 A; unstable mode: z = @ 1 A. A, −1 1 1 To avoid exciting the unstable mode, the initial velocity must be orthogonal to the null ¦ eigenvector: z · u(t0 ) = 0, i.e., there is no net horizontal velocity of the atoms. (f ) Resonant frequencies: ω1 = 1.0386, ω2 = 1.0229; 0 0 .0555 1 −.0327 1 B −.0426 C B −.0426 C C B C B C B C B B −.7054 C B .7061 C C; C, B eigenvectors: v1 = B v = 2 B B 0 C 0 C B C B C @ −.1826 A @ −.1827 A .6813 .6820 0 1 0 1 0 0 0 1 1 1 0 0 0 1 0 B1C B0C B 0 C B B0C C 0 B C B C B B B C C C B C B C B B B C C C 1 0 0 0 B C B C B 0 C B B C C C, z2 = B C, z3 = B C, z4 = B C=B C. unstable modes: z1 = B B0C C B1C B1C B 0 C B 0 B C C C B C B C B B ◦ @1A @0A @0A @ .9659 A @ sin 105 A 0 1 − cos 105◦ 0 .2588 To avoid exciting the unstable modes, the initial velocity must be orthogonal to the null ¦ ¦ ¦ ¦ eigenvectors: z1 · u(t0 ) = z2 · u(t0 ) = z3 · u(t0 ) = z4 · u(t0 ) = 0, i.e., there is no net linear velocity of the three atoms, and neither hydrogen atom has an angular velocity component around the oxygen. B @

277


10.1.1. (a) u(1) (b) u(1) (c) u(1) (d) u(1)

= 2, u(10) = 1024, u(20) = 1048576; unstable. = −.9, u(10) = .348678, u(20) = .121577; asymptotically stable. = i , u(10) = −1, u(20) = 1; stable. = 1 − 2 i , u(10) = 237 + 3116 i , u(20) = −9653287 + 1476984 i ; unstable.

10.1.2. (a) u(k+1) = 1.0325 u(k) , u(0) = 100, where u(k) represents the balance after k years. (b) u(10) = 1.032510 × 100 = 137.69 dollars. (c) u(k+1) = (1 + .0325/12) u(k) = 1.002708 u(k) , u(0) = 100, where u(k) represents the balance after k months. u(120) = (1 + .0325/12)120 × 100 = 138.34 dollars. 10.1.3. If r is the yearly interest rate, then u(k+1) = (1 + r/12) u(k) , where u(k) represents the balance after k months. Let v (m) denote the balance after m years, so v (m) = u(12 m) . Thus v (m) satisfies the iterative system v (m+1) = (1 + r/12)12 v (m) = (1 + s) v (m) , where s = (1 + r/12)12 − 1 is the effective annual interest rate. 10.1.4. The balance after k years coming from compounding n times per year is „ r «n k 1+ a −→ er k a as n → ∞, by a standard calculus limit, [ 2, 58 ]. n

“

10.1.5. Since u(t) = a eα t we have u(k+1) = u((k + 1) h) = a eα (k+1) h = eα h a eα k h αh

(k)

αh

”

=

e u , and so λ = e . The stability properties are the same: | α | < 1 for asymptotic stability; | α | ≤ 1 for stability, | α | > 1 for an unstable system. 10.1.6. The solution u(k) = λk u(0) is periodic of period m if and only if λm = 1, and hence λ is an mth root of unity. Thus, λ = e2 i k π/m for some k = 0, 1, 2, . . . m − 1. If k and m have a common factor, then the solution is of smaller period, and so the solutions of period exactly m are when k is relatively prime to m and λ is a primitive mth root of unity, as defined in Exercise 5.7.7. ♠ 10.1.7. Let λ = e i θ where 0 ≤ θ < 2 π. The solution is then u(k) = a λk = a e i k θ . If θ is a rational multiple of π, the solution is periodic, as in Exercise 10.1.6. When θ/π is irrational, the iterates eventually fill up (i.e., are dense in) the circle of radius | a | in the complex plane. 10.1.8. | u(k) | = | λ |k | a | > | v (k) | = | µ |k | b | provided k > inequality relies on the fact that log | λ | > log | µ |.

log | b | − log | a | , where the log | λ | − log | µ |

10.1.9. The equilibrium solution is u⋆ = c/(1 − λ). Then v (k) = u(k) − u⋆ satisfies the homogeneous system v (k+1) = λ v (k) , and so v (k) = λk v (0) = λk (a − u⋆ ). Thus, the solution to (10.5) is u(k) = λk (a − u⋆ ) + u⋆ . If | λ | < 1, then the equilibrium is asymptotically stable, with u(k) → u⋆ as k → ∞; if | λ | = 1, it is stable, and solutions that start near u⋆ 278

stay nearby; if | λ | > 1, it is unstable, and all non-equilibrium solutions become unbounded: | u(k) | → ∞. 10.1.10. Let u(k) represent the balance after k years. Then u(k+1) = 1.05 u(k) + 120, with u(0) = 0. The equilibrium solution is u⋆ = − 120/.05 = − 2400, and so after k years the balance is u(k) = (1.05k − 1) · 2400. Then u(10) = $1, 509.35, u(50) = $25, 121.76, u(200) = $4, 149, 979.40. 10.1.11. If u(k) represent the balance after k months, then u(k+1) = (1 + .05/12) u(k) + 10, u(0) = 0. The balance after k months is u(k) = (1.0041667k − 1) · 2400. Then u(120) = $1, 552.82, u(600) = $26, 686.52, u(2400) = $5, 177, 417.44. ♥ 10.1.12. The overall yearly growth rate is 1.2 − .05 = 1.15, and so the deer population satisfies the iterative system u(k+1) = 1.15 u(k) − 3600, u(0) = 20000. The equilibrium is u⋆ = 3600/.15 = 24000. Since λ = 1.15 > 1, the equilibrium is unstable; if the initial number of deer is less than the equilibrium, the population will decrease to zero, while if it is greater, then the population will increase without limit. Two possible options: ban hunting for 2 years until the deer population reaches equilibrium of 24, 000 and then permit hunting at the current rate again. Or to keep the population at 20, 000 allow hunting of only 3, 000 deer per year. In both cases, the instability of the equilibrium makes it unlikely that the population will maintain a stable number, so constant monitoring of the deer population is required. (More realistic models incorporate nonlinear terms, and are less prone to such instabilities.)

10.1.13.

− 3k + (−1)k 3k + (−1)k , v (k) = . 2 2 20 18 15 18 = − k + k , v (k) = − k + k . 3 2 3 √ √ √ √ √2 √ ( 5 + 2)(3 − 5)k + ( 5 − 2)(3 + 5)k (3 − 5)k − (3 + 5)k (k) √ √ = , v = . 2 5 2 5 18 1 1 27 = − 8 + k − k , v (k) = − 4 + k−1 , w(k) = k . 2 3 3 3 = 1 − 2k , v (k) = 1 + 2 (−1)k − 2k+1 , w(k) = 4 (−1)k − 2k .

(a) u(k) = (b) u(k) (c) u(k) (d) u(k) (e) u(k) 10.1.14. (a) u

(k)

(b) u

(k)

√ k − √2 = c1 (− 1 − 2) 1 0

= c1

“

0

1 2

(d) u(k)

3 2

i

+ c2 (− 1 + √ 1 ”k 5− i 3 “ 1 @ A+c 2 2 2 √

1

√

1

2)

− 0

k

√

3 2

√ ! 2 ; 1 0

i

”k

@

√ 1 5+ i 3 A 2

1

√

1

cos 13 k π + 23 sin 13 k π A + a @ 25 sin 13 k π − 23 cos 31 k π A; = a1 2 cos 31 k0 π 1 sin 31 k π 0 1 0 1 1 2 2 C kB C kB C = c1 B @ 2 A + c2 (−2) @ 3 A + c3 (− 3) @ 3 A; 0 2 3 0 1 0 1 0 1 1 1 0 “ ”k ”k “ “ ”k B C B C B C 1 1 = c1 − 12 1 A. @ 1 A + c2 − 3 @ 2 A + c3 @ 6 0 1 2 5 @2

(c) u(k)

+

√

!

279

10.1.15. (a) It suffices to note that the Lucas numbers are the general Fibonacci numbers (10.16) when a = L(0) = 2, b = L(1) = 1. (b) 2, 1, 3, 4, 7, 11, 18. (c) Because the first two are integers and so, by induction, L(k+2) = L(k+1) + L(k) is an integer whenever L(k+1) , L(k) are integers. 10.1.16.

˛ ˛ ˛ ˛ ˛− ˛ ˛

0 √ 1k ˛˛ 1 @ 1 − 5 A ˛˛ k √ ˛ < .448×.62 < .5 for all k ≥ 0. Since The second summand satisfies ˛ 2 5 ˛ the sum is an integer, adding the second summand to the first is the same as rounding the first off to the nearest integer. √ 1 −1 + 5 (− k) k+1 (k) √ = 10.1.17. u = (−1) u . Indeed, since , 1+ 5 2 2

20 √ 1− k 3 √ 1k 3 √ 1k 0 1 − 5A 7 −1 − 5 A 7 1 6 −1 + 5 A 6@ 1 + −@ −@ 5 = √ 4@ 5 4 2 2 2 2 5 2 0 0 √ 1 k3 √ 1− k 6 k @1 + 5A 7 k @1 − 5A − (−1) 5 4(−1) 2 2 20

1 u(− k) = √ 5 1 = √ 5

√

(a)

5 2

(b)

4 −2

(c) 0

(e)

B @

1 1

B B B @

!k

2 1

−1 2 −1 1

= =

−1 2

!

0

1 0 k 1C A

1

=

!

0 1

√ 1k 3 5A 7 1 − k+1 (k) −@ u . 5 = (−1) 2 0

1 1 5 A, 2 5

!0

2 5 @ − 15 !

3k 0 0 2k !0 i @ (1 + i )k 1 0

−2 1

−1 1

!

,

10

i 2 i 2 1 3 1 6 − 12

0 A@ (1 − i )k 0−

10

−1 C B 4k B 0C A@ 0 1 0

1

0 1 0

0 CB B 0 C AB @ (−1)k

1 1 2 A, 1 2 1 3 − 13

1 3 1 6 1 2

0

1

C C C, A

=

2

u(k) v (k)

k

1 −2 1

1

u(k) v (k)

6 0

2 k B1 1C A =B @1 1 1

√ 3+ 5 2√ −1− 5 2

!

!

−i 1

1

1 0 0

2

√ 3− 5 2√ −1+ 5 2

10.1.19. (a) (c)

=

!k

1 2 1

0 0 −1 0

!k

1 1

1 −1

1 (d) B @1 2 0

2 2

0

20 √ 1k 6@ 1 + 5 A 4

(−1)k+1 √ = 5 10.1.18.

1− k 5A

= −i 1

0 √ «k 1 „ 1+ 5 1 CB 2 B CB B 1C AB 0 @

1 2 1

−1 2 i 1

0 „

0

!

!0 @

!

− 25 6k , − 51

− i (1 + i ) (1 − i )k

k

√ «k 1+ 5 2

(b) 1

A,

280

0

u(k) v (k) 0

!

1

0C

0

CB CB CB @ 0C A

√ −5−3 5 10 √ −5+3 5 10

√ −5− 5 10√ −5+ 5 10

√ 5+2 5 5√ 5−2 5 5

1

1

−1

1 =

1

−1 1 0

−1 2

1 u(k) B B (k) C (d) @ v A = B 1 @ w(k) 1

!

1 −2 1

!

3k , −2k+1 10

1

2 k −1 C B 3 4 C B 1 C C, 0C AB @ 3 A 1 0

1

C C C. A

0

1

0

u(k) B B (k) C B (e) @ v A = B @ w(k)

√ 3− 5 2√ −1+ 5 2

√ 3+ 5 2√ −1− 5 2

1

1

0

√ „ √ «k −5+3 5 1+ 5 B 10 2 1 CB CB √ «k √ „ B 1+ 5 −5+ 5 1C AB B 10 2 @ 1

1

−1

1

C C C C. C C A

10.1.20. (a) Since the coefficient matrix T has all integer entries, its product T u with any vector with integer entries also 0has integer entries; (b) 1c1 = −2, 0 c2 = 3, 1c2 = −3; 0 0 1 1 0 1 4 −26 76 −164 304 C C C C (2) (3) (4) (5) (c) u(1) = B =B =B =B 76 C =B @ −2 A, u @ 10 A, u @ −32 A, u @ A, u @ −152 A. −2 −2 16 −44 88

10.1.21. The vectors u(k) 0 0 1 0 B 0 0 1 B B 0 0 B 0 .. .. B .. T =B B . . . B B @

0 cj

0 cj−1

0 cj−2

= 0 0 1 .. .

“

0 cj−3

”T

u(k) , u(k+1) , . . . , u(k+j−1) ∈ R j satisfy u(k+1) = T u(k) , where 1 ... 0 1 0 a0 ... 0 C C C B a C ... 0 C B 1 C B C C (0) . C. The initial conditions are u =a=B .. . C, and B . C . .. C C . @ A C ... 1 A aj−1 . . . c1

so u(0) = a0 , u(1) = a1 , . . . , u(j−1) = aj−1 .

”k−1

“

“

”k−1

(−2)k , (b) u(k) = 13 + − 41 , √ k √ √ k (5 − 3 5)(2 + 5) + (5 + 3 5)(2 − 5) (c) u(k) = , 10 ” “ ” “ ” “ k k k/2 (k) 1 1 cos 41 k π + 2 sin 41 k π , (d) u = 2 − i (1 + i ) + 2 + i (1 − i ) = 2

10.1.22. (a) u(k) =

4 − 31 3√

(e) u(k) = − 12 −

1 2

(−1)k + 2k ,

“

”

(f ) u(k) = − 1 + 1 + (− 1)k 2k/2−1 .

♣ 10.1.23. (a) u(k+3) = u(k+2) + u(k+1) + u(k) ,

(b) u(4) = 2, u(5) = 4, u(6) = 7, u(7) = 13,

(c) u(k) ≈ .183 × 1.839k + 2 Re (− .0914018 + .340547 i ) (− .419643 + .606291 i )k “

”

= .183 × 1.839k − .737353k .182804 cos 2.17623 k + .681093 sin 2.17623 k .

♣ 10.1.24. (a) u(k) = u(k−1) + u(k−2) − u(k−8) . (b) 0, 1, 1, 2, 3, 5, 8, 13, 21, 33, 53, 84, 134, 213, 339, 539, 857, 1363, 2167, . . . . ”T

“

satisfies u(k+1) = A u(k) where the 8 × 8 coeffi(c) u(k) = u(k) , u(k+1) , . . . , u(k+7) cient matrix A has 1’s on the superdiagonal, last row ( −1, 0, 0, 0, 0, 0, 1, 1 ) and all other entries 0. (d) The growth rate is given by largest eigenvalue in magnitude: λ1 = 1.59, with u(k) ∝ 1.59k . For more details, see [ 34 ]. 10.1.25.

(k) ui

=

n X

j =1

cj

jπ 2 cos n+1

!k

sin

ij π , n+1

i = 1, . . . , n.

10.1.26. The key observation is that the coefficient matrix T is symmetric. Then, according to Exercise 8.5.19, the principal axes of the ellipse E1 = { T x | k x k = 1 } are the orthogonal eigenvectors of T . Moreover, T k is also symmetric and has the same eigenvectors. Hence, all the ellipses Ek have the same principal axes. The semi-axes are the absolute values of the eigenvalues, and hence Ek has semi-axes ( .8)k and ( .4)k .

281

♠ 10.1.27. (a)

E1 : principal axes: E2 : principal axes: E3 : principal axes: E4 : principal axes:

!

−1 , 1 ! −1 , 1 ! −1 , 1 ! −1 , 1

!

1 , 1 ! 1 , 1 ! 1 , 1 ! 1 , 1

semi-axes: 1, 31 , area:

1 3

π.

semi-axes: 1, 19 , area:

1 9

π.

semi-axes: 1,

1 27 ,

area:

1 27

π.

semi-axes: 1,

1 81 ,

area:

1 81

π.

(b)

E1 : principal axes: E2 : circle of radius E3 : principal axes: E4 : circle of radius

!

!

0 1 , semi-axes: 1.2, .4, area: .48 π = 1.5080. , 1 0 .48,!area:!.2304 π = .7238. 0 1 , semi-axes: .576, .192, area: .1106 π = .3474. , 1 0 .2304, area: .0531 π = .1168.

(c)

E1 : principal axes: E2 : principal axes: E3 : principal axes: E4 : principal axes:

!

.6407 , .7678 ! .6765 , .7365 ! .6941 , .7199 ! .7018 , .7124

!

−.7678 , .6407 ! −.7365 , .6765 ! −.7199 , .6941 ! −.7124 , .7018

semi-axes: 1.0233, .3909, area: .4 π = 1.2566. semi-axes: 1.0394, .1539, area: .16 π = .5027. semi-axes: 1.0477, .0611, area: .064 π = .2011. semi-axes: 1.0515, .0243, area: .0256 π = .0804.

10.1.28. (a) This follows from Exercise 8.5.19(a), using the fact that K = T n is also positive definite. (b) True — they are the eigenvectors of T . (c) True — r1 , s1 are the eigenvalues of T . (d) True, since the area is π times the product of the semi-axes, so A 1 = π r1 s1 , so n n α = r1 s1 = | det T |. Then An = π rn sn = π r1n sn 1 = π | det T | = π α . 282

10.1.29. (a) This follows from Exercise 8.5.19(a) with A = T n . (b) False; see Exercise 10.1.27(c) for a counterexample. (c) False — the singular values of T n are not, in general, the nth powers of the singular values of T . (d) True, since the product of the singular values is the absolute value of the determinant, and so An = π | det T |n . 10.1.30. v(k) = c1 (α λ1 + β)k v1 + · · · + cn (α λn + β)k vn . ♦ 10.1.31. If u(k) = x(k) + i y(k) is a complex solution, then the iterative equation becomes x(k+1) + i y(k+1) = T x(k) + i T y(k) . Separating the real and imaginary parts of this complex vector equation and using the fact that T is real, we deduce x(k+1) = T x(k) , y(k+1) = T y(k) . Therefore, x(k) , y(k) are real solutions to the iterative system. ♦ 10.1.32. The formula uniquely specifies u(k+1) once u(k) is known. Thus, by induction, once the initial value u(0) is fixed, there is only one possible solution u(k) for k = 0, 1, 2, . . . . Existence and uniqueness also hold for k < 0 when T is nonsingular, since u(−k−1) = T −1 u(−k) . If T is singular, the solution will not exist for k < 0 if any u(−k) 6∈ rng T , or, if it exists, is not unique since we can add any element of ker T to u(−k) without affecting u(−k+1) , u(−k+2) , . . . . 10.1.33. According to Theorem 8.20, the eigenvectors of T are real and form an orthogonal basis of R n with respect to the Euclidean norm. The formula for the coefficients cj thus follows directly from (5.8). 10.1.34. Since matrix multiplication acts column-wise, cf. (1.11), the j th column of the matrix (k+1) (k) (0) equation T k+1 = T T k is cj = T cj . Moreover, T 0 = I has j th column cj = ej . 10.1.35. Separating the equation into its real and imaginary parts, we find ! ! ! x(k) . x(k+1) = µ − ν ν µ y (k) y (k+1) ! 1 The eigenvalues of the coefficient matrix are µ ± i ν, with eigenvectors and so the ∓i solution is ! ! ! (0) (0) x(0) − i y (0) k 1 x(k) = x + i y (µ + i ν)k 1 (µ − i ν) + . −i i 2 2 y (k) Therefore z (k) = x(k) + i y (k) = (x(0) + i y (0) ) (µ + i ν)k = λk z (0) .

♦ 10.1.36. (a) Proof by induction: T

k+1

wi = T

0

@ λk w i k

+ kλ

= λ T wi + k λ

k−1

k−1

0 1

k wi−1 + @ A λk−2 wi−2 + · · · 2 0 1

T wi−1 + @

k

= λ (λ wi + wi−1 ) + k λ =λ

k+1

k−1

1 A

kA k−2 λ T wi−2 + · · · 2 0 1

k (λ wi−1 + wi−2 ) + @ A λk−2 (λ wi−2 + wi−3 ) + · · · 2 0

1

k + 1A k−1 λ wi−2 + · · · . wi + (k + 1) λ wi−1 + @ 2 k

(b) Each Jordan chain of length j is used to construct j linearly independent solutions by formula (10.23). Thus, for an n-dimensional system, the Jordan basis produces the required number of linearly independent (complex) solutions, and the general solution is 283

obtained by taking linear combinations. Real solutions of a real iterative system are obtained by using the real and imaginary parts of the Jordan chain solutions corresponding to the complex conjugate pairs of eigenvalues. 10.1.37. “ (a) u(k) = 2k c1 + “

(b) u(k) = 3k c1 +

1 2 “

”

k c2 , v (k) = 1 3

k−

1 2

”

”

c2 , v

(e) (f )

=3

2 c1 +

2 3

”

k c2 ;

” “ ” 1 k(k − 1) c3 , v (k) = (−1)k c2 − (k + 1) c3 , w(k) = (−1)k c3 ; 2 “ ” ” “ ” “ 1 k (k − 1) + 1 c3 , v (k) = − 3k c2 + 31 k c3 , u(k) = 3k c1 + 13 k c2 + 18 “ ” 1 w(k) = 3k c1 + 13 k c2 + 18 k (k − 1) c3 ; u(0) = − c2 , v (0) = − c1 + c3 , w(0) = c1 + c2 , while, for k > 0, ” “ ” “ u(k) = − 2k c2 + 21 k c3 , v (k) = 2k c3 , w(k) = 2k c2 + 12 k c3 ; u(k) = − i k+1 c1 − k i k c2 − (− i )k+1 c3 − k(− i )k c4 , w(k) = − i k+1 c2 − (− i )k+1 c4 , v (k) = i k c1 + k i k−1 c2 + (− i )k c3 + k(− i )k−1 c4 , z (k) = i k c2 + (− i )k c4 . “

(c) u(k) = (−1)k c1 − k c2 +

(d)

1 k 3 2 c2 ; “ (k) k

k 10.1.38. Jλ,n =

0 k λ B B B B 0 B B B B B 0 B B B B B 0 B B B .. B . @

0

“ ” k k−2 2 λ

k λk−1 λk

k λk−1

0

λk

0 .. . 0

0 .. . 0

“ ” k k−3 3 λ “ ” k k−2 2 λ

... ...

k λk−1

...

λk .. . 0

... .. . ...

” k k−n+1 1 n−1 λ C C ” “ k k−n+2 C C λ C n−2 C C ” “ k k−n+3 C C λ n−3 C. C C “ ” k k−n+4 C λ C n−4 C C .. C C . A n “

λ

10.1.39. (a) Yes, if T is nonsingular. Indeed, in this case, the solution formula u (k) = T k−k0 u(k0 ) e (k) is valid even when k < k0 . But if T is singular, then one can only assert that u(k) = u e (k−k0 +k1 ) for all k; if T is singular, then for k ≥ k0 . (b) If T is nonsingular, then u(k) = u this only holds when k ≥ max{ k0 , k1 }.

♥ 10.1.40. (a) The system has an equilibrium solution if and only if (T − I )u⋆ = b. In particular, if 1 is not an eigenvalue of T , every b leads to an equilibrium solution. (b) Since v(k+1) = T v(k) , the general solution is u(k) = u⋆ + c1 λk1 v1 + c2 λk2 v2 + · · · + cn λkn vn , where v1 , . . . , vn are the linearly independent eigenvectors and λ1 , . . . , λn the corresponding eigenvalues of T . 0

1 2 3 A − 5k

!

0

1

−1 −3 (c) (i) u =@ − (−3)k @ 3 A; 1 −1 1 √ √ k ! √ ! (−1 + 2)k (−1 − 2) 1 − 2 (k) √ √ − (ii) u = + 1 1 2 2 2 2 (k)

(iii) u(k)

0

1

0

1

−1 1 B C 3C =B @ − 2 A − 3@ 2 A + 0 −1

0

1

√ ! 2 ; 1 0

1

2 2 kB C kB C 15 (−2) 3 3 A; − 5 (−3) @ @ A 2 2 3 284

(iv ) u(k) =

1 0 1 0 1 0 1 1 B6C “ ”k B 1 C “ ”k B 1 C “ ”k B 0 C B5C 1 1 7 B1C − 7 −1 B2C + 7 B 1 C. B C + 2 −2 @ A @ A @2A 2 3 3 6 @3A 3 0 1 1 2 0

(d) In general, using induction, the solution is

u(k) = T k c + ( I + T + T 2 + · · · + T k−1 )b.

If we write b = b1 v1 + · · · + bn vn , c = c1 v1 + · · · + cn vn , in terms of the eigenvectors, then u(k) =

n h X λkj cj j =1

i

+ (1 + λj + λ2j + · · · + λjk−1 )bj vj .

If λj 6= 1, one can use the geometric sum formula 1 + λj + λ2j + · · · + λjk−1 =

1 − λkj 1 − λj

,

while if λj = 1, then 1 + λj + λ2j + · · · + λjk−1 = k. Incidentally, when it exists the bj X vj . equilibrium solution is u⋆ = λj 6=1 1 − λj ♣ 10.1.41. (a) The sequence is 3, 7, 0, 7, 7, 4, 1, 5, 6, 1, 7, 8, 5, 3, 8, 1, 9, 0, 9, 9, 8, 7, 5, 2, 7, 9, 6, 5, 1, 6, 7, 3, 0, 3, 3, 6, 9, 5, 4, 9, 3, 2, 5, 7, 2, 9, 1, 0, 1, 1, 2, 3, 5, 8, 3, 1, 4, 5, 9, 4, 3, 7, 0, and repeats when u(60) = u(0) = 3, u(61) = u1 = 7. (b) When n = 10, other choices for the initial values u(0) , u(1) lead to sequences that also start repeating at u(60) ; if u(0) , u(1) occur as successive integers in the preceding sequence, then one obtains a shifted version of it — otherwise one ends up with a disjoint pseudo-random sequence. Other values of n lead to longer or shorter sequences; e.g., n = 9 repeats at u(24) , while n = 11 already repeats at u(10) .

10.2.1. √ √ (a) Eigenvalues: 5+2 33 ≈ 5.3723, 5−2 33 ≈ − .3723; spectral radius: i 1 (b) Eigenvalues: ± √ ≈ ± .11785 i ; spectral radius: √ ≈ .11785. 6 2 6 2 (c) Eigenvalues: 2, 1, −1; spectral radius: 2. √ (d) Eigenvalues: 4, − 1 ± 4 i ; spectral radius: 17 ≈ 4.1231.

√ 5+ 33 2

10.2.2. √ (a) Eigenvalues: 2 ± 3 i ; spectral radius: 13 ≈ 3.6056; not convergent. (b) Eigenvalues: .95414, .34586; spectral radius: .95414; convergent. (c) Eigenvalues: 45 , 53 , 0; spectral radius: 45 ; convergent. (d) Eigenvalues: 1., .547214, −.347214; spectral radius: 1; not convergent. 10.2.3. (a) Unstable: eigenvalues −1, −3; √ √ 5+ 73 (b) unstable: eigenvalues 12 ≈ 1.12867, 5−12 73 ≈ − .29533; i (c) asymptotically stable: eigenvalues 1± 2 ; (d) stable: eigenvalues −1, ± i ; (e) unstable: eigenvalues 45 , 41 , 41 ; (f ) unstable: eigenvalues 2, 1, 1; (g) asymptotically stable: eigenvalues 12 , 31 , 0. 10.2.4. (a) λ1 = 3, λ2 = 1 + 2 i , λ3 = 1 − 2 i , ρ(T ) = 3. 285

≈ 5.3723.

(b) λ1 = 35 , λ2 = 51 + 52 i , λ3 = 15 − 25 i , ρ(Te ) = 35 . (c) u(k) → 0 in all cases; generically, the initial data has a non-zero component, c1 6= 0, in the direction of the dominant eigenvector, and then ( −1, −1, 1 )T , then u(k) ≈ c1

“

3 5

”k

( −1, −1, 1 )T .

10.2.5. (a) T has a double!eigenvalue of 1, so ρ(T!) = 1. a 1 k (b) Set u(0) = . Then T k = , and so u(k) = b 0 1 q

a + kb b

!

→ ∞ provided b 6= 0.

(c) In this example, k u(k) k = b2 k2 + 2 a b k + a2 + b2 ≈ b k → ∞ when b 6= 0, while C ρ(T )k = C is constant, so eventually k u(k) k > C ρ(T )k no matter how large C is. (d) For any σ > 1, we have b k ≤ C σ k for k ≥ 0 provided C ≫ 0 is sufficiently large — more specifically, if C > b/ log σ. 10.2.6. A solution u(k) → 0 if and only if the initial vector u(0) = c1 v1 + · · · + cj vj is a linear combination of the eigenvectors (or more generally, Jordan chain vectors) corresponding to eigenvalues satisfying | λi | < 1 for i = 1, . . . , j. 10.2.7. Since ρ(c A) = | c | ρ(A), then c A is convergent if and only if | c | < 1/ρ(A). So, technically, there isn’t a largest c. ♦ 10.2.8. (a) Let u1 , . . . , un be a unit eigenvector basis for T , so k uj k = 1. Let n

˛

o

mj = max | cj | ˛˛ k c1 u1 + · · · + cn un k ≤ 1 , which is finite since we are maximizing a continuous function over a closed, bounded set. Let m⋆ = max{m1 , . . . , mn }. Now, given ε > 0 small, if k u(0) k = k c1 u1 + · · · + cn un k < ε,

| c j | < m⋆ ε

then

for

j = 1, . . . , n.

Therefore, by (10.25), k u(k) k ≤ | c1 | + · · · + | cn | ≤ n m⋆ ε, and hence the solution remains close to 0. (b) If any eigenvalue of modulus k λ k = 1 is incomplete, then, according to (10.23), the system has solutions of the form u(k) = λk wi + k λk−1 wi−1 + · · · , which are unbounded as k → ∞. Thus, the origin is not stable in this case. On the other hand, if all eigenvalues of modulus 1 are complete, then the system is stable, even if there are incomplete eigenvalues of modulus < 1. The proof is a simple adaptation of that in part (a). 10.2.9. Assume u(0) = c1 v1 + · · · + cn vn with c1 6= 0. For k ≫ 0, u(k) ≈ c1 λk1 v1 since (k+1)

| λk1 | ≫ | λkj | for all j > 1. Thus, the entries satisfy ui

(k)

≈ λ1 ui

and so, if nonzero, are

just multiplied by λ1 . Thus, if λ1 > 0 we expect to see the signs of all the entries of u(k) not change once k is sufficiently large, whereas if λ1 < 0, again for sufficiently large k, the signs alterate at each step of the iteration.

♥ 10.2.10. Writing u(0) = c1 v1 + · · · + cn vn , then for k ≫ 0,

u(k) ≈ c1 λk1 v1 + c2 λk2 v2 , (∗) which applies even when v1 , v2 are complex eigenvectors of a real matrix. Thus this happens if and only if the iterates eventually belong (modulo a small error) to a two-dimensional subspace V , namely that spanned by the eigenvectors v1 and v2 . In particular, for k ≫ 0, the iterates u(k) and u(k+1) form a basis for V (again modulo a small error) since if they were linearly dependent, then there would only be one eigenvalue of largest modulus. Thus, we can write u(k+2) ≈ a u(k+1) + b u(k) for some scalars a, b. We claim that the dominant 286

eigenvalues λ1 , λ2 are the roots of the quadratic equation λ2 = a λ + b, which gives an effective algorithm for determining them. To prove the claim, for k ≫ 0, by formula (∗), u(k+2) ≈ c1 λk+2 v1 + c2 λk+2 v2 , 1 2

a u(k+1) + b u(k) ≈ c1 λk1 (a λ1 + b) v1 + c2 λk2 (a λ2 + b) v2 . Thus, by linear independence of the eigenvectors v1 , v2 , we conclude that

λ21 = a λ1 + b, λ22 = a λ2 + b, which proves the claim. With the eigenvalues in hand, the determination of the eigenvectors is straightforward, either by directly solving the linear eigenvalue system, or by using equation (∗) for k and k + 1 sufficiently large. 10.2.11. If T has eigenvalues λj , then c T + d I has eigenvalues c λj + d. However, it is not necessarily true that the dominant eigenvalue of c T + d I is c λ1 + d when λ1 is the dominant eigenvalue of T . For instance, if λ1 = 3, λ2 = −2, so ρ(T ) = 3, then λ1 − 2 = 1, λ2 = −4, so ρ(T − 2 I ) = 4 6= ρ(T ) − 2. Thus, you need to know all the eigenvalues to predict ρ(T ), or, more accurately, the extreme eigenvalues, i.e., those such that all other eigenvalues lie in their convex hull in the complex plane. 10.2.12. By definition, the eigenvalues of AT A are λi = σi2 , and so the spectral radius ofqAT A 2 }. Thus ρ(AT A) = λ1 < 1 if and only if σ1 = λ1 < 1. is equal to ρ(AT A) = max{ σ12 , . . . , σn π . (b) No, since its spectral radius is slightly less than 2. n+1 !k n X jπ ij π (k) (k) (c) The entries of u are ui = cj 2 cos , i = 1, . . . , n, where sin n+1 n+1 j =1 c1 , . . . , cn are arbitrary constants.

♥ 10.2.13. (a) ρ(Mn ) = 2 cos

♥ 10.2.14. (k) (a) The entries of u(k) are ui =

!

k ij π jπ sin , i = 1, . . . , n. cj β + 2 α cos n + 1 n +1 j =1 (b) The system is asymptotically stable if and only if ˛ ˛ ˛ ˛ ˛ π ˛˛ ﬀ π ˛˛ ˛˛ < 1. ρ(Tα,β ) = max ˛˛ β + 2 α cos ˛ , ˛ β − 2 α cos n+1 n+1 ˛ In particular, if | β ± 2 α | < 1 the system is asymptotically stable for any n. n X

10.2.15. (a) According to Exercise 8.2.25, T has at least one eigenvalue with | λ | > 1. (b) No. ! 2 0 For example, T = has det T = 23 , but ρ(T ) = 2, and so the iterative system is 0 31 unstable. 10.2.16. (a) False: ρ(c A) = | c | ρ(A). (b) True, since the eigenvalues of A and S −1 A S are the same. (c) True, since the eigenvalues of A2 are the squares of the eigenvalues of A. (d) False, since ρ(A) = max λ whereas ρ(A−1 ) = max 1/λ. ! ! 1 0 0 0 (e) False in almost all cases; for instance, if A = and B = , then 0 0 0 1 ρ(A) = ρ(B) = ρ(A + B) = 1 6= 2 = ρ(A) + ρ(B). (f ) False: using the matrices in (e), A B = O and so ρ(A B) = 0 6= 1 = ρ(A) ρ(B). 0 10.2.17. (a) True by part (c) of0Exercise 10.2.16. (b) False. For example, A = 1 1 1 √ ρ(A) = 12 , whereas AT A = @ 14 25 A has ρ(AT A) = 43 + 21 2 = 1.45711. 2

4

287

1 @2

0

1

1A

1 2

has

10.2.18. False. The first requires its eigenvalues satisfy Re λj < 0; the second requires | λj | < 1. 10.2.19. (a) A2 =

„

lim T k

k→∞

«

2

= lim T 2 k = A. (b) The only eigenvalues of A are 1 and 0. k→∞

Moreover, A must be complete, since if v1 , v2 are the first two vectors in a Jordan chain, then A v1 = λ v1 , A v2 = λ v2 + v1 , with λ = 0 or 1, but A2 v2 = λ2 v1 + 2 λ v2 6= A v2 = λ v2 + v1 , so there are no Jordan chains except for the ordinary eigenvectors. Therefore, A = S diag (1, . . . , 1, 0, . . . 0) S −1 for some nonsingular matrix S. (c) If λ is an eigenvalue of T , then either | λ | < 1, or λ = 1 and is a complete eigenvalue. 10.2.20. If v has integer entries, so does Ak v for any k, and so the only way in which Ak v → 0 is if Ak v = 0 for some k. Now consider the basis vectors e1 , . . . , en . Let ki be such that Aki ei = 0. Let k = max{k1 , . . . , kn }, so Ak ei = 0 for all i =!1, . . . , n. Then Ak I = Ak = 0 1 . O, and hence A is nilpotent. The simplest example is 0 0 ♥ 10.2.21. The equivalent first order system v (k+1) = C v(k) for v(k) = cient matrix C =

O B !

!

u(k) u(k+1)

!

has coeffi-

I . To compute the eigenvalues of C we form det(C − λ I ) = A

−λ I I . Now use row operations to subtract appropriate multiples of the B A−λI first n rows from the last n, and then a series of row interchanges to conclude that det

det(C − λ I ) = det

−λ I B + λ A − λ2 I

I O

!

= ± det

= ± det(B + λ A − λ2 I ).

B + λ A − λ2 I −λ I

O I

!

Thus, the quadratic eigenvalues are the same as the ordinary eigenvalues of C, and hence stability requires they all satisfy | λ | < 1. ♦ 10.2.22. Set σ = µ/λ > 1. If p(x) = ck xk + · · · + c1 x + c0 has degree k, then p(n) ≤ a nk for all n ≥ 1 where a = max | ci |. To prove a nk ≤ C σ n it suffice to prove that k log n < n log σ + log C − log a. Now h(n) = n log σ − k log n has a minimum when h′ (n) = log σ − k/n = 0, so n = k/ log σ. The minimum value is h(k/ log σ) = k(1 − log(k/ log σ)). Thus, choosing log C > log a + k(log(k/ log σ) − 1) will ensure the desired inequality. ♦ 10.2.23. According to Exercise 10.1.36, there is a polynomial p(x) such that k u(k) k ≤

X i

| λi |k pi (k) ≤ p(k) ρ(A)k .

Thus, by Exercise 10.2.22, k u(k) k ≤ C σ k for any σ > ρ(A). ♦ 10.2.24. (a) Rewriting the system as u(n+1) = M −1 u(n) , stability requires ρ(M −1 ) < 1. The eigenvalues of M −1 are the reciprocals of the eigenvalues of M , and hence ρ(M −1 ) < 1 if and only if 1/| µi | < 1 for all i. (b) Rewriting the system as u(n+1) = M −1 K u(n) , stability requires ρ(M −1 K) < 1. Moreover, the eigenvalues of M −1 K coincide with the generalized eigenvalues of the pair; see Exercise 8.4.8 for details.

288

10.2.25. (a) All scalar multiples of

!

!

0 ; (c) all scalar multiples of 0

1 ; (b) 1

0

1 0

1

−1 1 C B C (d) all linear combinations of B @ 1 A, @ 0 A. 0 1

0

−1

1

C B @ −2 A;

1

10.2.26. (a) The eigenvalues are 1, 12 , so the fixed points are stable, while all other solutions go to a “

”k

“

”T

. unique fixed point at rate 12 . When u(0) = ( 1, 0 )T , then u(k) → 53 , 53 (b) The eigenvalues are −.9, .8, so the origin is a stable fixed point, and every nonzero solution goes to it, most at a rate of .9k . When u(0) = ( 1, 0 )T , then u(k) → 0 also. (c) The eigenvalues are −2, 1, 0, so the fixed points are unstable. Most solutions, specifically those with a nonzero component in the dominant eigenvector direction, become unbounded. However, when u(0) = ( 1, 0, 0 )T , then u(k) = ( −1, −2, 1 )T for k ≥ 1, and the solution stays at a fixed point. (d) The eigenvalues are 5 and 1, so the fixed points are unstable. Most solutions, specifically those with a nonzero component in the dominant eigenvector direction, become unbounded, including that with u(0) = ( 1, 0, 0 )T . 10.2.27. Since T is symmetric, its eigenvectors v1 , . . . , vn form an orthogonal basis of R n . Writing u(0) = c1 v1 + · · · + cn vn , the coefficients are given by the usual orthogonality formula (5.7): ci = u(0) · vi /k v1 k2 . Moreover, since λ1 = 1, while | λj | < 1 for j ≥ 2, u(k) = c1 v1 + c2 λk2 v2 + · · · + cn λkn vn −→ c1 v1 =

u(0) · v1 v . k v1 k2 1

10.2.28. False: T has an eigenvalue of 1, but convergence requires that all eigenvalues be less than 1 in modulus. 10.2.29. True. In this case T u = u for all u ∈ R n and hence T = I . ♥ 10.2.30. (a) The iterative system has a period 2 solution if and only if T has an eigenvalue of −1. Indeed, the condition u(k+2) = T 2 u(k) = u(k) implies that u(k) 6= 0 is an eigenvector of T 2 with eigenvalue of 1. Thus, u(k) is an eigenvector of T with eigenvalue −1 because if the eigenvalue were 1 then u(k) = u(k+1) , and the solution would ! be a fixed point. For ! k −1 0 for any c. has the period 2 orbit u(k) = c (−1) example, T = 0 2 0 (b) The period 2 solution is never unique since any nonzero scalar multiple is also a period 2 solution. (c) T must have an eigenvalue equal to a primitive mth root of unity; the non-primitive roots lead to solutions with smaller periods. ♦ 10.2.31. (a) Let u⋆ = v1 be a fixed point of T , i.e., an eigenvector with eigenvalue λ1 = 1. Assuming T is complete, we form an eigenvector basis v1 , . . . , vn with corresponding eigenvalues λ1 = 1, λ2 , . . . , λn . Assume λ1 , . . . , λj have modulus 1, while the remaining eigenvalues satisfy | λi | < 1 for i = j + 1, . . . , n. If the initial value u(0) = c1 v1 + · · · + cn vn is close to u⋆ = v1 , then | c1 − 1 |, | c2 |, . . . , | cn | < ε for ε small. Then the corresponding

289

solution u(k) = c1 v1 + c2 λk2 v2 + · · · + cn λkn vn satisfies k u(k) − u⋆ k ≤ | c1 − 1 | k v1 k + | c2 | | λ2 |k k v2 k + · · · + | cn | | λn |k k vn k “

”

< ε k v1 k + · · · + k vn k = C ε, and hence any solution that starts near u⋆ stays near. (b) If A has an incomplete eigenvalue of modulus | λ | = 1, then, according to the solution e (k) → ∞. Thus, for formula (10.23), the iterative system admits unbounded solutions u e (k) → ∞ that starts out arbitrarily any ε > 0, there is an unbounded solution u⋆ + ε u close to the fixed point u⋆ . On the other hand, if all eigenvalues of modulus 1 are complete, then the preceding proof works in essentially the same manner. The first j terms are bounded as before, while the remainder go to 0 as k → ∞.

10.3.1. (a) 34 , convergent; (b) 3, inconclusive; (c) 87 , inconclusive; (d) 74 , inconclusive; (e) 87 , inconclusive; (f ) .9, convergent; (g) 37 , inconclusive; (h) 1, inconclusive. 10.3.2. (a) .671855, convergent; (b) 2.5704, inconclusive; (c) .9755, convergent; (d) 1.9571, inconclusive; (e) 1.1066, inconclusive; (f ) .8124, convergent; (g) 2.03426, inconclusive; (h) .7691, convergent. 10.3.3. (a) 23 , convergent; (b) 12 , convergent; (c) .9755, convergent; (d) 1.0308, divergent; (e) .9437, convergent; (f ) .8124, convergent; (g) 23 , convergent; (h) 23 , convergent. 10.3.4. (a) k Ak k∞ = k 2 + k, (b) k Ak k2 = k 2 + 1, (c) ρ(Ak ) = 0. (d) Thus, a convergent matrix can have arbitrarily large norm. (e) Because the norm in the inequality will depend on k. !

1 0

1 , A2 = 1

!

2 10.3.5. For example, when A = , and (a) k A k∞ = 2, k A2 k∞ = 3; 1 r q √ √ (b) k A k2 = 3+2 5 = 1.6180, k A2 k2 = 3 + 2 2 = 2.4142. 10.3.6. Since k c A k = | c | k A k < 1. ♦ 10.3.7. For example, if A =

0 0

!

1 ,B= 0

1 0

!

√ 1 , then ρ(A + B) = 2 > 0 + 1 = ρ(A) + ρ(B). 0

0 1

10.3.8. True: this implies k A k2 = max σi < 1. 0

1 @2

1

1A , then ρ(A) = 1

10.3.9. For example, if A = 0√2 √ √ √ 3+2 2 3−2 2 = 1.2071 and σ = = .2071. 2 2 2 10.3.10. (a) False: For instance, if A =

0 0

!

1 ,S= 1

1 0

1 2.

The singular values of A are σ1 =

!

2 , then B = S −1 A S = 1

0 0

!

−2 , and 1

k B k∞ = 2 6= 1 = k A k∞ . √ √ (b) False: The same example has k B k2 = 5 6= 2 = k A k2 . (c) True, since A and B have the same eigenvalues. 10.3.11. By definition, κ(A) = σ1 /σn . Now k A k2 = σ1 . On the other hand, by Exercise 8.5.12, the singular values of A−1 are the reciprocals 1/σi of the singular values of A, and so the 290

largest one is k A−1 k2 = 1/σn . ♦ 10.3.12. (i) The 1 matrix norm is the maximum absolute column sum: 8 n < X

(ii) (a) (e)

k A k1 = max :

i=1

5 17 6 , convergent; (b) 6 , inconclusive; 12 7 , inconclusive; (f ) .9, convergent;

| aij | (c) (g)

˛ ˛ ˛ ˛ ˛ ˛

9 =

1 ≤ j ≤ n ;.

8 7 , inconclusive; 7 3 , inconclusive;

(d) (h)

11 4 , inconclusive; 2 3 , convergent.

10.3.13. If a1 , . . . , an are the rows of A, then the formula (10.40) can be rewritten as k A k∞ = max{ k ai k1 | i = 1, . . . , n }, i.e., the maximal 1 norm of the rows. Thus, by the properties of the 1-norm, k A + B k∞ = max{ k ai + bi k1 } ≤ max{ k ai k1 + k bi k1 } ≤ max{ k ai k1 } + max{ k bi k1 } = k A k∞ + k B k∞ ,

k c A k∞ = max{ k c ai k1 } = max{ | c | k ai k1 } = | c | max{ k ai k1 } = | c | k A k∞ . Finally, k A k∞ ≥ 0 since we are maximizing non-negative quantities; moreover, k A k∞ = 0 if and only if all its rows have k ai k1 = 0 and hence all ai = 0, which means A = O. ♦ 10.3.14. k A k = max{σ1 , . . . , σn } is the largest generalized singular value, meaning σi =

q

λi

T

where λ1 , . . . , λn are the generalized eigenvalues of the positive definite matrix pair A K A and K, satisfying AT K A v = λ K v for some v 6= 0, or, equivalently, the eigenvalues of K −1 AT K A.

10.3.15. (a) k A k =

7 2.

The “unit sphere” for this norm is the rectangle with corners “

1 5 6,− 6

“

”T

,± It is mapped to the parallelogram with corners ± 5 7 tive norms 3 and 2 , and so k A k = max{ k A v k | k v k = 1 } = 72 . 8 3. ”T 1 . 3

(b) k A k = “

1 7 6, 6

”T

“

± 12 , ± 13

”T

.

, with respec-

The “unit sphere” for this norm is the diamond with corners ± “

”T

“

”T 1 , 2,0 “ ”T 2 1 , 3,−3

It is mapped to the parallelogram with corners ± 21 , 21 ,± ± 0, 5 8 with respective norms 2 and 3 , and so k A k = max{ k A v k | k v k = 1 } = 83 .

(c) According to Exercise 10.3.14, k A k is the square root of the ! largest generalized eigen! 5 −4 2 0 T . Thus, , A KA = value of the matrix pair K = −4 14 0 3 r √ k A k = 43+12 553 = 2.35436. (d) According to Exercise 10.3.14, k A k is the square root of the largest generalized eigen! ! 2 −1 2 −1 T . Thus, k A k = 3. , A KA = value of the matrix pair K = −1 14 −1 2 2

♥ 10.3.16. If we identify an n × n matrix with a vector in R n , then the Frobenius norm is the same as the ordinary Euclidean norm, and so the norm axioms are immediate. To check T denote the rows of A and c1 , . . . , cn the columns the multiplicative property, let rT 1 , . . . , rn v of B, so k A kF = k C kF =

v uX u n t

u n uX

k ri k2 , k B kF = u t

i=1 v u n u X u c2ij t i,j = 1

=

k cj k2 . Then, setting C = A B, we have

j =1 v v u n u n u X u X T 2 u (ri cj ) ≤ u t t i,j = 1 i,j = 1

k ri k2 k cj k2 = k A kF k B kF ,

where we used the Cauchy–Schwarz inequality in the middle. 291

10.3.17. (a) This is a restatement of Proposition 10.28. (b) k σ k22 =

n X

i=1

σi2 =

n X

i=1

λi = tr(AT A) =

n X

i,j = 1

a2ij = k A k2F . 2

10.3.18. If we identify a matrix A with a vector in R n , then this agrees with ! the ∞ norm on 1 1 1 n2 , then A2 = R and hence satisfies the norm axioms. For example, when A = 0 1 0 and so k A2 k = 2 > 1 = k A k2 . 10.3.19. First, if x =

a b

!

,y =

c d

!

are any two linearly independent vectors in R 2 , then the !

!

a c cos ϕ curve (cos ϕ) x − (sin ϕ) y = is the image of the ! unit circle under the −b −d sin ϕ a c linear transformation defined by the nonsingular matrix , and hence defines an −b −d ellipse. The same argument shows that the curve (cos ϕ) x − (sin ϕ) y describes an ellipse in the two-dimensional plane spanned by the vectors x, y. ♦ 10.3.20. (a) This follows from the formula (10.40) since | aij | ≤ si ≤ k A k∞ , where si is the ith absolute row sum. (b) Let aij,n denote the (i, j) entry of An . Then, by part (a), ∞ and hence

∞ X

n=0

∞ X

n=0

| aij,n | ≤

∞ X

n=0

k An k∞ <

aij,n = a⋆ij is an absolutely convergent series, [ 2, 16 ]. Since each

entry converges absolutely, the matrix series also converges. ∞ tn X An and the series of norms (c) et A = n! n=0 ∞ X

n=0

∞ | t |n X | t |n | t | k A k∞ k An k∞ ≤ k A kn ∞ =e n! n! n=0

is bounded by the standard scalar exponential series, which converges for all t, [ 2, 58 ]. Thus, the convergence follows from part (b). 10.3.21. (a) Choosing a matrix norm such that a = k A k < 1, the norm series is bounded by a convergent geometric series: ∞ ∞ ∞ X X X 1 . An ≤ An = an = 1 − a n=0 k k= 0 k k= 0 Therefore, the matrix series converges. ( I − A)

∞ X

n=0

(b) Moreover,

An =

∞ X

n=0

An −

∞ X

An+1 = I ,

n=0

since all other terms cancel. I − A is invertible if and only if 1 is not an eigenvalue of A, and we are assuming all eigenvalues are less than 1 in magnitude.

292

!

2 , 1

10.3.22. 2

1

(a) Gerschgorin disk: | z − 1 | ≤ 2; eigenvalues: 3, −1;

-1

-2

1

2

4

3

-1

-2

1 0.75

(b) Gerschgorin disks: | z − 1 | ≤ eigenvalues:

1 1 2, 3;

˛ 2 ˛ 3, ˛z

+

˛ 1˛ 6˛

0.5

≤ 12 ;

0.25 -1

-0.5

1

0.5

1.5

2

-0.25 -0.5 -0.75 -1

3

2

1

(c) Gerschgorin disks: | z − 2 | ≤ 3, | z | ≤ 1; √ eigenvalues: 1 ± i 2;

-1

1

2

4

3

5

-1

-2 -3

2

1

(d) Gerschgorin disks: | z − 3 | ≤ 1, | z − 2 | ≤ 2; eigenvalues: 4, 3, 1;

1

2

4

3

-1

-2

4

(e) Gerschgorin disks: | z + 1 | ≤ 2, | z − 2 | ≤ 3, | z + 4 | ≤ 3; eigenvalues: −2.69805 ± .806289, 2.3961;

2

-8

-6

-4

-2

4

2

-2

-4

0.4

(f )

Gerschgorin disks: z = 21 , | z | 1 ; eigenvalues: 12 , ± √ 3 2

≤

1 3,

|z| ≤

5 12 ;

0.2

-0.4

-0.2 -0.2

-0.4

293

0.2

0.4

6

8

2

1

(g) Gerschgorin disks: | z | ≤ 1, | z − 1 | ≤ 1; eigenvalues: 0, 1 ± i ;

-1

-2

1

2

3

-1

-2

3

2

(h) Gerschgorin disks: | z − 3 | ≤ 2, | z − 2 | ≤ 1, | z | ≤ 1, | z − 1 | ≤ 2; eigenvalues:

1 2

±

√

5 2

,

5 2

±

1

√

5 2 ;

-2

4

2

6

-1

-2

-3

10.3.23. False. Almost any non-symmetric matrix, e.g.,

2 0

1 1

!

provides a counterexample.

♦ 10.3.24. (i) Because A and its transpose AT have the same eigenvalues, which must therefore belong to both DA and DAT . (ii) 2

1

(a) Gerschgorin disk: | z − 1 | ≤ 2; eigenvalues: 3, −1;

-1

-2

1

2

4

3

-1

-2

1 0.75

(b) Gerschgorin disks: | z − 1 | ≤ eigenvalues:

1 1 2, 3;

˛ 1 ˛ , 2 ˛z

+

˛ 1˛ 6˛

≤ 23 ;

0.5 0.25 -1

-0.5

1

0.5

1.5

-0.25 -0.5 -0.75 -1

3

2 1

(c) Gerschgorin disks: | z − 2 | ≤ 1, | z | ≤ 3; √ eigenvalues: 1 ± i 2;

-3

-2

-1

1 -1

-2 -3

294

2

3

2

2

1

(d) Gerschgorin disks: | z − 3 | ≤ 1, | z − 2 | ≤ 2; eigenvalues: 4, 3, 1;

1

2

4

3

-1

-2

4

(e) Gerschgorin disks: | z + 1 | ≤ 2, | z − 2 | ≤ 4, | z + 4 | ≤ 2; eigenvalues: −2.69805 ± .806289, 2.3961;

2

-8

-6

-4

-2

4

2

6

8

-2

-4

0.75 0.5

1 1 2 | ≤ 4, 1 √ ; 3 2

(f ) Gerschgorin disks: | z − eigenvalues:

1 2,±

| z | ≤ 16 , | z | ≤ 31 ;

0.25

-0.75 -0.5 -0.25

0.25

0.5

0.75

-0.25 -0.5 -0.75

2

1

(g) Gerschgorin disks: z = 0, | z − 1 | ≤ 2, | z − 1 | ≤ 1; eigenvalues: 0, 1 ± i ;

-2

-1

1

2

3

-1

-2

3

2

(h) Gerschgorin disks: | z − 3 | ≤ 1, | z − 2 | ≤ 2, | z | ≤ 2, | z − 1 | ≤ 1; eigenvalues:

1 2

±

√

5 2

, 52 ±

1

√

5 2 ;

-2

2

4

-1

-2

-3

♦ 10.3.25. By elementary geometry, all points z in a closed disk of radius r centered at z = a satisfy max{ 0, | a | − r } ≤ | z | ≤ | a | + r. Thus, every point in the ith Gerschgorin disk satisfies max{ 0, | aii | − ri } ≤ | z | ≤ | aii | + ri = si . Since every eigenvalue lies in such a disk, they all satisfy max{ 0, t } ≤ | λi | ≤ s, and hence ρ(A) = max{ | λi | } does too. 10.3.26. (a) The absolute row sums of A are bounded by si =

295

n X

j =1

| aij | < 1, and so

6

ρ(A) ≤ s = max si < 1 by Exercise 10.3.25. (b) A = hence ρ(A) = 1.

0

1 @2 1 2

1 1 2A 1 2

has eigenvalues 0, 1 and

10.3.27. Using Exercise 10.3.25, we find ρ(A) ≤ s = max{ s1 , . . . , sn } ≤ n a⋆ . 10.3.28. For instance, any diagonal matrix whose diagonal entries satisfy 0 < | a ii | < 1. !

1 2 10.3.29. Both false. is a counterexample to (a), while 2 5 to (b). However, see Exercise 10.3.30.

1 0

0 −1

!

is a counterexample

♦ 10.3.30. The eigenvalues of K are real by Theorem 8.20. The ith Gerschgorin disk is centered at kii > 0 and by diagonal dominance its radius is less than the distance from its center to the origin. Therefore, all eigenvalues of K must be positive and hence, by Theorem 8.23, K > 0. 10.3.31.

!

0 1 has Gerschgorin domain | z | ≤ 1. (a) For example, A = 1 0 (b) No — see the proof of Theorem 10.37.

10.3.32. The ith Gerschgorin disk is centered at aii < 0 and, by diagonal dominance, its radius is less than the distance from its center to the origin. Therefore, all eigenvalues of A lie in the left half plane: Re λ < 0, which, by Theorem 9.15, implies asymptotic stability of the differential equation.

10.4.1. (a) Not a transition matrix; (b) not a transition matrix; (c) regular transition matrix:

“

8 9 17 , 17

”T

; (d) regular transition matrix:

matrix; (f ) regular transition matrix: T

“

1 1 1 3, 3, 3

“

”T

1 5 6, 6

”T

; (e) not a regular transition

; (g) regular transition matrix:

( .2415, .4348, .3237 ) ; (h) not a transition matrix; (i) regular transition matrix:

“

6 4 3 13 , 13 , 13

“

251 225 235 290 1001 , 1001 , 1001 , 1001

”T

= ( .4615, .3077, .2308 )T ; (j) not a regular transition matrix; (k) not a transition matrix; (l) regular transition matrix (A4 has all positive entries): ”T

= ( .250749, .224775, .234765, .28971 )T ; (m) regular transition

matrix: ( .2509, .2914, .1977, .2600 )T . 10.4.2. (a) 20.5%; (b) 9.76% farmers, 26.83% laborers, 63.41% professionals 10.4.3. 2004: 37,000 city, 23,000 country; 2005: 38,600 city, 21,400 country; 2006: 39,880 city, 20,120 country; 2007: 40,904 city, 19,096 country; 2008: 41,723 city, 18,277 country; Eventual: 45,000 in the city and 15,000 in the country. 10.4.4. 58.33% of the nights. 10.4.5. When in Atlanta he always goes to Boston; when in Boston he has a 50% probability of going to either Atlanta or Chicago; when in Chicago he has a 50% probability of going to either Atlanta or Boston.0 The transition matrix1 is regular because .375 .3125 .3125 C T4 = B .5625 .5 @ .25 A has all positive entries. .375 .125 .1875 On average he visits Atlanta: 33.33%, Boston 44.44%, and Chicago: 22.22% of the time.

296

0

B0 B B1 @

10.4.6. The transition matrix T =

0

2 3

0 1 3

1

2 3 1 3

0

C C C A

is regular because T 4 =

0

14 B 27 B 2 B @ 9 7 27

26 81 49 81 2 27

26 81 16 27 7 81

1 C C C A

has all positive entries. She visits branch A 40% of the time, branch B 45% and branch C: 15%. 10.4.7. 25% red, 50% pink, 25% pink. 10.4.8. If u(0) = ( a, b )T is the initial state vector, then the subsequent state vectors switch back and forth between ( b, a )T and ( a, b )T . At each step in the process, all of the population in state 1 goes to state 2 and vice versa, so the system never settles down. 10.4.9. This is not a regular transition matrix, so we need to analyze the iterative process di1 rectly.0The eigenvalues of A are λ10= λ1 eigenvectors 2 = 1 and λ3 = 2 , with corresponding 1 0 1 0 1 1 0 1 p B C B C B C B 0C (0) v1 = @ 0 A, v2 = @ 0 A, and v3 = @ −2 A. Thus, the solution with u = @ q0 A is 0 1 1 r0 “

u(n) = p0 +

1 2 q0

”

1

0

0

1

0

1

1

0

p + 12 q0 1 ” 0 “ q0 B 1 C C B C B C B 0 1 A. @ −2 A −→ @ @ 0 A + 2 q 0 + r0 @ 0 A − 0 k+1 2 1 1 0 1 q 0 + r0 2

Therefore, this breeding process eventually results in a population with individuals of genotypes AA and aa only, the proportions of each depending upon the initial population. 10.4.10. Numbering the vertices from top to bottom and left to right, the transition matrix is 0

T =


0 1 2 1 2

1 4

0 1 4 1 4 1 4

1 4 1 4

0

0

1 2

0 0

0 0

1 4 1 4 1 4

1

0

0C C 0C C


1C C 2 C. C 0C C 1C C 2A

1 9 2 9 2 9 1 9 2 9 1 9

1 C C C C C C C C C C C C A

The probability eigenvector is and so the bug spends, 0 1 1 0 0 4 2 1 1 0 0 4 0 4 0 on average, twice as much time at the edge vertices as at the corner vertices.

10.4.11. Numbering the vertices from top to bottom and left to right, the transition matrix is 0

T =

B B B B B B B B B B B B B B B B B B B B B B B B @

0 1 2 1 2

0 0 0 0 0 0 0

1 4

0 1 4 1 4 1 4

0 0 0 0 0

1 4 1 4

0

0

1 4

0 0

0 0

1 4 1 4

1 4

1 6 1 6 1 6

0 0 0 0

0 1 4 1 4

0 0

0 1 6

0 1 6 1 6

0

0 0 1 4

0 1 4

0 0 0 1 4 1 4

0 0 0

0 0 0

1 2

1 4 1 4

0 0 0 1 2

0 0

0 1 4

0 1 4

0

0 0 0 0 1 4 1 4

0 1 4

0 1 4

1

0 C 0C C C C 0C C C 0C C C 0C C. The probability eigenvector is 1C C 2C C 0C C C 0C C 1C C 2A

0

0 B B B B B B B B B B B B B B B B B B B B B B B B @

1 18 1 9 1 9 1 9 1 6 1 9 1 18 1 9 1 9 1 18

1 C C C C C C C C C C C C C C C C C C C C C C C C A

and

so the bug spends, on average, twice as much time on the edge vertices and three times as much time at the center vertex as at the corner vertices.

297

0

10.4.12. The transition matrix T =

0

B B B B B B B B B B B B B B B B B B B B B @

1 2

0 1 2

0 0 0 0 0

1 3

0 1 3

0 1 3

0 0 0 0

1 3

0 1 2

0 1 4

0 0 0

0 0 0

1 4

1 3

1 4

1 3

0 0 0

0 0 0

0 0 0

0 1 4

0 0

1 2

0

0

0

0 0 0

1 3

0

1 3

1 2

0 0

1 2

1 3

0

0

0 0 0 0 1 3

0 1 3

0 1 3

1

0 C 0C C C C 0C C C 0C C C is not regular. Indeed, 0C C

1C C 2C C 0C C 1C C 2A

0

if, say, the bug starts out at a corner,then after an odd number of steps it can only be at one of the edge vertices, while after an even number of steps it will be either at a corner vertex or the center vertex. Thus, the iterates u(n) do not converge. If the bug starts at vertex i, so u(0) = ei , after a while, the probability vectors u(n) end up switching back and “

forth between the probability vectors v = T w = 0, 14 , 0, 14 , 0, 14 , 0, 14 , 0 has an equal probability of being at any edge vertex, and “

w = T v = 16 , 0, 16 , 0, 13 , 0, 16 , 0, likely, at the middle vertex.

1 6

(k)

♦ 10.4.13. The ith column of T k is ui

”T

= T k ei → v by Theorem 10.40. 0 B B B @

1 3 1 3 1 3

1 3 1 3 1 3

1 3 1 3 1 3

1

C C C. A

10.4.15. First, v is a probability vector since the sum of its entries is (1 − p) q + q p p+q eigenvector for eigenvalue 1.

p q + (1 − q) p p+q

10.4.16. All equal probabilities: z = 10.4.17. z =

„

10.4.18. False:

1 1 n, ... , n 0

.3

B @ .3

.4

.5 .2 .3

«T

, where the bug

, where the bug is either at a corner vertex or, twice as

10.4.14. In view of Exercise 10.4.13, the limit is

over, A v =

”T

„

1 1 n, ... , n

!

=

«T

„

q p+q

p p+q

p p + = 1. Morep+q p+q «

= v, proving it is an

.

.

1

.2 .5 C A is a counterexample. .3

10.4.19. False. For instance, if T = not even invertible.

0

1 @2 1 2

1 1 3 A, 2 3

then T −1 =

4 −3

0

!

−2 , while T = @ 3

1 2 1 2

10.4.20. False. For instance, 0 is not a probability vector. 10.4.21. (a) The 1 norm. 10.4.22. True. If v = ( v1 , v2 , . . . , vn )T is a probability eigenvector, then n X

j =1

n X

i=1

vi = 1 and

tij vj = λ vi for all i = 1, . . . , n. Summing the latter equations over i, we find 298

1 1 2A 1 2

is

λ=λ

n X

vi =

i=1

n X

n X

i=1 j =1

tij vj =

n X

j =1

vj = 1,

since the column sums of a transition matrix are all equal to 1. 10.4.23. (a)

0 1

0

!

0 (b) @ 1

1 ; 0

♦ 10.4.24. The ith entry of v is vi = also. Moreover,

n X

i=1

vi =

n X

1 1 2 A. 1 2 n X

j =1

i,j = 1

tij uj . Since each tij ≥ 0 and uj ≥ 0, the sum vi ≥ 0

tij uj =

equal to 1, and u is a probability vector.

n X

j =1

uj = 1 because all the column sums of T are

♦ 10.4.25. (a) The columns of T S are obtained by multiplying T by the columns of S. Since S is a transition matrix, its columns are probability vectors. Exercise 10.4.24 shows that each column of T S is also a probability vector, and so the product is a transition matrix. (b) This follows by induction from part (a), where we write T k+1 = T T k .

10.5.1. (a) The eigenvalues are − 12 , 31 , so ρ(T ) = 21 .

(b) The iterates will converge to the fixed point

“

− 16 , 1

”T

at rate

1 2.

Asymptotically, they

come in to the fixed point along the direction of the dominant eigenvector ( −3, 2 ) T . 10.5.2. (a) ρ(T ) = 2; the iterates diverge: k u(k) k → ∞ at a rate of 2. (b) ρ(T ) = 34 ; the iterates converge to the fixed point ( 1.6, .8, 7.2 )T at a rate dominant eigenvector direction ( 1, 2, 6 )T . (c) ρ(T ) = 12 ; the iterates converge to the fixed point ( −1, .4, 2.6 )T at a rate dominant eigenvector direction ( 0, −1, 1 )T . 10.5.3. (a,b,e,g) are diagonally dominant. ♠ 10.5.4. (a) x = 17 = .142857, y = − 27 = − .285714; (b) x = −30, y = 48; (e) x = −1.9172, y = − .339703, z = −2.24204; (g) x = − .84507, y = − .464789, z = − .450704; ♠ 10.5.5. (c) Jacobi spectral radius = .547723, so Jacobi converges to the solution x = 78 = 1.142857, y = 19 7 = 2.71429; (d ) Jacobi spectral radius = .5, so Jacobi converges to the solution 13 2 x = − 10 9 = −1.1111, y = − 9 = −1.4444, z = 9 = .2222; (f ) Jacobi spectral radius = 1.1180, so Jacobi does not converge. !

0

!

1

.3333 .7857 −4 C 10.5.6. (a) u = , (b) u = , (c) u = B @ −1.0000 A, .3571 5 1.3333 0 1 0 1 0 1 .8750 0. .7273 B B C −.1250 C C C B .7143 C B C, B C. (d) u = B (e) u = B (f ) u = @ −3.1818 A, @ −.1250 A @ −.1429 A .6364 −.1250 −.2857 299

3 4,

along the

1 2,

along the

♣ 10.5.7. (a) | c | > 2. (b) If c = 0, then D = c I = O, and Jacobi iteration isn’t even defined. Otherwise, T = − D−1 (L + U ) is tridiagonal with diagonal entries all 0 and sub- and super-diagonal kπ 2 entries equal to − 1/c. According to Exercise 8.2.48, the eigenvalues are − cos c n+1 1 2 cos . Thus, convergence for k = 1, . . . , n, and so the spectral radius is ρ(T ) = | c | n + 1 1 requires | c | > 2 cos ; in particular, | c | ≥ 2 will ensure convergence for any n. n+1 (c) For n = 5, the solution is u = ( .8333, − .6667, .5000, − .3333, .1667 )T , with a convergence rate of ρ(T ) = cos 16 π = .8660. It takes 51 iterations to obtain 3 decimal place accuracy, while log(.5 × 10−4 )/ log ρ(T ) ≈ 53. For n = 10, the solution is u = (.9091, − .8182, .7273, − .6364, .5455, − .4545, .3636, 1 π = .9595. It takes 173 itera− .2727, .1818, − .0909)T , with a convergence rate of cos 11 −4 tions to obtain 3 decimal place accuracy, while log(.5 × 10 )/ log ρ(T ) ≈ 184. For n = 20, u = (.9524, − .9048, .8571, − .8095, .7619, − .7143, .6667, − .6190, .5714, − .5238, .4762, − .4286, .3810, − .3333, .2857, − .2381, .1905, − .1429, .0952, − .0476) T , with 1 π = .9888. It takes 637 iterations to obtain 3 decimal place a convergence rate of cos 21 accuracy, while log(.5 × 10−4 )/ log ρ(T ) ≈ 677.

10.5.8. If A u = 0, then D u = − (L + U ) u, and hence T u = − D −1 (L + U ) u = u, proving that u is a eigenvector for T with eigenvalue 1. Therefore, ρ(T ) ≥ 1, which implies that T is not a convergent matrix.

♦ 10.5.9. If A is nonsingular, then at least one of the terms in the general determinant expansion (1.85) is nonzero. If a1,π(1) a2,π(2) · · · an,π(n) 6= 0 then each ai,π(i) 6= 0. Applying the permutation π to the rows of A will produce a matrix whose diagonal entries are all nonzero. 10.5.10. Assume, for simplicity, that T is complete with a single dominant eigenvalue λ 1 , so that ρ(T ) = | λ1 | and | λ1 | > | λj | for j > 1. We expand the initial error e(0) = c1 v1 + · · · + cn vn in terms of the eigenvectors. Then e(k) = T k e(0) = c1 λk1 v1 + · · · + cn λkn vn , which, for k ≫ 0, is approximately e(k) ≈ c1 λk1 v1 . Thus, k e(k+j) k ≈ ρ(T )j k e(k) k. In particular, if at iteration number k we have m decimal places of accuracy, so k e(k) k ≤ .5 × 10− m , then, approximately, k e(k+j) k ≤ .5 × 10− m+j log10 ρ(T ) = .5 × 10− m−1 provided j = − 1/ log10 ρ(T ).

10.5.11. False for elementary row operations of types 1 & 2, but true for those of type 3.

♥ 10.5.12. (a) x =

(b) x(1)

0 7 B 23 B 6 @ 23 40 23 0

1 C C A

0

1

.30435 C =B @ .26087 A; 1.73913 1

0

0

1

1

0

1

−.5 .4375 .390625 .0862772 C C C C (3) (2) (3) =B =B =B =B @ .0516304 A; @ −.25 A, x @ .0625 A, x @ .3125 A, with error e −.0828804 1.75 1.8125 1.65625

300

(c) x(k+1) = 0

0 B B B @

0 −

1 4 1 4

− 14 0 − 14

1 2 − 41

0

1

1

C C (k) Cx A 0

0

1 1 B−2 C C B − 1 C; +B @ 4A 7 4 1

0

1

−.5 .484375 .274902 C C (2) (3) (d) x = −.375 C =B =B A, x @ .316406 A, x @ .245728 A; the error at the third 1.70801 1.74271 1.78125 0 1 −.029446 C iteration is e(3) = B @ −.015142 A, which is about 30% of the Jacobi error; .003576 (1)

(e) x(k+1)

B @

0

B0 B0 =B @ 0

(f ) ρ(TJ ) =

√

− 14 1 − 16 3 64

1 2 3 8 1 − 32

1

0

1 C B−2 C (k) B 3 Cx +B @−8 A 57 32

3 4 =√.433013, = 3+64 73 = .180375,

1

C C C; A

so Gauss–Seidel converges about log ρGS / log ρJ = 2.046 (g) ρ(TGS ) times as fast. (h) Approximately log(.5 × 10−6 )/ log ρGS ≈ 8.5 iterations. (i) Under Gauss–Seidel, x(9)

0

1

0

1

−1.0475 .304347 C C (9) = 10−6 B =B @ −.4649 A. @ .260869 A, with error e .1456 1.73913

♠ 10.5.13. (a) x = 71 = .142857, y = − 27 = − .285714; (b) x = −30, y = 48; (e) x = −1.9172, y = − .339703, z = −2.24204; (g) x = − .84507, y = − .464789, z = − .450704;

10.5.14. (a) ρJ = .2582, ρGS = .0667; (b) ρJ = .7303, ρGS = .5333; (c) ρJ = .5477, ρGS = .3; (d) ρJ = .5, ρGS = .2887; (e) ρJ = .4541, ρGS = .2887; (f ) ρJ = .3108, ρGS = .1667; (g) ρJ = 1.118, ρGS = .7071. Thus, all systems lead to convergent Gauss–Seidel schemes, with faster convergence than Jacobi (which doesn’t even converge in case (g)). 10.5.15. (a) (b)

(c)

(d)

(e)

!

.7857 1 ; spectral radii: ρJ = √1 = .2582, ρGS = 15 Solution: u = = .06667, so 15 .3571 Gauss–Seidel converges ! exactly twice as fast; −4 ; spectral radii: ρJ = √1 = .7071, ρGS = 21 = .5, so Gauss–Seidel Solution: u = 2 5 converges exactly twice as fast; 0 1 .3333 B Solution: u = @ −1.0000 C A; spectral radii: ρJ = .7291, ρGS = .3104, so Gauss–Seidel 1.3333 / log ρJ 1 = 3.7019 times as fast; converges log ρGS 0 .7273 C 4 2 = .5164, ρGS = 15 = .2667, so Solution: u = B @ −3.1818 A; spectral radii: ρJ = √ 15 .6364 Gauss–Seidel converges exactly twice as fast; 0 1 .8750 B −.1250 C C B C; spectral radii: ρJ = .6, ρGS = .1416, so Gauss–Seidel Solution: u = B @ −.1250 A −.1250 converges log ρGS / log ρJ = 3.8272 times as fast; 301

(f ) Solution: u =

0

0.

1

B C B .7143 C C; B @ −.1429 A

spectral radii: ρJ = .4714, ρGS = .3105, so Gauss–Seidel

−.2857 converges log ρGS / log ρJ = 1.5552 times as fast. ♣ 10.5.16. (a) | c | > 2; (b) c > 1.61804; (c) same answer; (d) Gauss–Seidel converges exactly twice as fast since ρGS = ρ2J for all values of c. ♠ 10.5.17. The solution is x = .083799, y = .21648, z = 1.21508. The Jacobi spectral radius is .8166, and so it converges reasonably rapidly to the solution; indeed, after 50 iterations, x(50) = .0838107, y (50) = .216476, z (50) = 1.21514. On the other hand, the Gauss– Seidel spectral radius is 1.0994, and it slowly diverges; after 50 iterations, x (50) = −30.5295, y (50) = 9.07764, z (50) = −90.8959. ♠ 10.5.18. The solution is x = y = z = w = 1. Gauss–Seidel converges, but extremely slowly. Starting with the initial guess x(0) = y (0) = z (0) = w(0) = 0, after 2000 iterations, the approximate solution x(50) = 1.00281, y (50) = .99831, z (50) = .999286, w (50) = 1.00042, is correct to 2 decimal places. The spectral radius is .9969 and so it takes, on average, 741 iterations per decimal place. 10.5.19. ρ(TJ ) = 0, while ρ(TGS ) = 2. Thus Jacobi converges extremely rapidly, whereas Gauss–Seidel diverges. ♠ 10.5.20. Jacobi doesn’t converge because its spectral radius is 3.4441. Gauss–Seidel converges, but extremely slowly, since its spectral radius is .999958. ♦ 10.5.21. For a general matrix, both Jacobi and Gauss–Seidel require k n (n − 1) multiplications and k n (n−1) additions to perform k iterations, along with n2 divisions to set up the initial matrix T and vector c. They are more efficient than Gaussian Elimination provided the number of steps k < 31 n (approximately). ♣ 10.5.22. (a) Diagonal dominance requires | z | > 4; (b) The solution is u = (.0115385, −.0294314, −.0755853, .0536789, .31505, .0541806, −.0767559, −.032107, .0140468, .0115385) T . It takes 41 Jacobi iterations and 6 Gauss–Seidel iterations to compute the first three decimal places of the solution. (c) Computing the spectral radius, we conclude that the Jacobi scheme converges to the solution whenever | z | > 3.6387, while the Gauss–Seidel scheme converges for z < − 3.6386 or z > 2. ♣ 10.5.23. (a) If λ is an eigenvalue of T = I − A, then µ = 1 − λ is an eigenvalue of A, and hence the eigenvalues of A must satisfy | 1 − µ | < 1, i.e., they all lie within a distance 1 of 1. (b) The Gerschgorin disks are D1 = { | z − .8 | ≤ .2 } , D2 = { | z − 1.5 | ≤ .3 } , D3 = { | z − 1 | ≤ .3 } , and hence all eigenvalues of A are within a distance 1 of 1. Indeed, we can explicitly compute the eigenvalues of A, which are µ1 = 1.5026, µ2 = .8987 + .1469 i , µ3 = .8987 − .1469 i . n

Hence, the spectral radius of T = I − A is ρ(T ) = max | 1 − µj |

o

= .5026. Starting the

iterations with u = 0, we arrive at the solution u = ( 1.36437, −.73836, 1.65329 )T to 4 decimal places after 13 iterations. (0)

⋆

302

♥ 10.5.24. (a)

(b) (c) (d) (e)

(f ) (g)

!

1.4 u= . .2 The spectral radius is ρJ = .40825 and so it takes about −1/ log 10 ρJ ≈ 2.57 iterations to produce each additional decimal place of accuracy. The spectral radius is ρGS = .16667 and so it takes about −1/ log 10 ρGS ≈ 1.29 iterations to produce each additional decimal 1 place of accuracy. 0 1 0 3 1 ω ω 1 − ω − (n) (n+1) 2 2 @ A A. @ u + 2 u = 1 2 − 13 (1 − ω) ω 61 ω 2 − ω + 1 ω − ω 3 2 The SOR spectral radius is minimized when the two eigenvalues of Tω coincide, which occurs when ω⋆ = 1.04555, at which value ρ⋆ = ω⋆ − 1 = .04555, so the optimal SOR method is almost 3.5 times as fast as Jacobi, and about 1.7 times as fast as Gauss– Seidel. For Jacobi, about −5/ log10 ρJ ≈ 13 iterations; for Gauss–Seidel, about −5/ log 10 ρGS = 7 iterations; for optimal SOR, about −5/ log 10 ρSOR ≈ 4 iterations. To obtain 5 decimal place accuracy, Jacobi requires 12 iterations, Gauss–Seidel requires 6 iterations, while optimal SOR requires 5 iterations.

♣ 10.5.25. (a) x = .5, y = .75, z = .25, w = .5. (b) To obtain 5 decimal place accuracy, Jacobi requires 14 iterations, Gauss–Seidel requires 8 iterations. One can get very good approximations of the spectral radii ρJ = .5, ρGS = .25, by taking ratios of entries of successive iterates, or the ratio of norms of successive error vectors. (c) The optimal SOR scheme has ω = 1.0718, and requires 6 iterations to get 5 decimal place accuracy. The SOR spectral radius is ρSOR = .0718. √

√

♠ 10.5.26. (a) ρJ = 1+4 5 = .809017, ρGS = 3+8 5 = .654508; (b) no; (c) ω⋆ = 1.25962 and ρ⋆ = .25962; (d) The solution is x = ( .8, −.6, .4, −.2 )T . Jacobi: predicted 44 iterations; actual 45 iterations. Gauss-Seidel: predicted 22 iterations; actual 22 iterations. Optimal SOR: predicted 7 iterations; actual 9 iterations. √

√

♠ 10.5.27. (a) ρJ = 1+4 5 = .809017, ρGS = 3+8 5 = .654508; (b) no; (c) ω⋆ = 1.25962 and ρ⋆ = 1.51315, so SOR with that value of ω doesn’t converge! However, by numerically computing the spectral radius of Tω , the optimal value is found to be ω⋆ = .874785 (underrelaxation), with ρ⋆ = .125215. (d) The solution is x = (.413793, −.172414, .0689655, −.0344828)T . Jacobi: predicted 44 iterations; actual 38 iterations. Gauss-Seidel: predicted 22 iterations; actual 19 iterations. Optimal SOR: predicted 5 iterations; actual 5 iterations. ♠ 10.5.28. (a) The Jacobi iteration matrix TJ = D−1 (L + U ) is tridiagonal with all 0’s on the main diagonal and 12 ’s on the sub- and super-diagonals. Thus, using Exercise 8.2.47, 2 ρJ = cos 19 π < 1, and so Jacobi converges. (b) ω⋆ = = 1.49029. Since 1 + sin 91 π log ρJ ≈ 11.5 Jacobi steps per SOR step. (c) The solution is ρ⋆ = .490291, it takes log ρ⋆ u=

“

8 8 7 2 5 4 1 2 1 9, 9, 9, 3, 9, 9, 3, 9, 9 (0)

Starting with u place accuracy.

”T

= ( .888889, .7778, .6667, .5556, .4444, .3333, .2222, .1111 )T .

= 0, it takes 116 Jacobi iterations versus 13 SOR iterations to achieve 3

♠ 10.5.29. The optimal value for SOR is ω = 1.80063, with spectral radius ρSOR = .945621. Starting with x(0) = y (0) = z (0) = w(0) = 0, it take 191 iterations to obtain 2 decimal place accuracy in the solution. Each additional decimal place requires about −1/ log 10 ρSOR ≈ 41 iterations, which is about 18 times as fast as Gauss–Seidel. 303

√

♠ 10.5.30. The Jacobi and Gauss–Seidel spectral radii are ρJ = 37 = .881917, ρGS = 97 = .777778, respectively. It takes 99 Jacobi iterations versus 6 Gauss-Seidel iterations to obtain the solution with 5 decimal place accuracy. Using (10.86) to fix the optimal SOR parameter ω⋆ = 1.35925 with spectral radius ρ⋆ = .359246. However, it takes 16 iterations to obtain the solution with 5 decimal place accuracy, which is significantly slower than GaussSeidel, which converges much faster than it should, owing to the particular right hand side of the linear system. ♣ 10.5.31. (a) u = ( .0625, .125, .0625, .125, .375, .125, .0625, .125, .0625 )T . (b) It takes 11 Jacobi iterations to compute the first two decimal places of the solution, and 17 iterations for 3 place accuracy. (c) It takes 6 Gauss–Seidel iterations to compute the first two decimal places of the solution, and 9 iterations for 3 place accuracy. (d) ρJ = √1 , and so, by (10.86), the optimal SOR parameter is ω⋆ = 1.17157. It takes only 2 4 iterations for 2 decimal place accuracy, and 6 iterations for 3 places. 2

2 π . 1 + sin 1+ n+1 √ For the n = 5 system, ρJ = 23 , and ω⋆ = 34 with ρ⋆ = 31 , and the convergence is about 8 times as fast as Jacobi, and 4 times as fast as Gauss–Seidel. For the n = 25 system, ρ J = .992709, and ω⋆ = 1.78486 with ρ⋆ = .78486, and the convergence is about 33 times as fast as Jacobi, and 16.5 times as fast as Gauss–Seidel.

♣ 10.5.32. Using (10.86), the optimal SOR parameter is ω⋆ =

q

1 − ρ2J

=

♣ 10.5.33. The Jacobi spectral radius is ρJ = .909657. Using (10.86) to fix the SOR parameter ω = 1.41307 actually slows down the convergence since ρSOR = .509584 while ρGS = .32373. Computing the spectral radius directly, the optimal SOR parameter is ω ⋆ = 1.17157 with ρ⋆ = .290435. Thus, optimal SOR is about 13 times as fast as Jacobi, but only marginally faster than Gauss-Seidel. ♦ 10.5.34. The two eigenvalues „

q

«

„

q

ω 2 − 8 ω + 8 + ω ω 2 − 16 ω + 16 , λ2 = 81 ω 2 − 8 ω + 8 − ω ω 2 − 16 ω + 16 √ are real for 0 ≤ ω ≤ 8 − 4 3 . A graph of the modulus 1 of the eigenvalues over the range 0 ≤ ω ≤ 2 reveals that, 0.8 as ω increases, the smaller eigenvalue is increasing√ 0.6 and the larger decreasing until they meet at 8 − 4 3 ; 0.4 after this point, both eigenvalues are complex conjugates 0.2 of the same modulus. To prove this analytically, we compute 0.5 2 1 1.5 dλ2 3−ω 2−ω >0 = +√ dω 4 ω 2 − 16 ω + 16 √ for 1 ≤ ω ≤ 8 − 4 3 , and so the smaller eigenvalue is increasing. Furthermore, dλ2 3−ω 2−ω <0 = −√ 2 dω 4 ω − 16 ω + 16 √ on the same interval, so the larger eigenvalue is decreasing. Once ω > 8 − 4 3 , the eigenvalues are complex conjugates, of equal modulus | λ1 | = | λ2 | = ω − 1 > ω⋆ − 1.

λ1 =

1 8

♥ 10.5.35. (a) u(k+1) = u(k) + D−1 r(k) = u(k) − D−1 A u(k) + D−1 b

= u(k) − D−1 (L + D + U )u(k) + D−1 b = − D −1 (L + U )u(k) + D−1 b, which agrees with (10.65). 304

«

.

(b) u(k+1) = u(k) + (L + D)−1 r(k) = u(k) − (L + D)−1 A u(k) + (L + D)−1 b = u(k) − (L + D)−1 (L + D + U )u(k) + (L + D)−1 b

= − (L + D)−1 U u(k) + (L + D)−1 b, which agrees with (10.71). (k+1) = u(k) + (ω L + D)−1 r(k) = u(k) − (ω L + D)−1 A u(k) + (ω L + D)−1 b (c) u = u(k) − (ω L + D)−1 (L + D + U )u(k) + (ω L + D)−1 b “

”

= − (ω L + D)−1 (1 − ω)D + U u(k) + (ω L + D)−1 b, which agrees with (10.80). (d) If u⋆ is the exact solution, so A u⋆ = b, then r(k) = A(u⋆ − u(k) ) and so k u(k) − u⋆ k ≤ k A−1 k k r(k) k. Thus, if k r(k) k is small, the iterate u(k) is close to the solution u⋆ pro! ! 1 1 0 , then and b = vided k A−1 k is not too large. For instance, if A = 0 0 .0001 ! ! 1 0 x = has residual r = b − A x = , even though x is nowhere near the 100 .001 ! 1 exact solution x⋆ = . 0 10.5.36. Note that the iteration matrix is T = I − ε A, which has eigenvalues 1 − ε λ j . When 2 2 , the iterations converge. The optimal value is ε = , with spectral 0 < ε < λ1 λ1 + λn λ − λn radius ρ(T ) = 1 . λ1 + λn

10.5.37. In each solution,!the last uk is the!actual solution, with residual rk = f − K u = 0. ! ! k 2 .76923 .07692 .78571 (a) r0 = , u1 = , r1 = , u2 = ; 1 .38462 −.15385 .35714 0 1 0 1 0 1 0 1 1 .5 −1 .51814 C B C C B C (b) r0 = B r1 = B @ 0 A, u1 = @ 0 A, @ −2 A, u2 = @ −.72539 A, −2 −1 0 −.5 −1.94301 0 1 1 1.28497 1. B C C r2 = B @ −.80311 A, u3 = @ −1.4 A; .64249 −2.2 0 0 0 0 1 1 1 1 −.13466 2.36658 −.13466 −1 B B C C C C r1 = B (c) r0 = B @ −4.01995 A, u2 = @ −.26933 A, @ −2 A, u1 = @ −.26933 A, −.81047 .94264 .94264 7 0 0 1 1 .33333 .72321 B C C r2 = B @ .38287 A, u3 = @ −1.00000 A; 1.33333 .21271 1 0 1 0 1 0 1 0 .90654 1.2 .2 1 C B C B C B C B 2C B .46729 C B −.8 C B .4 C C, B C B C, C, u1 = B B , u = r = (d) r0 = B 2 1 @ −.33645 A @ −.8 A @ @ 0A 0A −.57009 −.4 −.2 −1

305

0

0

1

0

1

0

1

1

−1.45794 4.56612 −1.36993 9.50 B B B C −.59813 C .40985 C 1.25 C C B C B 1.11307 C B C C, u3 = B C, C B B C; r2 = r = , u = 3 4 @ −2.92409 A @ −3.59606 A @ −10.25 A −.26168 A −2.65421 0 1 −5.50820 .85621 −13.00 0 1 1 0 0 1 4 0. .8 .875 B C B C C B B C 0C B 0C B −.8 C B −.125 C B C, u1 = B C, B C C. B (e) r0 = B r = , u = 2 1 @0A @ −.8 A @ 0A @ −.125 A 0 −.8 0 −.125 B B B @

♣ 10.5.38. Remarkably, after only two iterations, the method finds the exact solution: u 3 = u⋆ = ( .0625, .125, .0625, .125, .375, .125, .0625, .125, .0625 )T , and hence the convergence is dramatically faster than the other iterative methods. ♣ 10.5.39. (a) n = 5: b = ( 2.28333, 1.45, 1.09286, .884524, .745635 )T ; n = 10: b = ( 2.92897, 2.01988, 1.60321, 1.3468, 1.16823, 1.0349, .930729, .846695, .777251, .718771 ) T ; n = 30: b = (3.99499, 3.02725, 2.5585, 2.25546, 2.03488, 1.86345, 1.72456, 1.60873, 1.51004, 1.42457, 1.34957, 1.28306, 1.22353, 1.16986, 1.12116, 1.07672, 1.03596, .998411, .963689, .931466, .901466, .873454, .847231, .82262, .799472, .777654, .75705, .737556, .719084, .70155) T ; (b) For regular Gaussian Elimination, using standard arithmetic in Mathematica, the maximal error, i.e., ∞ norm of the difference between the computed solution and u ⋆ , is: n = 5: 1.96931 × 10−12 ; n = 10: 5.31355 × 10−4 ; n = 30: 457.413. (c) Pivoting has little effect for m = 5, 10, but for n = 30 the error is reduced to 5.96011 × 10 −4 . (d) Using the conjugate gradient algorithm all the way to completiong results in the following errors: n = 5: 3.56512 × 10−3 ; n = 10: 5.99222 × 10−4 ; n = 30: 1.83103 × 10−4 , and so, at least for moderate values of n, it outperforms Gaussian Elimination with pivoting. 0

1

0

1

0

1

0

1

−2 .9231 .3077 2.7377 C B C C B C 10.5.40. r0 = B r1 = B @ 1 A, u1 = @ −.4615 A, @ 2.3846 A, u2 = @ −3.0988 A, 1 1 −.4615 1 −1.7692 −.2680 0 0 0 1 1 5.5113 7.2033 B B C C C r2 = B @ 4.6348 A, u3 = @ −9.1775 A, but the solution is u = @ −1 A. The problem is 1 .7262 −4.3823 that the coefficient matrix is not positive definite, and so the fact that the solution is “orthogonal” to the conjugate vectors does not uniquely specify it. 10.5.41. False. For example, consider the homogeneous system K u = 0 where K = !

!

.0001 0

1 −.01 with solution u = 0. The residual for u = is r = − K u = with k r k = 0 0 .01, yet not even the leading digit of u agrees with the true solution. In general, if u ⋆ is the true solution to K u = f , then the residual is r = f − A u = A(u⋆ − u), and hence k u⋆ − u k ≤ k A−1 kk r k, so the result is valid only when k A−1 k ≤ 1. ⋆

10.5.42. Referring to the pseudocode program in the text, at each step to compute rk = f − K uk requires n2 multiplications and n2 additions; k rk k2 requires n multiplications and n − 1 additions; k rk k2 v requires n + 1 multiplications and n additions — since k rk−1 k2 k k rk−1 k2 was already computed in the previous iteration (but when k = 0 this step requires no work);

vk+1 = rk +

306

!

0 , 1

T K vk+1 requires n2 + n multiplications and n2 − 1 additions; vk+1

uk+1 = uk +

k rk k2 vk+1 requires n2 multiplications and n2 additions; T Kv vk+1 k+1

for a grand total of 2 (n + 1)2 ≈ 2 n2 multiplications and 2 n2 + 3 n − 2 ≈ 2 n2 additions.

Thus, if the number of steps k < 61 n (approximately), the conjugate gradient method is more efficient than Gaussian Elimination, which requires 31 n3 operations of each type.

♦ 10.5.43. tk =

2 T 2 uT k rk k2 k K uk − 2 f K u k + k f k = . 3 T 2 T rT uT k K rk k K uk − 2 f K uk + f K f

♠ 10.6.1. In all cases, we use the normalized version (10.101) starting with u (0) = e1 ; the answers are correct to 4 decimal places. (a) After 17 iterations, λ = 2.00002, u = ( −.55470, .83205 )T ; (b) after 26 iterations, λ = −3.00003, u = ( .70711, .70710 )T ; (c) after 38 iterations, λ = 3.99996, u = ( .57737, −.57735, .57734 )T ; (d) after 121 iterations, λ = −3.30282, u = ( .35356, .81416, −.46059 )T ; (e) after 36 iterations, λ = 5.54911, u = ( −.39488, .71005, .58300 )T ; (f ) after 9 iterations, λ = 5.23607, u = ( .53241, .53241, .65810 )T ; (g) after 36 iterations, λ = 3.61800, u = ( .37176, −.60151, .60150, −.37174 ) T ; (h) after 30 iterations, λ = 5.99997, u = ( .50001, .50000, .50000, .50000 ) T . ♠ 10.6.2. For n = 10, it takes 159 iterations to obtain λ1 = 3.9189 = 2 + 2 cos 61 π to 4 decimal places. 1 for n = 20, it takes 510 iterations to obtain λ1 = 3.9776 = 2 + 2 cos 21 π to 4 decimal places. 1 π to 4 decimal for n = 50, it takes 2392 iterations to obtain λ1 = 3.9962 = 2 + 2 cos 51 places. ♠ 10.6.3. In each case, to find the dominant singular value of a matrix A, we apply the power method to K = AT A andqtake the square root of its dominant eigenvalue to find the dominant singular value σ1 = λ1 of A. ! 2 −1 (a) K = ; after 11 iterations, λ1 = 13.0902 and σ1 = 3.6180; −1 13 0 1 8 −4 −4 B 2C (b) K = @ −4 10 A; after 15 iterations, λ1 = 14.4721 and σ1 = 3.8042; −4 2 2 0 1 5 2 2 −1 B 2 8 2 −4 C C C; after 16 iterations, λ1 = 11.6055 and σ1 = 3.4067; B (c) K = B @ 2 2 1 −1 A −1 −4 −1 1 2 0 14 −1 1 (d) K = B 6 −6 C @ −1 A; after 39 iterations, λ1 = 14.7320 and σ1 = 3.8382. 1 −6 6

307

10.6.4. Since v(k) → λk1 v1 as k → ∞, u

(k)

8

< u1 , c1 λk1 v1 v(k) −→ = = : (−1)k u , | c1 | | λ1 | k k v 1 k k v(k) k 1

λ1 > 0, λ1 < 0, (k)

8 <

where λ1 u1 ,

u1 = sign c1 λ1 > 0,

v1 k v1 k

so (−1) λ1 u1 , λ1 < 0, k A u(k) k → | λ1 |. If λ1 > 0, the iterates u(k) → u1 converge to one of the two dominant unit eigenvectors, whereas if λ1 < 0, the iterates u(k) → (−1)k u1 switch back and forth between the two real unit eigenvectors. is one of the two real unit eigenvectors. Moreover, A u

→

:

k

♦ 10.6.5.

1 v, and so v is also the eigenvector of A−1 . λ (b) If λ1 , . . . , λn are the eigenvalues of A, with | λ1 | > | λ2 | > · · · > | λn | > 0 (recalling that 1 1 ,..., are the eigenvalues of 0 cannot be an eigenvalue if A is nonsingular), then λ1 λn 1 1 1 1 is the dominant eigenvalue of A−1 . > > ··· > and so A−1 , and | λn | | λn−1 | | λ1 | λn Thus, applying the power method to A−1 will produce the reciprocal of the smallest (in absolute value) eigenvalue of A and its corresponding eigenvector. (c) The rate of convergence of the algorithm is the ratio | λn /λn−1 | of the moduli of the smallest two eigenvalues. (d) Once we factor P A = L U , we can solve the iteration equation A u(k+1) = u(k) by rewriting it in the form L U u(k+1) = P u(k) , and then using Forward and Back Substitution to solve for u(k+1) . As we know, this is much faster than computing A−1 . (a) If A v = λ v then A−1 v =

♠ 10.6.6. (a) After 15 iterations, we obtain λ = .99998, u = ( .70711, −.70710 )T ; (b) after 24 iterations, we obtain λ = −1.99991, u = ( −.55469, −.83206 )T ; (c) after 12 iterations, we obtain λ = 1.00001, u = ( .40825, .81650, .40825 ) T ; (d) after 6 iterations, we obtain λ = .30277, u = ( .35355, −.46060, .81415 ) T ; (e) after 7 iterations, we obtain λ = −.88536, u = ( −.88751, −.29939, .35027 ) T ; (f ) after 7 iterations, we obtain λ = .76393, u = ( .32348, .25561, −.91106 ) T ; (g) after 11 iterations, we obtain λ = .38197, u = ( .37175, .60150, .60150, .37175 ) T ; (h) after 16 iterations, we obtain λ = 2.00006, u = ( .500015, −.50000, .499985, −.50000 ) T . ♦ 10.6.7. (a) According to Exercises 8.2.19, 8.2.24, if A has eigenvalues λ1 , . . . , λn , then (A − µ I )−1 1 has eigenvalues νi = . Thus, applying the power method to (A − µ I )−1 will λi − µ produce its dominant eigenvalue ν ⋆ , for which | λ⋆ − µ | is the smallest. We then recover 1 the eigenvalue λ⋆ = µ + ⋆ of A which is closest to µ. ν (b) The rate of convergence is the ratio | (λ⋆ − µ)/(λ⋆⋆ − µ) | of the moduli of the smallest two eigenvalues of the shifted matrix. (c) µ is an eigenvalue of A if and only if A − µ I is a singular matrix, and hence one cannot implement the method. Also choosing µ too close to an eigenvalue will result in an illconditioned matrix, and so the algorithm may not converge properly. ♠ 10.6.8. (a) After 11 iterations, we obtain ν ⋆ = 2.00002, so λ⋆ = 1.0000, u = ( .70711, −.70710 )T ; 308

(b) after 27 iterations, we obtain ν ⋆ = −.40003, so λ⋆ = −1.9998, u = ( .55468, .83207 )T ; (c) after 10 iterations, we obtain ν ⋆ = 2.00000, so λ⋆ = 1.00000, u = ( .40825, .81650, .40825 )T ; (d) after 7 iterations, we obtain ν ⋆ = −5.07037, so λ⋆ = .30278, u = ( −.35355, .46060, −.81415 )T ; (e) after 8 iterations, we obtain ν ⋆ = .72183, so λ⋆ = −.88537, u = ( .88753, .29937, −.35024 )T ; (f ) after 6 iterations, we obtain ν ⋆ = 3.78885, so λ⋆ = .76393, u = ( .28832, .27970, −.91577 )T ; (g) after 9 iterations, we obtain ν ⋆ = −8.47213, so λ⋆ = .38197, u = ( −.37175, −.60150, −.60150, −.37175 )T ; (h) after 14 iterations, we obtain ν ⋆ = .66665, so λ⋆ = 2.00003, u = ( .50001, −.50000, .49999, −.50000 )T . ♠ 10.6.9. (i) First, compute the dominant eigenvalue λ1 and eigenvector v1 using the power method. Then set B = A − λ1 v1 bT , where b is any vector such that b · v1 = 1, e.g., b = v1 /k v1 k2 . According to Exercise 8.2.52, B has eigenvalues 0, λ2 , . . . , λn , and corresponding eigenvectors v1 and wj = vj − cj v1 , where cj = b · vj /λj for j ≥ 1. Thus, applying the power method to B will produce the subdominant eigenvalue λ2 and the eigenvector w2 of the deflated matrix B, from which v2 can be reconstructed using the preceding formula. (ii) In all cases, we use the normalized version (10.101) starting with u (0) = e1 ; the answers are correct to 4 decimal places. (a) Using the computed values λ1 ! = 2., v1 = ( −.55470, .83205 )T , the deflated matrix −1.61538 −1.07692 is B = ; it takes only 3 iterations to produce λ2 = 1.00000, 3.92308 2.61538 v2 = ( −.38075, .924678 )T . ! −3.5 3.5 (b) Using λ1 = −3., v1 = ( .70711, .70711 )T , the deflated matrix is B = ; it −1.5 1.5 takes 3 iterations to produce λ2 = −2.00000. (c) Using0λ1 = 4., v1 = ( .57737, −.57735,1 .57735 )T , the deflated matrix is 1.66667 .333333 −1.33333 .333333 C B=B @ .333333 .666667 A; it takes 11 iterations to produce λ2 = 2.99999. −1.33333 .333333 1.66667 (d) Using λ1 = −3.30278, v1 = ( −.35355, −.81415, .46060 )T , the deflated matrix is B = 0 1 −1.5872 .95069 .46215 B .18924 −1.23854 C @ −2.04931 A; it takes 10 iterations to produce λ2 = 3.00000. −2.53785 3.76146 4.70069 (e) Using λ1 = 5.54913, v1 = ( −.39488, .71006, .58300 )T ,, the deflated matrix is B = 0 1 −1.86527 −.44409 −.722519 B 2.70287 C A; it takes 13 iterations to produce λ2 = −3.66377. @ 2.55591 −.797798 .277481 1.70287 −1.88606 (f ) Using λ1 = 5.23607, v1 = 1( 0.53241, 0.53241, 0.65809 )T , the deflated matrix is B = 0 .5158 .5158 −.8346 B C @ −.4842 1.5158 −.8346 A; it takes 36 iterations to produce λ2 = 1.00003. .1654 .1654 −.2677 (g) Using λ1 = 3.61803, v1 = ( .37176, −.60151, .60150, −.37174 )T , the deflated matrix

309

is B =

0

1.5

B B −.19098 B @ −.80902

−.19098 .69098 .30902 −.80902

−.80902 .30902 .69098 −.19098

1

.5000 −.80902 C C C; it takes 18 iterations to produce −.19098 A 1.5000

.5000 λ2 = 2.61801. (h) Using λ1 = 6., v1 = ( .5, .5, .5, .5 )T , the deflated matrix is B = 0 1 2.5 −.5 −1.5 −.5 B 2.5 −.5 −1.5 C C B −.5 C; it takes 17 iterations to produce λ2 = 3.99998. B @ −1.5 −.5 2.5 −.5 A −.5 −1.5 −.5 2.5

10.6.10. That A is a singular matrix and 0 is an eigenvalue. The corresponding eigenvectors are the nonzero elements of ker A. In fact, assuming u(k) 6= 0, the iterates u(0) , . . . , u(k) form a Jordan chain for the zero eigenvalue. To find other eigenvalues and eigenvectors, you need to try a different initial vector u(0) .

10.6.11. (a) (b) (c) (d)

(e)

(f )

!

!

.3310 .9436 Eigenvalues: 6.7016, .2984; eigenvectors: , . .9436 −.3310 ! ! −.3827 .9239 Eigenvalues: 5.4142, 2.5858; eigenvectors: , . .9239 .3827 1 0 1 0 1 0 .2726 .9454 −.1784 C B B C B Eigenvalues: 4.7577, 1.9009, −1.6586; eigenvectors: @ .7519 A, @ −.0937 A, @ .6526 C A. .6003 −.3120 −.7364 Eigenvalues: 7.0988, 2.7191, −4.8180; 0 1 0 1 0 1 .6205 −.5439 .5649 B C B C B eigenvectors: @ .6328 A, @ −.0782 A, @ −.7704 C A. −.4632 −.8355 −.2956 Eigenvalues: 4.6180, 3.6180, 2.3820, 1.3820; 1 0 0 1 0 1 1 0 −.3717 −.6015 .3717 −.6015 B B C B C C B .6015 C C B .3717 C B −.3717 C B .6015 C B C, B C, B C. C, B eigenvectors: B @ −.6015 A @ .3717 A @ .3717 A @ .6015 A .3717 −.6015 .3717 .6015 Eigenvalues: 8.6091, 6.3083, 4.1793, 1.9033; 0 1 0 1 0 1 0 1 −.3182 .8294 .4126 −.2015 B C B C B C B −.9310 C B −.2419 C B −.1093 C B .2507 C C B C, B C, B C, B C. eigenvectors: B @ −.1008 A @ −.4976 A @ .6419 A @ −.5746 A .1480 −.0773 .6370 .7526 0

6 0 10.6.12. The iterates converge to the diagonal matrix An → B @0 9 0 0 pear on along the diagonal, but not in decreasing order, because,

1

0 0C A. The eigenvalues ap3 when 0 the eigenvalues 1 are 0 1 1 C listed in decreasing order, the corresponding eigenvector matrix S = B @ 1 −1 1 A (or, −2 −1 1 rather its transpose) is not regular, and so Theorem 10.57 does not apply.

10.6.13. (a) Eigenvalues: 2, 1; eigenvectors:

!

−2 , 3

310

!

−1 . 1

!

− .9669 , .2550

(b) Eigenvalues: 1.2087, 5.7913; eigenvectors:

!

− .6205 . − .7842

(c) Eigenvalues: 3.5842, −2.2899, 1.7057; 0 1 0 1 0 1 −.4466 .1953 .7491 C B C B C eigenvectors: B @ −.7076 A, @ −.8380 A, @ −.2204 A. .5476 −.5094 .6247 (d) Eigenvalues: 7.7474, −0 .2995, −3.4479; 1 0 1 0 1 − .4697 − .7799 .6487 C B C B C eigenvectors: B @ − .3806 A, @ .2433 A, @ − .7413 A. − .7966 .5767 .1724 (e) Eigenvalues: 18.3344, 4.2737, 0, −1.6081; 1 0 1 0 1 0 1 0 .4136 − .4183 − .2057 − .5774 B C B C B C B .8289 C C B .9016 C B − .5774 C B .4632 C C, B C, B B C, B C. eigenvectors: B @ .2588 A @ − .0957 A @ .5774 A @ − .6168 A .2734 .0545 .6022 0

10.6.14. Yes. After 10 iterations, one finds 0 1 0 1 2.0011 1.4154 4.8983 −.5773 .4084 .7071 R10 = B 0 .9999 −.0004 C S10 = B .4082 −.7071 C @ A, @ −.5774 A, 0 0 .9995 −.5774 −.8165 .0002 so the diagonal entries of R10 give the eigenvalues correct to 3 decimal places, and the columns of S10 are similar approximations to the orthonormal eigenvector basis. 10.6.15. It has eigenvalues ± 1, which have the same magnitude. The Q R factorization is trivial, with Q = A and R = I . Thus, R Q = A, and so nothing happens. ♦ 10.6.16. This follows directly from Exercise 8.2.23. ♦ 10.6.17. T (a) By induction, if Ak = Qk Rk = RkT QT k = Ak , then, since Qk is orthogonal, T T T T T T T AT k+1 = (Rk Qk ) = Qk Rk = Qk Rk Qk Qk = Qk Qk Rk Qk = Rk Qk = Ak+1 , proving symmetry of Ak+1 . Again, proceeding by induction, if Ak = Qk Rk is tridiagonal, then its j th column is a linear combination of the standard basis vectors ej−1 , ej , ej+1 . By the Gram-Schmidt formulas (5.19), the j th column of Qk is a linear combination of the first j columns of Ak , and hence is a linear combination of e1 , . . . , ej+1 . Thus, all entries below the sub-diagonal of Qk are zero. Since Rk is upper triangular, this implies all entries of Ak+1 = Rk Qk lying below the sub-diagonal are also zero. But we already proved that Ak+1 is symmetric, and hence this implies all entries lying above the superdiagonal are also 0, which implies Ak+1 is tridiagonal. (b) The result does not hold if A is 0 only tridiagonal and1not symmetric. For example, when 0√ 1 √1 √1 √1 √1 − 0 1 2 0 C B 2 3 6 2 B C 1 −1 0 C B √ B 1 1 C B √1 C 2 C B C,, and √ √ √ − 3 0 C B , R = A = B 1 1 1 , then Q = @ A B 3 B 2 3C √6 C @ A A @ 0 1 1 √1 √1 √2 0 0 0

A1 = R Q =

0

1

B√ B B √3 B B 2 @

0

− √1 6 5 3 1 √ 3 2

√2 3 1 √ 3 2 1 3

1

C C C C, C A

3

3

which is not tridiagonal.

311

6

10.6.18.

0

1 (a) H = B @0 0 (b) 0

1 B B0 H1 = B @0 0 0 1 B 0 B H2 = B @0 0

0 −.9615 .2747

0

1

0 .2747 C A, .9615

8.0000 T = H AH = B @ 7.2801 0

1

0 0 0 −.4082 .8165 −.4082 C C C, .8165 .5266 .2367 A −.4082 .2367 .8816 1 0 0 0 1 0 0 C C C, 0 −.8278 −.5610 A 0 −.5610 .8278

7.2801 20.0189 3.5660

1

0 3.5660 C A. 4.9811

0

1

5.0000 −2.4495 0 0 B 3.8333 1.3865 .9397 C B −2.4495 C C, T1 = H1 A H1 = B @ 0 1.3865 6.2801 −.9566 A 0 .9397 −.9566 6.8865 0 1 5.0000 −2.4495 0 0 B −2.4495 3.8333 −1.6750 0 C C C. B T = H2 T1 H2 = B @ 0 −1.6750 5.5825 .0728 A 0 0 .0728 7.5842

(c) H1 =

0

1

B B0 B @0

0 1 B B0 H2 = B @0 0 0

1

0 0 0 0 .7071 −.7071 C C C, .7071 .5000 .5000 A −.7071 .5000 .5000 1 0 0 0 1 0 0 C C C, 0 −.1691 .9856 A 0 .9856 .1691

T1 = H1 A H1 =

0

4.0000

B B −1.4142 B @ 0

0 4.0000 B B −1.4142 T = H2 T1 H2 = B @ 0 0 0

1

−1.4142 0 0 2.5000 .1464 −.8536 C C C, .1464 1.0429 .7500 A −.8536 .7500 2.4571 1 −1.4142 0 0 2.5000 −.8660 0 C C C. −.8660 2.1667 .9428 A 0 .9428 1.3333

♠ 10.6.19. (a) Eigenvalues: 24, 6, 3; (b) eigenvalues: 7.6180, 7.5414, 5.3820, 1.4586; (c) eigenvalues: 4.9354, 3.0000, 1.5374, .5272. ♠ 10.6.20. The singular values are the square roots of the non-zero eigenvalues of 0

5 B B 2 T K=A A=B @ 2 −1

2 9 0 −6

2 0 5 3

Applying the tridiagonalization algorithm, we find H1 =

0

1

1

0 0 0 −.6667 −.6667 .3333 C C C, −.6667 .7333 .1333 A .3333 .1333 .9333 1 0 0 0 1 0 0 C C C, 0 −.9839 −.1789 A 0 −.1789 .9839

B B0 B @0

0 1 B B0 H2 = B @0 0 0

A1 =

0

1

−1 −6 C C C. 3A 6

5.0000

B B −3.0000 B @ 0

0 5.0000 B B −3.0000 A2 = B @ 0 0 0

−3.0000 8.2222 4.1556 .7556 −3.0000 8.2222 −4.2237 0

1

0 0 4.1556 0.7556 C C C, 8.4489 4.8089 A 4.8089 3.3289 1 0 0 −4.2237 0 C C C. 9.9778 −3.6000 A −3.6000 1.8000

Applying the Q R algorithm to the final tridiagonal matrix, the eigenvalues are found to be 14.4131, 7.66204, 2.92482, 0, and so the singular values of the original matrix are 3.79646, 2.76804, 1.71021. 10.6.21.

0

1 (a) H1 = B @0 0

0 −.4472 −.8944

1

0 −.8944 C A, .4472

0

3.0000 A1 = B @ −2.2361 0 312

−1.3416 −2.2000 −1.4000

1

1.7889 1.6000 C A; 4.2000

(b)

(c)

0

1 B B0 H1 = B @0 0 0 1 B B0 H2 = B @0 0

H1 =

0

1

B B0 B @0

0 1 B B0 H2 = B @0 0 0

0 0 −.8944 0 0 1 −.4472 0 0 0 1 0 0 −.7875 0 −.6163

1

0 −.4472 C C C, 0 A .8944 1 0 0 C C C, −.6163 A .7875 1

0 0 0 −.5345 .2673 −.8018 C C C, .2673 .9535 .1396 A −.8018 .1396 .5811 1 0 0 0 1 0 0 C C C, 0 −.7242 −.6897 A 0 −.6896 .7242

0

3.0000 B B −2.2361 A1 = B @ 0 0 0 3.0000 B B −2.2361 A2 = B @ 0 0

A1 =

0

1.0000

B B −3.7417 B @ 0

0 1.0000 B B −3.7417 A2 = B @ 0 0 0

−2.2361 3.8000 1.7889 1.4000 −2.2361 3.8000 −2.2716 0 −1.0690 1.0714 −2.2316 −2.1248 −1.0690 1.0714 3.0814 0

−1.0000 2.2361 2.0000 −4.4721 .7875 −2.0074 −3.2961 .9534

1

0 .4000 C C C, −5.8138 A 1.2000 1 .6163 −1.0631 C C C; 2.2950 A 6.4961 1

−.8138 .4414 −1.2091 −1.4507 C C C, 1.7713 1.9190 A .0482 3.1573 1 .2850 .8809 1.876 −.2168 C C C. 3.4127 −1.6758 A .1950 1.5159

♠ 10.6.22. (a) Eigenvalues: 4.51056, 2.74823, −2.25879, (b) eigenvalues: 7., 5.74606, −4.03877, 1.29271, (c) eigenvalues: 4.96894, 2.31549, −1.70869, 1.42426. 10.6.23. First, by Lemma 5.28, H1 x1 = y1 . Furthermore, since the first entry of u1 is zero, T uT 1 e1 = 0, and so H e1 = ( I − 2 u1 u1 )e1 = e1 . Thus, the first column of H1 A is H1 (a11 e1 + x1 ) = a11 e1 + y1 = ( a11 , ± r, 0, . . . , 0 )T . Finally, again since the first entry of u1 is zero, the first column of H1 is e1 and so multiplying H1 A on the right by H1 doesn’t affect its first column. We conclude that the first column of the symmetric matrix H1 A H1 has the form given in (10.109); symmetry implies that its first row is just the transpose of its first column, which completes the proof. 10.6.24. Since T = H −1 A H where H = H1 H2 · · · Hn is the product of the Householder reflections, A v = λ v if and only if T w = λ w where w = H −1 v is the corresponding eigenvector of the tridiagonalized matrix. Thus, to recover the eigenvectors of A we need to multiply v = H w = H1 H2 · · · Hn w. ♦ 10.6.25. (a) Starting with a symmetric matrix A = A1 , for each j = 1, . . . , n − 1, the tridiagonalization algorithm produces a symmetric matrix Aj+1 from Aj as follows. We first extract xj , which requires no arithmetic operations, and then determine vj = xj ± k xj k ej+1 , which, since their first j entries are 0, requires n − j multiplications and additions and a square root. To compute Aj+1 , we only need to work on the lower right (n − j) × (n − j) block of Aj since its first j − 1 rows and columns are not affected, while the j th row and column entries of Aj+1 are predetermined. Setting uj = vj /k vj k2 , T Aj+1 = Hj Aj Hj = ( I − 2 uT j uj )A( I − 2 uj uj ) = A −

“

T 2 vj zT j + v j zj

k vj k2

”

+

4 vjT vj zj vjT k vj k4

,

b where zj = A vj . Thus, the updated entries a ik of Aj+1 are given by 2 2 b a where αj = , βj = zj vjT = zj · vj , ik = aik − αj (vi zk + vk zi ) + αj βj vi vk , k vj k2

313

for i, k = j + 1, . . . , n, where aik are the entries of Aj and Aj+1 . To compute zj requires (n − j)2 multiplications and (n − j)(n − j − 1) additions; to compute the different products vi vk and vi zk requires, respectively, (n − j)2 and 12 (n − j)(n − j + 1) multiplications. Using these, to compute αj and βj requires 2(n − j − 1) additions and 1 division; finally, to compute the updated entries on and above the diagonal (the ones below the diagonal following from symmetry) requires 2(n − j)(n − j + 1) multiplications and 32 (n − j)(n − j + 1) additions. The total is 23 n3 − 12 n2 − −1 ≈ 23 n3 multiplications, 5 3 1 2 10 5 3 6 n + 2 n − 3 n + 2 ≈ 6 n additions and n − 1 square roots that are required to tridiagonalize A. (b) First, to factor a tridiagonal A = Q R using the pseudocode program on page 242, we note that at the beginning of the j th step, for j < n, the last n − j − 1 entries of the j th column and the last n − j − 2 entries of the (j + 1)st column are zero, while columns j + 2, . . . , n are still in tridaigonal form. Thus, to compute rjj requires j + 1 multiplications, j additions and a square root; to compute the nonzero aij requires j + 1 multiplications. We only need compute rjk for k = j +1, which requires j +1 multiplications and j additions, and, if j < n − 1, for k = j + 2, which requires just 1 multiplication. We update the entries in columns j + 1, which requires j + 1 multiplications and j + 1 additions and, when j < n − 1, column j + 2, which requires j + 1 multiplications and 1 addition. The final column only requires 2 n multiplications, n − 1 additions and a square root to normalize. The totals are 52 n2 + 29 n − 7 ≈ 25 n2 multiplications, 32 n2 + 23 n − 4 ≈ 23 n2 additions and n square roots. (c) Much faster: Once the matrix is tridiagonalized, each iteration requires 52 n2 versus n3 multiplications and 32 n2 versus n3 additions, as found in Exercise 5.3.31. Moreover, by part (a), the initial triadiagonalization only requires the effort of about 1 21 Q R steps. 10.6.26. Tridiagonalization start set R = A for j = 1 to n − 2 for i = 1 to j set xi = 0 next i for i = j + 1 to n set xi = rij next i set v = x + (sign xj+1 ) k x k ej+1

if v 6= 0, set uj = v/k v k, Hj = I − 2uT j uj , R = H j R Hj

else set uj = 0, Hj = I endif next j end

The program also works as written for reducing a non-symmetric matrix A to upper Hessenberg form.

314

Solutions — Chapter 11 d 1 α d d2 11.1.1. The greatest displacement is at x = α + 2 , with u(x) = 8 + 2 + 2 a , when d < 2α, and at x = 1, with u(x) = d, when d ≥ 2α. The greatest stress and greatest strain are at x = 0, α with v(x) = w(x) = 2 + d. 11.1.2. u(x) =

8 < 1 4 : 1 4

x2 ,

x−

1 2

−

x+

3 4

1 2

0 ≤ x ≤ 21 ,

x2 ,

1 2

v(x) =

≤ x ≤ 1,

0.03

8 < 1 4 :

0 ≤ x ≤ 21 ,

− x,

x − 34 ,

1 2

≤ x ≤ 1.

0.2

0.02 0.1

0.01 0.2

0.4

0.6

1

0.8

0.2

-0.01

0.4

0.6

1

0.8

-0.1

-0.02 -0.2 -0.03

11.1.3.

8 <

− 12 x2 ,

(a) u(x) = b + : 8 <

(b) v(x) = :

1 2 4ℓ

− x,

2

− ℓ x + 21 x , 0 ≤ x ≤ 12 ℓ, 1 2

x − ℓ,

0≤x≤ 1 2

1 2

ℓ,

ℓ ≤ x ≤ 1,

for any b.

which is the same for all equilibria.

ℓ ≤ x ≤ 1,

0.2

0.1

0.4

0.6

1

0.8

-0.1 0.05

(c) u(x):

0.2

0.4

0.6

v(x):

1

0.8

-0.2 -0.3

-0.05 -0.4 -0.1 -0.5

11.1.4. u(x) =

log(x + 1) − x, log 2

v(x) = u′ (x) =

0.08

1 − 1, log 2(x + 1)

w(x) = (1 + x)v(x) =

1 − 1 − x. log 2

0.4

0.4 0.3

0.2

0.06 0.2 0.04

0.1

0.02

-0.1

0.2 0.2

0.4

0.6

0.8

0.4

0.6

0.6

0.8

1

-0.4

-0.2 0.2

0.4

-0.2

1

1

0.8

Maximum displacement where u′ (x) = 0, so x = 1/ log 2 − 1 ≈ .4427. The bar will break at the point of maximum strain, which is at x = 0. 11.1.5.

u(x) = 2 log(x + 1) − x,

v(x) = u′ (x) =

1−x , 1+x

w(x) = (1 + x)v(x) = 1 − x.

1

1

0.8

0.3

0.8

0.6

0.6

0.4

0.4

0.2

0.2

0.2 0.1

0.2

0.4

0.6

0.8

1

0.2

0.4

315

0.6

0.8

1

0.2

0.4

0.6

0.8

1

Maximum displacement at x = 1. Maximum strain at x = 0. 11.1.6. There is no equilibrium solution. 11.1.7. 0.2

u(x) = x3 −

3 2

x2 − x,

1

v(x) = 3 x2 − 3 x − 1.

0.8 0.1

0.6 0.4 1

0.5

1.5

0.2

2

1

0.5

-0.1

1.5

2

-0.2 -0.4

√ √ The two points x = 1 − 13 3 ≈ .42265, 1 + 13 3 ≈ 1.57735 have maximal (absolute) displacement, while the maximal stress is at x = 0 and 2, so either end is most likely to break. -0.2

11.1.8. u(x) =

8 < 7 8 : 1 4

x2 ,

x−

1 2

+

x−

3 8

1 4

x2 ,

0 ≤ x ≤ 1,

8 < 7 8 : 3 8

v(x) =

1 ≤ x ≤ 2,

− x, 1 2

−

0 ≤ x ≤ 1,

x,

1 ≤ x ≤ 2.

0.8

0.3

0.6 0.4

0.2

0.2

0.1

-0.2

1

0.5

1.5

2

-0.4 0.5

1

1.5

-0.6

2

The greatest stress is at x = where v(0) = w(2) = 2 v(2) = − 45 .

7 8,

and the greatest strain is at x = 2, where

11.1.9. The bar will stretch farther if the stiffer half is on top. Both boundary value problems have the form ! d du − c(x) = 1, 0 < x < 2, u(0) = 0, u′ (2) = 0. dx dx ( 1, 0 ≤ x ≤ 1, When the stiffer bar is on bottom, c(x) = and the solution is 2, 1 ≤ x ≤ 2, 8 < 3 x − 1 x2 , 0 ≤ x ≤ 1, 2 with u(2) = 45 . When the stiffer bar is on top, u(x) = : 2 1 2 1 + x − x , 1 ≤ x ≤ 2, 4 4 8 ( < 3 x − 1 x2 , 0 ≤ x ≤ 1, 2, 0 ≤ x ≤ 1, 4 with u(2) = 74 . c(x) = and u(x) = : 2 1 2 1 1, 1 ≤ x ≤ 2, − 4 + 2 x − 2 x , 1 ≤ x ≤ 2, 11.1.10.

−((1 − x)u′ )′ = 1,

u(0) = 0,

w(1) = lim (1 − x) u′ (x) = 0. x→1

The solution is u = x. Note that we still need the limiting strain at x = 1 to be zero, w(1) = 0, which requires u′ (1) < ∞. Indeed, the general solution to the differential equation is u(x) = a+b log(1−x)+x; the first boundary condition implies a = 0, while u ′ (1) < ∞ is required to eliminate the logarithmic term. ♥ 11.1.11. The boundary value problem is − u′′ = f (x), u(0) = u(2 π), Integrating the differential equation, we find u(x) = a x + b −

u′ (0) = u′ (2 π).

Z x „Z y 0

1 The first boundary condition implies that a = 2π 316

«

f (z) dz dy.

0 Z x „Z y 0

0

«

f (z) dz dy. The second bound-

ary condition requires hf ,1i =

Z 2π 0

f (z) dz = 0,

(∗)

which is required for the forcing function to maintain equilibrium. The condition (∗) is precisely the Fredholm alternative in this case, since any constant function solves the homogeneous boundary value problem. For example, if f (x) = sin x, then the displacement is u(x) = b + sin x, where b is an arbitrary constant, and the stress is u′ (x) = cos x. 11.1.12. Displacement u and position x: meters. Strain v = u′ : no units. Stiffness c(x): N/m = kg/sec2 . Stress w = c v ′ : N/m = kg/sec2 . External force f (x): N/m2 = kg/(m sec2 ).

11.2.1. (a) 1, (b) 0, (c) e, (d) log 2, (e) 11.2.2. (a) ϕ(x) = δ(x);

Z b a

Z b

ϕ(x) u(x) dx = u(1) for a < 1 < b.

a

(c) ϕ(x) = 3 δ(x − 1) + 3 δ(x + 1); 1 2

(f ) 0.

ϕ(x) u(x) dx = u(0) for a < 0 < b.

(b) ϕ(x) = δ(x − 1);

(d) ϕ(x) =

1 9,

δ(x − 1);

Z b a

Z b a

ϕ(x) u(x) dx = 3 u(1) + 3 u(−1) for a < −1 < 1 < b.

ϕ(x) u(x) dx =

1 2

u(1) for a < 1 < b.

(e) ϕ(x) = δ(x) − δ(x − π) − δ(x + π); Z b a

(f ) ϕ(x) =

ϕ(x) u(x) dx = u(0) − u(π) − u(− π) for a < − π < π < b. 1 2

11.2.3. (a) x δ(x) =

δ(x − 1) +

1 5

δ(x − 2);

Z b a

ϕ(x) u(x) dx =

1 2

u(1) +

1 5

u(2) for a < 1 < 2 < b.

nx ” = 0 for all x, including x = 0. Moreover, the functions π 1 + n2 x2 are all bounded in absolute value by 21 , and so the limit, although non-uniform, is to an ordinary function. lim

“

n→∞

(b) h u(x) , x δ(x) i =

Z b a

u(x) x δ(x) dx = u(0) 0 = 0 for all continuous functions u(x), and so

x δ(x) has the same dual effect as the zero function: h u(x) , 0 i = 0 for all u.

11.2.4. (a) ϕ(x) = n lim →∞ (b)

Z b a

2 4

n 3n “ ” − “ ” 2 2 π 1+n x π 1 + n2 (x − 1)2

3

5;

ϕ(x) u(x) dx = u(0) − 3 u(1), for any interval with a < 0 < 1 < b.

♦ 11.2.5. (a) Using the limiting sequence, (11.31) n n ” = lim “ ” n → ∞ 2 2 π 1 + n (2 x) π 1 + (2 n)2 x2

δ(2 x) = n lim g (2 x) = n lim →∞ n →∞ = mlim →∞

“

1 2

m

π 1 + m2 x2

”

“

=

1 2 mlim →∞

where we set m = 2 n in the middle step. 317

gm (x) =

1 2

δ(x),

b = 2 x in the integral (b) Use the change of variables x Z a

−a

δ(2 x) f (x) dx =

Z 2a

1 2

δ(x) f

−2a

“

1 2

”

x dx =

1 2

f (0) =

Z a

1 −a 2

δ(x) f (x) dx.

1 (c) δ(a x) = a δ(x). 11.2.6. 6

8 > > <

′

(a) f (x) = − δ(x + 1) − 9 δ(x − 3) + > > :

“

′

(b) g (x) = δ x +

1 2π

”

“

−δ x−

1 2π

”

2 x, 1, 0,

0 < x < 3, −1 < x < 0, otherwise.

4 2

-1

-2

1

2

3

4

1

2

3

-2 -4 -6

8 > > > <

1.5

− cos x,

+ > cos x, > > : 0,

− 12 π

< x < 0,

0 < x < 21 π, otherwise.

1 0.5 -3

-1

-2

-0.5 -1 -1.5 4

′

(c) h (x) = − e

−1

′

8 > > <

δ(x + 1) + > > :

8 > > > <

2

π cos π x, − 2 x, ex ,

2

-3

-2

-1

1

1,

′ (a) f (x) = > −1, ′′

> > :

cos x, −e− x ,

x < − π,

-6

-4

-2

0,

− π < x < 0,

-4

x > 0.

-6

= σ(x + 1) − 2 σ(x) + σ(x − 1),

otherwise,

f (x) = δ(x + 1) − 2 δ(x) + δ(x −81). > > −1, ′

> <

(b) k (x) = 2 δ(x + 2) − 2 δ(x − 2) + > 1, > > :

−2 < x < 0, 0 < x < 2,

0, otherwise, = 2 δ(x + 2) − 2 δ(x − 2) − σ(x + 2) + 2 σ(x) − σ(x − 2),

k′′ (x) = 2 δ ′ (x + 2) − 2 δ ′ (x − 2) − δ(x + 2) + 2 δ(x) 8 − δ(x − 2). ( < − π 2 cos πx, − π sin πx, −1 < x < 1, (c) s′ (x) = s′′ (x) = : 0, otherwise, 0,

11.2.8. (a) f ′ (x) = − sign x e− | x | ,

4

2

-2

−1 < x < 0, 0 < x < 1,

3

-2

11.2.7. 8 > > > <

2

-4

(d) k (x) = (1 + π )δ(x) + > 2 x, > > :

x > 1, −1 < x < 1, x < −1.

f ′′ (x) = e− | x | − 2 δ(x). 318

−1 < x < 1, otherwise.

8 > > <

′

(b) f (x) = > > :

8 <

′

(c) f (x) = :

−1 3, 1,

x < 0, 0 < x < 1, = −1 + 4 σ(x) − 2 σ(x − 1), x > 1,

2 x + 1,

x > 0,

or

f ′′ (x) = 4 δ(x) − 2 δ(x − 1).

x < −1,

− 2 x − 1,

−1 < x < 0, 8 < 2, ′′ f (x) = 2 δ(x + 1) + 2 δ(x) + : −82, <

(d) f ′ (x) = 4 δ(x + 2) − 4 δ(x − 2) + :

x > 0,

or

x < −1,

−1 < x < 0, 1, | x | > 2,

.

− 1, | x | < 2, = 4 δ(x + 2) − 4 δ(x − 2) + 1 − 2 σ(x + 2) + 2 σ(x − 2),

f ′′ (x) = 4 δ ′ (x + 2) − 4 δ ′ (x − 2) − 2 δ(x + 2) + 2 δ(x − 2). (e) f ′ (x) = sign x cos x, f ′′ (x) = 2 δ(x) − sin | x |. (f ) f ′ (x) = sign(sin x) cos x,

(g) f ′ (x) = 2 f ′′ (x) = 2

∞ X

k = −∞ ∞ X

k = −∞

f ′′ (x) = − | sin x | + 2

δ(x − 2 k π) − 2

∞ X

k = −∞ ∞ X

δ ′ (x − 2 k π) − 2

∞ X

n = −∞

δ(x − n π).

δ(x − (2 k + 1) π), δ ′ (x − (2 k + 1) π).

k = −∞

11.2.9. (a) It suffices to note that, when λ > 0, the product λ x has the same sign as x, and so ( 1, x > 0, σ(λ x) = = σ(x). 0, x < 0, ( 1, x < 0, (b) If λ < 0, then σ(λ x) = = 1 − σ(x). 0, x > 0, (c) Use the chain rule. If λ > 0, then δ(x) = σ ′ (x) = λ σ ′ (λ x) = λ δ(λ x), while if λ < 0, then δ(x) = σ ′ (x) = − λ σ ′ (λ x) = − λ δ(λ x). ♦ 11.2.10.

lim n→∞

2 2 n √ e−n x = π

(

0, ∞,

Z ∞

x 6= 0, x = 0,

−∞

♦ 11.2.11. (a) First, by the definition of mn , we have Z ℓ 0

Second, note that mn = lim

n→∞

gen (x) dx =

Z ℓ g (n) (x) dx 0 y

Z ℓ

=

mn = 1. Therefore, using (11.32), lim gen (x) = n lim →∞

n→∞

(b) It suffices to note that n lim →∞

Z ℓ 0

0

Z ∞ 2 2 2 1 n √ e−n x dx = √ e−y dy = 1. −∞ π π

gn (x − y) dx mn

= 1.

tan−1 n(ℓ − y) − tan−1 (− n y) , and hence π

gn (x − y) =0 mn

whenever

x 6= y.

gn (x − y) dx = n lim mn = 1, as shown in part (a). →∞

319

♥ 11.2.12.

1 2

(a)

− n1

n

1 n

(b) First, n lim g (x) = 0 for any x 6= 0 since gn (x) = 0 whenever n > 1/| x |. Moreover, →∞ n Z ∞

g (x) dx −∞ n

δ(x).

(c) fn (x) =

= 1, and hence the sequence satisfies (11.32–33), proving nlim g (x) = →∞ n

Z x

g (y) dy −∞ n

8 > > > <

1 2 > > :

=>

1

x < − n1 ,

0, n x + 12 ,

| x | < n1 , x > n.

1,

1 n

− n1

8 > <

1 f (x) = σ(x) = > Yes, since n → 0 as n → ∞, the limiting function is n lim →∞ n :

0

x < 0, x = 0, x > 0.

1 2

1 ” ” “ “ (d) hn (x) = 21 n δ x + n1 − 12 n δ x − n1 . (e) Yes. To compute the limit, we use the dual interpretation of the delta function. Given any C1 function u(x), h hn , u i = =

Z ∞

−∞

hn (x) u(x) dx

Z ∞ h −∞

1 2

“

nδ x +

1 n

”

−

1 2

“

nδ x −

1 n

”i

u(x) dx =

“

u − n1

”

−u

2 n

“

1 n

”

.

The n → ∞ limit is − u′ (0), since the final expression is minus the standard centered u(x + h) − u(x − h) 1 difference formula u′ (x) = lim where h = at x = 0. (Alternah→0 2h n ′ tively, you can use l’Hˆ opital’s rule to prove this.) Thus, nlim g (x) = δ ′ (x). →∞ n ♥ 11.2.13.

1 2

(a)

− n1

n

1 n

(b) First, n lim g (x) = 0 for any x 6= 0 since gn (x) = 0 whenever n > 1/| x |. Moreover, →∞ n 320

Z ∞

2 = 1 since its graph is a triangle of base n and height n. We conclude that the sequence satisfies (11.32–33), proving nlim g (x) = δ(x). →∞ n g (x) dx −∞ n

(c) fn (x) =

8 > 0, > > > > > > < 1 + nx + 2 > 1 > > + nx − > > > 2 > :

1 2 1 2

2 2

n x , n2 x2 ,

1,

x < − n1 , −

1 n

< x < 0,

0

1

1 n,

1 n.

− n1

1 n

8 > <

1 (d) Yes, since n → 0 as n → ∞, the limiting function is n lim f (x) = σ(x) = > →∞ n : 8 > > > > <

0,

|x| >

(e) hn (x) = > n2 ,

1 n,

0 1 2

1

x < 0, x = 0, x > 0.

− n1 < x < 0,

> > > :

− n2 , 0 < x < n1 . (f ) Yes, using the same integration by parts argument in (11.54).

11.2.14. By duality, for any f ∈ C1 , D h “ ” “ ”i E h “ 1 1 n δ x − lim − δ x + , f = lim n f n n n→∞ n→∞

1 n

where we used l’Hˆ opital’s rule to evaluate the limit.

11.2.15. s(x) =

Z x

r(x) =

Z x

a

a

8 <

δy (t) dt = − σy (x) = :

a

11.2.16. h δy , u i =

0

provided u(ℓ) = 0.

−1,

σy (z) dz = ρy (x) + y − a =

s(x):

Z ℓ

0,

”

“

− f − n1

”i

= 2 f ′ (0) = − 2 h δ ′ , f i,

x > y, (

y

x < y, y − a,

x < y,

x − a,

x > y,

when

a

r(x):

−1 δy (x) u(x) dx = u(y), while h σy , u′ i =

Z ℓ y

y < a.

y

a−y u′ (x) dx = u(ℓ) − u(y) = − u(y)

11.2.17. Use induction on the order k. Let 0 < y < ℓ. Integrating by parts, h δy(k+1) , u i = =

˛ℓ Z ℓ Z ℓ ˛ δ (k) (x)) u′ (x) dx − δy(k+1) (x) u(x) dx = δy(k) (x) u(x) ˛˛ 0 y 0 x=0 Z ℓ δ (k) (x)) u′ (x) dx = − h δy(k) , u i = (−1)k+1 u(k+1) (y), − 0 y

by the induction hypothesis. The boundary terms at x = 0 and x = ℓ vanish since δy(k) (x) = 0 for any x 6= y. 11.2.18. x δ ′ (x) = − δ(x) because they both yield the same value on a test function: hxδ′ ,ui =

Z ∞

−∞

h

x δ ′ (x) u(x) dx = − x u(x) 321

i′ ˛˛ ˛ ˛

x=0

= u(0) =

Z ∞

−∞

δ(x) u(x) dx = h u , δ i.

See Exercise 11.2.20 for the general version. 11.2.19. The correct formulae are (f (x) δ(x))′ = f ′ (x) δ(x) + f (x) δ ′ (x) = f (0) δ ′ (x). Indeed, integrating by parts, h (f δ)′ , u i = hf δ′ ,ui = h f ′δ , u i =

Z

(f (x) δ(x))′ u(x) dx = −

Z

Z

f (x) δ(x) u′ (x) dx = − f (0) u′ (0), ˛

f (x) δ ′ (x) u(x) dx = −(f (x) u(x))′ ˛˛

x=0

Z

f ′ (x) δ(x) u(x) dx = f ′ (0) u(0).

= − f (0) u′ (0) − f ′ (0) u(0),

Adding the last two produces the first. On the other hand On the other hand, h f (0) δ ′ , u i =

Z

˛

f (0) δ ′ (x) u(x) dx = −(f (0) u(x))′ ˛˛

x=0

= − f (0) u′ (0)

also gives the same result, proving the second formula. (See also the following exercise.) ♦ 11.2.20. (a) For any text function, hf δ′ ,ui = = (b)

f (x) δ

(n)

Z ∞

−∞ Z ∞

−∞

(x) =

˛

u(x) f (x) δ ′ (x) dx = − (u(x) f (x))′ ˛˛

x=0

u(x) f (0) δ ′ (x) dx − u(x) f ′ (0) δ(x) = h f (0) δ ′ − f ′ (0) δ , u i.

n X

(−1)

i=1

11.2.21. (a) ϕ(x) = − 2 δ ′ (x) − δ(x), (b) ψ(x) = δ ′ (x),

= −u′ (0) f (0) − u(0) f ′ (0)

Z ∞

−∞

0 1 i @nA (i)

i

Z ∞

−∞

f

(0)δ (n−i) (x).

ϕ(x) u(x) dx = 2 u′ (0) − u(0);

ψ(x) u(x) dx = − u′ (0);

(c) χ(x) = δ(x − 1) − 4 δ ′ (x − 2) + 4 δ(x − 2),

Z ∞

−∞

χ(x) u(x) dx = u(1) + 4 u′ (2) + 4 u(2);

(d) γ(x) = e−1 δ ′′ (x + 1) − 2 e−1 δ ′ (x + 1) + e−1 δ(x + 1), Z ∞ u′′ (−1) + 2 u′ (−1) + u(−1) γ(x) u(x) dx = . −∞ e ♦ 11.2.22. If f (x0 ) > 0, then, by continuity, f (x) > 0 in some interval | x − x0 | < ε. But then the integral of f over this interval is positive:

Z x0 +ε x0 −ε

f (x) dx > 0, which is a contradiction. An

analogous argument shows that f (x0 ) < 0 is also not possible. We conclude that f (x) ≡ 0 for all x. ♦ 11.2.23. Suppose there is. Let us show that δy (z) = 0 for all 0 < z 6= y < ℓ. If δy (z) > 0 at some z 6= y, then, by continuity, δy (x) > 0 in a small interval 0 < z − ε < x < z + ε < ℓ. We can further assume ε < | z − y | so that y doesn’t lie in the interval. Choose u(x) to be a continuous function so that u(z) > 0 for z − ε < x < z + ε but u(x) = 0 for all 0 ≤ x ≤ ℓ ( ε − | x − z |, | x − z | ≤ ε, such that | z − x | ≥ ε; for example, u(x) = Note that, in 0, otherwise. particular, u(y) = 0. Then

Z ℓ 0

δy (x) u(x) dx =

Z z+ε z−ε

δy (x) u(x) dx > 0 because we are

integrating a positive continuous function. But this contradicts (11.39) since u(y) = 0. A similar argument shows that δy (z) < 0 also leads to a contradiction. Therefore, δy (x) = 0 for all 0 < x 6= y < ℓ and so, by continuity, δy (x) = 0 for all 0 ≤ x ≤ ℓ. But then 322

Z ℓ 0

δy (x) u(x) dx = 0 for all functions u(x) and so (11.39) doesn’t hold if u(y) 6= 0.

♦ 11.2.24. By definition of uniform convergence, for every δ > 0 there exists an n⋆ such that | fn (x) − σ(x) | < δ for all n ≥ n⋆ . However, if δ < 12 , then there is no such n⋆ since each fn (x) is continuous, but would have to satisfy fn (x) < δ < 12 for x < 0 and fn (x) > 1 − δ > 1 2 for x > 0 which is impossible for a continuous function, which, by the Intermediate Value Theorem, must assume every value in between δ and 1 − δ, [ 2 ].

11.2.25. .5 mm — by linearity and symmetry of the Green’s function. 11.2.26. To determine the Green’s function, we must solve the boundary value problem − c u′′ = δ(x − y), u(0) = 0, u′ (1) = 0. The general solution to the differential equation is σ(x − y) ρ(x − y) + a x + b, u′ (x) = − + a. u(x) = − c c The integration constants a, b are fixed by the boundary conditions 1 u(0) = b = 0, u′ (1) = − + a = 0. c Therefore, the Green’s function for this problem is ( x/c, x ≤ y, G(x, y) = y/c, x ≥ y. 8 < 1 8 : 1 8

11.2.27. (a) G(x, y) =

x(4 − y),

x ≤ y,

x,

x ≤ y,

(c) The free y(4 − x), x ≥ y. y, x ≥ y. boundary value problem is not positive definite, and so there is not a unique solution.

11.2.28. (a)

8 > > > > <

G(x, y) = >

(b) u(x) = =

Z x 0

Z 1 0

log(1 + y)

♥ 11.2.29. (a) u(x) = (b)

9 16

> > > :

=

1−

log(1 + x) log(1 + y)

! !

,

x < y,

,

x > y.

log(1 + x) log 2

!

dy +

Z 1 x

log(1 + x)

1−

log(1 + y) log 2

u′ (x) 1 2 3 3 1 4 9 x + x − x , w(x) = = 16 2 16 4 2 1 + x 8 “ ”“ ” > < 1 − 34 y − 14 y 3 x + 13 x3 , x < y, ”“ ” “ > : 1 − 34 x − 14 x3 x + 31 x3 , x > y.

x−

Z 1

!

dy =

log(x + 1) − x. log 2

− x;

G(x, y) dy

0 Z x“ 0 9 16

log(1 + y) 1− log 2 log(1 + x) 1− log 2

G(x, y) dy

G(x, y) =

(c) u(x) =

(b) G(x, y) =

8 < 1 2 : 1 2

1− 1 2

3 4

1 4

x3

3 16

3

x−

2

”“

x+ 1 4

4

1 3

”

x3 dy +

Z 1“ x

1−

3 4

y−

1 4

y3

”“

x+

1 3

”

x3 dy

= x− x + x − x . (d) Under an impulse force at x = y, the maximal displacement is at the forcing point, 1 6 x . The maximum value of namely g(x) = G(x, x) = x − 34 x2 + 13 x3 − 12 x4 − 12 323

g(x⋆ ) =

1 3

occurs at the solution x⋆ =

g ′ (x) = 1 −

3 2

x + x2 − 2 x 3 −

11.2.30. (a) G(x, y) = (c)

u(x) =

Z 1 0

(

=

Z x 0

x < y, x > y;

G(x, y)f (y) dy = Z x

u′ (x) = x (1 + x) f (x) + y f (y) dy +

0 Z 1 x

3

√ 2−

1+

x5 = 0.

1 2

(1 + y)x, y(1 + x),

q

Z x

3

1+

√ = .596072 to 2

(b) all of them;

y (1 + x) f (y) dy +

0

1

q

Z 1 x

(1 + y) x f (y) dy,

y f (y) dy − (1 + x) x f (x) +

Z 1 x

(1 + y) f (y) dy

(1 + y) f (y) dy,

u′′ (x) = x f (x) − (1 + x) f (x) = − f (x). (d) The boundary value problem is not positive definite — the function u(x) = x solves the homogeneous problem — and so there is no Green’s function. ♥ 11.2.31.

8 > > > > <

x(y − 1),

1 4n > > > :

(a) un (x) = >

−

1 2

x+

1 4

n x2 −

“

1 2

y+ 1−

x(1 − y),

(b) Since un (x) = G(x, y) for all | x − y | ≥ “

2

1 2

”

n xy +

1 4

n y2 ,

0≤x≤y−

1 n

|x − y| ≤

,

y+

1 u (x) n , we have n lim →∞ n ” 2 1 = y − y = G(y, y). 4n

1 n

1 n

,

≤ x ≤ 1.

= G(x, y) for all x 6= y,

y −y+ while n lim u (y) = n lim (Or one can appeal to →∞ n →∞ continuity to infer this.) This limit reflects the fact that the external forces converge to the delta function: n lim f (x) = δ(x − y). →∞ n 0.2

0.4

0.6

1

0.8

-0.05

(c)

-0.1 -0.15 -0.2

11.2.32. Use formula (11.64) to compute 1 c

Z x

1 0 c 1 u′ (x) = x f (x) − x f (x) + c u(x) =

Moreover, u(0) =

1 c

y f (y) dy +

Z 0 0

Z 1 x

Z 1 x

x f (y) dy, f (y) dy =

1 c

y f (y) dy = 0 and u′ (1) =

Z 1 x

1 c

f (y) dy,

Z 1 1

u′′ (x) = −

1 f (x). c

f (y) dy = 0.

♠ 11.2.33. (a) The j th column of G is the solution to the linear system K u = ej /∆x, corresponding to a force of magnitude 1/∆x concentrated on the j th mass. The total force on the chain is 1, since the force only acts over a distance ∆x, and so forcing function represents a concentrated unit impulse, i.e., a delta function, at the sample point yj = j/n. Thus, for n ≫ 0, the solution should approximate the sampled values of the Green’s function of the limiting solution. (b) In fact, in this case, the entries of G are exactly equal to the sample values of the Green’s function, and, at least in this very simple case, no limiting procedure is required. 324

♠ 11.2.34. Yes. For a system of n masses connected to both top and bottom supports by n + 1 1 springs, the spring lengths are ∆x = n + 1 , and we rescale the incidence matrix A by di-

viding by ∆x, and set K = AT A. Again, the j th column of G = K −1 /∆x represents the response of the system to a concentrated unit force on the ith mass, and so its entries approximate G(xi , yj ), where G(x, y) is the Green’s function (11.59) for c = 1. Again, in this specific case, the matrix entries are, in fact, exactly equal to the sampled values of the continuum Green’s function.

♦ 11.2.35. Set

Z w ∂F ∂I ∂I ∂I = (x, y) dy, = − F (x, z), = F (x, w), z ∂x z ∂x ∂z ∂w by the Fundamental Theorem of Calculus. Thus, by the multivariable chain rule, d Z β(x) d ∂I ∂I dβ ∂I dγ F (x, y) dy = I(x, α(x), β(x)) = + + dx α(x) dx ∂x ∂y dx ∂z dx Z β(x) ∂F dα dβ − F (x, α(x)) + (x, y) dy. = F (x, β(x)) α(x) ∂x dx dx

I(x, z, w) =

Z w

F (x, y) dy,

so

5 2

11.3.1. (a) Solution: u⋆ (x) =

x−

5 2

x2 .

25 (b) P[ u⋆ ] = − 24 = −1.04167,

(i) P[ x − x2 ] = − 32 = − .66667, (ii) P[ 32 x −

3 2

39 x3 ] = − 40 = − .975,

16 (iii) P[ 32 sin π x ] = − 320π + 19 π 2 = − 1.02544, (iv ) P[ x2 − x4 ] = − 35 = − .45714; all are larger than the minimum P[ u⋆ ].

11.3.2. (i) u⋆ (x) = (iii) P[ u⋆ ] = −

P[ c x − c x2 ] =

11.3.3. (a) (i) u⋆ (x) = (ii) (iii) (iv ) (b) (i)

1 6x 1 90

1 6

−

1 c ≥ − 96

5 x4 − 36 , 3 (u′ )2 2 5 P[ u ] = + x u dx, u(−1) = u(1) = 0, 2 (x2 + 1) P[hu⋆ ] = − .0282187, i h i P − 51 (1 − x2 ) = − .018997, P − 51 cos 21 π x = − .0150593. u⋆ (x) = 21 − e− 1 + e− x−1 − 21 e− 2 x ,

(ii) P[ u ] =

1 6 18 x2 Z 1 4 −1

Z 1h 0

+

1 12

1 x ′ 2 2 e (u )

P

h

1 5

i

sin 12 π x) = − .0354279.

Z 2h

i 2 1 2 ′ 2 x (u ) − 3 x u dx, u′ (1) = u(2) = 0, 2 1 37 P[ u⋆ ] = − 20 = − 1.85, h i 2 1 P[ 2 x − x ] = 11 6 = − 1.83333, P cos 2 π (x − 1) = − 1.84534. u⋆ (x) = 31 x + 79 − 49 x−2 , Z −1 h i − 21 x3 (u′ )2 − x2 u dx, u(−2) = u(−1) = 0. P[ u ] = −2

(ii) P[ u ] =

(ii)

i

− e− x u dx, u(0) = u′ (1) = 0,

(iii) P[hu⋆ ] = − .0420967, i (iv ) P 52 x − 15 x2 ) = − .0386508, (c) (i) u⋆ (x) = 25 − x−1 − 12 x2 , (iii) (iv ) (d) (i)

i 1 (u′ )2 − x u dx, u(0) = 2 0 2 1 P[ c x − c x3 ] = 52 c2 − 15 c > − 90 2 1 π 2

u(1) = 0,

for c 6= 16 , 1 = − .01042, P[ c sin π x ] = 4 c − π c ≥ − 4 = − .01027. π

= − .01111, (iv )

1 1 2 6 c − 12

Z 1h

x3 , (ii) P[ u ] =

325

13 (iii) P[hu⋆ ] = − 216 = − .060185, i 1 (iv ) P − 4 (x + 1)(x + 2) = − .0536458,

h

i

3 P − 20 x(x + 1)(x + 2) = − .0457634.

11.3.4. (a) Boundary value problem: − u′′ = 3, u(0) = u(1) = 0; solution: u⋆ (x) = − 23 x2 + 32 x. “

”′

(b) Boundary value problem: − (x + 1) u′ = 5, u(0) = u(1) = 0; 5 solution: u⋆ (x) = log(x + 1) − 5 x. log 2 (c) Boundary value problem: − 2 (x u′ )′ = − 2, u(1) = u(3) = 0; 2 log x. solution: u⋆ (x) = x − 1 − log 3 (d) Boundary value problem: − (ex u′ )′ = 1 + ex , u(0) = u(1) = 0; solution: u⋆ (x) = (x − 1) e− x − x + 1. ! du d x 1 (e) Boundary value problem: − 2 , u(−1) = u(1) = 0; =− 2 dx 1 + x2 dx (x + 1)2 1 1 3 solution: u⋆ (x) = − 16 x + 16 x .

11.3.5. 3 log x 1 . (a) Unique minimizer: u⋆ (x) = x2 − 2 x + + 2 2 2 log 2 (b) No minimizer since x is not positive for all − π < x < π. ! cos 1(1 + sin x) log 1 (1 + sin 1) cos x ! . (c) Unique minimizer: u⋆ (x) = (x − 1) − 1 + sin 1 2 log 1 − sin 1 (d) No minimizer since 1 − x2 is not positive for all − 2 < x < 2. (e) Not a unique minimizer due to Neumann boundary conditions. 11.3.6. Yes: any function of the form u(x) = a + 41 x2 − 16 x3 satisfies the Neumann boundary value problem − u′′ = x − 21 , u′ (0) = u′ (1) = 0. Note that the right hand side satisfies D

the Fredholm condition 1 , x − 21 problem would have no solution.

E

=

Z 1“ 0

x−

1 2

”

dx = 0; otherwise the boundary value

11.3.7. Arguing as in Example 11.3, any minimum must satisfy the Neumann boundary value problem − (c(x) u′ )′ = f (x), u′ (0) = 0, u′ (1) = 0. The general solution to the differential ! Z x 1 Zx 1 Zy f (z) dz dy. Since u′ (x) = a − f (z) dz, equation is u(x) = a x + b − 0 c(y) 0 c(x) 0 the first boundary condition requires a = 0. The second boundary condition requires 1 Z1 f (x) dx = 0. The mean zero condition is both necessary and sufficient u′ (1) = c(x) 0 for the boundary value problem to have a (non-unique) solution, and hence the functional to achieve its minimum value. Note that all solutions to the boundary value problem give the same value for the functional, since P[ u + b ] = P[ u ] − b ♦ 11.3.8.

1 2

k D[ u ] k2 =

1 2

k v k2 =

Z ℓ

1 0 2

c(x) v(x)2 dx =

Z ℓ

1 0 2

Z 1 0

f (x) dx = P[ u ].

v(x) w(x) dx =

1 2

h v , w i.

11.3.9. According to (11.91–92) (or, using a direct integration by parts) P[ u ] = 12 h K[ u ] , u i − h u , f i, so when K[ u⋆ ] = f is the minimizer, P[ u⋆ ] = − 21 h u⋆ , f i = − 21 h u⋆ , K[ u⋆ ] i < 0 326

since K is positive definite and u⋆ is not the zero function when f 6≡ 0. 11.3.10. Yes. Same argument 11.3.11. u(x) = x2 satisfiess

Z 1 0

u′′ (x) u(x) dx =

2 3.

Positivity of

Z 1 0

for functions that satisfy the boundary conditions u(0) = u(1).

− u′′ (x) u(x) dx holds only

11.3.12. No. The boundary terms still vanish and so the integration by parts identity continues Z ℓh

i

to hold, but now u(x) = a constant makes − u′′ (x) u(x) dx = 0. The integral is ≥ 0 0 for all functions satisfying the Neumann boundary conditions. 11.3.13. hh I [ u ] , v ii = I ∗[ v ] =

Z ℓ 0

u(x) v(x) c(x) dx =

Z ℓ 0

u(x) v(x) ρ(x) dx = h u , I ∗ [ v ] i, provided

c(x) v(x) is a multiplication operator. ρ(x) Z ℓ

c(x) u(x) v(x) dx = 11.3.14. h K[ u ] , v i = 0 conditions are required.

Z ℓ 0

u(x) c(x) v(x) dx = h u , K[ v ] i. No boundary

11.3.15. Integrating by parts, Z ℓ Z ℓ i h i d h du u(x) v(x) c(x) dx v(x) c(x) dx = u(ℓ) c(ℓ) v(ℓ) − u(0) c(0) v(0) − hh D[ u ] , v ii = 0 0 dx dx ! * + Z ℓ h i 1 1 d(v c) d u(x) − = = h u , D ∗ [ v ] i, ρ(x) dx = u , − v(x) c(x) 0 ρ(x) dx ρ dx provided the boundary terms vanish: u(ℓ) c(ℓ) v(ℓ) − u(0) c(0) v(0) = 0.

Therefore,

i c(x) dv c′ (x) d h 1 c(x) v(x) = − − v(x). ρ(x) dx ρ(x) dx ρ(x) Note that we have the same boundary terms as in (11.87), and so all of our self-adjoint boundary conditions continue to be valid. The self-adjoint boundary value problem K[ u ] = D∗ ◦ D[ u ] = f is now given by 1 d du K[ u ] = − c(x) = f (x), ρ(x) dx dx

D∗ [ v(x) ] = −

along with the selected boundary conditions, e.g., Dirichlet conditions u(0) = u(ℓ) = 0. ♥ 11.3.16. (a) Integrating the first term by parts, we find hh L[ u ] , v ii =

Z 1h i u′ (x) v(x) + 2 x u(x) v(x) dx 0 Z 1h h i

= u(1) v(1) − u(0) v(0) +

0

= h u , − v + 2 x v i = h u , L∗ [ v ] i, ′

i

− u(x) v ′ (x) + 2 x u(x) v(x) dx

owing to the boundary conditions. Therefore, the adjoint operator is L∗ [ v ] = − v ′ +2 x v. (b) The operator K is positive definite because ker L = {0}. Indeed, the solution to L[ u ] = 0 2 is u(x) = c e− x and either boundary condition implies c = 0. (c) K[ u ] = (− D + 2 x)(D + 2 x)u = − u′′ + (4 x2 − 2)u = f (x), u(0) = u(1) = 0. 2 2 (d) The general solution to the differential equation (− D + 2 x)v = ex is v(x) = (b − x)ex , 327

2

and so the solution to (− D + 2 x)(D + 2 x)u = ex is found by solving 2

(D + 2 x)u = (b − x)ex ,

Imposing the boundary conditions, we find u(x) =

− x2

e

−e 4

2

2

u(x) = − 41 ex + a e− x + b e− x

so x2

+

Z x 2 e2 y 0 Z 1 2 y2 e dy 0

(e4 − 1) e− x 4

2

2

Z x 2 e2 y 0

dy.

dy .

(e) Because the boundary terms u(1) v(1) − u(0) v(0) in the integration by parts argument are not zero. Indeed, we should identify v(x) = u′ (x) + 2 x u(x), and so the correct form of the free boundary conditions v(0) = v(1) = 0 requires u′ (0) = u′ (1) + 2 u(1) = 0. On the other hand, although it is not self-adjoint and so doesn’t admit a minimization principle, the free boundary value problem does have a unique solution: “ 2 ” u(x) = − 41 e2 1 + e− x .

♥ 11.3.17. (a) a(x) u′′ + b(x) u′ = − (c(x) u′ )′ = − c(x) u′′ − c′ (x) u′ if and only if a = − c and b = − c ′ = a′ . ! Z ρ′ b − a′ b − a′ ′ ′ ′ (b) We require ρ b = (ρ a) = ρ a + ρ a , and so = . Hence, ρ = exp dx ρ a a is found by one integration. (c) (i) No integrating factor needed: − (x2 u′ )′ = x − 1; (ii) no integrating factor needed: − (ex u′ )′ = − e2 x ; (iii) integrating factor ρ(x) = − e2 x , so − (e2 x u′ )′ = − e2 x ; (iv ) integrating factor ρ(x) = x2 , so − (x3 u′ )′ = x3 ; ! 1 1 u′ =− , so − . (v ) integrating factor ρ(x) = − 2 cos x cos x cos x 11.3.18. (a) Integrating the first term by parts and using the boundary conditions, hh L[ u ] , v ii =

Z 1h u′ v1 0

i

+ u v2 dx =

Z 1 0

h

i

u − v1′ + v2 dx = h u , L∗ [ v ] i,

and so the adjoint operator is L∗ [ v ] = − v1′ + v2 .

(b) − u′′ + u = x − 1, u(0) = u(1) = 0, with solution u(x) = x − 1 + 11.3.19. Use integration by parts: h L[ u ] , v i =

Z 2π 0

i u′ (x) v(x) dx = −

Z 2π 0

i u(x) v ′ (x) dx =

Z 2π 0

e2−x − ex . e2 − 1

u′ (x) i v ′ (x) dx = h u , L[ v ] i,

where the boundary terms vanish because both u and v are assumed to be 2 π periodic. 11.3.20. Quadratic polynomials do not, in general, satisfy any of the allowable boundary conditions, and so, in this situation, the the boundary terms will contribute to the computation of the adjoint.

11.3.21. Solving the corresponding boundary value problem du x3 − 1 2 log x d (x ) = − x2 , u(1) = 0, u(2) = 1, gives u(x) = − . − dx dx 9 9 log 2 11.3.22. (a) (i) − u′′ = −1, u(0) = 2, u(1) = 3;

(ii) u⋆ (x) = 328

1 2

x2 +

1 2

x + 2.

(b) (i) − 2 (x u′ )′ = 2 x, u(1) = 1, u(e) = 1. (ii) u⋆ (x) = 54 − 14 x2 + 14 (e2 − 1) log x. ! 1 du d = 0, u(−1) = 1, u(1) = −1. (ii) u⋆ (x) = − 43 x − 41 x3 . (c) (i) − dx 1 + x2 dx (d) (i) − (e− x u′ )′ = 1, u(0) = −1, u(1) = 0; (ii) u⋆ (x) = (x − 1) e− x . 11.3.23. (a) (i) (b) (i)

Z πh 0 Z 1

−1

1 2 2 4

i

(u′ )2 − (cos x) u dx, u(0) = 1, u(π) = −2; 3

(u′ )2 − x2 u 5 dx, u(−1) = −1, u(1) = 1; 2(x2 + 1)

1 6 (ii) u⋆ (x) = − 18 x − Z 1h

1 12 i

x4 +

1 4

x3 +

3 4

x−

u(x) = α (1 − x) + β x +

= α (1 − x) + β x + 11.3.25. (a)

(b) (c) (d)

"

0 Z x 0

x . π

5 36 .

1 x ′ 2 2 e (u ) + u dx, u(0) = 1, u(1) = 0; 0 Z 1 h i 1 2 ′ 2 (d) (i) x (u ) − (x2 − x) u dx, u(1) = −1, 2 −1 5 (ii) u⋆ (x) = − 16 x2 + 21 x + 11 3 − x. h i e (x) = u(x) − α (1 − x) + β x 11.3.24. The function u e ′′ = f, u e (0) = u e (1) = 0, and so is conditions − u Z 1 e (x, y) = f (y) G(x, y) dy. Therefore, u 0 Z 1

(c) (i)

(ii) u⋆ (x) = cos x −

(ii) u⋆ (x) = e− x (1 − x). u(3) = 2;

satisfies the homogeneous boundary given by the superposition formula

f (y) G(x, y) dy (1 − x) y f (y) dy +

Z 1 x

x (1 − y) f (y) dy.

#

d du − (1 + x) = 1 − x, u(0) = 0, u(1) = .01; dx dx solution:0 u⋆ (x) = .25 x2 − 1.5 x + 1.8178 log(1 + x); 1 !2 Z 1 du @ 1 (1 + x) P[ u ] = − (1 − x) u A dx; 2 0 dx P[ u⋆ ] = −.0102; When u(x) = .1 x + α (x − x2 ), then P[ u ] = .25 α2 − .085 α − .00159, with minimum value −.0070535 at α = .17. When u(x) = .1 x + α sin πx, then P[ u ] = 3.7011 α2 − .3247 α − .0087, with minimum value −.0087 at α = .0439.

e (x) + h(x) where h(x) is any ♦ 11.3.26. We proceed as in the Dirichlet case, setting u(x) = u function that satisfies the mixed boundary conditions h(0) = α, h′ (ℓ) = β, e.g., h(x) = α + β x. The same calculation leads to (11.105), which now is hh u′ , h′ ii − h u , K[ h ] i = c(ℓ) h′ (ℓ) u(ℓ) − c(0) h′ (0) α = β c(ℓ) u(ℓ) − c(0) h′ (0) α. The second term, C1 = − c(0) h′ (0) α, doesn’t depend on u. Therefore, e ] = − β c(ℓ) u(ℓ) + P[ u ] − C + C , P[ u 1 0 e (x) minimizes P[ u e ] if and only if u(x) = u e (x) + h(x) minimizes the functional and so u (11.106).

d 11.3.27. Solving the boundary value problem − dx 1 3 17 conditions gives u(x) = 18 x + 18 − 43 log x.

11.3.28. For any ε > 0, the functions u(x) =

8 < :

du x dx

!

= − 21 x2 with the given boundary

0, 1 2 ε (x − 1 + ε)

329

2

,

0 ≤ x ≤ 1 − ε,

1 − ε ≤ x ≤ 1,

satisfy the

boundary conditions, but J[ u ] = 31 ε, and hence J[ u ] can be made as close to 0 as desired. However, if u(x) is continuous, J[ u ] = 0 if and only if u(x) = c is a constant function. But no constant function satisfies both boundary conditions, and hence J[ u ] has no minimum value when subject to the boundary conditions. ♦ 11.3.29. The extra boundary terms serve to cancel those arising during the integration by parts computation: b u] = P[

=

Z bh

a Z bh a

i

1 ′ 2 2 (u )

′ ′ + u′′ ⋆ u dx − u⋆ (b)u(b) + u⋆ (a)u(a)

1 ′ 2 2 (u )

− u′⋆ u′ dx =

i

Z b

1 ′ (u a 2

− u′⋆ )2 dx +

Z b

1 ′ 2 (u ) dx. a 2 ⋆

Again, the minimum occurs when u = u⋆ , but is no longer unique since we can add in any constant function, which solves the associated homogeneous boundary value problem.

11.4.1. 1 3 1 1 4 x − 12 x + 24 x, w(x) = u′′ (x) = 12 x2 − 12 x. (a) u(x) = 24 (b) no equilibrium solution. 1 4 (c) u(x) = 24 x − 16 x3 + 31 x, w(x) = u′′ (x) = 12 x2 − x. 1 4 (d) u(x) = 24 x − 16 x3 + 41 x2 , w(x) = u′′ (x) = 12 x2 − x + 12 . 1 4 x − 16 x3 + 61 x2 , w(x) = u′′ (x) = 12 x2 − x + 13 . (e) u(x) = 24 ♥ 11.4.2. “ ” “ ” 3 (a) Maximal displacement: u 21 = 384 = .01302; maximal stress: w 21 = − 18 = − .125; (b) no solution; 5 (c) Maximal displacement: u(1) = 24 = .2083; maximal stress: w(1) = − 12 = − .5; (d) Maximal displacement: u(1) = 81 = .125; maximal stress: w(0) = 12 = .5; 1 = .04167; maximal stress: w(0) = 31 = .3333. (e) Maximal displacement: u(1) = 24 Thus, case (c) has the largest displacement, while cases (c,d) have the largest stress,. 11.4.3. Except in (b), which has no minimization principle, we minimize Z 1h

P[ u ] =

0

subject to the boundary conditions (a) u(0) = u′′ (0) = u(1) = u′′ (1) = 0. (d) u(0) = u′ (0) = u′′ (1) = u′′′ (1) = 0. 11.4.4. (a) (i) G(x, y) = 0.02

8 < 1 3 : 1 3

xy − xy −

1 6 1 2

x3 − 2

x y−

x y2 + 1 6

3

y +

(iii)

0.01 0.005

0.4

0.6

0.8

1

(iv ) maximal displacement at x = (b) no Green’s function.

1 2

3 1 3 1 6 x y + 6 xy , 3 1 3 1 6 x y + 6 xy ,

Z x

+ 0.2

i

u′′ (x)2 − u(x) dx

(c) u(0) = u′′ (0) = u′ (1) = u′′′ (1) = 0. (e) u(0) = u′ (0) = u′ (1) = u′′′ (1) = 0.

u(x) =

0.015

(ii)

1 2

1 2

( 13 0 Z 1 x

xy −

1 2

( 31 x y −

x > y;

x2 y −

1 6

y3 +

x3 −

1 2

x y2 +

1 6

with G( 21 , 12 ) = 330

x < y,

1 48 .

1 6

x3 y + 1 6

1 6

x3 y +

x y 3 )f (y) dy 1 6

x y 3 )f (y) dy;

0.2

8 <

(c) (i) G(x, y) = : (iii) u(x) =

Z x 0

xy − xy −

3

1 6 1 2

x −

1 6

3

(x y −

1 2

x2 y − x −

2

xy ,

x < y,

y3 ,

x > y;

1 6

1 2

2

x y )f (y) dy +

0.15

(ii)

0.1 0.05

Z 1 x

0.2

1 6

(x y −

(iv ) maximal displacement at x = 1 with G(1, 1) =

1 3.

3

1 2

x −

0.4

0.6

0.8

1

0.8

1

2

x y )f (y) dy;

0.1

8 <

(d) (i) G(x, y) = : (iii) u(x) =

− 16 x3 + 1 2

Z x

( 12 0

2

xy − 2

xy −

0.08

1 2 2 x y, 1 3 6y , 1 6

x < y,

0.06

(ii)

x > y;

3

y )f (y) dy +

0.04 0.02

Z 1 x

0.2

1 6

3

(− x +


0.4

0.6

2

1 2

x y)f (y) dy;

1 3. 0.04

8 <

(e) (i) G(x, y) = : (iii) u(x) =

Z x 0

− 61 x3 + 1 2

x y2 − 2

( 12 x y −

1 2 1 2 2 2x y− 4x y , 1 3 1 2 2 6y − 4x y , 1 6

3

y −

1 4

0.03

x < y,

(ii)

x > y;

2 2

x y )f (y) dy +

0.02 0.01

Z 1 x


0.2

3

(− 16 x +

1 12 .

1 2

2

x y−

0.4

1 4

0.6

0.8

1

2 2

x y )f (y) dy;

♥ 11.4.5. The boundary value problem for the bar is with solution u(x) = 21 f x(ℓ − x). −u′′ = f, u(0) = u(ℓ) = 0, ” “ The maximal displacement is at the midpoint, with u 12 ℓ = 18 f ℓ2 . The boundary value problem for the beam is 1 f x2 (ℓ − x)2 . with solution u(x) = 24 u′′′′ = f, u(0) = u′ (0) = u(ℓ) = u′ (ℓ) = 0, “

”

1 The maximal displacement is at the midpoint, with u 12 ℓ = 384 f ℓ4 . The beam displace√ ment is greater than the bar displacement if and only if ℓ > 4 3.

x2 − x , with stress w(x) = (1 + x2 ) v(x) = 21 (x2 − x), which 2(1 + x2 ) is obtained by solving w ′′ = 1 subject to boundary conditions w(0) = w(1) = 0. The problem is statically determinate because the boundary conditions uniquely determine w(x) and v(x) without having to first find the displacement u(x) (which is rather complicated).

11.4.6. The strain is v(x) =

11.4.7. (a) The differential equation is u′′′′ = f . Integrating once, the boundary conditions u′′′ (0) = u′′′ (1) = 0 imply u′′′ (x) =

Z x 0

f (y) dy

with

Z 1 0

f (y) dy = 0.

Integrating again, the other boundary conditions u′′ (0) = u′′ (1) = 0 imply that u′′ (x) =

Z x „Z z 0

0

f (y) dy

«

dz

with

Z 1 „Z z 0

0

f (y) dy

«

dz = 0.

(b) The Fredholm alternative requires f (x) be orthogonal to the functions in ker K = ker L = ker D2 , with basis 1 and x. (Note that both functions satisfy the free boundary conditions.) Then hf ,1i =

Z 1 0

f (x) dx = 0, 331

hf ,xi =

Z 1 0

xf (x) dx = 0.

Comparing with the two previous constraints, the first are identical, while the second are equivalent using an integration by parts. (c) f (x) = x2 − x + 61 satisfies the constraints; the corresponding solution is 1 1 1 x6 − 120 x5 + 144 x4 + c x + d, where c, d are arbitrary constants. u(x) = 360 11.4.8. (a) The differential equation is u′′′′ = f . Integrating once, the boundary condition Z u′′′ (0) = 0 implies u′′′ (x) =

u′′ (0) = u′′ (1) = 0 imply that u′′ (x) =

Z x „Z z 0

0

x

0

f (y) dy. Integrating again, the boundary conditions

f (y) dy

«

dz

with

Z 1 „Z z 0

0

f (y) dy

«

dz = 0.

Integrating twice more, one can arrange to satisfy the remaining boundary condition u(1) = 0, and so in this case there is only one constraint. (b) The Fredholm alternative requires f (x) be orthogonal to the functions in ker K = ker L = ker D2 , with basis x − 1, and so hf ,x − 1i =

Z 1 0

(x − 1)f (x) dx = 0.

This is equivalent to the previous constraint through an integration by parts. (c) f (x) = x − 31 satisfies the constraint; the corresponding solution is 1 4 1 1 x5 − 72 x + c (x − 1) + 180 , where c is an arbitrary constant. u(x) = 120 11.4.9. False. Any boundary conditions leading to self-adjointness result in a symmetric Green’s function. 11.4.10.

1 Zx 2 1 Z1 2 y (1 − x)2 (3 x − y − 2 x y) f (y) dy + x (1 − y)2 (3 y − x − 2 x y) f (y) dy, 6 0 6 x 1 Zx 2 1 Z1 du = y (1 − x) (1 − 3 x + 2 x y) f (y) dy + x (1 − y)2 (2 y − x − 2 x y) f (y) dy, dx 2 0 2 x Z 1 Z x d2 u 2 (1 − y)2 (y − x − 2 x y) f (y) dy, y (− 2 + 3 x + y − 2 x y) f (y) dy + = x 0 dx2 Z 1 Z x d3 u 2 (1 − y)2 (1 + 2 y) f (y) dy, (3 − 2 y) f (y) dy − y = x 0 dx3 d4 u = f (x). Moreover, u(0) = u′ (0) = u(1) = u′ (1). dx4 u=

♥ 11.4.11. Same answer in all three parts: Since, in the absence of boundary conditions, ker D 2 = { a x + b }, we must impose boundary conditions in such a way that no non-zero affine function can satisfy them. The complete list is (i) fixed plus any other boundary condition; or (ii) simply supported plus any other except free. All other combinations have a nontrivial kernel, and so have non-unique equilibria, are not positive definite, and do not have a Green’s function. 11.4.12. (a) Simply supported end with right end raised 1 unit; solution u(x) = x. (b) Clamped ends, the left is held horizontally and moved one unit down, the right is held at an angle tan−1 2; solution u(x) = x2 − 1. (c) Left end is clamped at a 45◦ angle, right end is free with an induced stress; solution u(x) = x + 21 x2 . (d) Left end is simply supported and raised 2 units up, right end is sliding and tilted at an angle − tan−1 2; solution u(x) = 2 − 2 x. 332

11.4.13. (a) P[ u ] =

Z 1

1 0 2

u′′ (x)2 dx,

(b) P[ u ] =

Z 1

(c) P[ u ] = u′ (1) +

1 0 2

u′′ (x)2 dx,

Z 1

1 0 2

u′′ (x)2 dx,

(d) P[ u ] =

Z 1

1 0 2

u′′ (x)2 dx.

11.4.14.

1

8 <

(a) u(x) = : 8 > > > <

0.5

−1.25 (x + 1)3 + 4.25 (x + 1) − 2, 3

2

1.25 x − 3.75 x + .5 x − 1,

−1 ≤ x ≤ 0,

(c) u(x) =

(d) u(x) = > > > > > > > :

1

0.5 -0.5 -1 -1.5 -2

2

− x3 + 2 x + 1,

0 ≤ x ≤ 1,

1.5

1 ≤ x ≤ 2,

− (x − 2)3 + 3 (x − 2)2 − (x − 2),

1 0.5

2 ≤ x ≤ 3.

0.5

1

1.5

2

2.5

2.5

(x − 2),

2

1 ≤ x ≤ 2,

1.5

2 ≤ x ≤ 4.

0.5

1

1.5

3

1.53571 (x + 2) − 4.53517 (x + 2) + 5,

2.5

3

3.5

−2 ≤ x ≤ −1,

− 3.67857 (x + 1)3 + 4.60714 (x + 1)2 + .07143 (x + 1) + 2, 3

2

−1 ≤ x ≤ 0,

2

4.17857 x − 6.42857 x − 1.75 x + 3,

0 ≤ x ≤ 1,

− 2.03571 (x − 1)3 + 6.10714 (x − 1)2 − 2.07143 (x − 1) − 1,

1 ≤ x ≤ 2.

5

4 3 2 1 -2

-1

1

2

-1

11.4.15. In general, the formulas for the homogeneous clamped spline coefficients a j , bj , cj , dj , for j = 0, . . . , n − 1, are aj = y j , j = 0, . . . , n − 1, dj =

cj+1 − cj 3 hj

b0 = 0,

,

dn−1 = −

j = 0, . . . , n − 2,

bj =

yj+1 − yj hj

bn−1 = “

3

3

8 < 2 (x − 1)3 − 11 (x − 1) + 3, 3 3 : 3 1 − 3 (x − 2) + 2 (x − 2)2 − 35 8 > > > > > > > <

-0.5

0 ≤ x ≤ 1.

(b) u(x) = > 2 (x − 1)3 − 3 (x − 1)2 − (x − 1) + 2, > > :

-1

where c = c0 , c1 , . . . , cn−1

”T

−

(2 cj + cj+1 ) hj 3

,

bn−1

3 h2n−1

−

j = 1, . . . , n − 2,

3(yn − yn−1 ) 1 − cn−1 hn−1 , 2 hn−1 2

solves A c = z =

333

“

2 cn−1 , 3 hn−1

z0 , z1 , . . . , zn−1

”T

, with

4

0

A=

B B B B B B B B B B B B B @

2 h0 h0

h0 2 (h0 + h1 ) h1

h1 2 (h1 + h2 ) h2

0

y − y0 z0 = 3 1 , h0

zj = 3 @

h2 2 (h2 + h3 ) h3 .. .. .. . . . hn−3 2 (hn−3 + hn−2 ) hn−2 hn−2 2 hn−2 + 32 hn−1

yj+1 − yj hj

−

yj − yj−1 hj−1

The particular solutions are:

1 A

,

1

C C C C C C C C, C C C C C A

j = 1, . . . , n − 2.

1

8 <

(a) u(x) = :

0.5

−5.25 (x + 1)3 + 8.25 (x + 1)2 − 2,

−1 ≤ x ≤ 0,

4.75 x3 − 7.5 x2 + .75 x + 1,

-1

-0.5

1

0.5 -0.5

0 ≤ x ≤ 1.

-1 -1.5 -2 2

8 > > > <

− 2.6 x3 + 3.6 x2 + 1,

0 ≤ x ≤ 1,

(b) u(x) = > 2.8 (x − 1)3 − 4.2 (x − 1)2 − .6 (x − 1) + 2, > > :

1 ≤ x ≤ 2,

− 2.6 (x − 2)3 + 4.2 (x − 2)2 − .6 (x − 2),

2 ≤ x ≤ 3.

1.5 1 0.5

0.5

1

1.5

2

2.5

3

1.5

2

2.5

3

3.5

4

3

8 <

(c) u(x) = : 8 > > > > > > > <

(d) u(x) = > > > > > > > :

2.5

3.5 (x − 1)3 − 6.5 (x − 1)2 + 3,

1 ≤ x ≤ 2,

− 1.125 (x − 2)3 + 4 (x − 2)2 − 2.5 (x − 2),

2 ≤ x ≤ 4.

2 1.5 1 0.5

-0.5

4.92857 (x + 2)3 − 7.92857 (x + 2)2 + 5, 3

−2 ≤ x ≤ −1,

2

− 4.78571 (x + 1) + 6.85714 (x + 1) − 1.07143 (x + 1) + 2, 3

−1 ≤ x ≤ 0,

2

5.21429 x − 7.5 x − 1.71429 x + 3, 3

0 ≤ x ≤ 1,

2

− 5.07143 (x − 1) + 8.14286 (x − 1) − 1.07143 (x − 1) − 1,

1 ≤ x ≤ 2.

5

4 3 2 1 -2

-1

1

2

-1

11.4.16. 8 > > > <

1

x3 − 2 x2 + 1,

(a) u(x) = > (x − 1)2 − (x − 1), > > :

3

0 ≤ x ≤ 1,

2

− (x − 2) + (x − 2) + (x − 2), 334

1 ≤ x ≤ 2,

2 ≤ x ≤ 3.

0.8 0.6 0.4 0.2 0.5 -0.2

1

1.5

2

2.5

3

2

8 > > > <

3

− x + 2 x + 1, 3

0 ≤ x ≤ 1,

2

(b) u(x) = > 2 (x − 1) − 3 (x − 1) − (x − 1) + 2, > > :

(c) u(x) =

(d) u(x) =

1.5

1 ≤ x ≤ 2,

− (x − 2)3 + 3 (x − 2)2 − (x − 2),

1 0.5

2 ≤ x ≤ 3.

8 5 3 > x − 49 x2 + 1, 0 > 4 > > > > 3 2 3 3 3 > < − (x − 1) + (x − 1) − (x − 1), 1 4 2 4 > 3 2 3 3 > > 2 > 4 (x − 2) − 4 (x − 2) , > > > : − 45 (x − 3)3 + 32 (x − 3)2 + 34 (x − 3), 3 8 2 3 9 3 > > − 2 (x + 2) + 4 (x + 2) + 4 (x + 2) + 1, > > > > > < 7 (x + 1)3 − 21 (x + 1)2 − 9 (x + 1) + 2, 2 4 4 > 3 21 2 9 > > − 2 x + x − x − 2, > 4 4 > > > : 1 3 2 3 9 2 (x − 1) − 4 (x − 1) + 4 (x − 1) − 1,

1

0.5

≤ x ≤ 1,

1.5

2

2.5

3

1 0.8

≤ x ≤ 2,

0.6 0.4

≤ x ≤ 3,

0.2

≤ x ≤ 4. −2 ≤ x ≤ −1, −1 ≤ x ≤ 0,

2

1

4

3

2 1

-1

-2

1

0 ≤ x ≤ 1,

2

-1

1 ≤ x ≤ 2.

-2

♠ 11.4.17. (a) If we measure x in radians, 8 > > > > > <

.9959 x − .1495 x3 , “

u(x) = > .5 + .8730 x − > > > > :

1 6

“

”

“

π − .2348 x −

.7071 + .6888 x −

1 4

”

1 6

“

π

π − .4686 x −

”2 1 4

“

− .297 x −

π

”2

“

1 6

π

”3

− .5966 x −

0 ≤ x ≤ 1, ,

1 4

π

”3

,

1 6

π≤x≤

1 4

π,

1 4

π≤x≤

1 3

π.

We plot the spline, a comparison with the exact graph, and a graph of the error: 0.8

0.8

0.003

0.6

0.6

0.002

0.4

0.4

0.001

0.2

0.2 0.2 0.2

0.4

0.6

0.8

1

0.2

0.4

0.6

0.8

0.4

0.6

1

0.8

1

(b) The maximal error in the spline is .002967 versus .000649 for the interpolating polynomial. The others all have larger error. ♣ 11.4.18. (a) 8 > > > > > <

2.2718 x − 4.3490 x3 , “

u(x) = > .5 + 1.4564 x − > > > > :

“

.75 + .5066 x −

1 4

”

9 16

“

− 3.2618 x −

”

“

+ .2224 x −

” “ ” 1 2 1 3 + 3.7164 x − , 4 4 ”2 “ ”3 9 9 − .1694 x − 16 , 16

0 ≤ x ≤ 1, 1 4

≤x≤

9 16

9 16 ,

≤ x ≤ 1.

We plot the spline, a comparison with the exact graph, and a graph of the error: 1

1 0.1

0.8

0.8

0.6

0.6

0.4

0.4

0.08 0.06 0.04 0.02 0.2

0.2 0.2

0.4

0.6

0.8

1

-0.02 0.2

0.4

0.6

0.8

1

0.2

0.4

0.6

0.8

1

(b) The maximal error in the spline is .1106 versus .1617 for the interpolating polynomial. (c) The least squares error for the spline is .0396, while the least squares cubic polynomial, p(x) = .88889 x3 − 1.90476 x2 + 1.90476 x + .12698 has larger maximal error .1270, but smaller least squares error .0112 (as it must!). 335

1

1

0.8

♠ 11.4.19.

0.6

0.6

0.6

0.4

0.4

0.2

-1

-2

0.8

0.4 0.2 -3

1

0.8

1

2

3

-3

-2

-1

0.2 1

2

3

-3

-2

-1

1

2

3

The cubic spline interpolants do not exhibit any of the pathology associated with the interpolating polynomials. Indeed, the maximal absolute errors are, respectively, .4139, .1001, .003816, and so increasing the number of nodes significantly reduces the overall error. ♣ 11.4.20. Sample letters:

♣ 11.4.21. Sample letters:

Clearly, interpolating polynomials are completely unsuitable for typography! ♥ 11.4.22. (a)

8 > > > > > <

1−

19 15

x+

4 15

x3 ,

0 ≤ x ≤ 1,

7 C0 (x) = > − 15 (x − 1) + 54 (x − 1)2 − > > > > :

2 15

8 8 > > 5 > > > <

(x − 2) −

x−

C1 (x) = > 1 − > > > > : 8 > > > > > <

(x − 2)2 +

1 15

(x − 1)3 ,

(x − 2)3 ,

x3 ,

1 ≤ x ≤ 2,

0.2

(x − 1) − 59 (x − 1)2 + (x − 1)3 ,

1 ≤ x ≤ 2,

(x − 2)2 −

2 ≤ x ≤ 3,

− 25 x +

2 5

6 5

2 5

(x − 2)3 ,

x3 ,

0 ≤ x ≤ 1,

(x − 1) + 65 (x − 1)2 − (x − 1)3 , (x − 2) −

9 5

(x − 2)2 +

3 5

(x − 2)3 ,

336

0.6 0.4

2 ≤ x ≤ 3, 0 ≤ x ≤ 1,

− 45 (x − 2) +

4 5 > > > > : 1 5

C2 (x) = >

1 5

3 5

1 5

1 3

1 0.8

1 ≤ x ≤ 2, 2 ≤ x ≤ 3,

0.5

1

1.5

2

2.5

3

0.5

1

1.5

2

2.5

3

0.5

1

1.5

2

2.5

3

1 0.8 0.6 0.4 0.2

1 0.8 0.6 0.4 0.2

8 1 > > 15 > > > <

x−

1 15

x3 ,

0 ≤ x ≤ 1,

C3 (x) = > − 12 15 (x − 1) − 15 (x − 1)2 + > > > > :

7 15

(x − 2) +

4 5

(x − 2)2 −

4 15

1 3

(x − 1)3 ,

1 0.8 0.6

1 ≤ x ≤ 2,

(x − 2)3 ,

0.4 0.2

2 ≤ x ≤ 3.

1

0.5

1.5

2

2.5

3

(b) It suffices to note that any linear combination of natural splines is a natural spline. Moreover, u(xj ) = y0 C0 (xj ) + y1 C1 (xj ) + · · · + yn Cn (xj ) = yj , as desired. (c) The n + 1 cardinal splines C0 (x), . . . , Cn (x) orthogonal matrix a basis. Part (b) shows that they span the space since we can interpolate any data. Moreover, they are linearly independent since, again by part (b), the only spline that interpolates the zero data, u(xj ) = 0 for all j = 0, . . . , n, is the trivial one u(x) ≡ 0. (d) The same ideas apply to clamped splines with homogeneous boundary conditions, and to periodic splines. In the periodic case, since C0 (x) = Cn (x), the vector space only has dimension n. The formulas for the clamped cardinal and periodic cardinal splines are different, though. ♥ 11.4.23.

8 > > > > > > <

(a) β(x) = > > > > > > :

4

(x + 2)3 ,

−2 ≤ x ≤ −1,

4 − 6 x 2 + 3 x3 ,

4 − 6 x 2 − 3 x3 ,

2

0 ≤ x ≤ 1,

3

(2 − x) ,

3

−1 ≤ x ≤ 0,

1

1 ≤ x ≤ 2.

′

-3

-2

-1

1

2

3

(b) By direct computation β (−2) = 0 = β ′ (2); (c) By the interpolation conditions, the natural boundary conditions and (b), β(−2) = β ′ (−2) = β ′′ (−2) = 0 = β(2) = β ′ (2) = β ′′ (2), and so periodicity is assured. (d) Since β(x) is a spline, it is C2 for −2 < x < 2, while the zero function is everywhere C2 . Thus, the only problematic points are at x = ±2, and, by part (c), β ⋆ (−2− ) = 0 = β ⋆ (−2+ ), β ⋆ (−2− ) = 0 = β ⋆ (−2+ ), β ⋆′ (−2− ) = 0 = β ⋆′ (−2+ ), β ⋆ (2− ) = 0 = β ⋆ (2+ ), β ⋆ (2− ) = 0 = β ⋆ (2+ ), β ⋆′ (2− ) = 0 = β ⋆′ (2+ ),, proving the continuity of β ⋆ (x) at ±2. ♥ 11.4.24. (a) According to Exercise 11.4.23, the functions Bj (x) are all periodic cubic splines. More8 4, j = k, > > < over, their sample values Bj (xk ) = > 1, | j − k | = 1 mod n, form a linearly indepen> : 0, otherwise, dent set of vectors because the corresponding circulant tridiagonal n × n matrix B, with entries bjk = Bj (xk ) for j, k = 0, . . . , n − 1, is diagonally dominant. (b) 4

4

3.5 3

B0 (x):

4

3.5

3.5

3

2.5

B1 (x):

2 1.5

3

2.5

2.5

B2 (x):

2

2

1.5

1

1.5

1

0.5

1

0.5 1

2

3

4

0.5

5

1

2

3

4

5

1

4

3.5

3.5

3

B3 (x):

4

3

2.5

B4 (x):

2 1.5

2.5 2 1.5

1

1

0.5

0.5 1

2

3

4

5

1

337

2

3

4

5

2

3

4

5

(c) Solve the linear system B α = y, where B is the matrix constructed in part (a), for the B–spline coefficients. 2 1

(d) u(x) =

5 11

B1 (x) +

2 11

B2 (x) −

2 11

B3 (x) −

5 11

B4 (x).

1

2

4

3

5

-1 -2

11.5.1. u(x) =

sinh 23 x e3 x/2 − e− 3 x/2 = ; e3 − e− 3 sinh 3

yes, the solution is unique.

11.5.2. True — the solution is u(x) = 1.

11.5.3. The Green’s function superposition formula is u(x) = If x ≤ 12 , then u(x) =

Z 1

1/2

G(x, y) dx.

0

Z 1/2 sinh ω (1 − x) sinh ω y sinh ω x sinh ω (1 − y) dy + dy − x ω sinh ω ω sinh ω Z 1 1 eω x + eω/2 e− ω x sinh ω x sinh ω (1 − y) dy = 2 − − , 1/2 ω sinh ω ω ω 2 (1 + eω/2 )

Z 1/2 0

− 11.5.4.

0

G(x, y) dx −

Z x

while if x ≥ 21 , then u(x) =

Z 1/2

sinh ω (1 − x) sinh ω y dy − ω sinh ω Z 1 x

Z x

1/2

sinh ω (1 − x) sinh ω y dy − ω sinh ω

sinh ω x sinh ω (1 − y) 1 e− ω/2 eω x + eω e− ω x . dy = 2 − ω sinh ω ω ω 2 (1 + eω/2 )

8 > > > <

sinh ω x cosh ω (1 − y) , x < y, ω cosh ω (a) G(x, y) = > > cosh ω (1 − x) sinh ω y > : , x > y. ω cosh ω 1 (b) If x ≤ 2 , then Z x Z 1/2 cosh ω (1 − x) sinh ω y sinh ω x cosh ω (1 − y) u(x) = dy + dy − 0 x ω cosh ω ω cosh ω Z 1 sinh ω x cosh ω (1 − y) dy − 1/2 ω cosh ω (eω/2 − e− ω/2 + e− ω ) eω x + (eω − eω/2 + e− ω/2 ) e− ω x 1 − , ω2 ω 2 (eω + e− ω ) while if x ≥ 12 , then Z 1/2 Z x cosh ω (1 − x) sinh ω y cosh ω (1 − x) sinh ω y u(x) = dy − dy − 0 1/2 ω cosh ω ω cosh ω Z 1 sinh ω x cosh ω (1 − y) − dy x ω cosh ω (e− ω/2 − e− ω + e− 3 ω/2 ) eω x + (e3 ω/2 − eω + eω/2 ) e− ω x 1 . = − 2 + ω ω 2 (eω + e− ω ) =

338

11.5.5.

8 > > > <

If x ≤ 12 , then u(x) =

G(x, y) = > > > :

Z x 0

while if x ≥ 12 , then u(x) =

x < y, x > y.

Z 1/2 cosh ω x cosh ω (1 − y) cosh ω (1 − x) cosh ω y dy + dy − x ω sinh ω ω sinh ω Z 1 1 cosh ω x cosh ω x cosh ω (1 − y) − , dy = 2 − 2 1/2 ω sinh ω ω ω cosh 12 ω

Z 1/2 0

cosh ω x cosh ω (1 − y) , ω sinh ω cosh ω (1 − x) cosh ω y , ω sinh ω

Z x cosh ω (1 − x) cosh ω y cosh ω (1 − x) cosh ω y dy − dy − 1/2 ω sinh ω ω sinh ω Z 1 cosh ω x cosh ω (1 − y) 1 cosh ω (1 − x) − . dy = − 2 + x ω sinh ω ω ω 2 cosh 12 ω

This Neumann boundary value problem has a unique solution since ker K = ker L = {0}. Therefore, the Neumann boundary value problem is positive definite, and hence has a Green’s function. “

”T

♦ 11.5.6. Since L[ u ] = u′ , u , clearly ker L = {0}, irrespective of any boundary conditions. Thus, every set of self-adjoint boundary conditions, including the Neumann boundary value problem, leads to a positive definite boundary value problem. The solution u⋆ (x) to the homogeneous Neumann boundary value problem minimizes the same functional (11.156) among all u ∈ C2 [ a, b ] satisfying the Neumann boundary conditions u′ (a) = u′ (b) = 0. ♥ 11.5.7. (a) The solution is unique provided the homogeneous boundary value problem z ′′ + λ z = 0, z(0) = z(1) = 0, has only the zero solution z(x) ≡ 0, which occurs whenever λ 6= n 2 π 2 for n = 1, 2, 3, . . . . If λ = n2 π 2 , then z(x) = c sin n π x for any c 6= 0 is a non-zero solution to the homogeneous boundary value problem, and so can be added in to any solution to the inhomogeneous system. 8 > > − sinh ω (1 − y) sinh ω x , > x < y, > < ω sinh ω 2 G(x, y) = > (b) For λ = − ω < 0, > sinh ω (1 − x) sinh ω y > > : − , x > y; ω sinh ω 8 < x(y − 1), x < y, for λ = 0, G(x, y) = : y(x − 1), x > y; 8 > > sin ω (1 − y) sin ω x , > x < y, > < ω sin ω for λ = ω 2 6= n2 π 2 > 0, G(x, y) = > > sin ω (1 − x) sin ω y > > : , x > y. ω sin ω (c) The Fredholm alternative requires the forcing function to be orthogonal to the solutions to the homogeneous boundary value problem, and so h h , sin n π x i = ♦ 11.5.8. The first term is written as −

d dx

p(x)

Z 1 0

du dx

339

h(x) sin n π x dx = 0.

!

= D∗ ◦ D[ u ], where D[ u ] = u′ , and the

adjoint is computed with respect to the weighted L2 inner product hh v , ve ii =

Z b a

p(x) v(x) ve(x) dx.

The second term is written as q(x) u = I ∗ ◦ I [ u ], where I [ u ] = u is the identity operator, and the adjoint is computed with respect to the weighted L2 inner product e ii = hh w , w

Z b a

e q(x) w(x) w(x) dx.

Thus, by Exercise 7.5.21, the sum can be written in self-adjoint form ! D[ u ] ∗ ∗ ∗ = with L[ u ] = D ◦ D[ u ] + I ◦ I [ u ] = L ◦ L[ u ], I [u]

u′ u

!

taking its values in the Cartesian product space, with inner product exactly given by (11.154). ♥ 11.5.9. i′ h i h (a) µ(x) a(x) u′′ + b(x) u′ + c(x) u = − p(x) u′ + q(x) u = − p(x) u′′ − p′ (x) u + q(x) u if and only if µ a = − p, µ b = − p′ , µ c = q. Thus, (µ a)′ = µ b, and so the formula for the integrating factor is ! ! Z Z b(x) − a′ (x) 1 b(x) µ(x) = exp dx = exp dx . a(x) a(x) a(x)

(ii) µ(x) = x−4

(iii) µ(x) = − e− x

!

du + e2 x u = e 3 x , dx ! 1 3 1 du + 4u= 4, 2 x dx x x ! d − x du −x xe − e u = 0. yields − dx dx

d dx d yields − dx

(b) (i) µ(x) = e2 x yields −

(c) (i) Minimize P[ u ] =

Z bh

(ii) Minimize P[ u ] =

Z bh

(iii) Minimize P[ u ] =

a

a

1 2

e2 x

e2 x u′ (x)2 +

1 2

u(1) = u(2) = 0;

Zab h

i

e2 x u(x)2 − e3 x u(x) dx subject to

3 u(x)2 u(x) i u′ (x)2 + − 4 dx subject to u(1) = u(2) = 0; 2 4 2x 2x x i −x ′ 2 1 −x 1 u (x) − 2 e u(x)2 dx subject to u(1) = u(2) = 0. 2 xe

11.5.10. Since λ“> 0, the to ” the ordinary differential equation is √ general solution √ u(x) = e− x c1 cos λ x + c2 sin λ x . The first boundary condition implies c1 = 0, while √ the second implies c2 = 0 unless sin 2 λ = 0, and so the desired values are λ = 14 n2 π 2 for any positive integer n. ♦ 11.5.11. The solution is

1 (1 − e− ε ) eε x + (eε − 1) e− ε x . − ε2 ε2 (eε − e− ε ) Moreover, by l’Hˆ opital’s Rule, or using Taylor expansions in ε, u(x, ε) =

eε − eε x + eε (x−1) − e− ε + e− ε x − eε(1−x) = ε2 (eε − e− ε ) ε→0+ ε→0+ which is the solution to the limiting boundary value problem. lim u(x, ε) = lim

♦ 11.5.12. The solution is u(x, ε) = 1 −

1 2

x−

1 − e− 1/ε e1/ε − 1 x/ε e − e− x/ε . 1/ε − 1/ε 1/ε − 1/ε e −e e −e 340

1 2

x2 = u⋆ (x),

To evaluate the limit, we rewrite lim u(x, ε) = 1 − lim

ε→0+

ε→0+

− x/ε

since lim+ e ε→0

1 − e− 1/ε (x−1)/ε 1 − e− 1/ε − x/ε e − e = 1 − e− 2/ε 1 − e− 2/ε

(

1, 0,

0 < x < 1, x = 0 or x = 1,

= 0 for all x > 0. The convergence is non-uniform since the limiting

function is discontinuous. 11.5.13. To prove the first bilinearity condition: hh c v + d w , z ii = =c

Z 1h 0

Z 1h 0

i

p(x) (c v1 (x) + d w1 (x)) z1 (x) + q(x) (c v2 (x) + d w2 (x)) z2 (x) dx i

p(x) v1 (x) z1 (x) + q(x) v2 (x) z2 (x) dx + d

Z 1h 0

i

p(x) w1 (x) z1 (x) + q(x) w2 (x) z2 (x) dx

= c hh v , z ii + d hh w , z ii. The second has a similar proof, or follows from symmetry as in Exercise 3.1.9. To prove symmetry: hh v , w ii = = As for positivity,

Z 1h

0 Z 1h 0

i

p(x) v1 (x) w1 (x) + q(x) v2 (x) w2 (x) dx i

p(x) w1 (x) v1 (x) + q(x) w2 (x) v2 (x) dx = hh w , v ii.

hh v , v ii =

Z 1h 0

i

p(x) v1 (x)2 + q(x) v2 (x)2 dx ≥ 0,

since the integrand is a non-negative function. Moreover, since p(x), q(x), v 1 (x), v2 (x) are all continuous, so is p(x) v1 (x)2 +q(x) v2 (x)2 , and hence hh v , v ii = 0 if and only if p(x) v1 (x)2 + q(x) v2 (x)2 ≡ 0. Since p(x) > 0, q(x) > 0, this implies v(x) = ( v1 (x), v2 (x) )T ≡ 0. ♦ 11.5.14. (a) sinh α cosh β + cosh α sinh β = = (b) cosh α cosh β + sinh α sinh β = =

−α 1 α ) (eβ + e− β ) + 14 (eα + e− α ) (eβ 4 (e − e 1 α+β − e− α−β ) = sinh(α + β). 2 (e −α 1 α ) (eβ + e− β ) + 14 (eα − e− α ) (eβ 4 (e + e 1 α+β + e− α−β ) = cosh(α + β). 2 (e

− e− β ) − e− β )

ex − 1 − x ex . The graphs compare the exact and finite e2 − 1 element approximations to the solution for 6, 11 and 21 nodes:

♣ 11.6.1. Exact solution: u(x) = 2 e2 1.4

1.4

1.4

1.2

1.2

1.2

1

1

0.8

1

0.8

0.8 0.6

0.6

0.6

0.4

0.4

0.4

0.2

0.2

0.2

1

0.5

1.5

2

0.5

1

1.5

2

0.5

1

1.5

The respective maximal overall errors are .1973, .05577, .01476. Thus, halving the nodal spacing apparently reduces the error by a factor of 4. ♠ 11.6.2. (a) Solution: u(x) =

1 4

x − ρ2 (x − 1) =

8 < 1 4 : 1 4

x, x−

1 2

(x − 1)2 , 341

0 ≤ x ≤ 1, 1 ≤ x ≤ 2;

2

0.3 0.25 0.2 0.15

finite element sample values: 0.1 0.05 ( 0., .06, .12, .18, .24, .3, .32, .3, .24, .14, 0. ); maximal error at sample points .05; maximal overall error: .05.

1

0.5

1.5

2

0.08 0.06

(b) Solution: u(x) =

log x + 1 − x; log 2

0.04

finite element sample

0.02

0.2

0.4

0.6

0.8

1

values: ( 0., .03746, .06297, .07844, .08535, .08489, .07801, .06549, .04796, .02598, 0. ); maximal error at sample points .00007531; maximal overall error: .001659. 1.5

2

2.5

3

-0.05 -0.1

(c) Solution: u(x) =

1 2

x−2+

3 2

−1

x

;

finite element sample

-0.15 -0.2 -0.25

values: ( 0., −.1482, −.2264, −.2604, −.2648, −.2485, −.2170, −.1742, −.1225, −.0640, 0. ); maximal error at sample points .002175; maximal overall error: .01224. 0.4 0.3

(d) Solution: u(x) =

e2 + 1 − 2 e1−x − x; e2 − 1

0.2

finite element

0.1

-1

-0.5

1

0.5

sample values: ( 0., .2178, .3602, .4407, .4706, .4591, .4136, .3404, .2444, .1298, 0. ); maximal error at sample points .003143; maximal overall error: .01120. ♣ 11.6.3. (a) The finite element sample values are c = ( 0, .096, .168, .192, .144, 0 )T .

(b)

0.15

0.15

0.15

0.1

0.1

0.1

0.05

0.05

0.05

0.2

0.4

0.6

0.8

1

0.2

0.4

0.6

0.8

1

0.2

0.4

0.6

0.8

1

(c) (i) The maximal error at the mesh points is 2.7756×10−17 , almost 0! (ii) The maximal error on the interval using the piecewise affine interpolant is .0135046. (d) (i) The maximal error at the mesh points is the same 2.775 × 10−17 ; (ii) the maximal error on the interval using the spline interpolant is 5.5511 × 10−17 , making it, for all practical purposes, identical with the exact solution. ♣ 11.6.4. The only difference is the last basis function, which should changed to 8 > > > > > <

ϕn−1 (x) = > > > > > :

x − xn−2 , xn−1 − xn−2

1, 0,

xn−2 ≤ x ≤ xn−1 , xn−1 ≤ x ≤ b, x ≤ xn−2 ,

in order to satisfy the free boundary condition at xn = b. This only affects the bottom right entry s mn−1,n−1 = n−2 h2 342

of the finite element matrix (11.169) and the last entry bn−1 =

Z xn−1

1 h

xn−2

(x − xn−2 )f (x) dx +

Z xn

xn−1

f (x) dx ≈ h f (xn−1 ) +

1 2

h f (xn ),

of the vector (11.172). For the particular boundary value problem, the exact solution is u(x) = 4 log(x + 1) − x. We graph the finite element approximation and then a comparison with the solution: 0.3

0.3

0.2

0.2

0.1

0.1

0.2

0.4

0.6

1

0.8

0.2

0.4

0.6

1

0.8

They are almost identical; indeed, the maximal error on the interval is .003188. ♣ 11.6.5. (a) u(x) = x +

π ex π e−x + . 1 − e2 πZ 1 − e− 2 π 2πh

(b) Minimize P[ u ] =

0

1 2

u′ (x)2 +

1 2

i

u(x)2 − x u(x) dx over all C2 functions u(x) that

satisfy the boundary conditions u(0) = u(2 π), u′ (0) = u′ (2 π). (c) dim W5 = 4 since any piecewise affine function ϕ(x) that satisfies the conditions is uniquely determined by its 4 interior sample values c1 = ϕ(x4 ), with c0 = ϕ(x0 ) = 12 (c1 + c4 ) then determined so that ϕ(x0 ) ϕ′ (x0 ) = ϕ′ (x5 ). Thus, a basis consists of the following four functions polation values plotted in 1

ϕ1 :

1 2 , 1, 0, 0, 0

1

0.8

0.8

0.6

0.6

ϕ2 : 0, 0, 1, 0, 0

0.4 0.2 1

2

3

4

5

0.4 0.2

6

1

1

2

3

4

5

6

1

2

3

4

5

6

1

0.8

0.8

0.6

ϕ3 : 0, 0, 0, 1, 0

two boundary ϕ(x1 ), . . . , c4 = = ϕ(x5 ) and with listed inter-

0.6

ϕ4 :

0.4 0.2 1

2

(d) n = 5: maximal error .9435

3

4

5

0.4 0.2

6

4

4

3

3

2

2

1

1

1

(e) n = 10: maximal error .6219

1 2 , 0, 0, 0, 1

2

3

4

5

6

4

4

3

3

2

2

1

1

1

2

343

3

4

5

6

1

2

3

4

5

6

1

2

3

4

5

6

n = 20: maximal error .3792

4

4

3

3

2

2

1

1

1


3

4

5

6 4

3

3

2

2

1

1

1

1 2

Each decrease in the step size by

2

4

2

3

4

5

6

1

2

3

4

5

6

1

2

3

4

5

6

decreases the maximal error by slightly less than

1 2.

♣ 11.6.6. (c) dim W5 = 5, since a piecewise affine function ϕ(x) that satisfies the two boundary conditions is uniquely determined by its values cj = ϕ(xj ), j = 0, . . . , 4. A basis consists ( 1, i = j, of the 5 functions interpolating the values ϕi (xj ) = for 0 ≤ i, j < 5, and 0, i 6= j, ϕi (2 π) = ϕi (x5 ) = ϕi (x0 ) = ϕi (0). The basis functions are graphed below: 1

ϕ0 :

1

0.8

0.8

0.6

0.6

ϕ1 :

0.4 0.2 1

2

3

4

5

1 0.8 0.6

ϕ2 :

0.4 0.2 1

6

2

3

4

5

1

0.2 1

6

2

4

3

5

1

0.8

0.8

0.6

ϕ3 :

0.4

0.6

ϕ4 :

0.4 0.2 1

2

3

(d) n = 5: maximal error 1.8598

4

5

1

2

4

3

3

2

2

1

1

2

3

4

5

6

4

4

3

3

2

2

1

1

2

3

4

5

6

4

4

3

3

2

2

1

1

1

2

344

3

4

5

6

4

3

4

1


0.2

6

1

(e) n = 10: maximal error .9744

0.4

5

6

1

2

3

4

5

6

1

2

3

4

5

6

1

2

3

4

5

6

6


4

4

3

3

2

2

1

1

1

2

4

3

5

1

6

1 2

2

3

4

5

Each decrease in the step size by also decreases the maximal error by about ever, the approximations in Exercise 11.6.5 are slightly more accurate.

6

1 2.

How-

2 1.5 1

♣ 11.6.7. n = 5: maximum error .1117

0.5

1

2

3

4

5

6

1

2

3

4

5

6

1

2

3

4

5

6

2 1.5 1

n = 10: maximum error .02944

0.5

2 1.5 1

n = 20: maximum error .00704

0.5

11.6.8.

0

(a) L =

1

1

B−1 B 2 B B B B B B @

1 − 23

1 − 34

1 ..

C C C C C, C C C A

D=

1 h

0 B B B B B B B B @

3

3 2

4 3

1 5 4

.. . . since all pivots are positive, the matrix is positive definite.

..

.

C C C C C; C C C A

kπ (b) By Exercise 8.2.48, the eigenvalues of M are λk = 2 − 2 cos n + 1 > 0 for k = 1, . . . , n. Since M is symmetric and these are all positive definite, the matrix is positive definite. 11.6.9. (a) No. (b) Yes. The Jacobi iteration matrix T , cf. (10.65), is the tridiagonal matrix with 0’s on the main diagonal, and 21 ’s on the sub and super-diagonals, and hence, accordkπ , k = 1, . . . , n. Thus, its spectral ing to Exercise 8.2.48, has eigenvalues λk = cos n+1 π radius, and hence rate of convergence of the iteration, is ρ(T ) = cos n + 1 < 1, proving convergence. ♣ 11.6.10. (a) A basis for Wn consists of the n − 1 polynomials ϕk (x) = xk (x − 1) = xk+1 − xk for k = 1, . . . , n − 1.

345

(b) The matrix entries are mij = hh L[ ϕi ] , L[ ϕj ] ii = hh ϕ′i , ϕ′j ii = =

Z 1h 0

(i + 1)xi − i xi−1

ih

i

(j + 1)xj − j xj−1 (x + 1) dx

4 i2 j + 4 i j 2 + 4 i j + j 2 + i2 − i − j , (i + j − 1) (i + j) (i + j + 1) (i + j + 2)

1 ≤ i, j ≤ n − 1.

while the right hand side vector has entries bi = h f , ϕ i i =

Z 1h 0

i

(i + 1)xi − i xi−1 dx = −

Solving M c = b, the computed solutions v(x) =

1 , i2 + 3 i + 2

n−1 X

k=1

i = 1, . . . , n − 1.

ck ϕk (x) are almost identical to

the exact solution: for n = 5, the maximal error is 2.00 × 10−5 , for n = 10, the maximal error is 1.55 × 10−9 , and for n = 20, the maximal error is 1.87 × 10−10 . Thus, the polynomial finite element method gives much closer approximations, although solving the linear system is (slightly) harder since the coefficient matrix is not sparse. ♣ 11.6.11. (a) There is a unique solution provided λ 6= − n2 for n = 1, 2, 3, . . . , namely √ 8 > x λx π sinh > > > √ , λ > 0, − > > > λ λ sinh λ π > > < u(x) = > 61 π 2 x − 61 x3 , λ = 0, > > √ > > > x π sin − λ x > > > √ − , − n2 6= λ < 0. : λ λ sin − λ π When λ = − n2 , the boundary value problem has no solution.

(b) The minimization principle P[ u ] =

Z 2πh 0

1 2

u′ (x)2 +

1 2

i

λ u(x)2 − x u(x) dx over all C2

functions u(x) that satisfy the boundary conditions u(0) = 0, u(π) = 0, is only valid for λ > 0. The minimizer is unique. Otherwise, the functional has no minimum. (c) Let h = π/n. The finite element equations are M c = b where M is the (n − 1) × (n − 1) 2 2 tridiagonal matrix whose diagonal entries are + h λ and sub- and super-entries are h 3 1 1 − + h λ. h 6 (d) According to Exercise 8.2.48, the eigenvalues of M are 2 2 + hλ + h 3

2 1 − + hλ h 3

!

cos k h,

k = 1, . . . , n − 1.

The finite element system has a solution if and only if the matrix is not singular, which 6 cos k h − 1 occurs if and only if 0 is an eigenvalue, and so λ = 2 ≈ − k 2 , provided h cos k h + 2 k h ≪ 0, and so the eigenvalues of the finite element matrix converge to the eigenvalues of the boundary value problem. Interestingly, the finite element solution converges to the actual solution, provided on exists, even when the boundary value problem is not positive definite. (e–f ) Here are some sample plots comparing the finite element approximant with the actual solution. First, for λ > 0, the boundary value problem is positive definite, and we expect convergence to the unique solution. 346

1 0.8 0.6

λ = 1, n = 5, maximal error .1062:

0.4 0.2 0.5

1

1.5

2

2.5

3

0.5

1

1.5

2

2.5

3

0.5

1

1.5

2

2.5

3

0.5

1

1.5

2

2.5

3

1 0.8 0.6

λ = 1, n = 10, maximal error .0318:

0.4 0.2

0.03 0.025 0.02 0.015

λ = 100, n = 10, maximal error .00913:

0.01 0.005

0.025 0.02 0.015

λ = 100, n = 30, maximal error .00237:

0.01 0.005

Then, for negative λ not near an eigenvalue the convergence rate is simliar: 4 3 2

λ = −.5, n = 5, maximal error .2958:

1

1

0.5

1.5

2

2.5

3

4 3 2

λ = −.5, n = 10, maximal error .04757:

1

0.5

1

1.5

2

2.5

3

Then, for λ very near an eigenvalue (in this case −1), convergence is much slower, but still occurs. Note that the solution is very large: 200 150 100

λ = −.99, n = 10, maximal error 76.0308:

50

0.5

1

1.5

2

2.5

3

200 150 100

λ = −.99, n = 50, maximal error 5.333:

50

0.5

347

1

1.5

2

2.5

3

200 150 100

λ = −.99, n = 50, maximal error .4124:

50

0.5

1

1.5

2

2.5

3

On the other hand, when λ is an eigenvalue, the finite element solutions don’t converge. Note the scales on the two graphs: 250 200 150

λ = −1, n = 10, maximum value 244.3

100 50 0.5

1

1.5

2

2.5

3

0.5

1

1.5

2

2.5

3

6000 5000 4000 3000

λ = −1, n = 50, maximum value 6080.4

2000 1000

The final example shows convergence even for large negative λ. The convergence is slow because −50 is near an eigenvalue of −49: 0.2 0.1

λ = −50, n = 10, maximal error .3804:

0.5

1

1.5

2

2.5

3

0.5

1

1.5

2

2.5

3

0.5

1

1.5

2

2.5

3

-0.1 -0.2 -0.3

1

λ = −50, n = 50, maximal error 1.1292:

0.5

-0.5 -1 -1.5 0.3 0.2

λ = −50, n = 200, maximal error .0153:

0.1

-0.1 -0.2 -0.3

♥ 11.6.12.

ui+1 − ui (x − xi ) for xi ≤ x ≤ xi+1 . xi+1 − xi (b) Clearly, since each hat function is piecewise affine, any linear combination is also piece( n X 1, i = j, wise affine. Since ϕi (xj ) = we have u(xj ) = ui ϕi (xj ) = uj , and so 0, i 6= j, i=0 u(x) has the correct interpolation values. (a) We define f (x) = ui +

(c) u(x) = 2 ϕ (x) + 3 ϕ (x) + 6 ϕ (x) + 11 ϕ (x) 0 1 2 3 8 x + 2, 0 ≤ x ≤ 1, > > > <

= > 3 x, > > :

5 x − 4,

1 ≤ x ≤ 2,

12 10 8 6 4 2

2 ≤ x ≤ 3.

0.5

348

1

1.5

2

2.5

3

♦ 11.6.13. ′ − (a) If α(x) is any continuously differentiable function at xj , then α′ (x+ j ) − α (xj ) = 0. Now, each term in the sum is continuously differentiable at x = xj except for αj (x) = + − ′ − ′ ′ cj | x − xj |, and so f ′ (x+ j ) − f (xj ) = αj (xj ) − αj (xj ) = 2 cj , proving the formula. Further, a = f ′ (x− 0 )+ (b)

n X

i=1

ci , b = f (x0 ) − a x0 −

n X

i=1

ci | x0 − xi |.

1 1 1 | x − xj−1 | − | x − xj | + | x − xj+1 |. 2h h 2h If u(x) = 0 for x < 0 or x > 3, then 1 5 u(x) = 13 2 + 2 | x | + | x − 1 | + | x − 2 | − 2 | x − 3 |. More generally, for any constants c, d, we can set u(x) = (3 − c + d) x − (3 d + 1) + c | x | + | x − 1 | + | x − 2 | + d | x − 3 |. ϕj (x) =

349

Applied Linear Algebra by Peter J. Olver and Chehrzad Shakiban

Corrections to Instructor’s Solution Manual Last updated: April 28, 2012 

     2 −1 2 u 2 1.2.4 (d) A =  −1 −1 3 , x =  v , b =  1 ; 3 0 −2 w 1   5 3 −1 (e) A =  3 2 −1 , 1 1 1   −3  −5  (f ) b =  . 2 1   0 0 1 1.4.15 (a)  0 1 0 . 1 0 0

1.8.4 (i ) a 6= b and b 6= 0; (ii ) a = b 6= 0, or a = −2, b = 0; (iii ) a 6= −2, b = 0. T

1.8.23 (e) ( 0, 0, 0 ) ; 2.2.28 (a) By induction, we can show that, for n ≥ 1 and x > 0, f (n) (x) =

Qn−1 (x) − 1/x2 , e x2 n

where Qn−1 (x) is a polynomial of degree n − 1. Thus, lim+ f (n) (x) = lim+

x→0

x→0

Qn−1 (x) − 1/x2 e = Qn−1 (0) lim y 2 n e− y = 0 = lim− f (n) (x), y→∞ x2 n x→0

because the exponential e− y goes to zero faster than any power of y goes to ∞. T

2.5.31 (d) . . . while coker U has basis ( 0, 0, 0, 1 ) .

4/28/12

1

c 2012

Peter J. Olver

2.5.32 (b) Yes, the preceding example can put into row echelon form by the following elementary row operations of type #1: 0 0 R1 7→R1 +R2 1 0 R2 7→R2 −R1 1 0 −→ −→ . 1 0 1 0 0 0 Indeed, Exercise 1.4.18 shows how to interchange any two rows, modulo multiplying one by an inessential minus sign, using only elementary row operations of type #1. As a consequence, one can reduce any matrix to row echelon form by only elementary row operations of type #1! (The answer in the manual implicitly assumes that the row operations had to be done in the standard order. But this is not stated in the exercise as written.) 2.5.42 True. If ker A = ker B ⊂ R n , then both matrices have n columns, and so n − rank A = dim ker A = dim ker B = n − rank B. 3.1.6 (b) . . . plane has length . . . 3.1.10 (c) If v is any element of V , then we can write v = c1 v1 + · · · + cn vn as a linear combination of the basis elements, and so, by bilinearity, h x − y , v i = c1 h x − y , v1 i + · · · + cn h x − y , vn i = c1 h x , v1 i − h y , v1 i + · · · + cn h x , vn i − h y , vn i = 0.

Since this holds for all v ∈ V , the result in part (a) implies x = y. 3.2.11 (b) Missing square on h v , w i in formula: sin2 θ = 1 − cos2 θ =

(v × w)2 k v k2 k w k2 − h v , w i 2 = . k v k2 k w k2 k v k2 k w k2

3.4.22 (ii ) and (v ) Change “null vectors” to “null directions”. 3.4.33 (a) L = (AT )T AT is . . .    1 1 0 1 0    3.5.3 (b) 1 3 1 = 1 1 0 1 1 0 21 3.6.33 Change all w’s to v’s.

 0 1   0 0 1 0

0 2 0

 1 0   0 0 1 0 2

1 1 0

0 1 2

1



.

4.2.4 (c) When | b | ≥ 2, the minimum is . . . 4.4.23 Delete “ (c)”. (Just the label, not the formula coming afterwards.) 4/28/12

2

c 2012

Peter J. Olver

4.4.27 (a) Change “the interpolating polynomial” to “an interpolating polynomial”. 4.4.52 The solution given in the manual is for the square S = When S = −1 ≤ x ≤ 1, −1 ≤ y ≤ 1 , use the following: Solution: (a) z =

2 3

,

(b) z =

3 5

(x − y),

0 ≤ x ≤ 1, 0 ≤ y ≤ 1 .

(c) z = 0.

5.1.14 One way to solve this is by direct computation. A more sophisticated approach is to apply the Cholesky factorization (3.70) to the inner product matrix: K = M M T . bT w b where v b = M T v, w b = M T w. Therefore, v1 , v2 form an Then, h v , w i = vT K w = v b1 = M T v1 , v b2 = M T v2 , orthonormal basis relative to h v , w i = vT K w if and only if v form an orthonormal basis for the dot product, of the form determined in and hence cos θ 1 √0 , so v1 = √1 , v2 = Exercise 5.1.11. Using this we find: (a) M = sin θ 2 0 2 − sin θ 1 0 cos θ + sin θ , for any 0 ≤ θ < 2 π. (b) M = , so v1 = , ± √1 cos θ −1 1 sin θ 2 cos θ − sin θ v2 = ± , for any 0 ≤ θ < 2 π. cos θ 9 5.4.15 p0 (x) = 1, p1 (x) = x, p2 (x) = x2 − 13 , p4 (x) = x3 − 10 x. (The solution given is for the interval [ 0, 1 ], not [ − 1, 1 ].)     23 .5349 43    5.5.6 (ii ) (c)  19 .4419 . ≈ 43 −.0233 −1 43

5.5.6 (ii ) (d)



614 26883  163  − 927 1876 8961





 .0228    ≈ −.1758  .2094

5.6.20 (c) The solution corresponds to the revised exercise for the system x1 + 2 x2 + 3 x3 = b1 , x2 + 2 x3 = b2 , 3 x1 + 5 x2 + 7 x3 = b3 , − 2 x1 + x2 + 4 x3 = b4 . T For the given system, the cokernel basis is ( −3, 1, 1, 0 ) , and the compatibility condition is − 3 b1 + b2 + b3 = 0.

4/28/12

3

c 2012

Peter J. Olver

5.7.2 (a,b,c) To avoid any confusion, delete the superfluous last sample value in the first equation: (a) (i ) f0 = 2, f1 = −1, √f2 = −1. (ii )√e− i x + e i x = √ 2 cos x; √ (b) (i ) f0 = 1, f1 = 1 − 5 , f2 = 1 + 5 , f3 = 1 + 5 , f4 = 1 − 5 ; (ii ) e− 2 i x − e− i x + 1 − e i x + e2 i x = 1 − 2 cos x + 2 cos 2 x; (c) (i ) f0 = 6, f1 = 2 + 2 e2 π i /5 + 2 e− 4 π i /5 = 1 + .7265 i , f2 = 2 + 2 e2 π i /5 + 2 e4 π i /5 = 1 + 3.0777 i , f3 = 2 + 2 e− 2 π i /5 + 2 e− 4 π i /5 = 1 − 3.0777 i , f4 = 2 + 2 e− 2 π i /5 + 2 e4 π i /5 = 1 − .7265 i ; (ii ) 2 e− 2 i x + 2 + 2 e i x = 2 + 2 cos x + 2 i sin x + 2 cos 2 x − 2 i sin 2 x; (d) (i ) f0 = f1 = f2 = f4 = f5 = 0, f3 = 6; (ii ) 1 − e i x + e2 i x − e3 i x + e4 i x − e5 i x = 1 − cos x+cos 2 x−cos 3 x+cos 4 x−cos 5 x+ i (− sin x + sin 2 x − sin 3 x + sin 4 x − sin 5 x). 6.2.1 (b) The solution given in the manual corresponds to the revised exercise with 

0 0 1 0 incidence matrix  0 −1 1 0 6.3.5 (b)

3 2

1 0 1 −1

 −1 −1  . For the given matrix, the solution is 0 0

u1 − 21 v1 − u2 = f1 , − 12 u1 + 23 v1 = g1 ,

− u1 + 32 u2 + 21 v2 = f2 , 1 2 u2

+ 23 v2 = g2 .

7.4.13 (ii ) (b) v(t) = c1 e2 t + c2 e− t/2 7.4.19 Set d = c in the written solution. 7.5.8 (d) Note: The solution is correct provided, for L: V → V , one uses the same inner product on the domain and target copies of V . If different inner products are used, then the identity map is not self-adjoint, I ∗ 6= I , and so, in general, (L−1 )∗ 6= (L∗ )−1 . ! 8.3.21 (a)

5 3 8 3

4 3 1 3

,

8.5.26 Interchange solutions (b) and (c).

4/28/12

4

c 2012

Peter J. Olver

T

10.5.12 The solution given in the manual is for b = ( −2, −1, 7 ) . T When b = ( 4, 0, 4 ) , use the following: Solution:  88 

 1.27536   (a) x =   =  .52174 ; .81159         1 1.50 1.2500 −.02536 (b) x(1) =  0 , x(2) =  .50 , x(3) =  .5675 , with error e(3) =  .04076 ; 1 .75 .7500 −.06159     1 1 0 −4 1 2  (k)    1 (k+1) 1 (c) x = 4 0 − 4  x +  0 ; 69 12 23 56 69



1 − 14 − 41 0      1.0000 1.34375 1.26465 (d) x(1) =  .2500 , x(2) =  .53906 , x(3) =  .51587 ; the error at the third .8125 .79883 .81281   −.01071 (3)  iteration is e = −.00587 ; the Gauss-Seidel approximation is more accurate. .001211     1 1 0 −4 1 2  (k)  1   (k+1) 3 1 (e) x =  0 − 16  x +  4 ; 8 

(f ) (g)

1 3 − 32 0 64 √ ρ(TJ ) = 43 = .433013, √ ρ(TGS ) = 3+64 73 = .180375,

13 16

so Gauss–Seidel converges about log ρGS / log ρJ = 2.046

times as fast. (h) Approximately log(.5 × 10−6 )/ log ρGS ≈ 8.5 iterations.     1.27536 −.3869 (i) Under Gauss–Seidel, x(9) =  .52174 , with error e(9) = 10−6  −.1719 . .81159 .0536 1 11.1.11 Change upper integration limit in the formula for a = 2π 11.2.2 (f ) ϕ(x) = a < 1 < 2 < b.

4/28/12

1 2

δ(x − 1) −

1 5

δ(x − 2);

5

Z

a

Z

2π 0

Z

y

f (z) dz dy.

0

b

ϕ(x) u(x) dx =

1 2

u(1) −

c 2012

1 5

u(2) for

Peter J. Olver

′

11.2.8 (d) f (x) = 4 δ(x + 2) + 4 δ(x − 2) +

(

1,

| x | > 2,

− 1, | x | < 2, = 4 δ(x + 2) + 4 δ(x − 2) + 1 − 2 σ(x + 2) + 2 σ(x − 2),

f ′′ (x) = 4 δ ′ (x + 2) + 4 δ ′ (x − 2) − 2 δ(x + 2) + 2 δ(x − 2). 11.2.13 (a) The symbol on the vertical axis should be n, not

1 2

n.

11.2.15 Replace − σy (x) by σy (x) − 1. (k)

11.2.17 In displayed equation, delete extra parenthesis at end of δξ (x) (twice); (k)

replace u by u′ in penultimate term: − h δξ , u′ i. 11.2.19 Delete one repetition of “On the other hand”. 11.2.31 (a)   x (1 − y), − 14 n x2 + un (x) =   y (1 − x),

1 2

n − 1 x y − 14 n y 2 + 21 y + 21 x −

(b) Since un (x) = G(x, y) for all | x − y | ≥

x 6= y, while lim un (y) = lim n→∞

n→∞

2

y−y −

1 , n

1 4n

1 4n,

0≤x≤y−

|x−y| ≤ y+

1 n

1 n

1 n

,

,

≤ x ≤ 1.

we have lim un (x) = G(x, y) for all 2

n→∞

= y − y = G(y, y). (Or one can appeal

to continuity to infer this.) This limit reflects the fact that the external forces converge to the delta function: lim fn (x) = δ(x − y). n→∞

0.2 0.15

(c)

0.1 0.05

0.2

11.2.35 End of first line:

4/28/12

0.4

0.6

0.8

1

∂I ∂I dα ∂I dβ + + ∂x ∂z dx ∂w dx

6

c 2012

Peter J. Olver

11.3.3 (c) (i ) u⋆ (x) = (ii ) P[ u ] =

1 2

Z

1

5 2

x2 −

2

1

2

+ x−1 ,

x2 (u′ )2 + 3 x2 u dx,

u′ (1) = u(2) = 0,

37 = − 1.85, (iii ) P[ u⋆ ] = − 20

1 (iv ) P[ x2 − 2 x ] = − 11 6 = − 1.83333, P − sin 2 πx = − 1.84534. (d) (iv ) P − 41 (x + 1)(x + 2) = − .0536458, 3 x(x + 1)(x + 2) = − .0457634. P 20

x+1 1 (1 + sin x) cos 1 − log . 2 2 (1 − sin 1) cos x (The given solution is for the boundary conditions u(−1) = u(1) = 0.) 11.3.5 (c) Unique minimizer: u⋆ (x) =

11.3.21 u(x) =

2 log x x3 − 1 + 9 9 log 2

11.5.3 Change sign of last expression:

11.5.7 (b) For λ = − ω 2 < 0,

for λ = 0,

for λ = ω 2 6= n2 π 2 > 0,

11.5.9 (c) (i ,ii ,iii ) Replace

Z

b

by a

Z

e− ω/2 eω x + eω e− ω x 1 + . ω2 ω 2 (1 + eω/2 )  sinh ω (y − 1) sinh ω x   , x < y,  ω sinh ω G(x, y) =    sinh ω (x − 1) sinh ω y , x > y; ω sinh ω ( x(y − 1), x < y, G(x, y) = y(x − 1), x > y;  sin ω (y − 1) sin ω x   , x < y,  ω sin ω G(x, y) =    sin ω (x − 1) sin ω y , x > y. ω sin ω =−

2

. 1

Acknowledgments: We would particularly like to thank to David Hiebeler, University of Maine, for alerting us to many of these corrections.

4/28/12

7

c 2012

Peter J. Olver

Linear Algebra Solutions

Recommend Documents