PERSPECTIVES IN MATHEMATICS DAVID E. PENNEY, The University of Georgia
!~~t~A~~
~
~ W. A. BENJAMIN, INC.
~ ~ ~{IS't\~
Menlo Park, California
Copyright © 1972 by W. A. Benjamin, Inc. Philippines copyright 1972 by W. A. Benjamin, Inc. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of the publisher. Printed in the United States of America. Published simultaneously in Canada. Library of Congress Catalog Card No. 70-166540.
For B. J. Pettis Harry H. Corson A. C. Woods Paul S. Mostert G. O. Sabidussi Paul F. Conrad Fred B. Wright A. D. Wallace A. H. Clifford G. S. Young, Jr. Frank D. Quigley and especially L. B. Treybig
1 1 1 1 1 1 1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
PREFACE
This book was planned primarily as a text for a one year course in mathematics for students not intending to take calculus. However, it could be used as a sourcebook by teachers and mathematicians or as outside reading by undergraduate mathematics majors. The usual topics from collegiate precalculus courses have specifically been excluded, as has calculus itself. Each chapter contains a rather detailed study of a topic chosen from one of the major branches of modern mathematics. Although some of this material is available elsewhere, it has not been gathered before into a single volume written for the reader who does not have an extensive background in mathematics. I quite frankly admit to choosing topics I found particularly interesting, and I hope that the reader will be pleased with some of these choices. The problems at the end of each chapter are in many cases meant to open up avenues of deeper exploration of the subject of the chapter. The chapters themselves are almost wholly independent of one another, thus enabling the student to learn about those topics most appropriate for him, in any order, and in the necessary degree of mathematical rigor. This should provide the flexibility desirable for so diverse an audience-even undergraduate mathematics majors are not commonly exposed to the majority of the material included, and I hope that they too might enjoy this book. Although the formal prerequisites for one who wishes to profit from this book are minimal-some high-school algebra, a little geometry, and aptitude to do college-level work-it is not an easy book. It contains only incidental references to the history of mathematics. It should acquaint the reader with the various branches of modern mathematics, and each student is expected to find out what mathematics is by doing mathematics: guessing patterns, making conjectures, and (most important) proving theorems. ix
x
Preface
Most technical terms are defined upon the occasion of their first appearance, and in this case the term appears in boldface. The statements of theorems appear in italics. The reader is advised to have paper and pencil always available, and should not hesitate to consult the index. Generally, the last problems of each chapter are the most difficult, but there are numerous exceptions; the difficult problems have not been so indicated in order to increase the similarity to mathematical research. I wish to thank for their patience those undergraduates who were exposed to this book in its preliminary form, the Department of Mathematics of the University of Georgia for making it possible for me to write this book, and my former teachers of mathematics, to whom it is dedicated. Special thanks are due Professor Gail S. Young of the University of Rochester and Professor Frank D. Quigley of Tulane University for their help with the second chapter, Professor F. A. Roach of the University of Houston for his help with the third, ProfessorH. S. M. Coxeter of the University of Toronto for his help with Chapter 5, and Professor D. B. Hinton of the University of Tennessee for his help with Chapter 8. Finally, lowe a great deal to my wife Carol, who read every word, suggested innumerable improvements, and invented a number of the better problems. Athens, Georgia November 1971
D.E.P.
CONTENTS
Chapter 1
The Bolyai-Gerwin Theorem
1.1 1.2 1.3 1.4 1.5 1.6 1.7
The fundamental definitions The Bolyai-Gerwin theorem How to cut a polygon into triangles How to cut a triangle to make a parallelogram How to cut a parallelogram to make a rectangle How to cut a rectangle to make a square How to cut several squares and reassemble the pieces to form a single square . 1.8 How to cut a square into pieces which can be reassembled to form a given polygon of the same area 1.9 Some concluding remarks Chapter 2
2.1 2.2 2.3 2.4 2.5 2.6 2.7 Chapter 3
1
2 6 8 11 11 12 14 16 17
Brunnian Links
25
The simple closed curve Links and their properties A simple algebra. Brunnian 4-links. Notation and examples. Commutators The final generalization.
27 30 33 39 43 46
The Well-Tempered Clavichord
52
44
3.1 Properties of logarithms 3.2 A peculiar manipulation 3.3 Continued fractions.
53 57 59 xi
xii
Contents
3.4 3.5 3.6 3.7 3.8 3.9
Chapter 4
The value of a continued fraction . Applications to baseball and grade distributions Harmony. Tuning a piano, old style Tuning a piano, new style Improving the octave
Group Theory
4.1 Some examples of groups 4.2 Subgroups 4.3 Cyclic groups and Abelian groups.
Chapter 5
5.1 5.2 5.3 5.4 5.5 Chapter 6
62 66 70
73 76
78 82
82 95 102
Polyhedra
108
The definition of polyhedron Euler's formula Regular solids A converse of Euler's formula . Map coloring
108 118
Infinite Sets
145
6.1 Sets 6.2 Functions. 6.3 More on one-to-one correspondences.
123 127 134
146 154 162
Contents
6.4 The Cantor-Schroeder-Bernstein theorem 6.5 Properties of finite and infinite sets 6.6 Nondenumerable infinite sets Chapter 7
7.1 7.2 7.3 7.4 7.5 Chapter 8
8.1 8.2 8.3 8.4 Chapter 9
9.1 9.2 9.3 9.4 9.5 9.6
xiii
164 170 177
Number Theory
184
Divisibility Well-ordering The fundamental theorem of arithmetic The greatest common divisor Applications .
184 189 196 205 210
Animal Populations
220
Unrestricted growth of a single species Growth of a single species under limiting conditions The case of two competing species The predator-prey case .
221 229 233 244
The Art Gallery Theorem
254
Convex sets Intersections of convex sets Hulls and kernels Helly's theorem . Krasnoselskii's theorem. L-convexity
254 258 260 266 271 276
xiv
Contents
Chapter 10
10.1 10.2 10.3 10.4 10.5 10.6
The Real Number System
280
The rational numbers Nested intervals of rational numbers Construction of the real numbers . The order relation on R Are there more numbers? An unusual set of real numbers
280 288 295 302 307 311
Epilogue
316
Answers and Hints
320
Index
345
CHAPTER 1 THE BOLYAI-GERWIN THEOREM
You have probably seen some of the numerous puzzles and games involving a small number of plastic pieces of various shapes. In one variety of these, an accompanying booklet gives outlines of various figures, and one attempts to form the pieces into these figures. Also enjoying some popularity are board games in which one uses the varied shapes of his pieces to force or prevent certain moves by his opponent. It should not surprise you to find that many mathematicians like such puzzles, and moreover that some of the mathematical ideas behind such puzzles have been studied by mathematicians for many years. The puzzles usually feature one extra problem when one tires of constructing figures; for example, the plastic pieces come in a flat rectangular box, and we find it surprisingly difficult to return the pieces properly to the box. This chapter will reveal, among other things, a method by which you can construct such puzzles of your own, even ones which will fit into a rectangular box, if you wish. You will be able to cut several plastic pieces which will fit into the box, but which you can reassemble to form a triangle, a hexagon, a symmetrical five-pointed star-or all three. Of course, one restriction immediately becomes apparent. If one such figure can be cut up and the pieces reassembled to form another figure, the two must have the same area. On the other hand, it is hard to believe that a square could be cut into a finite number of pieces which could be reassembled to form a circular disk of the same area, but we cannot answer whether or not this is possible. This question presently enjoys the status of an Unsolved Problem in Mathematics. (Do not confuse this with the famous problem of Squaring the Circle-the distinction between the two will be discussed in the 1
2
The Bolyai-Gerwin Theorem
1.1
exercises.) Because circles, curved cuts, and curved figures in general lead us beyond the frontiers of current mathematical knowledge, we shall restrict our attention to polygons and straight-line cuts. In these cases the mathematical problems involved have been solved, and the solution is remarkably simple. All that is required is that the two figures have the same area. If so, then either may be cut into pieces and reassembled to form the other. This is the essence of the Bolyai-Gerwin Theorem. 1.1 THE FUNDAMENTAL DEFINITIONS
We plan to lead you through a proof of the Bolyai-Gerwin Theorem for two reasons. First, it is in some sense a very "rich" proof in that the proof itself answers numerous questions which may already have occurred to you. Second, the proof is a good model of mathematical thinking, and you should develop the habit of mathematical thinking in order to profit from this book. Have no fear, though, that we are trying to turn you into a mathematicianwe hope to convince you that mathematical thinking is not at all mysterious, but merely ordinary careful reasoning. A little geometry will be helpful to you but not essential. Your intuitive idea of the area of a plane figure will serve, and we need but two definitions. Before the first definition, a small bit of notation will be introduced for clarity. If L is a straight-line segment in the two-dimensional plane, we will write L = [a, b] to indicate that a is one endpoint of L, and b is the other. The interval notation is used to remind us of what is in fact the case, that [a, b] consists of the two points a and b together with all the points between them on the straight line through a and b. And we call a the first point of Land b the last point of L. A polygon is a plane figure of finite area bounded by a finite number of straight line segments L b L 2 , L 3 , .•• , L", such that the last point of L 1 is the first point of L 2 , the last point of L 2 is the first point of L 3 , and so on, and the last point of L" is the first point of L 1 ; and moreover, other than as indicated above, no two of these line segments have any points in common. These line segments are called the edges of the polygon, and the endpoints of these line segments are called the vertices of the polygon. The points belonging to the edges of a polygon are to be considered part of the polygon; that is, the polygon consists of all its interior points together with all its boundary points. Consequently, when we speak of a point x belonging to the polygon P, it is permissible that x be either an interior point or a boundary point of P. We require that the figure have finite area to avoid certain paradoxes (to be discussed in the exercises) and also so that we may attach the common meanings to many terms such as "square." We certainly want to mean by "square" a square boundary together with what's inside, and the requirement
1.1
The fundamental definitions
3
that a polygon have finite area prevents us from the necessity of dealing with a square boundary together with what's outside. Two polygons are said to be equidecomposable if it is possible to cut one up, using straight cuts, into a finite number of pieces which may be reassembled, omitting none, to form the other. You may object that this definition is somewhat vague. But an attempt to make it much more precise would also make it several times as long, and require the introduction of several other terms which would also have to be defined precisely. Again, as with the idea of area, we ask that you accept your natural understanding of the concept of equidecomposability as the correct one. At this point you may test your concept of area and your understanding of equidecomposability by means of the following theorem. Theorem 1.1 area.
If two polygons are equidecomposable, then they have the same
If you feel that this proposition is obvious, then you surely understand enough to proceed. After all, whatever we may mean by "area," the area of a plane figure should not be altered by rigid (congruence) motions-so all that would be necessary for a proof of the above theorem is to cut the first polygon into several pieces and measure the area of each such piece. Then observe that as the pieces are assembled into each of the two given polygons the resulting total areas must be equal, for each is the sum of the same set of numbers. But on a deeper level there are always a few "obvious" things which become less obvious when more closely examined. If we were to start with certain axioms about the geometry of the two-dimensional plane and attempt to define "area," the development of this definition would be lengthy, and the above theorem would have a rather complicated proof. For example, it is conceivable that some of the pieces into which a polygon might be cut simply cannot have area. It would be necessary to establish that if only straight cuts are used to form such pieces then each piece must in fact have area, and then that the total area of the polygon is equal to the sum of the areas of the individual pieces. Further consideration of these topics will be found in the exercises. Exercises
1.1 Which of the figures shown in Fig. 1.1 is a polygon? Exactly in what way does each that is not a polygon fail to satisfy the definition? 1.2 What is the smallest number of vertices that a polygon can have? Prove that your answer is correct. 1.3 Construct a polygon having one and only one vertex in each of the four quadrants and not containing the origin-or prove that no such polygon can exist.
4
The Bolyai-Gerwin Theorem
(a)
1.1
(b)
(e)
Fig.1.1
Which of the above figures is a polygon 7
1.1
The fundamental definitions
5
1.4 Need a polygon have positive, rather than zero, area? Why? 1.5 Show that a polygon must have at least one interior angle less than 180 1.6 The relation of equality between numbers enjoys the following three properties. 0
•
a) The Reflexive Property: If x is a number, then x = x. b) The Symmetric Property: If x and yare numbers such that x = y, then y = x. c) The Transitive Property: If x, y, and z are numbers such that x = y and y = z, then x = z. Such a relation is said to be an equivalence relation. If P and Q are polygons, let us write P '" Q provided that P and Q are equidecomposable; that is, if it is possible to cut P up, using straight cuts, into a finite number of pieces which can be reassembled, omitting none, to form Q. Show that equidecomposability is an equivalence relation: specifically, if P, Q, and R are polygons, then a) P '" P.
b) If P '" Q, then Q '" P. c) If P '" Q and Q '" R, then P '" R. 1.7 Two polygons are said to be nonoverlapping if they have, at most, boundary segments in common. In particular, polygons with only one point, or nothing, in common are nonoverlapping. We take for granted in our definition and subsequent discussion of equidecomposability that the pieces one assembles to form a given polygon are to be nonoverlapping, but that they may have common edges. Show that a triangle is not the union of nonoverlapping parallelograms each pair of which are congruent. Is this still true if we drop the restriction that each two parallelograms are congruent, and require only that finitely many parallelograms be used? 1.8 Plane figures not of finite area can exhibit surprising behavior, even if the boundaries are required to be straight-line segments meeting in common endpoints, much as in the definition of a polygon. We might as well allow the use of half-infinite lines as well, with only one endpoint, that is also an endpoint of another such half-infinite line or line segment. Here, unfortunately, the notion of geometric congruence breaks down rather badly. It turns out that there is an example of such a figure which is congruent to a proper part of itself. Thus there can be no way to define "area" for such figures so that congruence motions would preserve area, or so that proportional figures with the same area would be congruent.
6
The Bolyai-Gerwin Theorem
1.2
Find an example of such a plane figure with polygonal boundary congruent to a proper part of itself. (The figure must be unbounded-that is, contained in no circle, no matter how large the radius of the circle.) 1.9 In the terminology of the previous exercise, find an example of an unbounded plane figure with polygonal boundary such that the figure has finite area. It is permissible to use infinitely many segments to form the boundary. 1.10 It is also very easy to find an example of a plane figure that has a finite perimeter but infinite area. Please do so. 1.2 THE BOlYAI-GERWIN THEOREM
The Bolyai-Gerwin Theorem is just the converse of Theorem 1.1. The beauty of this theorem lies in the extremely simple condition for equidecomposability of two polygons and the curious fact that the theorem is unexpectedly true.
Theorem 1.2 (Bolyai-Gerwin) are equidecomposable.
If two polygons have the same area then they
The proof has its own beauty: it is constructive. The statement of the theorem would be of little use to you should you wish to construct such puzzles as were mentioned earlier, but the proof provides a "how to do it" recipe. If the proof were not broken into a sequence of short steps, it would be easy to lose the thread of the argument, so we shall present the proof in the following sequence. We shall show, in order, how to cut 1) a polygon into triangles, 2) a triangle to make a parallelogram, 3) a parallelogram to make a rectangle, 4) a rectangle to make a square, 5) several squares to form a single square when the pieces are reassembled, and 6) a square into pieces which can be reassembled to form a given polygon of the same area. Given two polygons of the same area, we can cut one into pieces which can be assembled into a square, using the first five steps above. We can do the same with the other polygon and assemble both into squares. The two squares will have the same area, and so can be superimposed. If all cuts now visible are made, then either square can be made into either polygon. Hence the first polygon can be made into the second, with one of the squares as a halfway point. Finally, before we begin our sequence of six short proofs, nute that in each proof, each construction can be performed by so-called "pure" geometric methods-that is, with unmarked straightedge and compass. This
1.2
The Bolyai-Gerwin Theorem
7
Fig.1.2 The Texan Rectangle.
is not necessary for the truth of the Bolyai-Gerwin Theorem, but it does serve as an added attraction and makes the construction of puzzles considerably simpler. Also note that in each proof all cuts are straight-line cuts, as required in the definition of equidecomposability, and finally that it is never necessary to lift a piece and turn it over. Exercises
1.11 Let R be a rectangle with short side of length one unit and long side of length two units, and let S be a square of the same area as R. Show how to cut R into a finite number of pieces, using straight cuts, so that the pieces can be reassembled to form S. Can you do this with only three cuts? 1.12 Show how to cut up the Texan Rectangle shown in Fig. 1.2, bounded by two parallel rays and a segment perpendicular to each, into infinitely many squares that can be reassembled to form a strip twice as wide. This example is one reason why we restrict our attention to figures of finite area when dealing with equidecomposability. 1.13 Show how to cut the Texan Rectangle into squares that can be reassembled to form the entire two-dimensional plane. 1.14 Show how a given triangle can be divided into two right triangles with a single cut. 1.15 Let T be a right triangle. Show how you can cut T into two pieces which can be reassembled to form the mirror image of T, without turning any piece over. 1.16 In the next section we will show how to cut any polygon into triangles. Thus the two previous exercises show how turning pieces over in our study of equidecomposability could be avoided, even if thought necessary in some construction. Explain how.
8
The Bolyai-Gerwin Theorem
1.3
1.17 What is a theorem? A corollary? A lemma? Look up the answers. What is the difference between a mathematical definition and the sort of definition found in a dictionary? 1.18 How would you phrase the Bolyai-Gerwin Theorem for the onedimensional case-that is, for sets which are subsets of the real number line? In order to answer this question, you will have to decide on a one-dimensional analogue of "polygon," and define equidecomposability for a figure like this. 1.19 Do you believe that the theorem you phrased in the previous exercise is true? 1.20 See Exercise 1.6. Given real numbers a and b, let us say that a is equivalent to b, and write a ~ b, provided that a - b is a whole number. Is the relation ~ an equivalence relation? If so, prove it; if not, show why not.
1.3 HOW TO CUT A POLYGON INTO TRIANGLES
It will be much more convenient if the mathematical objects with which we work have short and easily remembered names, so we begin by letting P be a plane polygon. Let its vertices be called Vi' V2' V3' ..• ,Vn• We shall also suppose that we have so named these vertices that the edges of P are the line segments [Vh V2], [V2' v3 ], ... , [Vm Vi]. If P has only three vertices, it is already a triangle and no further construction is necessary. If P has four or more vertices, there must be at least one at which the interior angle is less than 1800 (why?). We may suppose that we have named this vertex Vi' and thus by the definition of polygon, Vi lies on the two edges [Vi' V2] and [v m Vi] of P. Consider the line segment K joining V2 with Vn• Now K will not be an edge of P itself, but it may happen that except for the endpoints of K, K lies entirely within the interior of P. If so, we cut P along this new line segment, and thus obtain two new polygons. One is, of course, the triangle T whose vertices are Vi' V2' and Vm and the other is a new polygon P' with one vertex fewer than P. On the other hand, what if K does not lie entirely within P? In this case we can produce another new line segment L, also joining two vertices of P, and lying entirely within P, as follows. Imagine a straight line M, parallel to the segment K, and passing through the vertex Vi' Then imagine this line M moving slowly toward K while remaining always parallel to it. Since K does not lie entirely within the polygon P, as M moves toward K it must eventually meet some part of the boundary of P that lies between [v h V2] and [V m Vi]. To be precise, M must intersect some part of the boundary of P that lies within the triangle T whose vertices are Vi' V2' and Vn , as shown in Fig. 1.3. But when M first meets part of the
1.3
How to cut a polygon into triangles
9
Fig. 1.3 Triangulation of a polygon.
boundary in this fashion, during its journey from V l to K, then M must simultaneously meet at least one vertex Vi of P. Let L be the line segment from Vl to Vi. Then L lies, except for its endpoints, wholly within the interior of P. Thus we can always produce some new line segment, either K or L, joining two vertices of P not previously joined by an edge of P, and lying except for its endpoints wholly within the interior of P. If we cut P along this new line, we produce two new polygons with the following important property: Each has fewer vertices than did P. We repeat this process on each resulting polygon that has more than three vertices until the process is forced to come to a halt by virtue of the fact that P has been cut entirely into triangles. All the cuts are straight-line cuts, and each construction can be performed with straightedge alone. The method is the "most efficient possible" in that it does not require the creation of new vertices. Thus we have shown how it is possible to cut a given polygon into triangles.
10
1.3
The Bolyai-Gerwin Theorem
Exercises
1.21 Suppose that P is a polygon with vertices V h V2, V 3 , ... ,Vn • In the construction of this section, in which a polygon such as P is decomposed into triangles, how many cuts must be made, and how many triangles will be obtained? The answer of course depends on the value of n, and should give the correct value in particular for n = 3 and n = 4. 1.22 Suppose that f/ is a statement meaningful for natural numbers; that is, f/ might be the statement "Every natural number is composite." In order to be meaningful it is not necessary that f/ be true, merely that for each natural number f/ is either true or false. Let us consider the statement f/, as an example, that "The sum of the first n natural numbers is n(n + 1)/2." In this case f/ happens to be always true, and for such statements there is frequently the possibility of proving them true by the method of induction. To prove by the method of induction that such a statement is true, we must first establish that the statement is true for n = 1; and then, assuming the truth of the statement for n = k, show that it follows that the statement is also true for n = k + 1. Consequently, since the statement is true for 1, it is then true for 2; then, since it is true for 2, it is also true for 3, and so on. Thus the statement must be true for each natural number. Use the method of induction to prove that your answer to the previous exercise is correct. 1.23 Continuing the previous exercise, let f/ be the statement 1+2+3+"'+n=
n(n
+
1)
2
.
Prove f/ by the method of induction. 1.24 The method used in this section to prove that each polygon can be triangulated is actually a proof by induction in disguise. Explain how to rephrase this proof so that the use of the method of induction becomes more apparent. 1.25 In the proof of the last section, we showed how to cut up a polygon into triangles without the introduction of new vertices. Do you think an analogous process will work for three-dimensional polyhedral solids? In other words, can each polyhedral solid be cut up into tetrahedra (not necessarily regular) without the introduction of new vertices? 1.26 See Exercise 1.22. Prove by induction: If 0 < a < 1 and n is a natural number, then 0 < an < 1. Of course, you may use anything you know about inequalities. 1.27 Prove by induction: If n is a natural number, then (-It = 1 (_l)n+t
=
-1
if n
IS
even,
if n is odd.
1.5
How to cut a parallelogram to make a rectangle
11
1.28 Prove by induction: If n is a natural number, then 4n - 1 is evenly divisible by 3. Note that if m is a natural number evenly divisible by 3, then there exists a natural number k such that m = 3k. 1.29 Give an example of a pair of rectangles of the same area such that it is not possible to cut one into finitely many squares that can be reassembled to form the other. 1.30 Give an example of a rectangle that cannot be cut into finitely many squares each two of which have the same area. 1.4 HOW TO CUT A TRIANGLE TO MAKE A PARALLELOG RAM
This is much easier. In fact, you have probably already guessed how to do it. If not, one hint is that it can be done with only one cut. Let T be a triangle and let a and b be midpoints of two of its sides. Join a with b by the straight-line segment [a, b], and cut T along [a, b]. This will produce a small triangle and a trapezoid. Holding the trapezoid fixed, rotate the triangle half a turn about either a or b. The resulting figure will be a parallelogram. That it is indeed a parallelogram is the only fact not immediately obvious. Establishing this fact is left for the next exercise. Exercises
1.31 Establish, as indicated in the proof of this section, that the reassembled triangle actually does form a parallelogram. 1.32 With what type of triangle will the method of this section produce a rectangle rather than just a parallelogram? 1.5 HOW TO CUT A PARALLELOGRAM TO MAKE A RECTANGLE
This is almost as easy as the previous proof. It is possible to construct in the parallelogram (which we suppose is not already a rectangle) a line from one vertex where the interior angle exceeds 90° to the opposite side, such that this line lies within the parallelogram and is perpendicular to the opposite side. Cut along this line; you will obtain a trapezoid and a triangle. If the triangle is moved without rotation to the opposite side of the trapezoid, the two will fit together to form a rectangle. Again, the only problem is to show that the new figure is indeed a rectangle, and we leave this for the exercises. Exercises
1.33 Show how to construct the perpendicular needed for the cut in the proof of this section.
12
1.6
The Bolyai-Gerwin Theorem
q
Fig.1.4 Finding the solution of x 2 = abo
1.34 Prove that the reassembled parallelogram in the proof of this section actually does form a rectangle. 1.35 Under what circumstances will the reassembled parallelogram in the last proof form a square, rather than just a rectangle? 1.6 HOW TO CUT A RECTANGLE TO MAKE A SQUARE
There is a trick to this construction. We must first determine the size of the square, draw it, and then use it to determine where to make the cuts on the rectangle. To draw the desired square, we use a method known to the ancient Greek geometers. Let the rectangle be called R, let its short side have length a, and let its long side have length b. (If R is already a square we are finished, and no construction is necessary.) Draw a straight line of length a + b, bisect it, and draw a circle with the center at the point of bisection so that the segment of length a + b forms a diameter of that circle. Since this diameter has length a + b, there is a point p on it which divides the diameter into two segments, one of length a and one oflength b. Draw a perpendicular to the diameter at the point p. See Fig. 1.4. This perpendicular must meet the circle at some point q. It turns out that the length x of the segment [p, q] has the property that x 2 = ab (to be
1.6
How to cut a rectangle to make a square
13
s
Fig. 1.5 How to turn a rectangle into a square of the same area.
R
established in an exercise), and hence a square of side length x will have the same area as the rectangle R. There will now be a small technical problem if the rectangle R is more than four times as long as it is wide, but in an exercise we will ask you to show how such a rectangle can be converted into one which is less than four times as long as it is wide. Hence we may assume that R itself is in fact less than four times as long as it is wide. Draw R and the square, which we shall call S, in such fashion that they overlap as shown in Fig. 1.5. Draw the extra line L also shown in the same figure, and cut R along that portion of L which lies in R. Then cut R along the line segment M, that part of the right-hand side of S which lies below L. Now R consists of several regions, numbered from 1 to 4 in the figure. S consists of the regions numbered 1, 3, 5, and 6. We reassemble the pieces of R to form S as follows: • Leave piece 1 alone. • Move piece 2 to the position of piece 5. • Move the triangle composed of pieces 3 and 4 to the position of the triangle composed of pieces 3 and 6.
14
The Bolyai-Gerwin Theorem
1.7
No rotations are used in these motions. All we need to do is show that triangle 2 is congruent to triangle 5 and that the triangle composed of pieces 3 and 4 is congruent to the triangle composed of pieces 3 and 6. Again, the proof is left for the exercises. We wanted the rectangle R to be less than four times as long as wide, to ensure that the line L will angle down steeply enough to guarantee the existence of triangle 3. In any case, we have shown that with two straight cuts a rectangle can be reassembled into a square. Exercises
1.36 In Section 1.6, an important preliminary to the construction is to show how to construct a line segment of length x such that x 2 = ab (see Fig. 1.4). It is shown in many courses in plane geometry that if the lines from q to the endpoints of the diameter are drawn, then the angle formed at q is a right angle. Use this fact to show that x 2 = abo 1.37 In the proof of this section, it was necessary to use a rectangle less than four times as long as wide. Suppose that R is a rectangle more than four times as long as wide. Show how to convert R by straight line cuts into a rectangle of equal area less than four times as long as wide. What if R is exactly four times as long as wide? 1.38 The construction in this section states that the rectangle R has to be less than four times as long as it is wide in order to ensure the existence of triangle 3 (see Fig. 1.5). Establish this by showing the following to be true: If the rectangle R is less than four times as long as it is wide, then the line L angles down steeply enough to guarantee the existence of triangle 3, and not otherwise. 1.39 The last step in the proof of this section is to show that triangle 2 is congruent to triangle 5, and that the triangle composed of pieces 3 and 4 is congruent to the triangle composed of pieces 3 and 6. Please do so.
1.7 HOW TO CUT SEVERAL SQUARES AND REASSEMBLE THE PIECES TO FORM A SINGLE SQUARE
Actually all we will show is how to assemble two squares 8 1 and 8 2 into a single square 8. If there should happen to be a third square 8 3 then we would repeat the process with 8 and 8 3 to obtain a single square, and so on. Let 8 1 and 8 2 be two squares. If they should happen to have the same area, then there is an extremely easy way to cut and reassemble them into a single square, a way left for you to discover. So let us suppose that the area of 8 1 exceeds the area of 8 2 . As in Section 1.6, a preliminary construction is needed to determine where the cuts should be made in 8 1 and 8 2 in order that
1.7
Cutting squares and reassembling as one square
I
10 I
/
/
/
/
/
/
.......... ,
"
S
9
...................
"
/
5
8
b
...........................
/
Fig. 1.6 How to turn two squares into a single square.
/
" ' ....
4
/ 3 I
6
I
/
1 I
I ....
,
I
............
f-
I
I
I
/
/
/
/
I
a
/
........
I
........ ,
I
.............................
2
[-
/
I
/
(3 1'....
" ....
8~/
/
/
/
T
S2
"
/
I
15
S,
I
.........
I
.....................
A
........ I 'Y
I
I
I
I
7
I
/
'-'--
the resulting pieces can be reassembled to form the square 8. Draw 8 1 and 8 2 as shown in Fig. 1.6, as well as the auxiliary square A, and let 8 1 and 8 2 have side lengths a and b, respectively. Note in the figure that each of the two points y and {) is located at distance b from the next vertex of A counterclockwise around A. Locate the two points rL and p similarly. Draw the segments [rL, P], [P, y], [y, {)], and [{), rL], shown as dashed lines in the figure. These segments form the boundary of the desired square 8. Cut 8 1 and 8 2 along the parts of the boundary of 8 which lie in each. Now 8 1 consists of three pieces numbered 1, 2, and 3; 8 2 consists of two pieces marked 4 and 5; and 8 itself must be formed by rearranging these pieces so as to cover the regions numbered 1, 4, 6, and 8. We may ignore the pieces numbered 7, 9, and 10, since they do not enter the construction-they are merely artifacts introduced to help discover where to make the cuts, and are not part of either 8 1 , 8 2 , or 8. Here is the recipe for reassembling the pieces of 8 1 and 8 2 to form 8: • Leave piece 1 alone. • Remove piece 4 and leave it on the side for a while.
16
The Bolyai-Gerwin Theorem
1.8
• Pick piece 2 up and move it, without rotation, so that it covers region 4 and part of region 8; the point marked y goes to position ~. • Rotate piece 5 counterclockwise 90° about the point top half of region 6.
~,
thus covering the
• Use piece 3 to cover the rest of region 6; the point marked {3 on piece 3 goes to position y. • Find piece 4 (left on the side for a while) and use it to cover the rest of region 8; the point marked ~ on piece 4 goes to position IX. A little easy geometry involving lengths and right angles can be used to show that the pieces actually fit correctly and do form a square. We chose this particular construction because Fig. 1.6 can also be used to prove the Pythagorean Theorem. There are other proofs of the Pythagorean Theorem, just as there are other proofs, using different figures, that two squares can be assembled into a single square, but we prefer this one because it does two jobs instead of one. Exercises
1.40 Show that the polygon 8 bounded by the segments [IX, {3], [{3, y], [y, b], and [~, IX] is in fact a square. See Fig. 1.6. 1.41 Show that the reassembly in which the squares 8 1 and 8 2 are cut into pieces which form 8 actually works-that is, that the implied congruences are actual. 1.42 The Pythagorean Theorem states that if a right triangle has legs of length a and b, respectively, and hypotenuse of length c, then a2 + b 2 = c2 • Show how ideas in the proof in this section and use of Fig. 1.6 can be used to establish this theorem.
1.8 HOW TO CUT A SQUARE INTO PIECES WHICH CAN BE REASSEMBLED TO FORM A GIVEN POLYGON OF THE SAME AREA
We discussed this procedure immediately after we listed in Section 1.2 the six steps in the proof of the Bolyai-Gerwin Theorem. What we do, of course, is make the polygon into a square, using the first five steps. Clearly, then, the actual process of assembling the pieces of the polygon to form a square may be reversed. Or if you prefer, Exercise 1.6 has some bearing on this problem, and can be used to show the existence of a method for cutting up a square into pieces which can be reassembled to form a given polygon of the same area.
1.9
Some concluding remarks
17
To summarize, suppose that P and Q are two polygons of the same area. Chop each into triangles. Make each triangle into a parallelogram, each parallelogram into a rectangle, and each rectangle into a square. Assemble all the squares obtained from P into one giant square S, and all the squares obtained from Q into one giant square T. Then Sand T will have the same area, since each has the area common to P and Q. If you draw T together with its cuts as a transparent overlay and place this overlay upon S, you will see where to make additional cuts on S. Using these cuts, we can reverse the construction of T from Q, using these new pieces of S and so reassemble S to form Q. This maneuver shows that P and Q are equidecomposable, and concludes the proof of the Bolyai-Gerwin Theorem. Exercise
1.43 Exactly where in the proof of the Bolyai-Gerwin Theorem did we use the hypothesis that the polygons P and Q have the same area? 1.9 SOME CONCLUDING REMARKS
We have derived a great deal of pleasure from the actual construction of puzzles, using the techniques of the proof of the Bolyai-Gerwin Theorem. If you think you too would enjoy such constructions, here are two pieces of advice. One is aesthetic; one pertains to mechanical details. For the aesthetic advice, an important point to remember is that the puzzles will be unsatisfactory if some of the pieces are too small or if there is a great disparity in their relative sizes. Constructions proposed for puzzles should be drawn carefully first, perhaps using cardboard, to verify that there are no tiny pieces. In addition, there is a code of honor among puzzlemakers to the effect that no unnecessary cuts should be made merely for the purpose of confusing the puzzle-worker. Since the only practical way to apply the proof of the theorem to the invention of a puzzle is to turn some pleasing polygon into a square, rather than the reverse, this is indeed the right approach; but variations in where the cuts are made should be tried not only to avoid very tiny pieces, but also to use the smallest possible number of pieces in the puzzle. It is quite difficult to work such a puzzle with more than fifteen pieces, and tremendously difficult to work one with more than twenty-five. Mechanically, a very satisfactory method of constructing the puzzle pieces themselves is as follows: Cut a close-grained light hardwood about onequarter inch thick into the shape of some polygon, such as a regular hexagon. Follow the steps of the construction, lightly drawing lines on the wood with a pencil. Make the necessary cuts after each step using a fine-bladed jigsaw, and the reassembled pieces may be glued in the new shapes to a piece of stiff cardboard to hold them fast for subsequent drawing and cutting. The glue
18
The Bolyai-Gerwin Theorem
1.9
and the pencil marks may be sanded off after the final cuts are made and the cardboard removed. The grain of the wood may give a few clues to the puzzleworker, but he will appreciate this. Keep a record of your construction in case your victim challenges you to work the puzzle. A natural question that would occur at this point to a mathematician is the following: Does a theorem analogous to the Bolyai-Gerwin Theorem hold for a three-dimensional figure? That is, is it true that if we have two solid polyhedra with the same volume, then we can show that they are equidecomposable, still of course using only straight-line cuts? The answer is no. Of course, there are some pairs of solid polyhedra with the same volume which are equidecomposable-thickening up any two equidecomposable polygons, as you would in effect do if you constructed a puzzle, would provide such an example. That the proposed theorem is not true does not mean that it is always false, merely that there do exist pairs of polyhedra with the same volume which are not equidecomposable. In fact, what may be the simplest possible example-a cube and a regular tetrahedron of the same volume-provides an example of a pair of nonequidecomposable polyhedra with the same volume. This example was discovered by the German mathematician M. Dehn and first published in 1902. One might next ask if, given two polyhedra with the same volume, it is possible to tell by a simple test whether or not they are equidecomposable. The Swiss mathematician H. Hadwiger found the answer to this question in 1949, but to provide the necessary definitions here to express the answer would take us too far afield. Let it suffice to say that, as one might expect, the question is resolved on the basis of the dihedral angles of the two polyhedra. Exercises
1.44 A famous problem of antiquity, studied by early geometers, was the problem known as Squaring the Circle. Given a circle, the problem was to construct a square of equal area, using only an unmarked straightedge and compass. This problem is usually mentioned along with two companions, that of trisection of a given angle and the duplication of the cube (given a cube, to construct with straightedge and compass the edge length of a cube with double the volume of the first). These problems remained unsolved for thousands of years, but the solutions are now known. The answer is that in each of the three cases cited above no such construction can exist. Please do not misinterpret this statement. We do not mean that nobody can square the circle with straightedge and compass because nobody knows how. What we do mean is that in each case it has been proved that no such construction exists, nor ever can exist. Of course, mathematics departments of various famous universities receive from time to time proposed constructions. The tendency is to ignore them, or return them to the proposer with a note
1.9
Some concluding remarks
19
to the effect that the construction is not correct, and that if the proposer wants to know where the error is, he should be willing to pay a fee for professional services. This has given mathematicians quite a reputation for dogmatism and narrow-mindedness, but most would prefer to work on the many problems to which the solution is yet unknown rather than search for an error in a construction they know in advance to be incorrect. The reason that these constructions are impossible has to do with the fact that, given a line segment of length 1, unmarked straightedge, and compass, certain lengths cannot be constructed. However, if a number can be constructed, such as 2 or 1/4 (how?), its square root can also be constructed. The technique is concealed in the proof of the Bolyai-Gerwin Theorem, in Section 1.6 of this chapter. You are invited to discover the technique and use it to construct a line segment whose length is the square root of 5, given straightedge, compass, and a line segment of length 1. Curiously enough, the impossibility of squaring the circle with straightedge and compass has little bearing, as far as we know, on the Unsolved Problem in Mathematics mentioned early in this chapter-the problem of whether a square and a circle of the same area are equidecomposable in the general sense that, naturally, curved cuts are to be allowed, but still that only finitely many pieces should be used. We do not require that only straightedge and compass be used. The problem here is one of finding where to make the cuts, or alternately, to show that there can be no way to make the cuts that will work. This problem was first stated by A. Tarski in Fundamenta Mathematicae in 1925. 1.45 The first three stages in the construction of the Snowflake Curve are shown in Fig. 1.7. Some techniques of topology may be used to show that it makes sense to talk about "the figure obtained by continuing this process, over and over, once for each natural number." That is, of each point in the plane it can be determined whether that point is eventually (and thus permanently) within the curve, or never within. The boundary of the figure, which is what we mean by the snowflake curve itself, consists of those points which eventually get on the boundary at some stage of the construction and then stay on the boundary in each successive stage. Since the figure is bounded its area is finite. However, the perimeter is infinite; that is, given any whole number n no matter how large, at some stage in the construction of the snowflake curve its perimeter exceeds n, and the perimeter increases at each stage. In fact, the perimeter is multiplied by 4/3 each time we pass to the next stage. So if the initial perimeter is 1, we obtain the following sequence of perimeters for the successive stages of the construction: 1, 4/3, (4/3)2, (4/3)3, (4/3)4, (4/3)5, ... , and this sequence of numbers increases without bound. Why?
20
The Bolyai-Gerwin Theorem
1.9
Fig. 1.7 First three stages in the construction of the Snowflake Curve.
2
3
If you know a little about infinite geometric series, you can use this knowledge to calculate the area of the figure bounded by the snowflake curve. Such a series is one in which each term is a fixed multiple, say r, of the previous one; that is, the series has the form
1.9
Some concluding remarks
21
where a2 = rat, a3 = ra2' a4 = ra 3, and so on. The sum of such a series is the first term divided by 1 - r. This formula works only if - 1 < r < 1, but such will be the case if you find the right series for computing the area of the snowflake curve. See whether you can do this. 1.46 It is possible that a plane figure has no area. We do not mean "zero area," nor do we mean "infinite area." A much stranger phenomenon can take place. What ought "area" to mean? Suppose we list the properties it should have, and then ask first if any such thing exists. That is, does there exist a function A which assigns to each bounded subset of the two-dimensional plane a nonnegative real number, called the area of the subset, with the following properties: a) The area of a line segment, or a point, or the "empty set," is zero. b) If Rand S are congruent bounded plane sets, then they have the same area. c)
If Rand S are bounded plane sets which overlap in a set of zero area, then the area of R u S is the sum of the area of R and the area of S.
d) If ReS then the area of R does not exceed the area of S. e) The area of a rectangle is the product of its length and its width. The Polish mathematician S. Banach published in 1923 a proof that no such function A can exist. The only way, then, to get an effective areameasuring function such as we would like A to be is to drop one or more of the above stipulations as to its behavior. If property (e) is dropped, then it turns out that there is only one such function, and it is a rather dull one; it assigns' area zero to every set. So we might as well keep property (e). But we surely do not want to give up any of the first four properties listed, and so the only solution is to drop the stipulation that the area function A measures every bounded subset of the plane. If we do this, it is then possible to construct a function A with all the desirable properties listed above; in fact, A will coincide with our intuitive notion of "area" for geometric figures. For example, A will assign the "correct" value for the area of a circle. What goes wrong, as shown by Banach, is that there will exist bounded plane figures to which A cannot assign any area value at all while still satisfying all the listed properties. Such a set is said to be nonmeasurable. No one has ever constructed an example of such a set; we merely have a proof of its existence. This is understandably not very satisfying, but it is also unavoidable. Incidentally, nonmeasurable sets exist in every dimension. Can you use the techniques of this chapter to construct an area function A with the properties listed above so that A does assign an "area" to every bounded subset of the plane with polygonal boundary?
22
The Bolyai-Gerwin Theorem
1.9
1.47 The idea of volume in dimension three is a natural one, although we
have already mentioned that there are polyhedral figures of equal volume which are not equidecomposable. On the other hand, S. Banach and A. Tarski found that the situation is far worse than one might expect. It turns out that the solid ball of radius 1 and the solid ball of radius 2 are equidecomposable-though here we use the term in a much more general sense than previously, for the cuts are not straight, and most of the pieces into which the balls are cut are nonmeasurable (see the previous exercise). One dramatic way of putting it is this: If matter were homogeneous, it would be possible to cut the sun into pieces and reassemble these into a solid ball the size of a pea. This sort of thing cannot happen for bounded plane sets, but it is possible to solve the following problem: Show how to cut into infinitely many pieces a square of side 1 so that these pieces may be reassembled into a rectangle of twice the area of the square. 1.48 One method of cutting into pieces a polygon P and reassembling them into another polygon Q may be called more efficient than another such method if fewer pieces are involved. Very little is known about most efficient procedures in the general case other than the mere fact of their existence. Our methods, applied to turning an equilateral triangle into a square, involve cuts to form a total of seven pieces. Find a more efficient way, using shortcuts. 1.49 Suppose that P and Q are rectangles with short sides a and c, respectively, and long sides band d, respectively; and suppose in addition that P and Q have different areas and that b is larger than d. Show how to cut P up into finitely many polygonal pieces which can be reassembled into a rectangle which has one side equal to d. This shows that two rectangles can be assembled into a single rectangle. How? 1.50 Can you use Exercise 1.49 and other information in this chapter to provide a shorter proof of the Bolyai-Gerwin Theorem? How? 1.51 It was mentioned in the last section that there do exist polyhedral solids of the same volume which are not equidecomposable. However, the Bolyai-Gerwin Theorem may well hold for certain types of polyhedra. For example, is it true that two rectangular parallelepipeds of the same volume are equidecomposable? Explain. 1.52 At the beginning of this chapter, it was stated that one can make a puzzle of a number of pieces that fit into a rectangular box, and which can be reassembled to form a triangle, a hexagon, a symmetrical five-pointed star-or all three. Of course, the Bolyai-Gerwin Theorem shows how to cut a rectangle into pieces that can be reassembled to make a hexagon, but how does it follow that a rectangle can be cut into pieces that can be reassembled to form all three of the above figures?
Notes and references
23
1.53 Is it possible to cut up two cubes of equal volume into pieces which can be reassembled to form a single cube? Explain. 1.54 Continuing the previous exercise, is it possible to cut up two cubes of equal volume into a number of small cubes of equal volume which may be reassembled to form a single cube? Reduce this problem to one of solving a certain simple algebraic equation. 1.55 Is it possible to assemble finitely many squares, no two of which have the same area, into a rectangle? Explain. NOTES AND REFERENCES
The most convenient text covering the material of this chapter, as well as associated topics, is V. G. Boltyanskii's Equivalent and Equidecomposable Figures (Heath, 1963; translated by A. K. Henn and C. E. Watts from the first Russian edition, Moscow, 1956). In particular, the results of Dehn and Hadwiger about equidecomposable polyhedra are given in detail. For those who wish to pursue these matters in the one-dimensional case, W. Sierpinski's long article "On the Congruence of Sets and Their Equivalence by Finite Decomposition" is excellent. This was originally published in Vol. 20 of Lucknow University Studies (1954), and is available in English in Monographs, published by Chelsea, containing the article by Sierpinski as well as one by Klein, one by Runge, and one by Dickson on other topics. One might also see Sierpinski's article, "Sur quelques problemes concernant la congruence des ensembles de points," in Elemente der Mathematik, Vol. 5, pages 1-4 (1950). With respect to Exercise 1.46, the paper of Banach's referred to is "Sur Ie Probleme de Mesure," in Fundamenta Mathematicae, Vol. 4, pages 7-33 (1923). A general reference covering a wide variety of problems about plane sets is the excellent text by Hadwiger, Debrunner, and Klee, Combinatorial Geometry in the Plane, published by Holt, Rinehart, and Winston (1964). A mention of equidecomposability is made on page 52, and several additional references are given. Additional topics on triangulation, construction of an area-measuring function for the plane, and an alternate proof of the Bolyai-Gerwin Theorem may be found in E. E. Moise's Elementary Geometry from an Advanced Standpoint (Addison-Wesley, 1963), in Chapters 14 and 24. Chapter 19 also contains some interesting material on straightedge and compass constructions. In mathematics, as well as in the physical sciences, it frequently happens that discoveries are made almost simultaneously by men working independently. In the case of the Bolyai family, we have a double coincidence. Bolyai Farkas (whose name is frequently rendered as Wolfgang Bolyai de Bolya) was a Hungarian mathematician born in 1775. He studied at the University at Gottingen, where he and Gauss, then also a student there,
24
The Bolyai-Gerwin Theorem
1.9
became close friends. The elder Bolyai returned to teach at Maros-Vasarhely, and during this time his son Bolyai Janos (or Johann Bolyai de Bolya) acquired an interest in mathematics. In 1832 Farkas published his major mathematical work, the Tentamen, covering a variety of geometrical ideas, and in which he stated the theorem we have called the Bolyai-Gerwin Theorem. At almost the same time-perhaps within a year-the German officer and amateur mathematician Gerwin also published the same result. Meanwhile, Janos had been working on the problem of the independence of Euclid's famous parallel postulate, and in the process discovered that the parallel postulate could be replaced by a nonequivalent alternative. The resulting geometry, known as non-Euclidean geometry, was at that time a major breakthrough in mathematical thought; indeed, so important that the reader is urged to consult a history of mathematics for a full appreciation of its implications. The work of the younger Bolyai was published as a 26-page appendix to his father's Tentamen. Again, almost simultaneously, the Russian mathematician Lobachevsky came up with much the same result; meanwhile, when he heard of this work of the younger Bolyai, Gauss revealed to Bolyai Farkas that he too had obtained such results, but had hesitated to publish them because he thought they might not be well received. This was quite disappointing to Bolyai Janos, who never again published any mathematical results; curiously enough, he is by far the better known of the two Bolyais. This is probably as it should be, inasmuch as his discovery of non-Euclidean geometry has had major implications in both the foundations of mathematics itself as well as in Einstein's Theory of Relativity. Bolyai's work went almost unnoticed for thirty-five years, until it was noted after his death by Richard Baltzer in 1867. Finally, in 1894, a memorial stone was placed on Bolyai Janos' grave in Maros-Vasarhely.
CHAPTER 2 BRUNNIAN LINKS
Examine the three rings shown in Fig. 2.1, close the book, and try to reproduce the drawing. Many people have some difficulty in correctly drawing the three rings of Fig. 2.1, known as the Borromean Rings, and will draw instead three rings linked much as the three shown in Fig. 2.2. As soon as the difference between the two is seen, though, it becomes easy to draw the Borromean Rings. The distinguishing property of the Borromean Rings is that each of the rings lies completely over one of the other two, and completely under the other. The three rings shown in Fig. 2.2, on the other hand, have the property that each actually links each of the other two. We shall examine in this chapter the implications of this difference, as well as properties of such figures made of other numbers of rings. For the rest of this chapter, we shall assume that all such figures lie in ordinary three-dimensional space, so that if you wish you may construct examples made of string or wire. The circular shapes into which your material is formed will not have to be perfectly circular, nor is the material itself important with regard to the mathematical properties that will concern us. All we care about is the manner in which the various curves link one another. Hence we immediately pass to a convenient mathematical abstractionthat of the simple closed curve. Just as Euclid saw in the real world various rough approximations to a perfectly straight line (such as the line where the wall meets the ceiling), and passed to the abstraction of the geometric straight line without width or end, so also do mathematicians who wish to examine properties of knots and linking curves pass to the abstract idea of a simple closed curve. 25
26
Brunnian links
2.1
Fig.2.1
The Borromean rings.
Fig. 2.2 A (3, 1)-Brunnian link.
The simple closed curve
2.1
27
Fig.2.3 A wild simple closed curve.
2.1
THE SIMPLE CLOSED CURVE
Let us consider the reality from which the concept of the simple closed curve is abstracted. Imagine an ordinary piece of string with the ends woven together so as to form a homogeneous and continuous curve; then think of the midline of this curve. This abstraction, the midline, is an example of a simple closed curve. The important properties are these: a) No point separates a simple closed curve. That is, no single scissors-cut can separate it into two pieces. b) Each set of two points does separate a simple closed curve. That is, two scissors-cuts will separate the curve into two (why not more?) pIeces. c) The simple closed curve is one-dimensional-every sufficiently small connected piece of it has the same dimensional properties as a small piece of Euclid's perfect geometric straight line. d) The simple closed curve is bounded-it is contained in some sphere of sufficiently large radius. Mathematicians were surprised, as you might be, to discover that the above four properties leave something to be desired if they are supposed to be defining properties of the common notion of a simple closed curve; at least, something to be desired if this definition is to be an accurate abstraction from reality. For this definition does not exclude such an object as is shown in Fig. 2.3.
28
Brunnian links
2.1
Fig. 2.4 Tame simple closed curves.
The object shown in Fig. 2.3 does have the four properties listed above, and so ought to be considered a simple closed curve, but is not an abstraction from reality since it certainly cannot be tied with a piece of string. There are infinitely many loops, decreasing in size; of course, the entire figure cannot be drawn, and so most of it is concealed in the box. Such curves are said to be wild, as opposed to the tame ones we shall study. We may prevent the occurrence of any wild simple closed curves henceforth by the following agreement: We consider a curve only if it can be continuously deformed without self-intersection onto a polygonal curve (one formed of a finite collection of straight-line segments). Mostly for aesthetic reasons we shall continue to draw our tame simple closed curves without corner points; just imagine if you wish that all the corners have been rounded off slightly. Each of the simple closed curves shown in Fig. 2.4 is tame-the one on the left is an approximation by a polygonal simple closed curve to the one on the right. Since the curves shown in Figs. 2.1 and 2.2 can be continuously deformed without self-intersections onto polygonal curves, they too are tame; but the curve shown in Fig. 2.3 does not have this property. (This is not obvious.) Note that the Borromean Rings of Fig. 2.1 have the property that each ring lies completely over the one immediately clockwise to it. This means that if anyone of these three simple closed curves is removed from the Borromean Rings, the remaining two come apart. However, you will have
2.1
The simple closed curve
29
little trouble in convincing yourself that the three do not come apart. Because of this interesting property, the Borromean Rings have been used for centuries in the Christian religions as a symbol of the Holy Trinity. On a less reverent note, they have also been used as the trademark of a well-known manufacturer of beer and ale. The mathematician H. Brunn may have been the first to study generalizations of the Borromean Rings, about 1892, and these generalizations are known to some mathematicians as Brunnian links for this reason. What generalizations? Why, one would naturally ask if it is possible to construct four simple closed curves with the Borromean Property: The four are linked together but removal of anyone causes the remaining three to fall apart. And why stop with four? Do there exist ten, or twenty, or even a hundred simple closed curves, the totality linked together, but such that the removal of anyone causes the rest to fall apart? We shall examine these generalizations as well as others in this chapter. Exercises
2.1 We listed earlier the four properties of a simple closed curve:
No point separates a simple closed curve. Each two-point set separates a simple closed curve. A simple closed curve is one-dimensional. A simple closed curve is bounded. If there exists a figure in three-dimensional space which is not a simple closed curve and has three of these properties but not the fourth, it shows that the fourth property is essential for an abstract definition of simple closed curve. Can you construct a figure having properties (a), (c), and (d), but not property (b)? What about other combinations? 2.2 Which of the defining properties of a simple closed curve does a straightline segment have? 2.3 Which of the defining properties of a simple closed curve does an infinite straight line have? 2.4 Which of the defining properties of a simple closed curve does a thetacurve (a figure shaped like the Greek letter 8) have? 2.5 Which of the defining properties of a simple closed curve does a flat two-dimensional circular disk have? 2.6 Construct an example of a wild simple closed curve essentially different from that shown in Fig. 2.3. 2.7 Construct an example of four simple closed curves in three-dimensional space with the property that the removal of a certain one of these causes the other three to come apart, but removal of anyone of the other three leaves the remaining three linked together. a) b) c) d)
30
Brunnian links
2.2
2.8 Each of the curves shown in Fig. 2.4 is said to be knotted because neither
can be continuously deformed into a circle in three-dimensional space without self-intersection at some stage. On the other hand, a circle or a square is an unknotted simple closed curve. Show that a square is unknotted. 2.9 Your intuition should tell you that a tame simple closed curve is unknotted (see Exercise 2.8) if and only if it can be continuously deformed in three-dimensional space until it lies in a flat plane without self-intersection. Show that this last condition is equivalent to the following: The tame simple closed curve can be continuously deformed in three-dimensional space until it lies on the surface of a round two-dimensional sphere without self-intersection. 2.10 It is difficult to show that each of the curves of Fig. 2.4 is knotted, but you can show that either-and thus both-can be deformed so as to lie on the surface of a two-dimensional torus without self-intersection. Please do so. (A torus is the mathematician's abstraction of the surface of a onehole doughnut.)
2.2 LINKS AND THEIR PROPERTIES
We have already introduced some technical terms and concepts without definition. We need these definitions to ensure that the writer and the reader agree on precisely what is meant, but we will define these terms-"linking," "falling apart"-in such a way as to be as near as possible to the common and natural interpretations of these terms. This is how we have been able so far to use these terms in a very nonmathematical fashion-without previous definition-with some assurance that no confusion as to precise meaning has yet arisen. However, after the definitions, we will begin to use the mathematical terms now current to remind you that these terms have been defined in a certain way, and to prevent the connotations associated with the common terms from creeping in and clouding the issues. An n-Iink is a collection of n simple closed curves in three-dimensional space. (Remember that we shall deal only with tame simple closed curves; n is merely a positive whole number.) See Fig. 2.5. An n-link is splittable if it is possible to deform it continuously in threedimensional space in such a way that part of the link lies within Bland the rest of the link lies within B 2 , where B 1 and B 2 are mutually exclusive solid balls in three-dimensional space. Figure 2.6 shows an example of a splittable 3-link. Your intuition is in good working order if it tells you that if an n-link composed of curves C 1 , C2 , . . . , Cn is splittable, then no one of the curves C i can lie partly within B 1 and partly within B 2 • What does happen is this: Some (but not all) of the curves lie entirely within B 1 , some (but not all) of
2.2
Fig. 2.5 A 4-link.
Fig. 2.6 A splittable 3-link.
links and their properties
31
32
Brunnian links
2.2
Fig.2.7 A 4-link and one of its 3-sublinks.
the curves lie entirely within B 2 , and each curve lies entirely within exactly one of the two balls. For example, if a 3-link is splittable, then it can be continuously deformed so that two of its curves lie entirely within B 1 and the third entirely within B 2 , where (as before) B 1 and B 2 are disjoint solid balls. An n-link composed of the curves C 1 , C 2 , ••• , Cn is said to be completely splittable provided that there exist mutually exclusive balls B 1 , B 2 , ••. , Bn in three-dimensional space such that the link can be deformed continuously so that each C i lies entirely within the ball B i • The example of Fig. 2.6 is a splittable 3-link which is not completely splittable, because C 2 and C 3 cannot be split apart. So all that splittability means is that the link comes at least partly apart; completely splittable links come entirely apart. If we say a link is nonsplittable, we mean that not even one of the curves involved, or any pair, or any combination, can be separated from the rest without cutting. Finally, given any n-link composed of the curves C 1 , C 2 , .•. , Cn' a sublink of this link is simply some subcollection of the curves C 1 , C 2 , ••• , Cn -obtained, if you like, by erasing the ones not in the collection. A ksublink of the link L is just a k-link which is a sublink of L. This of course makes sense only for 1 :::: k < n, where n is the number of curves used to form L. Figure 2.7 shows on the left a 4-link, and on the right one of its 3-sublinks. Please note that a collection ofcurves in a sublink retains the linking properties it enjoyed in the whole link, although removal of some curves to form a
2.3
A simple algebra
33
sublink out ofthe remainder may cause splittability where none existed before. For example, consider the 2-sublinks of the Borromean 3-link. We may use our new terminology to describe the property of interest that distinguishes the Borromean Ring example from the example shown in Fig. 2.2: The Borromean Rings form an example of a nonsplittable 3-link every 2-sublink of which is completely splittable. And the generalizations we seek may be described in the following way: a) Does there exist a nonsplittable 4-link such that each of its 3-sublinks is completely splittable? b) Given a whole number n > 5, does there exist a nonsplittable n-link with each of its (n - 1)-sublinks completely splittable? c) Given whole numbers nand k with 1 ~ k < n, does there exist a nonsplittable n-link such that each of its k-sublinks is completely splittable but each of its (k + l)-sublinks is nonsplittable? In particular, does there exist an example of a nonsplittable 4-link with every 3-sublink also nonsplittable, but with every 2-sublink completely splittable? Exercises
2.11 Is every splittable 2-link also completely splittable? Why? 2.12 Is every splittable 3-link also completely splittable? Explain your answer. 2.13 Show how to construct, for each whole number n > 2, an example of a nonsplittable n-link each of whose sublinks is also nonsplittable. 2.14 Show how to construct, for each whole number n > 3, an example of a nonsplittable n-link one of whose (n - 1)-sublinks is completely splittable. Can this be done in such a way that one and only one of the (n - 1)sublinks is completely splittable, and all the other (n - l)-sublinks are nonsplittable? 2.15 Construct infinitely many circles C 1 , C z , . .. in three-dimensional space forming a nonsplittable link L such that removal of anyone of the curves C j from L produces a splittable link. For the purposes of this exercise, how is the word "link" being used? What is a reasonable definition of "splittable" ? 2.3 A SIMPLE ALGEBRA
We shall answer the questions raised at the end of the preceding section by use of an extremely simplified form of analytic geometry. Analytic geometry may be described as the process of assigning algebraic equations or expressions to geometric objects so that we can make deductions which may not be geometrically apparent. The simplification we make lies in our new simple
34
2.3
Brunnian links
algebra. We shall use the customary symbols a, b, c, ... of ordinary algebra, but the rules of our algebra are very few in number. Here they are: 1.
2. 3.
4.
The only operation is "multiplication." We shall denote the product of a and b byab. There is a multiplicative identity, which we shall denote by 1. It has the property that for each a, la = a = a1. The associative law for multiplication holds-that is, we may ignore parentheses. It will always be true that (ab)c = a(bc), so that we may simplify an expression such as a(b[cd(b)]) to abcdb. Finally, for each symbol a in our algebra, there exists an element denoted by a-I (and pronounced "a-inverse") in our algebra such that aa- 1 = 1 = a- 1 a.
To save space, we shall abbreviate such expressions as aaaa by a4, and the usual laws of exponents can be shown valid: d"d'
=
d"+n
and
(d"t
=
d"n
for all whole numbers m and n and each element a of our algebra. particular, (a -l)n
= a -n
and
In
aa- 1 = aO = 1,
thus justifying our use of the notation a-I for the inverse of the element a. What we cannot use is the so-called commutative law; in general it will not be true that ab = ba. The only complicated thing is to determine what the symbols a, b, c, . .. represent. They are certainly not whole numbers, nor are they real numbers. Moreover, the operation which we have so blithely called "multiplication" certainly cannot be ordinary multiplication, since the objects we are "multiplying" are not numbers. We need to explain what the symbols a, b, c, and so on represent, and what the meaning of the "product" ab is. Let A be a circle in three-dimensional space, such as is shown in Fig. 2.8. Warning: So far, "A" is just the name of a circle, and the symbol A will not be an element of our algebra. The object a of our algebra will, however, be closely associated with the circle A, and the association is so natural as to cause us to use almost the same name-differing only in upper as opposed to lower case-for two different things. Imagine a curve in three-dimensional space that starts at the tip of your nose, passes through the circle A once going away from you, curves around, and returns to the tip of your nose. (The tip of your nose is denoted by p in Fig. 2.8.) This new curve is almost the object a of our algebra. But all we care about is the fact that this new curve passes through the circle A a net of one time away from you, and it is the collection of all such curves that
2.3
A simple algebra
Fig. 2.8 A and
35
B.
p
deserves the name a. Thus what the element a of our algebra represents is the net effect of passing through the circle A one time going away, or alternatively, the set of all such curves that start and end at the tip of your nose and pass through the circle A once going away. Your intuition should tell you that if x is any other curve also passing through A a net of one time going away, then the curve x can be continuously deformed to the curve representing a (as shown in Fig. 2.8) without cutting x or the circle A, and in such a way that x never touches the circle A, and the ends of the curve x are never removed from the tip of your nose. Conversely, any curve x which can be so continuously deformed must pass through the circle A a net of one time going away. So if you imagine all possible curves that start at your nose, pass through the circle A a net of one time going away from you, and return to your nose, you may consider each of these as representing the element a of our algebra, since anyone can be deformed to any other such in the manner described above. Now that you know what the symbols a, b, C, • •. of our algebra represent, it is time to define the operation of multiplying two of them together. Let A and B be circles in three-dimensional space as shown in Fig. 2.9. The element a of our algebra can be represented by a curve that starts at p, passes through the circle A once going away from you (and not passing through B at all), and finally returns to p. Similarly, the element b can be
36
2.3
Brunnian links
Fig. 2.9 A curve representing the "product" abo
p
represented by a curve that behaves the same way with respect to the circle B. We define the product ab to be represented by each and any curve that has the net effect of "first doing a, then doing b." That is, a typical representative of ab is a curve that starts at p, passes away from you through A, then passes away from you through B, and finally returns to p. It is permissible to remove the end of a and the beginning of b from the point p and then join these ends together close by, as we have shown in Fig. 2.9, in order to clarify the picture. A little experimentation with wire circles and two pieces of string should convince you that, at least in this example, ab =1= ba; that is, the curve ab cannot be deformed to the curve ba without cutting it or cutting the circles A and B or removing its ends from p. What sort of curve represents a- 1 ? Why, each curve that passes once through the circle A toward you. What is the meaning of a 2 ? This stands for those curves that have the net effect of passing through the circle A twice away from you. What about the multiplicative identity 1 of our algebra? That is represented by any curve that passes through no circle at all. These phenomena are illustrated in Fig. 2.10.
2.3
A simple algebra
37
A
Fig.2.10 Some algebraic illustrations.
A
A
38
2.3
Brunnian links
It is easy to see that la = a = al for each element a of our algebra, and that aa- l = I = a-lao We are, of course, interpreting equality as "has the same effect as," or "can be continuously deformed to," or "passes through the same curves in the same direction in the same order as." The associativity, which allows us to write abc for either (ab)e or a(be), will be discussed in the exercises. You should now spend some time convincing yourself that the laws of our simple little algebra in fact hold, when you interpret the objects a, b, e, . .. and the operation of "multiplication" as we have done. From our point of view, one of the most useful things about this algebra is that we can tell when two "objects" or expressions in the algebra are actually equal by a very simple test: First, perform the only algebraic simplifications allowable in the algebra:
Replace aa - 1 (or a- 1 a) by 1. Replace 1a (or a1) by a. Simplify expressions such as aaaa to a4 • Perform all these simplifications on each expression. If the resulting expressions are identical, then the original expressions must have been equal. For example, we may ask whether aba 2 a- 1 bbeae - lee
and
are equal. Each simplifies to abab 2 eae. Since they have identical simplifications, they are equal. Exercises
2.16 The definition of "group" is given in Section 4.1. In our simple algebra developed for the study of Brunnian links, we started with a set of objects a, b, e, ... , each representing a way of sending a simple closed curve through some fixed circle in three-dimensional space, and a "multiplication" of these objects, by which the product ab was interpreted to mean the effect of first doing a, then doing b. Verify informally that this set of objects together with this multiplication does indeed satisfy the definition of a group. 2.17 We saw that our simple algebra has the following property: Two strings of symbols are equal if they have identical simplifications. However, there may be equal expressions without identical simplifications. If the circles A and B are linked together as shown in Fig. 2.11, then ab = ba, but this relation cannot be derived from the simplifications listed in the preceding section. Why is ab = ba in this example? 2.18 Let us study just a little more the algebra associated with the 2-link shown in Fig. 2.11. Show that every expression in this algebra is equal to one of the form d'b m , where nand m are whole numbers (possibly zero; we
2.4
Brunnian 4-links
39
Fig.2.11 ab = ba.
interpret XO as the identity, 1). For example, b- 1 a2b3a 5 = a7 b2. This shows that each expression in this algebra has a "standard form" that looks like d'b m • Are two expressions in this algebra equal if and only if their standard forms are identical? Explain your answer. 2.19 We return in this exercise to the general case of an algebra known only to satisfy the group axioms (see Exercise 2.16). Show that (xy)-l = y-1X- 1. Your proof should work not only in the case that x and yare single symbols in the algebra, but also in the case that x and y stand for more complicated expressIOns. 2.20 Show that the two expressions
aab2b-lca2b3b-la-lb3 and are equal. 2.4 BRUNNIAN 4-LlNKS
Although we are going to use a very simple algebra, it will turn out to be a powerful tool in our examination of Brunnian links. You will now see how a fairly complex geometric problem can be transformed into a remarkably simple algebraic one. Suppose you had the task of drawing four simple closed curves to form a nonsplittable link each 3-sublink of which is to be completely splittable.
40
2.4
Brunnian links
Fig.2.12 Three fourths of a {4, 3)-Brunnian link.
Since each 3-sublink is to be completely splittable, it should be possible to draw the desired 4-link by first imagining one of the curves not present. Since the remaining three split completely, they could be drawn as separated circles much as in Fig. 2.12. The only problem would be to draw the fourth curve in such a way as to guarantee that the resulting 4-link would be nonsplittable, but such that all four possible 3-sublinks would be completely splittable. One-fourth of the latter task has already been completed: Clearly, if the fourth curve, not yet drawn, is removed, then the remaining three come apart. You will not be convinced of the difficulty of finding a geometric solution to this problem unless you give it a try-so spend some time trying to draw that fourth curve. Remember, in order to obtain the desired example, it must be true that if anyone of the curves A, B, or C is removed from your figure, then the remaining three, two circles and your curve, must come
2.4
Brunnian 4-links
41
Fig.2.13 The Borromean rings again.
p
completely apart. Do not try to draw a circle for the fourth curve; so far as we know, it cannot be done that way. All you need is any tame simple closed curve for the fourth curve. Let us see if the miniature algebra we have developed will serve to tell us why the Borromean Rings form a nonsplittable 3-link each 2-sublink of which is completely splittable. Perhaps this will provide some insight that will help in the construction of other examples. If you carefully draw apart the upper two rings of the Borromean example shown in Fig. 2.1, you will obtain a figure much like the one shown in Fig. 2.13. We have two separated circles A and B, which is what we might well want to start with if we wanted to invent the example of the Borromean rings. The tip of your nose is shown as the point p on the curve C, for we wish to calculate the algebraic formula of this curve. It first passes away from you through A, then toward you through B, then again toward you through A, and finally away from you through B. The algebraic formula we have learned to associate with such a curve is ab-1a-1b. This simple formula tells us a great deal about the Borromean rings. First, since no simplification of the formula ab-1a-1b is possible, it does not reduce to I. (This does not contradict Exercise 2.17, since the curves A and B are not linked.) Hence the curve representing this formula actually links the union of the curves A and B and cannot be removed without cutting.
42
Brunnian links
2.4
Second, observe the effect of cutting and removing the circle A. The effect on the formula ab- 1 a- 1 b is both simple and striking. When the circle A is taken away, the effect on the formula is to delete all occurrences of the symbol a (and a- 1 as well). The formula becomes b- 1 b, which simplifies to 1. This shows that if A is removed, then Band C are completely splittable. Similarly, if B is removed, then we obtain aa- 1 = 1, and so A and Care completely splittable. By our construction, since we drew A and B already separated, A and B split completely if C is removed. This shows that every 2-sublink of the Borromean rings is completely splittable. If the 3-link itself were splittable it would then be completely splittable by the above discussion. But this is not the case since ab- 1 a- 1 b #= 1. Hence the 3-link is nonsplittable. This demonstrates that the Borromean rings actually do provide an example of a nonsplittable 3-link each 2-sublink of which is completely splittable. The one item of crucial importance about the formula ab- 1 a- 1 b, the one phenomenon that makes the example work, is this: The deletion of all occurrences of anyone symbol in the formula causes the formula to collapse to I. What this means is that if anyone circle is removed from the example of the Borromean rings, then the remaining two come apart. To apply similar algebraic methods to the construction of a nonsplittable 4-link each 3-sublink of which is completely splittable, let us return to Fig. 2.12 and simply ask what sort of formula the fourth curve we must draw should have. First, the formula should involve all three of the symbols a, b, and e, so that the four curves together will form a nonsplittable link. Second, the formula must not collapse to I. This too is needed to ensure that the 4-link is nonsplittable. Finally, the formula must have the property that the deletion of all occurrences of anyone symbol-either a, b, or e-will cause the formula to collapse to I. This will, as in the example of the Borromean rings discussed above, guarantee that each 3-sublink is completely splittable. Exercise
2.21 Can you write down a formula involving all three symbols a, b, and e, such that the formula does not collapse to I, but the deletion of all a's, or all b's, or all e's, does cause the resulting formula to collapse to I? Or, alternatively, can you prove this impossible, thus demonstrating there is no analogy to the Borromean rings with four simple closed curves? We are sure you will be able to discover the answer for yourself, and in order for you to enjoy fully the thrill of mathematical discovery, we ask you not to read beyond this paragraph until you have worked on this problem for a while. Go back and look at the formula associated with the third curve in the Borromean rings; it will probably help.
2.5
Notation and examples
43
2.5 NOTATION AND EXAMPLES
Let us introduce at this point a small bit of notation that will serve as a useful abbreviation. By an (n, k)-Brunnian link we mean a link of n simple closed curves in three-dimensional space such that each k-sublink is completely splittable, but each (k + 1)-sublink, and each (k + 2)-sublink, and so on up to the n-link itself, is nonsplittable. Thus an (8, 5)-Brunnian link would be a collection of eight simple closed curves such that the set of all eight, or any seven, or any six, would be nonsplittable, but any 5-sublink would be completely splittable. Clearly, if each 5-sublink is completely splittable, then so is any 4-sublink, any 3-sublink, and any 2-sublink. Of course, it makes sense to talk about an (n, k)-Brunnian link only if 1 ~ k < n; if k = 1 the definition still makes sense-Fig. 2.2 shows a (3, 1)-Brunnian link. And, of course, the Borromean rings form a (3, 2)-Brunnian link. We have previously asked you to construct, first geometrically and then algebraically, an example of a (4, 3)-Brunnian link. If you have succeeded in doing this, we invite you to try a much harder problem-the construction of a (4, 2)-Brunnian link. Here is the hint or observation that may simplify the task: If a (4, 2)-Brunnian link exists, then the removal of anyone curve from this link should produce a (3, 2)-Brunnian link. Thus one should perhaps begin by drawing a (3, 2)-Brunnian link and then attempt to find where to put the fourth curve so as to produce a (4, 2)-Brunnian link. Make use of the algebra we developed specifically to solve problems of this sort. Let us return to the problem of producing a (4, 3)-Brunnian link, starting with the three circles A, B, and C as shown in Fig. 2.12. As we hope you have discovered, there does indeed exist a formula using the symbols a, b, and c such that the deletion of all occurrences of anyone of these symbols causes the resulting formula to collapse to 1. One such formula is
aba - 1b- 1 cbab - 1 a-I c- 1 • The presence of the "subformula" aba- 1 b- 1, so like the one associated with the Borromean rings, is no coincidence. As soon as this is noted, the pattern becomes clear for construction of even larger links, such as the (5, 4)-Brunnian link. To construct a (5, 4)-Brunnian link, first draw four separated circles A, B, C, and D. Find a formula, such as
aba-lb-lcbab-la-lc-ldcaba-lb-lc-lbab-la-ld-l , with the property that all four symbols a, b, c, and d appear in the formula, and such that the deletion of all occurrences of anyone of these symbols causes the resulting formula to collapse to 1. We have not chosen the shortest such formula above for the (5, 4)-Brunnian link, but it is one which continues
44
Brunnian links
2.6
the pattern already established. If we then draw in a fifth curve with this formula, we can argue (as we did in the case of the Borromean rings) that what is produced is indeed a (5, 4)-Brunnian link. You will also have seen that the number of curves in these constructions is not material, except that large numbers of such curves produce rather long formulas. But since the formulas have the desired properties it follows that for each whole number n > 2, an (n, n - 1)-Brunnian link exists. Thus it is possible to construct, say, one hundred simple closed curves in three-dimensional space forming a nonsplittable link, but such that the removal of anyone of these curves causes the remaining ninety-nine to split completely. Exercises
2.22 Draw a (2, 1)-Brunnian link. 2.23 Draw a (4, 3)-Brunnian link. 2.24 Draw a (5, 4)-Brunnian link. 2.25 Find a shorter formula than that given in Section 2.5 for the fourth curve in a (4, 3)-Brunnian link. 2.26 Draw a (3, 2)-Brunnian link essentially different from the example of the Borromean rings shown in Fig. 2.1. 2.27 Show that an (n, 1)-Brunnian link can be constructed for each whole number n > 2. Compare this with Exercise 2.13. 2.28 If you have not already done so, construct an example of a (4, 2)Brunnian link. 2.29 For appropriate choice of x and y, the formula for the third curve in a (3,2)-Brunnian link was seen to have the form xyx- 1y-1. Is this also true of the formula given in Section 2.5 for the fourth curve in a (4, 3)-Brunnian link? See also Exercise 2.19. 2.30 Construct four simple closed curves in three-dimensional space, say A, B, C, and D, with the following rather unsymmetrical linking property: The removal of A causes the remaining three to split completely, but of B, C, and D, exactly two must be removed to cause A and the third to split, and if only one of B, C, or D is removed, then the remaining three form a nonsplittable link. 2.6 COMMUTATORS
The frequent repetition of a mathematical concept justifies calling attention to that concept and supplying it with a name. You have undoubtedly noticed by now the frequent occurrence of a string of symbols of the form xyx - 1Y -1. In addition, you may have noticed that in our simple algebra, if we wish to construct the inverse ofa string of symbols such as ab 2 a- 1 cab- 3 ,
2.6
Commutators
45
the proper way to do it is to write the string of symbols in the reverse order, changing the sign of each exponent. (The exponent of a is 1.) Thus, in the example just mentioned, the inverse would be b3 a- 1 c- 1ab- 2 a-t, for the product of the two will clearly produce 1. In general, we may state as a useful law of exponents in our algebra that (xy)-l
=
y-1x- 1
even if one or the other or both of x and yare strings of symbols rather than a single symbol. Let us abbreviate by (x, y) the string of symbols xyx-1y-l. This object is a victim of frequent study in modern mathematics, and is called the commutator of x and y. Suppose we are given three symbols a, b, and c, and form first the commutator of a and b, and then form the commutator of that string of symbols with the element c. Symbolically, we would express this as «a, b), c), which expands to the expression
aba-lb-lcbab-la-lc-l. It is no coincidence that this is the formula by which we earlier produced a
(4, 3)-Brunnian link, and indeed if you expand the multiple commutator «(a, b), c), d), you will obtain the formula for the fifth curve in a (5, 4)Brunnian link. The reason that commutators produce this effect is this: Any of the multiple commutators we have been writing down has the property that if all occurrences of anyone symbol are deleted, then the entire commutator collapses to 1. This is precisely the property needed to produce (n, n - l)-Brunnian links. Let us turn our attention to the problem of producing a (4, 2)-Brunnian link. As we have already observed, a reasonable starting point would be to draw a (3, 2)-Brunnian link; that is, the Borromean rings. We do this because if anyone curve is deleted from a (4, 2)-Brunnian link, what is left is a (3, 2)-Brunnian link. So we draw the three, and puzzle over exactly how to draw the fourth curve. In terms of our algebra, if the three we draw are called A, B, and C, what is required is a formula using all the symbols a, b, and c, such that the deletion of all occurrences of anyone of these symbols causes the formula to collapse, not to 1, but to the formula for a (3, 2)Brunnian link. In other words, can we use the symbols a, b, and c to write a formula in which the deletion of all occurrences of anyone symbol does not cause the formula to collapse to 1, but one in which the deletion of all occurrences of any two symbols does cause the formula to collapse to I? If we could find such a formula, and draw in a fourth curve representing this formula, we would have a (4, 2)-Brunnian link as desired. Here is where the concept of the commutator of two symbols proves useful. If the formula involves the commutator (a, b), then at least this much will remain if all occurrences of c in the formula are deleted, and some care is
46
Brunnian links
2.7
taken in the construction of the formula. If in addition this were all that remained, then after removing curve C, we would have exactly what we wanta (3, 2)-Brunnian link. So let us first write down the commutator of a and b, and be sure that c is involved in any other commutators we write down so as to guarantee the disappearance of all these other commutators if all occurrences of c are deleted. At this point you may have already guessed what will happen next. About the only other sorts of things we can construct that are commutators involving the symbol c are the commutators (a, c) and (b, c). But then we notice that the product of all three has the desired property (a, b)(a, c)(b, c) = aba-lb-laca-lc-lbcb-lc-t,
which will collapse to one of these three commutators if all occurrences of any one symbol are deleted. Whether or not we remove the curve A, or B, or C, what is left will be a (3, 2)-Brunnian link. Hence drawing a fourth curve representing the above formula will produce a (4, 2)-Brunnian link. Exercises
2.31 Show that the inverse of (x, y) is (y, x). 2.32 Prove that (x, y) = 1 if and only if xy = yx. 2.33 Show that the deletion of anyone symbol in all its occurrences from the multiple commutator «(a, b), c), d) causes the resulting expression to collapse to 1. 2.34 Use the ideas of this section to construct a (5, 2)-Brunnian link. Note: you should start by drawing a (4, 2)-Brunnian link. 2.35 Use the methods of this section and your experience with Exercise 2.34 to show that a (n, 2)-Brunnian link can be constructed for any whole number n > 3. 2.7 THE FINAL GENERALIZATION
At this point we have exhausted all possibilities for 4-links. As we have shown, it is not too hard to produce a (4, 3)-Brunnian link; it is easy to produce a (4, l)-Brunnian link; and we finally succeeded in the last section in producing a (4, 2)-Brunnian link. It would be reasonable at this stage to guess that for any whole numbers nand k for which the notation makes sense, an (n, k)-Brunnian link can be constructed. But let us first examine the next case still open, that of a (5, 3)-Brunnian link. As before, let us start with all but one of the curves already drawn. If we imagine a (5, 3)-Brunnian link with one of the curves removed, what we must have is a (4, 3)-Brunnian link. However, we know that a (4, 3)Brunnian link can be constructed, so let us draw one. Let the four curves
2.7
The final generalization
47
Fig. 2.14 A (4, 3)-Brunnian link.
involved be called A, B, C, and D. Then, as before, the problem is to draw a fifth curve E with certain properties-but this amounts only to finding a formula for E, a formula involving all the symbols a, b, c, and d, such that the deletion of all occurrences of anyone of these symbols from the formula produces the formula for a (4, 3)-Brunnian link. The formula for producing a (4, 3)-Brunnian link is itself a commutator, but a slightly more complicated commutator than we used in the (n, 2)Brunnian examples (see Exercise 2.35). Recall that if separated circles A, B, and C are drawn, then drawing a fourth simple closed curve D according to the formula (a, b), c) will produce a (4, 3)-Brunnian link, as in Fig. 2.14. To produce a (5, 3)-Brunnian link, we must provide a formula using the four symbols a, b, c, and d, so that-if possible-the deletion of all occurrences of anyone of these symbols produces a commutator in three of these symbols, such as our original commutator (a, b), c). For example, if a were deleted, it would be sufficient to have the formula collapse to (b, c), d); if b were deleted, we would like to have the formula collapse to
48
2.7
Brunnian links
«a, c), d). But observe that even these more complicated commutators, such as «a, b), c), do collapse to I upon the deletion of any symbol present in them. Therefore, as we did for the construction of the (n,2)-Brunnian links, we write the product of all possible commutators involving three of the symbols a, b, c, and d, and obtain «a, b),
c) «a, b), d)«a, c), d)«b, c), d).
If all occurrences of anyone symbol, such as c, are deleted, all the commutators involving that symbol collapse to 1. What we have left is the single commutator «a, b), d), not involving the symbol c. But this shows that if we draw the fifth curve with the above formula, we have indeed produced a (5, 3)-Brunnian link. We need but a single definition here, for the purpose of making our ideas easier to express. If a, b, c, ... , are symbols in our algebra, and k is a whole number at least 2, then a k-commutator of these symbols is simply one of the multiple commutators we have been considering, one that involves exactly k distinct symbols drawn from our algebra. For instance, using the above symbols, one example of a 4-commutator is «(a, b), c), d). It is important to note that a k-commutator has the property that if all occurrences of any one symbol are deleted from it, then the formula that remains will collapse to 1. Let us see how the concept of a k-commutator will serve us in guessing the general procedure necessary to produce an (n, k)- Brunnian link for any meaningful choice of nand k. Suppose we first wished to draw a (3, 2)Brunnian link. We draw two separated circles, label them A and B, and consider the associated symbols a and b in our algebra. Since we wish to form a (something, 2)-Brunnian link, we write down all possible 2-commutators using the symbols a and b. There is no distinction between aba- 1 b- 1 and (for example) b - 1 aba -1 for our purposes. All we need to do is to write down only one such, say aba - 1 b -1. We then draw in a third curve with this formula and obtain a (3, 2)-Brunnian link. Similarly, to produce a (4, 2)-Brunnian link, we first produce a (3, 2)Brunnian link, label the three curves A, B, and C, and consider the three associated symbols a, b, and c in our algebra. We form all possible 2commutators using the symbols a, b, and c, and write their product
(a, b)(a, c)(b, c). If a fourth curve is drawn according to this formula, we obtain a (4, 2)Brunnian link. To produce a (4, 3)-Brunnian link, we draw three separated circles, call them A, B, and C, and form the product of all possible 3-commutators on the three symbols a, b, and c. There is only one, «a, b), c). If a fourth curve is drawn with this formula, then a (4, 3)-Brunnian link is formed.
2.7
The final generalization
49
To produce a (5, 3)-Brunnian link, we first construct a (4, 3)-Brunnian link, call the four curves of this link, A, B, C, and D, and then write the product of all possible 3-commutators using the four symbols a, b, c, and d. When a curve is drawn with this forty-symbol formula, a (5, 3)-Brunnian link is produced. The following is the general procedure, and you may verify for yourself that it works. Given an (n, k)-Brunnian link, here is how to produce an (n + 1, k)-Brunnian link. Label the n simple closed curves of the (n, k)Brunnian link C 1 , C 2 , C 3 , ..• , Cn. Let the algebraic symbols associated with these curves be called ClJ C2, C3' . . . , Cn. Form the product of all possible k-commutators using these n symbols. Draw a curve with this formula. The resulting link will be an (n + 1, k)-Brunnian link. For example, to produce a (7, 4)-Brunnian link, first construct a (5, 4)Brunnian link. Label the five curves A, B, C, D, and E. Form the product of all possible 4-commutators using the five symbols a, b, c, d, and e. (Of the five possible, one such 4-commutator would be «(a, b), c), d).) Draw a sixth curve with this formula; label it F. This will give you a (6, 4)-Brunnian link. Form all possible 4-commutators of the six symbols a, b, c, d, e, andh and multiply these commutators together. Draw a seventh curve with this formula. You will then have a (7, 4)-Brunnian link, as desired. In general, the method for constructing an (n, k)-Brunnian link is this: Start with a (k + 1, k)-Brunnian link. By using k-commutators at each stage, form next a (k + 2, k)-Brunnian link, form from that a (k + 3, k)Brunnian link, and continue the process until the desired (n, k)-Brunnian link is obtained. This establishes the only theorem of this chapter, which in conclusion we state below.
Theorem 2.1 For any whole numbers nand k for which the notation makes sense (I ~ k < n), there does exist an (n, k)-Brunnian link.
Exercises
2.36 Let S be an expression in our algebra; that is, S is a string of symbols with various exponents all multiplied together. Let a be an element of the algebra. Show that even though aSa - 1 need not be the same as S, the two do in some sense represent the same curve. 2.37 Find a shorter formula than that given in Section 2.5 for a (5, 4)Brunnian link. 2.38 Prove that the deletion of all occurrences of a single symbol from a kcommutator causes the resulting formula to collapse to 1. (You may use proof by induction on the integer k; see Exercise 1.22.) 2.39 In what sense is our proof of Theorem 2.1 a proof by induction?
50
Brunnian links
2.7
2.40 Sometimes it is difficult or impossible to draw certain examples of Brunnian links without having some of the curves pass over or under themselves. This may cause such curves to be knotted. Does this matter? Can it be prevented? How? 2.41 At certain points in the construction of Brunnian links one must write all possible essentially distinct k-commutators using n symbols, where 2 < k < n. This number of different k-commutators is the same as the number of different k-element subsets of a set with n elements. For sets, we may even allow k to take on the values and I, and as an example, let us consider a set with four elements. The number of subsets with 0, 1, 2, 3, and 4 eletp.ents is, respectively, 1, 4, 6, 4, and 1. In what other context have you seen this sequence? Can you obtain a general formula for calculating the number of k-element subsets of a set with n elements? See Exercise 1.22 for one method of attack.
°
2.42 Make up another problem like Exercise 2.30, but harder. 2.43 Make a Mobius strip by cutting a fairly long and narrow strip of paper, and then attaching the ends with glue or tape after giving the strip a half-twist. Show that the strip has only one side and only one edge.
2.44 Continuing the previous exercise, start drawing a line on the Mobius strip, starting at a point one-third of the way from one edge and continuing around the strip, always staying one-third of the way from that edge, until you reach the starting point. Predict what will happen if the Mobius strip is now cut in two down its center line. How many edges will the new object or objects have? How many strips will result? Verify your predictions by the experimental method. 2.45 Continuing the previous two exercises, what do you think will happen if a Mobius strip is cut in two by keeping the scissors always one third of the way from one edge, until the starting point is reached?
2.46 Repeat the previous three exercises for a strip with two half-twists. 2.47 Show that a strip with three half-twists has but one edge and one side, and that the edge is knotted.
2.48 Since a Mobius strip has but one edge, and that edge is a simple closed curve, one could imagine sewing two such strips together along the edges. One would obtain a surface with no boundary edge-like a sphere, but also very unlike a sphere in one way. How would the resulting surface differ from a sphere? 2.49 Can a surface have one side and two edges? Two sides and one edge? What are the impossible combinations?
2.50 A currently unsolved problem in mathematics is to find the answer to the following question: Does every simple closed curve in the plane contain the
Notes and references
51
vertices of a square? One method of attack might be to show that every simple closed curve in the plane contains the vertices of a parallelogram; another approach might be to restrict the sorts of simple closed curves under consideration. For an example of the latter approach, can you show that every triangular simple closed curve contains the vertices of a square? That is, given a triangle, is it always possible to construct a square whose vertices lie on the boundary of the triangle? NOTES AND REFERENCES
The first examples of (n, n - 1)-Brunnian links were given by H. Brunn in his paper "Uber Verkettung", published in Sitzungberichte der Bayerischen Akademie der Wissenschaften, Mathematische-Physikalische Klasse, Vol. 22 (1892), pages 77-99. Hans Debrunner published in Vol. 28 (1961) of the Duke Mathematical Journal, pages 17-23, a paper entitled "Links of Brunnian Type," in which he showed that Brunn's examples had the properties claimed for them by Brunn, and further generalized these examples to show the existence of (n, k)-Brunnian links for arbitrary nand k, I ~ k < n. D. E. Penney's paper "Generalized Brunnian Links," published in Vol. 36 (1969) of the Duke Mathematical Journal, provided an alternative construction of (n, k)-Brunnian links. This chapter is in essence providing sufficient material on homotopy theory for the elementary study oflinks. Additional material for the advanced student may be found, for example, in I. M. Singer and John A. Thorpe's Lecture Notes on Elementary Topology and Geometry, published by ScottForesman in 1967. Other material on links and knots is found in Crowell and Fox's Introduction to Knot Theory, published in 1963 by Ginn and Company, and in Topology of 3-Manifolds (edited by M. K. Fort, Jr.), published by Prentice-Hall in 1962. The existence of wild knots, such as the one shown in Fig. 2.3, is a consequence of the thesis of L. Antoine, published in 1921. The study of knots, links, and wild curves and surfaces in general, has become in recent years an exciting and rapidly advancing branch of modern mathematics. In particular, although the structure of surfaces is now well understood, the problem of describing high-dimensional figures is much more difficult, and to date only very partial answers to most questions have been obtained.
CHAPTER 3 THE WELL-TEMPERED CLAVICHORD
As you probably know, the "octave" on a piano has this name becausecounting both ends-it contains eight keys; these and the black keys between some of them are named as shown in Fig. 3.1. If we count black keys as well, and only the essentially different notes, there are actually twelve different keys in an octave, and most music of the Western Hemisphere is written using these and only these twelve different notes. For this reason, such music is referred to as twelve-tone music; but you are probably aware that music can be written in other systems. We shall try to answer three questions in this chapter. First, why are there twelve different notes in an octave on the piano? Second, what was the proposed modification in the twelve-tone system used in the time of Johann Sebastian Bach, a modification that he sought to call attention to by the publication of his Woh/temperiertes K/avier? Finally, are further improvements in the scale system we now use possible, and if so, how can such improvements be effected? Curiously enough, some information pertaining to the answers to all three of these questions is supplied by the methods of continued fractions, not an especially new branch of mathematics, but one which has begun to enjoy valuable applications in fields where accurate approximations to irrational numbers are needed, such as in the use of high-speed computers. Moreover, the use of continued fractions to help answer the above questions is particularly simple, and the only prerequisite for this particular application is a degree of familiarity with logarithms. 52
3.1
Properties of logarithms
C#
F#
D#
G# A#
C#'
D#'
53
F#'
Fig. 3.1 The names of the notes on the piano. C
D
E
F
GAB
C'
D'
E'
F'
Partly to remind you of the properties of logarithms, and partly so that the necessary properties will be listed in this book for your convenience, we shall first take up some topics in logarithms, leaving some proofs for the exerCIses. 3.1
PROPERTIES OF LOGARITH MS
First, if b is any positive number other than 1, and a is any positive number whatsoever, the equation always has a unique solution x, which may be positive, negative, or zero. This number x is called the logarithm of a to the base b, and we write
x = 10gb a. Thus, 10gb a is the power to which the base b must be raised in order to obtain the number a. For example, loglo 1000 = 3
and
log232 = 5
54
The well-tempered clavichord
3.1
y-axis
- - + - - - - + - - - - - - - - - - - x-axis
Fig.3.2 The graph of y = log,o x.
because 103 = 1000
and
2 5 = 32.
The graph of y = loglo X is shown in Fig. 3.2. Note that the graph of y = loglo x passes over only the positive x-axis-this indicates that only positive numbers have logarithms. Also, the graph passes through the point (1, 0), since 100 = 1; the graph of y = 10gb x will have this property for any acceptable value of b (namely, b positive but not equal to 1). Also, the graph of y = loglo X is always increasing-that is,
This is also the case for any other value of b so long as b exceeds 1.
3.1
Properties of logarithms
55
One useful application of logarithms is in quickly finding approximate answers to problems involving the products and quotients of many, or large, numbers; another especially useful application is found in obtaining good approximations to such numbers as (17)35.61. Because of this application, the base b = lOis used for the tables of logarithms so frequently found in the backs of textbooks in algebra, trigonometry, and engineering mathematics. By choice of base 10, it becomes especially easy to locate the decimal point in the final answer. However, the laws of logarithms which make such shortcut computations possible are independent of the choice of base, and theoretically any other positive number (except 1) will do as well. The answers obtained by such computations in any case will usually be only approximations, because although you may see from a table of logarithms that IOg10 2 is given as 0.30103, actually this number has been rounded off to fit in the table; 10glo 2 is in fact an infinite nonrepeating decimal. Hence 10°·30103 is not equal to 2, but will be very close to 2. The mathematical properties of logarithms which make such computations possible follow from the ordinary laws of exponentiation for real numbers, given below. For values for which the following expressions are defined, we have always:
a
aO
= 1 (for a
=1=
-x
1
=-
ax
'
0),
From these laws may be derived the various laws of logarithms. We need only two; the others are given in the exercises. Here are the two: Let b be a positive number other than 1. Then, for all positive numbers x and y, 10gb xy = (10gb x) + (10gb y). Also, for every positive number x and every number y,
It happens that these are exactly the two properties which also make the
approximate calculations mentioned earlier possible. For example, suppose we wish to find, using logarithms, the approximate value of the product of 82,425 and 46,037. Using a table of logarithms to the base 10, we find that 10glo 82,425 = 4.91606
and
10glo 46,037 = 4.66311.
56
The well-tempered clavichord
3.1
Thus IOg10 (82,425)(46,037) = IOg10 82,425
+ IOg10 46,037
= 4.91606 + 4.66311 = 9.57917. Using our table of logarithms in reverse to find what number has its logarithm equal to 9.57917, we find that number to be 3,794,600,000 (the five zeroes at the end of this number mean that our table is not sufficiently accurate for us to be sure of the last five digits). The correct answer to the problem is 3,794,599,725. This problem takes almost as long to do with logarithms as actually multiplying the original two numbers together, but in a problem involving several products and quotients, logarithms can immensely shorten the time needed for the computation. The price paid is, of course, the sacrifice in accuracy. For another example, suppose we wish to find an approximate value for 50 2 . This would take quite a while to multiply out by hand, but we find the logarithm of 2 in our table to be 0.30103, and thus
= 50· 0.30103 = 15.05150. Again using the logarithm table in reverse, we find that the number whose logarithm is 15.05150 is 1,125,900,000,000,000 (where, again, the eleven zeroes indicate that only the first five digits are reliable). Here the correct answer is 1,125,920,387,354,624. Exercises
3.1 Why can we not use the positive number 1 as a base for taking logarithms? 3.2 Sketch the graph of y = log2 x. 3.3 Sketch the graph of y = IOg(1/2) x. 3.4 Show that if 0 < b < 1, then the graph of y = 10gb X is always decreasing. You may assume that the graph of y = 10gb X is increasing if 1 < b. 3.5 Show that if 0 < band b i: 1, then the graph of y = 10gb x passes through the point (1, 0). 3.6 Use logarithms to the base 2 to compute the value of 4·8. 3.7 Use logarithms to the base 2 to compute the value of 43 •
3.2
A peculiar manipulation
57
3.8 Use the laws of exponents stated in this section to prove that 10gb xy
= 10& x + 10gb y.
3.9 Use the laws of exponents stated in this section to prove that log" x Y
=
Y 10gb x.
3.10 We defined x = 10gb a to mean that bX = a. Can you use this definition in reverse, as a definition of the meaning of b\ to prove the laws of exponents? Of course, you may use the properties of logarithms given in the previous two exercises. 3.11 Let a, b, and c be numbers for which the expressions below are meaningful. Prove that (lo&, b) . (10gb c)
= lo&, c.
Hint: Let x = lo&, b, y = 10gb c, and z = lo&, c. Then ~ = b, and so on. 3.12 Let b be a positive number not equal to I, c a positive number, and a a positive number not equal to 1. Simplify the following expressions: 10gb (ljb),
10gb bc,
(10gb a) . (lo&, b).
3.13 Let band c be positive numbers, neither equal to 1. Show that loge b
=
1 . 10gb c
3.14 Let b be a number larger than 1; if you want to be specific, you may even assume for the purpose of this exercise that b = 10. One law of exponents is that if x and yare any two numbers such that x < y, then bX < bY. Use this fact to show that also, if x < y and both x and yare positive, then 10gb x < 10gb y. 3.15 You will need a table of logarithms for this exercise. Solve for x: 2x
= 3. 3.2 A PECULIAR MANIPULATION
Let us assume that all logarithms to be used from now on are to the base b = 10, since tables of such logarithms are the most readily available, and we may then abbreviate IOg10 x by log x. Later, one of the most important computations we shall want to perform will involve expressing a number such as (log 3)j(log 2), which exceeds 1, as the sum ofa whole number and a number between 0 and 1. We know that (log 3)j(log 2) exceeds 1 by virtue of Exercise 3.14: since 2 < 3, also log 2 < log 3, and hence (log 3)j(log 2) > 1.
58
3.2
The well-tempered clavichord
Here is the computation by which we can express (log 3)/(log 2) as the sum of a whole number and a number between 0 and I. Note how the laws of logarithms mentioned in Section 3.1 are used. log 3 log [(2) . (3/2)] --=----"'-------= log 2 log 2 = log 2
+ log (3/2) log 2
= log 2 + log (3/2) log 2 = 1
log 2
+ log (3/2) . log 2
Now 3/2 < 2, and hence log (3/2) < log 2. Thus we have obtained the desired result provided both log (3/2) and log 2 are positive, so that their quotient does lie between 0 and I. You will see in the exercises at the end of this section that it is very easy to determine that both log (3/2) and log 2 are positive, and so we have achieved the desired result. Given the number (log 3)/(log 2), we have expressed it as the sum of a whole number and a number between 0 and I, in the form log 3 = 1 log 2
+ log (3/2) . log 2
It may seem strange to you at this stage that these peculiar manipulations of
logarithms can provide us with information about problems involving musical scale systems, but they will. Exercises
3.16 Show that log (3/2) and log 2 are both larger than 0; that is, that each
is positive. Hint: Use Exercise 3.14 and the fact that log I = O. 3.17 Express the number 12/7 as the sum of a whole number and a number between 0 and I. 3.18 Express the number (log 5)/(log 2) as the sum of a whole number and a number between 0 and I. 3.19 Express (log 4)/(log 2) as the sum of a whole number and a number between 0 and I. 3.20 Since we now know that 3/2 < 2 and log (3/2) < log 2, and that the latter numbers are positive, we can write 2 -log -> 1. log (3/2)
3.3
Continued fractions
59
Express log 2 log (3/2) as the sum of a whole number and a number between 0 and 1.
3.3 CONTINUED FRACTIONS
Consider a fraction such as 12/7. We may convert this into a compound fraction of a certain standard form as follows. 12 - 1 7
-
+~
7 1 7 5
- 1
+-
- 1
+
1 1
- 1 +
+ -2 5
1 1 1 +52 1
- 1 +
1
+
1 1 2+2
The "standard form" into which 12/7 has been converted is this: It is a compound fraction in which each numerator is I, all signs are "+" rather than "-", and all numbers are positive whole numbers. If we had begun with a negative number, we could still obtain this standard representation for it, except that the first number on the left would be a negative number. It should be clear that every rational number can be so expressed, and that the resulting compound fraction must be finite because the denominators decrease at each stage. But if we allow nonterminating denominators, just as we allow nonterminating decimal expansions, each real number can be expressed as
60
3.3
The well-tempered clavichord
such a compound fraction-these are called continued fractions. For example, the continued fraction for .J2 is 1 1 + ---------1 2 + --------1 2+ - - - --1 2 + ----2 + 1
2+
Since we have agreed that all numerators are to be 1 and all signs positive, we can adopt a much more convenient abbreviation for a continued fraction. We merely list in order the numbers to the left of each fraction, setting the first such number off by a semicolon, since it, unlike the others, may be zero or negative. The above continued fraction would then be abbreviated by (1 ; 2, 2, 2, 2, 2, ... ), and the continued fraction we previously obtained for 12/7 would be expressed as (1; 1, 2, 2). In the latter case, the case of a terminating continued fraction, we need to give the last denominator. Of course, there is no difficulty in evaluating a terminating continued fraction, such as (1; 2, 3). Only the following simple arithmetic is needed. (1; 2, 3)
= 1+
1 1 3
2+-1
1 7
+3
3
10
-1+-=-. 7 7 Here is a method for "evaluating" nonterminating continued fractions in which the numbers repeat periodically. This method will always produce the correct value for the sort of continued fractions we will be dealing with. We apply the method to the continued fraction (1; 2, 2, 2, ... ). Let x = (1; 2, 2, 2, ... ). Then x = (1; 1 + x), and so x=I+_I_ 1 + x
Thus x· (1
+
x)
= (1 +
x2 =
X
+
2
so that x
+
or
x)
+ 1,
3.3
Continued fractions
61
We discard the negative solution of this last equation for reasons to be discussed later, and find that x = 2. If you wish to work in the opposite direction and find the continued fraction expansion of x = 2, first write
.J
.J
x2
so that or (x
+
1)(x -- 1)
--
1 -- 1,
= 1.
Since we know that x '::F --1, we may divide both sides of this last equation by x + 1, and thus obtain 1 x -- 1 = 1 +x Thus 1 x = 1+ 1 +x Since the entire right-hand side of this last equation is equal to x, we substitute the entire right-hand side for the x in the last denominator, and obtain
x
1
= 1 +
1 + 1 + or
1
1 +x
1
x= 1 + 2+
1 1 +x
If we continue this process of substitution for the x in the denominator of the right-hand side, we obtain x = (I; 2, 2, 2, ... ), as desired. Exercises
3.21 Evaluate the cQll,tinued fraction (I; 2, 1, 3, 2). 3.22 Evaluate the continued fraction (1; 3, 3, 3, ... ).
.J"S.
3.23 Find the continued fraction for 3.24 Evaluate the continued fraction (1 ; 3, 4, 3, 4, 3, 4, 3, 4, ... ). 3.25 Since.J3 is between 1 and 2, the continued fraction expansion for must be of the form .J3 = (1; a1' a2, a3, a4' ... ),
.J3
where the numbers a 1, a2' a3, a4' ... are all positive whole numbers. Show that the continued fraction expansion of .J3 cannot be of the form (I ; a, a, a, a, ... ), where a is a positive whole number.
62
The well-tempered clavichord
3.4
3.26 Find the continued fraction expansion of .Ji 3.27 Evaluate (1; 1, 2, 2) and (1; 1, 2, 1, 1). 3.28 Evaluate (1; 1, 1, 1, 1, ... ). Call the resulting number Q. Show that if a rectangle with sides of length 1 and Q is constructed, then divided into a square and a rectangle by a line perpendicular to the long side, the new small rectangle also has its sides in the proportion 1 : Q. 3.29 Show that any rectangle with the property of the one constructed in the previous exercise must have its sides in the proportion 1 : Q, where Q
= (1; 1, 1, 1, 1, ... ).
3.30 Look up The Golden Mean in a book on the history of mathematics. 3.4 THE VALUE OF A CONTINUED FRACTION
We went through a process in the previous section by which we concluded that the value of the continued fraction (1 ; 2, 2, 2, 2, ... ) was.Ji But how can we say that a number is "equal" to one of these nonterminating continued fractions when such a continued fraction, because it is nonterminating, clearly cannot be evaluated directly? It is impossible to actually perform the infinite number of additions and divisions necessary to "evaluate" a nonterminating continued fraction, so what we must do is give a definition of the value of a continued fraction in terms of processes which actually can be carried out. This is how it is done. Consider again our continued fraction representation of .J2; that is, .J2 = (1; 2, 2, 2, 2, ... ). Let us imagine that we were actually trying to evaluate this "infinite" fraction, and write the sequence of so-called partial quotients that we obtain by evaluating it out to a certain point and then forgetting about the rest. In this instance, we would first evaluate (1; ), then (1 ; 2), then (1 ; 2, 2), and so on. The partial quotients thus obtained form the following sequence: 1,3/2,7/5,17/12,41/29,99/70,239/169, ... , Sl
= 1.000 000 000
S2
= 1.500 000 000
S3 S4 Ss
S6 S7
= = = = =
1.400 000 1.416 666 1.413 793 1.414 285 1.414201
000 666 103 714 183
or
. . . . . . .
If you know that .J2 = 1.414213 562 ... , you notice an interesting phenomenon. The sequence of partial quotients has bracketed the value of
The value of a continued fraction
3.4
Fig. 3.3
.J2 as the limit of a sequence.
.J2 much
1.4
63
1.5
as in artillery fire; the values are bouncing back and forth on either side of the value of .J2 and are getting closer and closer to it, as indicated in Fig. 3.3. The technical term used here is that .J2 is the limit of the above sequence of numbers, and it is in this sense that we say that the continued fraction (1 ; 2, 2, 2, 2, ) has the value.Ji The method we used ) in the previous section, the one in which for "evaluating" (1; 2, 2, 2, 2, we substituted the x on the left-hand side of an equation for the x on the right-hand side, is merely a device for avoiding examination of the sequence of partial quotients. This device is very useful when it works, but as you can see, it will work only on a continued fraction that repeats its terms periodically. You may well wonder whether or not all continued fractions have values in the sense of the limit of the sequence of partial quotients. After all, it is conceivable that the sequence increases without bound. Fortunately for our purposes, there is a theorem in the theory of continued fractions which guarantees that if ao is any real number whatsoever and ai' a2' a3' a4' ... are all positive whole numbers, then the sequence of partial quotients of the continued fraction (a o ; ai' a2' a3' a4' ... ) does indeed have a limit, and it is this limit we mean when we speak of the value of the continued fraction (a o ; a h a2' a3' a4' ... ). Further topics on limits can be found in the next set of exercises. But the above-mentioned theorem is by no means the most interesting fact about continued fractions. We shall use the following result to answer some of our questions about musical scale systems: A continued fraction
64
3.4
The well-tempered clavichord
provides us with a "best possible" sequence of rational approximations to its limit. In the last example, you can see that 17/12 is a fairly good approximation to,Ji The next fraction in the sequence is 41/29. The continued fraction for ,J2, namely (1 ; 2, 2, 2, 2, ... ), will yield up to us the information as to which, if any, fractions with denominators between 12 and 29 are better approximations to ,J2 than 17/12. (Sometimes the method will produce a few which are not so good as 17/12, but these can be eliminated by inspection.) The method will be easier to see if we first try a more complicated example; say, ex = (1; 3, 1, 4, 1, 3, 1, 4, 1, ... ). The value of ex is approximately 1.261 28. The sequence of partial quotients associated with the continued fraction expansion of ex is 1/1, 4/3, 5/4, 24/19, 29/23, .... Now 5/4 was obtained by calculating the value of (1; 3, 1) and 24/19 was obtained by calculating the value of (1; 3, 1, 4). To find the possibly better approximations to ex with denominators between 4 and 19, one simply calculates the values of the so-called intermediate fractions. One may think of these as being obtained in the following way. We already have 5/4 = (1; 3, 1). The next approximation given in the above sequence, 24/19, is obtained by calculating the value of (1; 3, 1, 4). To get the intermediate fractions we instead successively calculate the values of the fractions (1; 3, 1, 1) (1; 3,1,2) (1; 3,1,3)
or, in more mathematical notation, we calculate the values of the fractions (1; 3, 1, n), where n takes on all positive whole number values between 1 and 4 (since 4 is the last number in the fraction (1; 3, 1, 4) = 24/19); the first n - 1 of these are the intermediate fractions, and the last is just 24/19 itself. If the last term of 24/19 = (1; 3, 1, 4) had been some much larger number, such as 30, then n would have to be allowed to take on twenty-nine values rather than three, but the principle is the same. In any case, in the above example the three intermediate fractions turn out to be 9/7, 14/11, 19/15, and each can be tested to see whether or not it is a better approximation to ex = 1.261 28 than 5/4. We insert these three new fractions into the sequence of partial quotients, and obtain ... , 5/4, 9/7, 14/11, 19/15, 24/19, ...
3.4
The value of a continued fraction
65
These five fractions have the following decimal values: 1.200 00 1.285 71 1.272 72 1.26666 1.263 18
. . . . .
It then is clear that each is a better approximation to 1.261 28 than its predecessor. What the theorem guarantees is this: These are the only better approximations with such denominators-of all fractions with denominators between 4 and 19, only 9/7, 14/11, and 19/15 are better approximations to a; than 5/4. Similarly, there may be such intermediate fractions between other successive partial quotients. It will be necessary to examine the intermediate fractions in order to answer completely the question raised at the beginning of this chapter, on how improvements in the twelve-tone system might be accomplished. Exercises
3.31 Recall that we evaluated the continued fraction x = (1; 2, 2, 2, 2, ... ) by algebraic methods in which we obtained the equation x 2 = 2. We then stated that the value of x must be )2, since the negative root can be ignored. If you know the definition of the value of a continued fraction in terms of its sequence of partial quotients, you will now be able to explain why the negative root can indeed be discarded. Please supply this explanation. 3.32 The sequence Sh S2' S3' S4' . .. of real numbers is said to have the number L as its limit if the following is true: Given the positive number f, no matter how small, there exists a whole number N (probably dependent on f) such that for all values of n > N, also ISn - LI < f. This is the precise meaning of the word "limit" as used in the preceding section. With the aid of this definition, you can prove such theorems as this: If a sequence of real numbers has a limit, then it has only one limit. There are many others you can think of and prove. The concept of limit is fundamental to calculus, perhaps the best-known and most important branch of higher mathematics. 3.33 You may have noticed that the sequence of partial quotients Sl' S2' S3' S4' . . . of the continued fraction expansion of )2 had the property that Sl
<
S3
<
Ss
< ... < )2
and
)2 < ... <
S6
<
S4
<
S2·
Try to give an informal proof why this should always be true, restricting your attention to continued fractions with only positive whole number entries.
66
3.5
The well-tempered clavichord
3.34 Insert the intermediate fractions in the appropriate places in the sequence of partial quotients of the continued fraction (1; 2, 2, 2, 2, ... ). Indicate which of these are better approximations to .J"2 than their predecessors in the new sequence. 3.35 A desk calculator would be very helpful in working this problem. Suppose you wanted to find the continued fraction expansion of the square root of 10, and you alrtiady know that the decimal expansion of the square root of 10 looks like 3.162 277 660 . . . . You could then find the continued fraction expansion of 3.162 277 660 instead, since this is a good approximation to the square root of 10. You would expect that the answer would be similar to the continued fraction expansion of the square root of 10 itselfthe entries should be the same until round-off error catches up with you. Knowing the decimal expansion, you could proceed as follows: 3.162277 660
= 3 + 0.162277 660 = 3
+
I__ I 0.162 277 660
= 3+
1 6.162 277 666
= 3+ 6
+
1 0.162 277 666
...
At each stage, you extract the largest whole number you can from the last denominator, invert the remainder twice (to keep it equal to what it was before while producing a number larger than 1), and continue the process. After a few stages you will have a good guess as to the actual continued fraction expansion of the square root of 10 itself. Continue the above computations, and make that guess. 3.5 APPLICATIONS TO BASEBALL AND GRADE DISTRIBUTIONS
First, here is an example of how methods of continued fractions might be used to answer a question about baseball. When a player's batting average is given as, for example, 0.263, this number is computed by dividing the total number of the player's official hits by the total number of official at-bats at that point in the baseball season. The quotient is rounded off to three-place accuracy. Suppose that we are given that a player's batting average is 0.263. Can we draw any conclusions about his total number of hits and total number of at-bats? The number 0.263 is obtained by rounding off some longer decimal, obtained perhaps by dividing 76 by 289. But it would be far too much
3.5
Applications to baseball and grade distributions
67
trouble to try all possible fractions with denominators less than one thousand (a rather generous upper bound for the total number of at-bats available to one player in a single season) to find out which ones might round off to 0.263. However, 0.263 is itself a reasonably good approximation to the player's "true" or unrounded average, hence we can expand 0.263 into a continued fraction in order to find the good rational approximations to it, and thereby find the good rational approximations to the player's true average. Such rational numbers are the candidates we examine, the numerators providing us with possible numbers of hits, and the denominators being the corresponding numbers of at-bats. The continued fraction expansion of 0.263 is given below: 263 0.263 = = (0; 3, 1,4, 17,3). 1000 The corresponding sequence of partial quotients is 1/3, 1/4, 5/19, 86/327, 263/1000. We insert the intermediate fractions and obtain 1/1, 1/2, 1/3, 1/4, 2/7, 3/11, 4/15, 5/19, 6/23, 11/42, 16/61, 21/80, 26/99, 31/118, 36/137,41/156,46/175,51/194, 56/213, 61/232, 66/251, 71/270, 76/289, 81/308, 86/327, 91/346, 177/673, 263/1000. Of these, only 5/19 and those from 21/80 on give quotients which round off to 0.263. However, since 5/19 appears in the list, this prevents the occurrence of other forms of the same fraction, and so we should also include 10/38, 15/57, and so on. In any case, we now have all the possible combinations of hits and at-bats that give a percentage of 0.263. If you have additional information-such as the fact that the player is a little-used pinch-hitter, or that it is still early in the baseball season-this might be enough to conclude that the player must have had five hits out of nineteen times at-bat. Of course, with such advance knowledge we wouldn't have bothered to carry the listing of the fractions out so far, but would have terminated our list when the denominators became sufficiently large to take care of the maximum possible number of at-bats. As a second example, suppose that you know that in a certain class of no more than sixty students, an instructor's final grade distribution was given as A's: B's: C's: D's: F's:
10.8% 24.3% 37.8% 8.1% 18.9%
68
3.5
The well-tempered clavichord
Suppose that you wish to find exactly how many students were in the class. Consider first only the percentage of students receiving "A" grades. We find the continued fraction expansion of this number as follows: 108 1000
= 0 + _1_ 1000 108
=0+
1 9
=
0
+
28 108
+ __I _ I 108 28
9+-
=0+ _ _1_ _ 9
+
1 3
= 0
+
24 28
+ __1__ 9 + 1 3 +~
7
=0+
1 9
_
+ - -1- 3
+
1 1
1 6
+-
or (0; 9, 3, 1, 6). The associated sequence of partial quotients, together with their decimal expansions, is 1/9 = 0.111 111 3/28 = 0.107 142
. .
4/37 = 0.108 108
.
27/250 = 0.108000
.
Applications to baseball and grade distributions
3.5
69
Now 28 is too small a denominator, since the corresponding decimal does not round off to 0.108, and 250 is too large, since the denominator represents the class size and there are no more than 60 students. So any other possible approximations that round off to 0.108 can only be found among those intermediate fractions immediately after 3/28 and those immediately after 4/37. But there are no intermediate fractions between 3/28 and 4/37, and the first one after 4/37 is (0; 9, 3, 1, 1) = 7/65, which we reject since its denominator is too large. The following intermediate fractions can have only larger denominators, hence we are already finished. In effect, the only fraction with denominator between 1 and 60 which does round off to 0.108 is 4/37. Thus we know that there must have been 37 students in the class, and that four of these students received "A" grades.
Exercises
3.36 Repeat the preceding example, using the "B" grades instead.
You should obtain a continued fraction in the form (0; 4, 8, 1, 2, 9) for 0.243, and 9/37 the only admissible ratio. 3.37 Repeat the preceding example, using the "C" grades. You should obtain two admissible fractions, 14/37 and 17/45. 3.38 Do this example again, this time using the "D" grades. You should obtain only one admissible fraction. 3.39 Repeat the example using the "F" grades. You should obtain more than
one admissible fraction. 3.40 If you repeat Exercise 3.36, altering it only in that as many as 150
students may have been in the class, what results do you obtain for the number of students that mayor must have been in the class? 3.41 If a baseball player's batting average at the end of a season is 0.338,
what is the minimum number of times he could have batted (that is, what is the minimum number of official at-bats)? 3.42 Use the definition of limit given in Exercise 3.32 to prove that your
guess as to the limit of the sequence 1, 1/2, 1/3, 1/4, 1/5, 1/6, ... is correct. 3.43 The value of a continued fraction has been defined as the limit of its
sequence of partial quotients. How could one analogously define the "sum" of an infinite series such as 1
+
1/2
+
1/4
+
1/8
+
1/16
+ ... ?
70
The well-tempered clavichord
3.6
3.44 What is your guess as to the sum of the infinite series given in the previous exercise? 3.45 How would you prove that your answer to the previous exercise is correct? 3.6 HARMONY
We are finally ready to turn our attention to the musical scale. A vibrating string sets up corresponding vibrations in the air about it, vibrations which are perceived as sound if they are sufficiently, but not excessively, rapid. The human ear is usually capable of perceiving vibrations between thirty and seventeen thousand hertz (the currently approved term for "cycles per second"), although at the extremes only fairly loud sounds can be heard. It was known to the ancient Greeks that the frequency of such a vibrating string is inversely proportional to its length, all other pertinent factors (tension, density, diameter, ... ) remaining the same. That is, doubling the length of a string produced a frequency half that of the original string; to triple the frequency it would be necessary to use a string one-third the original length. In addition, it was probably known to the Greeks that a vibrating string of length (say) twelve inches also vibrated to a certain extent as if it were two six-inch strings joined at the middle, as well as three four-inch strings, and so on. This phenomenon, which occurs in different proportion with different musical instruments, is known as the production of the higher harmonics of the so-called fundamental frequency of the vibrating string, and it follows that the frequencies of these harmonics are the whole number multiples of the fundamental frequency. See Fig. 3.4 for an indication of the way in which the higher harmonics are produced. It is the production of these higher harmonics in different proportions by the various orchestral instruments that enables a composer to create music of such wide ranges of sound. The particular combination of harmonics is one of the main reasons why a violin sounds different from a saxophone. It also follows that if two strings are set in simultaneous vibration and the length of one is twice the length of the other, then the various harmonics match up, producing a pleasant (or at least a nondissonant) effect. For in this case the first harmonic of the longer string would be the fundamental frequency of the shorter, and all higher harmonics would have frequencies equal in pairs. However, though not unpleasant, the sound of two such tones is not a very rich sound. You can produce such a harmony by striking any two notes one octave apart on a properly tuned piano. It must have been discovered early in the history of music that if two strings were simultaneously plucked, and the length of the second was twothirds the length of the first, a pleasing harmony resulted. The shorter string produces a frequency three-halves that of the longer one, and many of the
3.6
Harmony
71
Fundamental
Second harmonic
~---~ Third harmonic
"""""==
:..7' Fourth harmonic
...............
:;7_
Fifth harmonic
Fig. 3.4 Production of harmonics by a vibrating string.
oc;;::::::
~
72
The well-tempered clavichord
3.6
higher harmonics are equal in frequency. The mixture of two such tones gives a richer sound than can be obtained from the harmonics of a single string, as you can hear by comparing the sound of middle C on the piano together with the G immediately above it with the sound produced by two notes an octave apart. (The pitch of this G is, however, not precisely threehalves that of middle C, but only very close to this ratio-later in this chapter we shall see why this is so.) Finally, the Greeks also discovered a principle which has come to be known as the Law of Small Whole Numbers: If the ratios of the lengths of two plucked strings could be expressed using small whole numbers, such as 2 : 1 in the case of an octave or 3 : 2 in the case of a fifth, then the sounds were harmonic rather than dissonant. But as the numbers needed to express such ratios become larger, the sounds become more dissonant; the ratios of 4 : 3 and 5: 4 produce harmonies which most people find quite pleasant, whereas only the most avant-garde composers might consider that a 37 : 29 ratio produces a harmonious sound. We may now speculate about the origin of our present twelve-tone scale system so widely used in our Western culture. If you will refer again to Fig. 3.1 at the beginning of this chapter, you will see that X # would be the black key of the piano immediately above the white key X. There is no black key between E and F, nor is there one between Band C, and so a musician would interpret E # to mean F and B # to mean C. If we have information about the frequencies of the notes in a single octave, we may use this information to find the frequencies of every other note on the piano, for each other note is some whole number of octaves above or below one of these, and its frequency may be found by the appropriate amount of doubling or halving the right one in our first octave. Finally, it is common to tune middle C to about 256 hertz, but this fact need concern us no further. After a brief excursion into the art of tuning a piano, we will be able to continue our study of the origins of the twelve-tone system. Exercise
3.46 Strike the key middle C on a piano, while holding down the key C' (the C one octave above middle C), and release middle C immediately. You will hear C' sounding its natural frequency, showing that this frequency is one of the harmonics of C. Can you find which other keys show the same behavior, and make a table of the higher harmonics of middle C in terms of the other notes on the piano. Hint: Of course, C' has twice the frequency of C, and C" (the C immediately above C') has four times the frequency of C, and so on. So, for example, you would look for the third harmonic of middle C somewhere between C' and C".
3.7
Tuning a piano, old style
73
3.7 TUNING A PIANO, OLD STYLE
We have already mentioned that the harmony produced by two strings in a 3: 2 length ratio can almost, but not exactly, be produced by striking middle C on the piano together with the G immediately above it. Why is not this G tuned to exactly three-halves the frequency of middle C? Suppose it were, and in fact that all such pairs of fifths-two notes seven half steps apart on the piano-were so tuned. Let the frequency of middle C be denoted by v. The argument that follows turns out to be independent of the value of v, since it cancels out, but if you wish you may assume that v = 256 hertz. Remember that if we move down one octave, or twelve half steps, on the piano, the frequency will be halved. We proceed to fill in the frequencies of all the other notes in our octave, using the fact (which we have assumed in order to reach an eventual contradiction) that each note a fifth above another has frequency exactly three-halves the frequency of the other.
C
G (3j2)v
v
D' (9j4)v
Since the note D' above has escaped our octave, we return it by dividing its frequency by 2: C D G (3j2)v v (9j8)v From the frequency of D we can calculate the frequency of A, then E'-we then return the latter to our octave as before. C
D
v
(9j8)v
E (8Ij64)v
G (3j2)v
A (27j16)v
We continue this process for the other notes in the scale, and we finally end up with the following frequencies for each:
C
v
C# {lj2)4(3j2fv D {lj2)(3j2)2 V D # {lj2)5(3j2)9 V E {lj2)2(3j2)4 V F {lj2)6(3j2)11 V F# {lj2)3(3j2)6 V G (3j2)v G# {lj2)4(3j2)8 V A (lj2)(3j2)3 v A # {lj2)5(3j2)10 V B (lj2)2(3j2)5 V C' {lj2)6(3j2)12 V
74
The well-tempered clavichord
3.7
The frequency of C', the last to be obtained and the last in the above list, should give us the value v if divided by two, since C' is one octave above middle C, our starting point. Thus
(1/2)7(3/2)12 v == v. We may cancel out v, and obtain next
{1/2)7(3/2)12 - 1. When we simplify this equation, we get 312 == 2 19 . But this last equality is impossible, since 312 is odd but 2 19 is even. This shows that it is impossible to have all the fifths on a piano tuned exactly so that the frequency ratio in each is exactly 3 : 2. Of course, one could tune middle C to the correct frequency v and then tune all the other C's on the piano to the correct multiples of v. Then one could tune G above middle C to the frequency (3/2)v (this can be done quite easily by ear alone with the aid of a little experience), and then set all the other G's by that one. Proceeding in this manner, one would finally obtain the frequency {1/2)6(3/2)11 V for the F above middle C, and then all the other F's could be tuned accordingly. All the fifths would have exactly the correct ratio except for one-the fifth from F to C. Here the ratio would be a little less than 1.48 rather than 1.5, quite enough difference to sound very peculiar. Pianos were tuned in this fashion for many years (actually, in Europe, this method of tuning disappeared about the same time as the harpsichord did), and the interval between F and C was known as the Wolf Interval, because wolves howl; this was an allusion to the howling higher harmonics of the two notes, few of which came anywhere near matching up, thus producing a number of dissonances. Of course, if one were to playa composition in which the notes F and C were used as little as possible, as in a piece written in the key of B major, the Wolf Interval would be avoided and the music would sound quite good. (If you wished to compose in C, though, you would have to have your piano retuned so as to place the Wolf Interval on another, little-used, fifth.) However, transposition, changing the key of a piece by moving all notes up or down a fixed amount, was impossible in the modern sense, for various harmonic ratios would be changed by transposition. The key of composition was sufficiently important to composers of the seventeenth and eighteenth centuries so that they commonly included it in the titles of their works. Exercises
3.47 Repeat the argument of this section, using a seven-tone scale system, and thus show that it is also impossible to have all fifths in exactly a 3 : 2
3.7
Tuning a piano, old style
75
ratio in such a system. Note: In a seven-tone scale, a "fifth" would be an interval of four, rather than seven, half steps. 3.48 Repeat the argument of this section, using an n-tone scale system (where n is a positive whole number larger than 1), and thus show that it is also impossible to have all "fifths" in exactly a 3 : 2 ratio. 3.49 Suppose that middle C on the piano is tuned to the frequency v, and
all other C's are set accordingly. Suppose also that each note on the piano is then tuned a fixed multiple of the frequency of the note one half step below it; that is, calling this multiple k, then the frequency of C# would be kv, the frequency of D would be k 2 v, and so on. What is the value of k? 3.50 Since 2 <
.J7 <
3, we know the continued fraction expansion of starts off like (2; ... ? ... ). So we start by writing
.J7
= 2
+ (.J7 -
J7
2),
because .J7 - 2 is a number between 0 and 1. We invert twice to obtain a number larger than 1, and thus
.J7
= 2
+
_I_ I
.J7 -
2
The formula (a + b)(a - b) = a 2 - b 2 can be used to eliminate from the last denominator, as follows:
1
----
.J7 -
2
-
-
.J7+2 (.J7 - 2)(.J7 + 2) .)7 +
2
7 - 22
.)7 +
2
3
Thus
r
-v7
=
2
+ /
1
+
-v7
2
3 Since 2 <
or
.J7 <
3, it follows that 4 <
.J7
+ 2 < 5, and so
-<
4
.J7 + 2
O<
.J7+2 <
5 333
3
<-
1.
.J7
76
3.8
The well-tempered clavichord
Hence
,fi=2+ 1
+
J . 7 - 1 3
If you continue this process, eventually a repetition will set in; you will find that
../7
=
2
+
1 1
_ 1
+-----1 1 +----1+ 1 2
+ ../7
Hence ../7 = (2; 1, 1, 1,4, 1, 1, 1,4, 1, 1, 1, 4, ... ). Repeat this process in order to find the continued fraction expansion of ../i 3.8 TUNING A PIANO, NEW STYLE
Now 312 is not equal to 2 19 , but the two numbers are relatively close, and careless tuning might convince someone that it would be possible, using twelve notes in an octave, to return to exactly the original frequency after tuning twelve successive "perfect" fifths. This may well have something to do with the genesis of the twelve-tone scale used almost exclusively in Western music. Other cultures have developed scale systems with other numbers of half steps in the octave, seven being one of the most common. But the same argument as used before, in which we obtained the contradictory result 312 = 2 19 , can be repeated for such scale systems to show that an attempt to produce perfect fifths will always result in a Wolf Interval. Perhaps the first question of this chapter has been partially answered, as to why there should be twelve notes in an octave. We say "partially," for what we have done so far is to give a possible evolutionary justification for the twelve-tone system. Later we shall see that there is a mathematical reason as well. Johann Sebastian Bach wrote his Well-Tempered Clavichord, a collection of forty-eight piano compositions, to draw attention to an alternate tuning method which was then being considered in European musical circles. This new method, known as well-tempering, involved tuning all the C's on the piano as before, but subsequently effecting a compromise. Rather than have most of the fifths in perfect harmony and one very bad Wolf Interval, it was proposed that all the fifths be tuned very slightly flat-with a ratio of approximately 1.498 307 rather than exactly 1.5-the number 1.498 307 chosen because it is that number which will cause all the fifths, including the
3.8
Tuning a piano, new style
77
last one tuned, to be in the same ratio. Rather than having a piano with one howling interval, we would have a piano with twelve, but each of these twelve wolves would howl so quietly and only at such high harmonics that they would be undetectable except to electronic instruments or to an uncommonly well-trained ear. Avoiding the Wolf Interval was one of the reasons brought forth to support well-tempering, but another must surely have had wide appeal: Because all the intervals (not just the fifths) would have fixed ratios on a welltempered piano, transposition becomes possible. A piece written in C # could be played in D or in G without changing any of the harmonic ratios, and only a person with a well-developed sense of pitch could tell the difference. The effect of transposition would be the same as if the piece were recorded at 33t rpm and played back slightly faster or slower. And with the possibility of transposition, vocal music becomes much easier to write and to sing. This advantage of well-tempering must have been an important factor contributing to its eventual adoption; for adopted it was, and is now used on virtually all pianos. And perhaps we have answered the second question raised at the beginning of this chapter, by showing one reason for the composition of the Wohltemperiertes Klavier. The drawback to well-tempering is, of course, that none of the harmonies other than octaves are "perfect." As we have seen, a fifth is tuned with the higher note about 1.498 307 times the frequency of the lower, rather than 1.5 exactly. We now turn to the third question mentioned at the beginning of this chapter. Would further improvements in the scale system be possible, and how might they be accomplished? Should it be possible, by changing the number of notes in an octave from twelve to some other number, to welltemper a piano, but have the ratios of the fifths much closer to 1.5 than they are at present? It seems likely. Theoretically we could have 1200 notes in an octave rather than 12, and the fifth might be better approximated by striking two notes 702 half steps apart rather than the 700 which would correspond to the sound of the twelve-tone fifth. Of course, the technical difficulties involved in the construction of such an instrument, to say nothing of the problem of playing the monstrosity, appear intimidating; but perhaps we do not need so many notes in order to improve the ratios of the fifths. Well, we seem to be asking for a "next best approximation," and this brings us back to the methods of continued fractions. Exercises
3.51 It was mentioned in this section that in well-tempering, the ratio of the fifths is 1.498 307 rather than 1.5 exactly. How was this number obtained? Hint: See Exercise 3.49. 3.52 Any scale system that improves fifths will also improve fourths-pairs of notes five half steps apart in the twelve-tone system, with an ideal (but not
The well-tempered clavichord
3.9
actual) frequency ratio of 4: 3. fourths?
Why will improving fifths also improve
78
3.53 In music, thirds are used very extensively in producing harmonies. In the twelve-tone piano, a third can be produced by striking middle C and the E immediately above it, or any two notes four half steps apart. The ratio of frequencies in an ideal third would be 5 : 4, or 1.25, but such cannot be obtained for every third in the twelve-tone system for reasons similar to those which show that not all fifths can have the perfect ratio of 1.5. Show why this is so. In the well-tempered twelve-tone piano, what is the actual ratio of a third? 3.54 In what keys are the forty-eight compositions in the Well-Tempered Clavichord written? 3.55 You may need both a table of logarithms and a desk calculator for this problem. Suppose that a piano were constructed with 1200 notes in each octave. Would the note 702 half steps above middle C produce a harmony with middle C closer to a perfect fifth than the note 700 half steps above middle C? Would the former note be better than any other on such a piano for this purpose?
3.9 IMPROVING THE OCTAVE
Now 2(7/12) is very nearly equal to 1.5; in fact, 2(7/12) is approximately equal to 1.498 307. What we seek is some possibly larger number, say n, of notes in an octave, in which a "better" fifth would be (say) m notes up, so that 2(m/n) would be a better approximation to 1.5 than 2(7/12). As in our baseball example, it would be both difficult and time-consuming to test all possible fractions with denominators between 12 and 1000 to find such a better approximation. Instead we seek rational approximations to the solution x of the equation 2x = 1.5 by finding the continued fraction expansion of x. Note that it will be unnecessary to find explicitly the decimal expansion of the solution x in order to do this. Using the laws of logarithms, we first transform the equation 2 = 3/2 into a more tractable one, as follows: X
2
x
= 3/2,
log 2 X = log (3/2), x log 2 = log (3/2),
log (3/2) x---log 2 .
3.9
Improving the octave
79
We find the continued fraction expansion of x, much as we found the fraction for ..j7 in Exercise 3.50. Since 3/2 < 2, then also log (3/2) < log 2, so initially we invert to obtain a fraction larger than 1 to work with. We then apply to the denominator the methods of Section 3.2, as follows log (3/2) 1 ---log 2 log 2 ' log (3/2) 1 x=---log (3/2)(4/3) log (3/2)
1
-----1
+
log (4/3) log (3/2)
The fraction in the last denominator is less than 1, so again we doubly invert it to obtain a number exceeding 1, and proceed as before. By replacing log (3/2) with log (4/3)(9/8), we obtain as our second stage the fraction below:
1
x=-----1 + 1 1 + log (9/8) log (4/3)
Again, we doubly invert the fraction in the last denominator to obtain a number larger than 1, but this time we obtain a number between 2 and 3, and slightly trickier methods are now needed: log (4/3) log (9/8)2(256/243) log (9/8) log (9/8)
= 2 + log (256/243) . log (9/8) Thus we obtain for the third stage of our continued fraction expansion of x the following: x
1
= ---------1
1 + -------1 + 1 2 + log (256/243) log (9/8)
80
The well-tempered clavichord
3.9
The numbers involved get sufficiently large to cause some practical difficulties even with the aid of a desk calculator in about two more stages, so we carry the above no further but provide you with the results of our labor. x = (0; 1, 1,2,2,3, 1, 5, 2, 26, ... ). The corresponding sequence of partial quotients is 1/1, 1/2, 3/5, 7/12, 24/41,31/53, .... As you recall, the reason that the fraction 7/12 appears in this sequence is that 7/12 is a fairly good approximation to the solution x of the equation 2x = 3/2. Actually, the value of x is 0.584 962 501 ... , and the difference between 7/12 and x is 0.001 629 .... This is quite small, and provides us with a good mathematical reason for using twelve notes in an octave, but we will have to insert the intermediate fractions in the above sequence to find the other possibly better approximations. Between 7/12 and 24/41 there are two intermediate fractions, 10/17 and 17/29. Some arithmetic calculations provide us with the information that the error between 10/17 and x is 0.003 272 ... , so that 10/17 is a poorer approximation to x than 7/12. Thus we cannot improve the harmonies of fifths by using a seventeen-tone scale. For 17/29 the error is 0.001 244 ... , so that a twenty-nine-tone scale would be slightly better than the twelve-tone scale, but not by much. It would seem hard to justify using more than twice as many notes in an octave in order to reduce the error by less than twenty-five percent. However, the error between 24/41 and x is much smaller-only 0.000 403 ... , and we leave it to musicians to decide whether reducing the error in the fifths to less than a quarter of its value in the twelve-tone scale would justify building instruments with forty-one steps in each octave and writing music for such instruments. The important idea is this: The theory of continued fractions guarantees to us that the "next best" number of notes in an octave, after twelve, is twenty-nine, next after that is forty-one, and so on. Note finally that there is one intermediate fraction, namely 4/7, between the two partial quotients 3/5 and 7/12. This suggests that a seven-tone scale might be reasonably harmonious, and in fact some oriental music is written in such a system. Exercises
3.56 It was stated in this section that the continued fraction expansion of the solution x of 2X = 3/2 is x
= (0; 1, 1,2, 2, 3, 1, 5, 2, 26, ... ).
But only the first four numbers were found by computations shown in the text. Verify that the next two are correct.
Notes and references
81
3.57 This is a very long problem. If you find the continued fraction expansion of the solution y of 2Y = 5/4, you will be able to find what number of notes in an octave would produce better approximations to the "perfect third" ratio of 1.25 (see Exercise 3.53) than are possible in the usual twelvetone scale system. With luck, one of these solutions might also produce better fifths. Find the next larger number of notes after 12 which will improve thirds. If possible, find the next larger number of notes after 12 which will improve both fifths and thirds. 3.58 Think of an argument against the well-tempering system of tuning a plano. 3.59 This is a highly speculative problem. All you have to do is think about it, or perhaps discuss it. The human eye perceives just about an "octave" of the spectrum of radiation frequencies, and it has been customary for centuries to think of this "octave" as made of seven "notes": red, orange, yellow, green, blue, indigo, and violet. Why seven? Is there a theory of color harmony lurking about somewhere? Is the division of the spectrum into seven colors a cultural phenomenon, or is there a good physical or mathematical reason for it? Note that 4/7 is a good approximation to the solution of 2X = 3/2. Do any cultures have twelve color names, or five, or forty-one? 3.60 Can a violinist play perfect fifths? Explain your answer. NOTES AND REFERENCES
Continued Fractions, A. Ya. Khinchine's little book published in 1964 by the University of Chicago Press in English translation, is still a good reference work on the theory of continued fractions whose entries are real or positive whole numbers. Science and Music, by Sir James Jeans (Dover Publications 1968 edition), contains many topics on harmony, a thorough discussion of the history of scale systems, acoustics, and several other topics related to the material of this chapter. There is a minor error on page 188 of this edition, in which Sir James has neglected the possibility of a twenty-nine-tone scale system. Analytic Theory of Continued Fractions, by H. S. Wall (van Nostrand, 1948) contains a wealth of information of a much more abstract sort, most of which is of use only to students of advanced mathematics. Horns, Strings, and Harmony, by Arthur H. Benade (Doubleday, 1960) is another good book, one which should serve to show that this chapter deals with only one very simple consideration in the mathematical and physical theory of music.
CHAPTER 4
GROUP THEORY
Group theory is remarkable in that from such simple beginnings so much can be deduced, and in the wide variety of applications it has found-not only in almost every other branch of modern mathematics, but also in such diverse fields as crystallography and quantum mechanics. We shall be concerned with only a very few major theorems in this chapter, but even in the introductory stages of group theory you will be able to appreciate its elegance, as well as learn enough to prove a wide variety of theorems on your own.
4.1
SOME EXAMPLES OF GROUPS
We first present some examples of mathematical systems which will turn out to satisfy our subsequent definition of "group." You should ask yourself what these examples have in common, for it is just these common properties that willleaQ us to the abstract definition. Example 4.1 Consider first the set W of all whole numbers, positive, negative, and zero, together with the single operation of ordinary addition. This operation is said to be closed with respect to the set W, for if m and n are two whole numbers then so is m + n. In other words, we cannot get outside the set W using the operation of addition. Moreover, this operation also obeys the associative law: If m, n, and p are all elements of the set W, then (m 82
+ n) + p =
m
+
(n
+ p).
4.1
Some examples of groups
83
Since we are dealing only with the operation of addition, this law is a great convenience-it permits us to neglect parentheses in many cases, and we shall do so. To put it another way, the value of an expression such as
m+n+p is independent of either choice of placement of parentheses, and hence is unambiguous. Third, there is one element of W, namely 0, with the property that if m is any element of W whatsoever, then
o+ m = 0 =
m
+ O.
For this reason the number 0 is called an identity element of the set W with respect to the operation of addition. We must say "an" identity element, for it is conceivable that there might be others. Also, if we were using a different operation (such as multiplication) the identity element might well be some other element of W. Finally, with the existence of an identity element assured, we may speak of an inverse of an element of W. In this example, each element of W does have an inverse; that is, given an element m in W, there does exist at least one element m' of W such that
m'
+m=
0 = m
+ m'.
The element m' is called an inverse for m; again, it is conceivable that an element has more than one inverse. In this particular case, m' is of course just the so-called additive inverse - m of the whole number m. To summarize, then, as Example 4.1 we have a pair (W, +) consisting of a set W, a binary operation "+" on W, such that the operation is closed and associative, W contains an identity element 0 with respect to this operation, and each element of W has an inverse with respect to this operation. Example 4.2 For our second example, we begin with an equilateral triangle in the plane. The elements of the set G that will form our group are going to be the congruence motions of this triangle. There are a number of different ways in which an equilateral triangle is congruent to itself, and the motions that produce such congruences will be the elements of the group-not the triangle itself; the triangle has been introduced for accounting purposes only, to keep track of the effect of each congruence motion. Three such motions are shown and labeled in Fig. 4.1. The motion which has the effect of rotating the triangle one-third revolution clockwise will be denoted by S. The motion of rotating the triangle one-third revolution counterclockwise will be denoted by R, and the motion (if we may be permitted use of such a term in this last case) of no movement at all will be
84
4.1
Group theory
a
a
I b~---~c
b~---~c
b
c
c~---~a
a~---~b
Fig.4.1 Three congruence motions of an equilateral triangle.
denoted by I. If we are prohibited from moving the triangle out of the plane, these are all the possible motions. Now we have a set containing three rather ususual elements, the motions I, R, and S; in order to have a group we need an operation on this set, and so before we consider other possible motions of the triangle, let us see how we are going to define the necessary operation. We shall always refer to our group operation as multiplication, except when we happen to be using ordinary addition as in Example 4.1. To continue with this analogy, we shall write the product of the two group elements x and y as x· y or even simply as xy. In the present example, we define the multiplication of congruence motions in the following manner. Given motions x and y, the product xy is that motion which has the net effect of first doing x, then doing y. In Fig. 4.2, we have indicated the product of R with itself, and we see that RR = S, since the net effect of first doing one rotation one-third revolution counterclockwise and then another, has the same effect on the triangle as a single rotation one-third revolution clockwise. You can also see with the aid of figures similar to Fig. 4.2 that IR = R, RS = I, and so on.
4.1
Some examples of groups
85
a
c b----~c
----~b
Fig.4.2 Products of congruence motions: R . R = S.
c~----~a
If we are allowed to lift the triangle out of the plane in order to perform congruence motions, three more motions become possible, as shown in Fig. 4.3. We have called one of these motions by the name A because the vertex labeled a of the triangle does not move as a consequence of this motion, one which amounts to rotation of the triangle about the bisector of the angle with vertex at a. For similar reasons Band C have received their names. The important thing to remember about this operation of "multiplication" of congruence motions is that the motion A does not mean to put the triangle into the configuration shown as a consequence of the motion A in Fig. 4.3, but means to rotate the triangle about its vertical axis, regardless of the names pasted on its vertices. Similarly, the motion R does not mean to put the triangle with the vertex labeled c at the top, a at the lower left, and b at the lower right, but simply means to rotate the triangle one-third revolution counterclockwise.
86
Group theory
4.1
a
a
A
c
b
!c
c
b
~
b
a
..
c
c Fig. 4.3 Congruence motions of an equilateral triangle by rotation about an angle bisector.
a
b
If you wish, you might now cut an equilateral triangle from cardboard and label its vertices a, b, and c (on both sides of the cardboard), and then you can discover, for example, that AR = C, BC = S, and so on. We can summarize such information about this mathematical system by saying that the set of objects which will form the group consists of the six motions I, R, S, A, B, and C; the multiplication is defined as above; and a "multiplication table" for this system is shown below. I
R
S
A
B
C
I
I
S
A
B
R S A
R S A
R S
C A
B
C
C I R C A B C B I S R B A C R I S C B A S R I I
B
This multiplication table is interpreted in the following way: The first element of the product XY appears in the leftmost column, the second in the topmost row, and the product XY itself is in the "obvious" place in the body of the table. With the aid of this table we can examine this mathematical
4.1
Some examples of groups
87
system for those same four properties we abstracted from Example 4.1, the example of whole numbers with the operation of ordinary addition. First, since each entry in the body of the table is in fact an element of the set G = {I, R, S, A, B, C}, we have the desired closure property. That is, if X and Y both belong to the set G, then so does their product XY. Next, the operation is associative for a simple, though perhaps subtle, reason: Both (XY)Z and X( YZ) mean, in effect, first do X, then do Y, and then do Z. This is the interpretation no matter which way the parentheses are placed, and the net effect on the triangle will be the same in either case. Hence for all elements X, Y, and Z of the set G, (XY)Z = X( YZ). Thus the operation is associative. The appearance of a row identical to the topmost or guide row-that is, the appearance of the row with the element I at its left-and the appearance of a column identical to the left-most column show us that IX = X = XI
for all elements X of G, and hence the element I of G is an identity for this operation. Finally, the appearance of the identity element I at least once in each row and column assures us of the existence of inverses for each element of G; to find, for example, the inverse R' of the element R, locate I in the row to the right of R in the body of the table. Then find the element at the head of that column; in this case, the element is S, and so we know that RS = 1. We verify also that SR = I, and hence S = R', the desired inverse of R, and similarly R = S' as well. If the other elements of G are also checked, you will find that for each congruence motion X, there is a congruence motion X' such that X'X = I = XX'. Thus each element of G has an inverse in G. There are thus four properties common to Examples 4.1 and 4.2. These are just the properties that will lead us to the definition of group. If we denote the multiplication in G by ".", then we have seen that in both the system (W, +) and the system (G, .), the operation is closed and associative, an identity exists with respect to this operation, and each element in the system has an inverse in the system. However, there is very little else shared by these two examples. For instance, the operation in (W, +) is commutativem + n = n + m is true for all values of m and n in W; but in the system (G, .), RA = B whereas AR = C. Thus the order in which group elements are multiplied is important, and we cannot assume without prior knowledge that a given operation is commutative. Also, the set W is infinite while the set G is finite. The system (W, +) has an algebraic or numerical origin, but (G, .) has a geometric origin. Other
88
4.1
Group theory
than the four properties we have noted that they do have in common, there is very little that appears to suggest common properties of the two systems (W, +) and (G, .). However, it is the aim of group theory to discover those properties which must be held in common by all (or certain classes of) groups. Example 4.3 H = {O, 1, 2}. Now you know what set is to be used in this example. It remains only to define an operation, which amounts to filling in the following table. 012
!i We are using whole numbers for group elements here simply for convenience, since there may not be enough letters in our alphabet to cover all the examples we might wish to consider. But how do we fill in this table? We certainly cannot use ordinary addition or multiplication, for 2 + 2 = 4, and 2·2 = 4, and 4 is not an element of H. Under these circumstances our operation could not be closed. One way we might fill in the table is shown below. 012
o 1 2
0 1 2 I 2 0 2 0 1
At this point you may be a little disturbed, if you are asking yourself what this operation "is." But if you reflect a moment, you will see that you already know everything there is to know about this operation on H; the preceding table gives the answer to every conceivable question about the operation, except possibly the one question that bothers many people at this point. When they ask what this operation "is," they are really asking how it came into being. One possible response is that this question belongs to the realm of philosophy rather than mathematics. But mathematics should be able to answer reasonable questions dealing with the origin and existence of mathematical objects. So if you return to Example 4.2, and consider only the three congruence motions in the set {I, R, S}, you can see that the multiplication table for this subsystem is as follows. IRS I R S
IRS R S I SIR
4.1
Some examples of groups
89
So our multiplication table for H = {a, 1, 2} is really the preceding table traveling under an alias; we have simply replaced I by 0, R by 1, and S by 2. So the above question about the origin of Example 4.3 may be answered in this case by demonstrating a geometric origin for the elements of H and their multiplication. It is appropriate to mention here that the other examples of this section, too, as well as all groups, can be thought of as having a geometric origin in much the above sense. There does remain the question of checking Example 4.3 to see that the four properties we are so interested in hold true. Closure, the existence of an identity, and the existence of an inverse for each element of H are easily verified by an examination of the multiplication table. And because we have demonstrated that H is actually of geometric origin, so that the operation in H may be thought of as combination of congruence motions as in Example 4.2, the operation in H is automatically associative. This is perhaps one of the easiest ways to demonstrate associativity; without previous knowledge of the origin of the operation in H, one would otherwise have to check for associativity by verifying that a rather large number of equations hold true in H. For example, one would test 0· (1 ·2) and (0· 1) ·2 for equality, and continue with twenty-six other cases. There is, in fact, a test for associativity of such a multiplication table that involves "looking at the table" in much the same way that a table can be visually examined for the other three properties, but it is quite complicated and generally takes longer than almost any alternative. We finally present, very briefly, a few more examples of groups. As you have seen, it is really sufficient to list the elements of the set and give the multiplication table. We will usually do this and no more in our examples. Later, when you are making conjectures about properties of groups, you may find it useful to refer to these examples to test your conjectures. Example 4.4 J = {a, I, 2, 3}.
° °°
1 2
3
1 2
3
1
1 2
2 3
2 3 3
°
Example 4.5 K = {e, a, b, c}.
e a b c
3
° ° 1
1 2
e
a b c
e a b c
a e c b
b c c b e a a e
90
Group theory
Example 4.6 Zn = {O, 1, 2, 3, ... , n sion of x + y by n. Example 4.7 E addition.
4.1
I}. xy is the remainder upon divi-
= {... , - 4, - 2, 0, 2, 4, ... }. The operation is ordinary
Example 4.8 R+ is the set of all positive real numbers. The operation is ordinary multiplication. Example 4.9 Let D be a circular disk of radius 1 in the plane. The elements of the group L are to be all possible congruence motions of this disk, including infinitely many different rotations and infinitely many different ways of turning the disk over about a straight line through its center. We could easily name these congruence motions; for example, we could denote by R 30 the motion of rotating the disk 30 degrees counterclockwise, and by T90 the motion of flipping the disk over about a line making a 90 degree angle with the x-axis. The operation on the elements of L will be the same as in the case of the congruence motions of the equilateral triangle, Example 4.2; namely, the product (R 30 ) • (T90 ) will be that motion which has the net effect of first doing R 30 , then T90 , to the disk. In this case the product is T75 , as you can easily verify. Again, for the same reasons as in Example 4.2, this operation is associative. It is only closure which is not so obvious. Example 4.10 Let a regular tetrahedron in three-dimensional space have its vertices labeled as in Fig. 4.4. The set T is to consist of all the congruence motions of this tetrahedron. We will name these motions in such fashion as to make the effect of each easy to remember, and it will turn out that our labeling method also makes computation of the product of two such motions very simple. First, consider the motion of rotation of the tetrahedron one-third revolution counterclockwise about the altitude passing through the vertex labeled O. This has the effect of moving the vertex in position 1 to position 2, the vertex in position 2 to position 3, and the vertex in position 3 to position 1. The vertex labeled 0 remains fixed. Hence we abbreviate this motion by (1 2 3), where it is the order of the symbols that tells us what is going on: By the motion (1 2 3) we mean that "1 goes to 2, 2 goes to 3, and (because here the parentheses close up) 3 goes to 1." The omission of 0 from the symbol (1 2 3) means that the vertex at 0 is left fixed. The symbol (0 1)(2 3) represents another possible congruence motion, the one in which the vertices at 0 and 1 change places while the vertices at 2 and 3 also change places. Caution: There are some symbols that seem meaningful, such as (1 2), which do not represent possible congruence motions of the tetrahedron. As in Examples 4.2 and 4.9, the multiplication we use on this set T of all possible congruence motions of the tetrahedron goes as follows: The product xy of two such is that congruence motion which has the net effect of first doing
4.1
Some examples of groups
91
2
oL.---------~
Fig. 4.4 A regular tetrahedron with vertex positions labeled.
x, then doing y to the tetrahedron. As we mentioned before, our method of labeling these congruence motions makes computation of products particularly simple. If, for example, you wish to find the product (1 2 3)(1 4 3), all you have to do is follow each vertex through each of these motions to find out where it finally ends up. Take the vertex in position 1. After applying (1 2 3) to the tetrahedron, this vertex lies in position 2. Then, after applying (1 4 3), the vertex in position 2 does not move. Hence the net effect of the product (1 2 3)(1 4 3) on the vertex in position 1 is to move that vertex to position 2. So we can begin writing down the answer; the product must look like the expression below: (1 2 3)(1 4 3) = (1 2 In the incomplete symbol on the right-hand side above, we next want to write down the symbol following the 2; since the product is to tell us where the vertex in position 2 ends up, we follow that vertex through the two motions (1 2 3) and (1 4 3), and find that the vertex in position 2 first moves to
92
Group theory
4.1
position 3, and then, under the action of (1 4 3), is moved to position 1. We indicate this by closing up the parentheses: (1 2 3)(1 4 3) = (1 2). But our work is not yet complete, for it is quite possible that the vertices in positions 3 and 4 are also moved by the product (1 2 3)(1 4 3), so we follow each of them through the two motions (1 2 3) and (1 4 3), and finally discover that the desired answer is (l 2 3)(1 4 3)
= (1 2)(3 4).
It turns out that the tetrahedron will admit twelve congruence motions,
which we list below; since each product can be computed directly from the symbols involved, it is as unnecessary to give the multiplication table for T as it would be to give the complete multiplication table for the set R+ of Example 4.8. All one needs to know is that T = {I, (1 2 3), (1 3 2), (1 2 4), (1 4 2), (1 3 4), (1 4 3), (2 3 4), (2 4 3), (1 2)(3 4), (1 3)(2 4), (l 4)(2 3)},
where I stands for the "motion" in which no vertex is moved. Like the previous example, only the closure property is not so obvious. To summarize, in all ten examples we have a set, together with an operation, such that the operation is both closed and associative, the set contains an identity with respect to this operation, and each element of the set has an inverse with respect to this operation. Any such mathematical system is known as a group. And you now have at your disposal an infinite number of different examples, because Example 4.6 is really infinitely many examples. On the other hand, Examples 4.4 and 4.3 are really special cases of Example 4.6, and in some sense Example 4.1 is really the same as Example 4.7. Exercises
4.1 Verify that Example 4.4 is an example of a group. Note: The associativity provides the only difficulty. This can be established by "realization" of the example as a group of congruence motions, by direct computation, or by use of the Euclidean algorithm: If m and n are two natural numbers, then there exist unique integers q and r such that n = qm + rand 0 ~ r < m. Some careful use of this theorem will produce a proof that the group given as Example 4.4 is associative, and in fact can also be used for Example 4.6. 4.2 You saw that Example 4.3 was really in some sense the "same" as the group {I, R, S} of rotations of an equilateral triangle. The sameness comes from the fact that it is possible to rename the elements of one group and thus obtain the other group (the multiplication table, when renamed, must
4.1
Some examples of groups
93
also correspond to the multiplication table of the other group). Show that, as stated at the end of this section, the groups of Examples 4.1 and 4.7 are really the "same" in the above sense. 4.3 Even though the groups of Examples 4.4 and 4.5 have the same number of elements, they are not the same in the sense of the previous exercise. Why not? 4.4 Let a square be given in the two-dimensional plane. Name each of the eight congruence motions of this square according to some reasonable scheme, and exhibit the multiplication table for this system. (As usual, the product of two motions is to be the motion that has the net effect of first doing one, then the other.) Show that this system is a group. 4.5 Let (G, .) be an arbitrary group. Show that G cannot contain two different identity elements; that is, if e andfare two elements of G such that
ex = x = xe, and
fx = x = xl, for all elements x of G, then e = f 4.6 Let (G, .) be an arbitrary group. Show that each element of G has only one inverse; that is, show that if x is an element of G, and y and z are elements of G each of which acts like an inverse for x, then y = z. Your first line of the proof might well be: "Let x be an element of G, and suppose that y and z are elements of G such that yx = e = xy, and
zx = e = xz, where e denotes the identity of G." 4.7 Show that if x is an element of the group G such that xx = x, then x = e, the identity of G. 4.8 By virtue of Exercise 4.6, we may refer to "the" (rather than "an") inverse of a group element. Let us denote the inverse of x by X-l. Prove that if x is an element of the group G, then (X-l)-l = x. 4.9 Let G be a group and w an element of G such that wx
=
x
for some (not necessarily all) x in G. Prove that w must be the identity of G.
94
Group theory
4.1
4.10 Let M = {O, I}. Find all possible ways to fill in the multiplication table shown below so as to make M into a group.
o
I
4.11 Let H = {O, I, 2}. Find all possible ways to fill in the multiplication table shown below so as to make H into a group. For simplicity, you should assume that in any case, 0 is to be the identity for the resulting group.
012
4.12 Verify that the system shown in Example 4.5 actually is a group. 4.13 Of the examples considered in this section, some have a commutative operation (xy = yx is always true) and some do not (xy = yx is sometimes false). List those groups for which the operation is commutative and those for which it is not. 4.14 See Exercise 4.13. Is there a way that one can decide if a group operation is commutative by simply looking at the group's multiplication table? 4.15 Write down the complete multiplication table for the group T given in Example 4.10. 4.16 Show that, in the multiplication table for a group, each element of the group appears exactly once in each row and column (excluding, of course, the guide row and guide column). 4.17 See Exercise 4.16. Fill in the multiplication table shown below in such a way that 0 is an identity element for the operation, each element appears exactly once in each row and column, but the resulting system is not a group. What must go wrong? o I 234
o I
2 3 4
04.18 Besides ordinary addition and multiplication, there are numerous other associative operations on the set R of all real numbers. Verify that the operation # defined by x#y=x+y+1
Subgroups
4.2
95
for all real numbers x and y is associative. See if you can find three other associative operations on R. We ask that these operations be closed, but not necessarily so chosen as to make R into a group. 4.19 An operation on a set S is said to be cancellative provided that both the following are true:
x = y. then x = y.
a) Ifax = ay, b) If xa
then
= ya,
Show that every group is cancellative. 4.20 An operation on a set S is said to be cross-cancellative provided that, whenever xa = ay, then x = y. Give an example of a group in which the cross-cancellative law does not hold.
4.2 SUBGROUPS
Consider the subset U
=
{I, (l 2)(3 4), (l 3)(2 4), (1 4)(2 3)}
of the group T given in Example 4.10. One would generally expect a subset of a group not to be a group in its own right if the same operation were used; one thing that might go wrong would be that the subset does not contain an identity (which would have to be the same as the identity of the original group). However, the subset U shown above does contain the identity of T. Another difficulty could be that the subset is not closed under the operation used in the group. We test U for closure by writing down its multiplication table as follows: I
(l 2)(3 4)
(l 3)(24)
(l 4)(2 3)
I
I
(l 2)(3 4) (l 3)(24) (1 4)(2 3)
(l 2)(3 4) (l 3)(24) (l 4)(2 3)
(l 2)(3 4) I (l 4)(2 3) (1 3)(24)
(l 3)(24) (l 4)(2 3) I (l 2)(3 4)
(l 4)(2 3) (l 3)(24) (l 2)(3 4) I
We see immediately from this table that not only is the operation closed with respect to U, but also each element of U has an inverse in U. Since the operation is automatically associative on U, U together with the operation previously defined for T becomes a group in its own right. Under such circumstances U is said to be a subgroup of the group T, and the formal definition is given next. H is said to be a subgroup of the group G provided that H is a subset of G and (H, .) is a group, where the operation is the same as that in the group G.
96
4.2
Group theory
Since the operation of G, when used on a subset H of G, is automatically associative, in order to show that H is a subgroup of the group G it is only necessary to show that a) H is a subset of G, b) the identity e of G belongs to H, and c)
if x and yare elements of H, then so are xy and
X-i.
As mentioned in Exercise 4.8, since each group element x has one and only one inverse, we may refer to that inverse with the notation X-i. Note that a group G is always a subgroup of itself, and that the subset {e} of G consisting of the identity e of G alone is alse always a subgroup of G. These two subgroups are called improper subgroups of G; all others, if any, are called proper subgroups of G. The order o(G) of a group G is just the number of elements in the set G. If this set should be infinite, we say that the group G has infinite order. For example, oCT) = 12, where T is the group of Example 4.10; the order of its subgroup U discussed previously is 4. The group (W, +) of whole numbers with the operation of addition provides an example of a group of infinite order. A group consisting of a single element (which must necessarily act as its identity) is the only possible way one can have a group of order I. For each natural number n, Example 4.6 provides an example of a group of order n, namely Zn. So there is at least one group of given order. There may be more than one; see Exercise 4.3. If you now check some examples of groups, such as the ones previously given in this chapter, you will discover no exception to the following rule: If G is a finite group and H is a subgroup of G, then o(H) divides o(G) evenly. For example, the group T of Example 4.10 has subgroups of order I, 2, 3, 4, and 12, and all of these numbers are divisors of 12, the order of T itself. This important principle is known as LaGrange's Theorem, and because it has so many useful consequences we now provide a proof. Theorem 4.1 (LaGrange's Theorem) If G is a finite group of order nand H is a subgroup of G of order m, then min. Proof. The notation min is just a convenient abbreviation for the phrase "m divides evenly into n." Note that min is true if and only if there exists a whole number k such that n = mk; that is, if n is an integral multiple of m. So let G be the group of order n, and let H be a subgroup of G of order m. Now if x is any element of G whatsoever, we can form what is called a left coset of H in G, obtained by multiplying each element of H on the left by x, and thus obtaining a subset of G. Our notation for the left coset of H by x in G will be xH, and an alternate way of stating the above definition is xH
=
{xh
Ih E
H}.
4.2
Subgroups
97
We now establish the following things about the left cosets of H in G: a) H is a left coset of H in G. b) Any two left cosets of H in G contain the same number of elements. c) Any two left cosets of H in G are either the same, or else have no elements of G in common. d) Each element of G belongs to some left coset of H in G. Consequently, the left cosets of H in G actually have the effect of chopping G up into a number, say k, of nonoverlapping subsets, each of which has the same number of elements as H itself-namely m. Hence n = mk, and hence min, as we wish to show. In the next set of exercises, outlines of the methods of establishing the four propositions above will be given, with the details left for you to provide. Contingent upon these four exercises, the proof of LaGrange's Theorem will be completed. To illustrate how this proof works in a special case, let us examine what happens if we use this coset procedure on the group G of Example 4.2. Now o(G) = 6, so we want to show that any subgroup of G has order 1,2,3, or 6. Let us use for the subgroup H, the set H = {I, R, S}. Pretend you do not immediately see that o(H) = 3, one of the divisors of o(G). First we list the left cosets of H in G as follows: IH RH SH AH BH CH
= {II, IR, IS} = {I, R, S}. {RI, {SI, = {AI, = {BI, = {CI, = =
RR, RS} = {R, S, I}. SR, SS} = {S, I, R}. AR, AS} = {A, C, B}. BR, BS} = {B, A, C}. CR, CS} = {C, B, A}.
Note how each of the four propositions we need to establish for the proof of LaGrange's Theorem is illustrated. First, H = {I, R, S} is indeed one of the left cosets of H in G. Second, any two of the left cosets of H in G indeed contain the same number of elements. Third, any two left cosets of H in G are either the same or else have nothing in common. Finally, every element of G appears in some left coset of H in G. Since there are exactly two left cosets of H in G in this example, the number of elements in G must be double the number in each coset. In particular, the number of elements of G must be double the number in its subgroup H, and so since o(G) = n is an integral multiple of oCR) = m, we see that min is true. Note that LaGrange's Theorem tells us that a group of order 30 (for example) can have subgroups only of order 1, 2, 3, 5, 6, 10, 15, and 30-but not that such subgroups must exist. The group T of Example 4.10 has order 12 but no subgroup of order 6, even though 6 I 12. So when you use LaGrange's Theorem, be sure to use it with care.
98
4.2
Group theory
Suppose that G is a group and 9 is one of its elements. By g1 we mean g. Also, by g2 we mean gg, by g3 we mean ggg, and so on; so if n is a natural number, g" is just an abbreviation for the product of g"-1 and g. In analogy to calling our group operation multiplication, and speaking of gg as the product of 9 and g, we call n an exponent in the expression g", and call g" the !,lth power of g. We can also define negative powers of g. If n is a natural number, then by g-" we mean (g-1)". Finally, we convene that gO = e, the identity of G. With these definitions of exponentiation, certain laws of exponents hold; namely, a) gmg " = gm+", b) (gm)" = gm". However, as a general rule, only those laws of exponents hold which involve but one element of G; it is not in general true that if 9 and h are elements of G, then (gh)" = g"h". There is one very useful law of exponents, though, that does hold when two elements of G are involved: c) (gh)-1 = h- 1g- 1. One item worth remembering is that the two laws (a) and (b) above hold for all integral values of m and n, positive, negative, and zero; for example, from (b) one can derive that
for all integral values of n. Suppose we were to pick an element 9 from a group G, and examine all positive powers of g. For example, in our group G of Example 4.2, the set of all positive powers of R is {Rt, R 2 , R 3 , R4, R 5 , R 6 , R 7 ,
.•. }
= {R, S, I, R, S, I, R, ... }.
As another example, in the group R+ of Example 4.8, the set of all positive powers of the element 3 is {3 1 , 32 , 33 , 34, 35 , 36 , 37 ,
..• }
= {3, 9, 27, 81, 243, 729,2187, ... }.
It may happen that when we consider the set of all positive powers of the element 9 of an arbitrary group G, the set 234567 1 { g,g,g,g,g,g,g,
... }
mayor may not contain e, the identity of G. In our first example above, the set does contain the identity; R 3 = I, the identity of the group of congruence motions of an equilateral triangle. In the second example, in the group R+, no positive power of 3 is equal to 1, the identity of R+.
Subgroups
4.2
99
If no positive power of the element 9 in G is the identity, then 9 is said to have infinite order. If some positive power of 9 is the identity, then there is a least such positive power, and this exponent is called the order of g. So in the above two examples, the order of R is 3, and the element 3 of R+ has infinite order. It should be clear that in a finite group, each element has finite order; an infinite group may contain only one element of finite order (the identity), or may consist entirely of elements of finite order, or may even be mixed. However, in a finite group, the set consisting of all positive powers of an element turns out to be a subgroup; for instance, in the first example above, the set of all positive powers of R is {I, R, S}, which we have already seen is a subgroup of the group G of Example 4.2. Moreover, this subgroup has order 3, the same as the order of the element R which generates it. This is no coincidence; if a group element 9 has finite order m, then the subset {
2 3... m 1 g,g,g, ,g}
always turns out to be a subgroup of order m; and if the group involved is finite of order n, then min by LaGrange's Theorem. Consequently, we have another theorem, with details of the proof left for the exercises. Theorem 4.2 If G is a finite group of order nand 9 is an element of G of order m, then min.
In summary, the order of each subgroup of a group of order n, and the order of each element of a group of order n, must always be a divisor of n. These two facts will be found very useful in subsequent exercises. Exercises
4.21 Find three different subgroups of the group (W, +) of Example 4.1. Does this group contain any subgroups of finite order? 4.22 Find all the subgroups of the group G of Example 4.2. Give the order of each subgroup of G and the order of each element of G. 4.23 Does the group G of Example 4.2 consist of all positive powers of some one of its elements? Why? 4.24 Does the group H of Example 4.3 consist of all positive powers of some one of its elements? Explain. 4.25 Give the order of each element of the group J of Example 4.4, and the order of each element of the group K of Example 4.5. Does this information provide a possible method of working Exercise 4.3? 4.26 What is the order of the identity e of a group G? Do any other elements of G have the same order as e?
100
Group theory
4.2
4.27 Does the group L of Example 4.9 contain elements of finite order other than its identity? How many? Does the group L of Example 4.9 contain any elements of infinite order? How many? 4.28 Does the group L of Example 4.9 contain any subgroups of order two? How many? Does L contain a subgroup of order three? Four? Any natural number? Does L contain any subgroups of infinite order in addition to L itself? 4.29 Give the order of each element of the group T of Example 4.10. Is every divisor of o(T) = 12 represented among these numbers?
4.30 Let G be a group and g an element of G of order n. Show that, for each element x of G, x-lgx also has order n. Warning: It is not generally true that (x-lgx)n = (x-Itgn~. Another warning: Just because y" = e, it does not immediately follow that y has order n. 4.31 Find an example of a group G in which the equation (gh)2 = g2h 2 does not hold for all elements g and h of G. 4.32 Let G be a group in which, for each two elements g and h of G, (gh)2 = g2h 2. Prove that G must be commutative. 4.33 Let G be a group with identity e such that x 2 = e for all elements x in G. Prove that G must be commutative. 4.34 Let G be a group with subgroups Hand K. Prove that H (") K is also a subgroup of G. 4.35 Can a group G contain exactly one element of order 3? 4.36 Let G be a group and a and b elements of G. Suppose that the order of ab is finite. Prove that ba must then have the same order as abo 4.37 Let G be a group and a and b elements of G, neither the identity of G. Suppose that as = e and aba- l = b 2. What is the order of b? 4.38 Let G be a commutative group and let H be the set of all elements x of G such that x 2 = e. Prove that H is a subgroup of G. 4.39 Prove that if a, b, and c are elements of a finite group, then the elements abc, bca, and cab have the same order. 4.40 Find all the subgroups of the group T of Example 4.10. What is the order of each of these subgroups? Is every divisor of o(T) = 12 represented among these orders? 4.41 This is how to show part (a) of the proof of LaGrange's Theorem: If G is a group and H is a subgroup of G, then H itself is a left coset of H in G. It suffices to find an element x of G such that xH = H. What is a reasonable candidate for x? Show that your candidate works. 4.42 This is how to show part (b) of the proof of LaGrange's Theorem: Any two cosets of the subgroup H in the finite group G contain the same number of elements. Let xH and yH be two left cosets of H in G (of course,
4.2
Subgroups
101
x and yare elements of G). Then a typical element of xH has the form xh where h is some element of H. Let p = yx- 1. Simplify pxh. Show that pxh is an element of yH. Show that the set yH can be obtained by applying p as a left multiplier to each element of xH. Show that this transformation of xH into yH by application of p as a left multiplier actually sets up a oneto-one correspondence between the elements of xH and yH. It follows that xH and yH have the same number of elements.
4.43 This is how to show part (c) of the proof of LaGrange's Theorem: Any two left cosets of the subgroup H in the group G are either the same, or else have no elements of G in common. Let xH and yH be two such cosets. If xH and yH have no elements in common, we have no more to prove. Suppose then that xH and yH have some element of G, say g, in common. Then 9 must have the form xh 1 for some element h 1 of H; but 9 must also have the form yh 2 for some element h 2 of H. Hence xh 1 = yh 2. From this equation, derive the equation y - 1 X = h 2h 11 • Then show that y -1 X must be an element of H. Table this result for a while. Now, to show that xH = yH, choose a typical element, say xh(h E H) of xH, and prove that xh must belong also to yH. Use the fact that (y-1 x)H = H, which follows from the fact that y-1 x E H (why?). Of course, a similar proof can be used to show that each element of y H is an element of xH, and it follows that xH = yH. Hence if two left cosets of H in G overlap, they must be equal. 4.44 This is how to show part (d) of the proof of LaGrange's Theorem: Each element of G belongs to some left coset of the subgroup H in G. Let 9 be an arbitrary element of G. What is a reasonable candidate for the left coset xH to which 9 might belong? Show that, for your choice of x, 9 actually does belong to xH. 4.45 Show that, in a group G with elements 9 and h, (gh) -
1
= h - 19 -1 .
4.46 Show that each element of a finite group has finite order. 4.47 Let G be a finite group, and 9 an element of G of order m. Let H consist of the first m powers of g; that is, H
=
{g1, g2, g3, ... , gm}.
Show that H must be a subgroup of G. Hint: You may use the fact that for each natural number n, the group Zn of Example 4.6 actually is a group. 4.48 With respect to the previous exercise, show in addition that o(H) Hint: Clearly, o(H) ~ m. What would happen if o(H) < m?
= m.
4.49 Let G be a finite group. Show that there exists a fixed natural number k such that, for each element 9 of G, gk = e. 4.50 Let G be a finite group of even order. Prove that G must contain an element of order 2.
102
4.3
Group theory
4.3 CYCLIC GROUPS AND ABELIAN GROUPS
The group G is said to be cyclic provided that G contains an element g such that each element of G is a power of g. With respect to this definition, one is allowed to use all integral powers of g, positive, negative, and zero. For example, the group (W, +) of Example 4.1 is cyclic, because every element of W is a "power" of 1. Here, because of the additive rather than multiplication notation, we interpret 1" as n· 1, for g" means iteration of the group operation, using the element g, n times. So in the case of the element 1 of the group (W, +), 1" would mean the sum of n 1's: 1+1+1+···+I=n·l. If G is a cyclic group, then an element g of G whose powers produce all elements of G is called a generator of G, and G is said to be generated by g. It is usually the case that a cyclic group has more than one generator. The group (W, +) contains another generator in addition to I-what is it? The group G of Example 4.2 is not cyclic, as you were asked to verify in Exercise 4.23. Of the two groups J and K of Examples 4.4 and 4.5, one is cyclic and one is not-which is which? See also Exercise 4.25. The group G is said to be Abelian if the operation in G is commutative; that is, if gh = hg for all elements g and h of G. Numerous examples of Abelian groups have already been examined. See Exercises 4.13, 4.32, and 4.33. Our next theorem is sometimes quite useful. Theorem 4.3 Every cyclic group is Abelian. Proof Let x and y be elements of the cyclic group G, and let g be a generator of G. Then, by definition, there exist integers m and n such that x = gm and y = g". Hence xy
=
(gm)(g")
=
gm+"
=
g"+m
=
(g")(gm)
=
yx.
Therefore G is Abelian. As is frequently the case in group theory, the converse of this theorem does not hold. That is, there do exist Abelian groups which are not cyclic. However, there is one very important class of groups which must be cyclic merely as a consequence of their order. Recall that a positive whole number p is said to be prime if p has exactly two natural number divisors-itself and 1. For example, the first ten prime numbers are 2,3,5, 7, 11, 13, 17, 19, 23, 29. Theorem 4.4 Every group ofprime order is cyclic. Proof Let G be a group of prime order p. Since p > 2, G contains at least one element g not equal to the identity e of G.
Cyclic groups and Abelian groups
4.3
103
Let H be the subgroup of G generated by g. That is, H = {gl, g2, g3, ... , gn},
where, as we have seen, n is the order of g. Since 9 :I: e, 9 has order at least 2. But the order n of 9 must be a divisor of p = o(G), by Theorem 4.2. Since p is prime, its only divisors are I and p, and hence n = p. But H also has order n = p, and hence H is a subset of G containing as many elements as G itself does. Hence H = G. So, since each element of H is a power of g, so is each element of G. Therefore by definition, G is cyclic. As usual, the converse of this theorem does not hold. It does not follow that a cyclic group must be of prime order. Exercises
4.51 Is every subgroup of (W, +) cyclic? 4.52 Show that the group Zn of Example 4.6 is cyclic for each natural number n. 4.53 Show that if a group G has prime order, then G is generated by each of its elements other than the identity. Hint: Look at the proof of Theorem 4.4.
4.54 Let G be an Abelian group, n an integer, and 9 and h elements of G. Show that (gh)n
=
gnhn.
4.55 Let G be an Abelian group and H the subset of G consisting of all elements of G of finite order. Show that H must be a subgroup of G. Hint: Use the previous exercise. 4.56 Let G be a finite Abelian group and let k be a fixed natural number. Let H be the set of all kth powers of all elements of G. Prove that H must be a subgroup of G. 4.57 Let G be a group and n a natural number such that, for all elements x and y of G, (xyt (xy)n+l (xy)n+2
= = =
~yn, ~+1~+t, ~+2yn+2.
Prove that G must be Abelian. 4.58 Prove that every subgroup of a cyclic group is cyclic. Hint: Let G be a cyclic group and H a subgroup of G. Let 9 be a generator of G. Then each element of H is a power of g. Some positive power of G must appear in H (unless H = {e}: but then H is cyclic); let k be the least positive integer such that gk E H. Prove that gk generates H.
104
Group theory
4.3
4.59 Let G be a group. The center Z of G is the set of all elements z of G such that zg = gz for all 9 E G. Prove that Z is a subgroup of G.
4.60 Let G be a finite group and 9 a fixed element of G. Let N(g) be the set of all elements x of G such that gx = xg. Prove that N(g) is a subgroup ofG. 4.61 This is difficult. A semigroup consists of a set together with a closed and associative binary operation; the set must be nonempty. Prove that a finite cancellative semigroup is a group. See Exercise 4.19. 4.62 Suppose that G is a finite group with no proper subgroups. Prove that G must have prime order. Warning: The converse of LaGrange's Theorem is false. 4.63 Let G be a group, H a subgroup of G, and 9 a fixed element of G. Let g-IHg consist of all elements of G of the form g-lhg, where h is an element of H. That is, g-IHg = {g-lhg I hE H}.
Prove that 9 - I Hg must also be a subgroup of G. 4.64 Let G be a group and 9 and h elements of G. Prove that the equation xgx = h has a solution x in G if and only if gh = a2 for some element a in G. Warning: Group G is not necessarily Abelian. 4.65 Suppose that 9 is the only element of order two in the group G. Prove that 9 must belong to Z, the center of G. See Exercise 4.59.
4.66 Let G be a group containing elements x and y such that xy2 = y3 x and yx2 = x 3y. Prove that x = e = y. 4.67 The subgroup H of the group G is said to be normal in G provided that xH = Hx for all x in G. (Of course, Hx = {hx I h E H}.) Prove that the center Z of a group G is a normal subgroup of G. See Exercise 4.59. 4.68 Let G be a finite group of order 2n and suppose that G contains a subgroup H of order n, where n is a natural number. Prove that H is a normal subgroup of G. See Exercise 4.67. 4.69 Let G be a finite group of order 2n and suppose that G contains a subgroup H of order n, where n is a natural number. Prove that H contains all the elements of G of odd order. 4.70 Prove that the equation x 2ax = a-I has a solution x in the group G if and only if a = g3 for some element 9 of G. 4.71 Let A be a subset of the finite group G and suppose that A contains more than half the elements of G. Prove that each element of G is the product of two elements of A.
Cyclic groups and Abelian groups
4.3
105
4.72 Prove that if Hand K are two normal subgroups of the group G and have only the identity of G in common, then every element of H commutes
with every element of K. That is, if hE Hand k E K, then hk = kh. See Exercise 4.67. 4.73 Let G be a finite group of order n, where n is not 1 and n is not prime. Prove that G must contain at least one proper subgroup. 4.74 (Azriel Rosenfeld) Prove that a group with only finitely many subgroups must be finite. 4.75 (Mira Bhargava) Prove that the union of two subgroups of a group is itself a subgroup if and only if one of the two original subgroups contains the other. 4.76 (F. M. Sioson) Prove that any semigroup in which x 2y = y = yx 2 for all its elements x and y must be a group. See Exercise 4.61. 4.77 (W. A. McWorter) Let G be a group of order n 2 and H a subgroup of G of order n. Prove that for each element x of G, H () x- 1 Hx contains more than one element. See Exercise 4.63. 4.78 (Michael Gemignani) Let G be a group with identity e and A a subgroup of G such that (G - A) u {e} is also a subgroup of G. Prove that either A = {e} or A = G. Note: G - A is the subset of G consisting of those elements of G not belonging to A. 4.79 (Alan Schwartz) Let G be a finite Abelian group of order n. Then n is odd if and only if each element of G is a square. (The element 9 of G is a square if 9 = h 2 for some h E G.) 4.80 (Erwin Just and Mary R. Embry) Let G be a group in which no element has order 2. Prove that if (xy)2 = (yX)2 for all elements x and y of G, then G must be Abelian. 4.81 (W. A. Donnell) Let S be a semigroup satisfying the cross-cancellation law. Prove that S is both Abelian and cancellative. Need S be a group? See Exercises 4.20 and 4.61. 4.82 This and the next three exercises deal with one of the major theorems about homomorphisms. A homomorphism cp: G --+ H is just a function cp from the group G to the group H that "preserves the group operations"that is, for each two elements g1 and g2 of G, CP(g1g2)
= [CP(g1)] [CP(g2)]'
Of course, the operation between the square brackets is the group operation in H. The image I of such a homomorphism cp: G --+ H is the set of elements h of H such that h = cp(g) for some 9
E
G. Show that, in the above notation, I is a subgroup of H.
106
4.3
Group theory
4.83 Continuing the previous exercise, we define the kernel K of a homomorphism
cp(g) = e,
where e denotes the identity of H. Show that, in the above notation, K is a normal subgroup of G. 4.84 Continuing the two previous exercises, let cp: G -+ H be a group homomorphism with kernel K. Let GJK denote the collection of left cosets of K in G. For two such left cosets xK and yK, we define an operation on the set GJKas follows: (xK)(yK)
=
(xy)K.
Show that GJK together with this operation becomes a group. 4.85 Finally, in the notation of the previous exercises, let (): GJK according to the rule
-+
I
()(xK) = cp (x),
where I denotes the image of cp. Show that () is then also a homomorphism, and that in fact () is both one-to-one and onto (see Chapter 7 for definitions of the last two terms).
NOTES AND REFERENCES
There are so many excellent texts and references on group theory that it would be almost a hopeless task to list them all. A few are given below for the convenience of the reader; however, most of those listed are fairly advanced. Birkhoff, G., and S. MacLane, A Survey of Modern Algebra (Macmillan, 1963). Hall, M., The Theory of Groups (Macmillan, 1959). Herstein, I. N., Topics in Algebra (Blaisdell, 1964). Kurosh, A. G., The Theory of Groups (two volumes, translated by K. A. Hirsch; Chelsea, 1955). Jacobson, N., Lectures in Abstract Algebra, volume I (van Nostrand, 1951). MacLane, S., and G. Birkhoff, Algebra (Macmillan, 1967). Passman, D. S., Permutation Groups (Benjamin, 1968).
Notes and references
107
I wish to thank the editors of the American Mathematical Monthly for permission to use several problems submitted as part of the regular problems section of that journal. These are Exercises 4.74 (E 1522, Vol. 69, 1962), 4.75 (E 1592, Vol. 70, 1963),4.76 (E 1629, Vol. 70, 1963),4.77 (E 1874, Vol. 73, 1966), 4.78 (E 1764, Vol. 72, 1965), 4.79 (E 1794, Vol. 72, 1965), 4.80 '(E 1996, Vol. 74, 1967, and solution in Vol. 75, 1968), and 4.81 (E 2007, Vol. 74, 1967). Joseph Louis LaGrange, who first formulated and proved the theorem that bears his name, was born in Turin in 1736. He was recognized as an exceptionally able mathematician while still in his teens, and many consider him second only to Euler of the mathematicians of the eighteenth century. He solved a large number of significant and difficult problems, was one of the first to insist on accuracy in mathematical proofs, and late in his life he became known as a great teacher of mathematics. His major researches, curiously enough, were not in group theory, but in mechanics, the calculus of variations, and number theory.
CHAPTER 5 POLYHEDRA
In this chapter we shall be concerned mostly with the numerical relations that must exist between the numbers of faces, edges, and vertices of a solid polyhedron in three-dimensional space, and with the connection between such relations and the mapmaker's problem of coloring each face of such a figure so that faces with a common boundary are colored differently. Although the material of this chapter is much simpler than much of that in the previous four, it is unusual and surprising that under such circumstances we shall find ourselves on the very edge of the mathematically unknown. For example, suppose you are given a map of connected countries on a large island, and you wish to color different countries in such a way that countries with a common boundary have different colors. It has been proved that five colors are always sufficient, but no one has ever been able to construct such a map in which all five colors are actually required. Since it is easy to construct a map that requires four colors, one sees that the answer to the question of how many colors are needed is either "four" or "five"; but although amateur and professional mathematicians have worked on this problem for over a century, no one has yet discovered which is the correct answer. Apparently the answer has something to do with Euler's Formula, which gives one numerical relation that must exist between the numbers of vertices, edges, and faces of a polyhedral solid; "apparently," because the proof that five colors are always sufficient in the case mentioned above does require Euler's Formula. Hence we begin with some introductory material about polyhedra. 5.1 THE DEFINITION OF POLYHEDRON
A peculiarity of much of mathematics is its instability under slight changes. A theorem that is true may become false with only the slightest alteration in its wording-it is even possible that a change in the order of two clauses 108
5.1
The definition of polyhedron
109
may alter the truth of the statement of the theorem. Hence it is quite important that the things about which these theorems are proved are defined precisely, not out of any innate desire for precision, but from the need for accuracy. Although you may be quite sure that you know what is meant by a "solid polyhedron in three-dimensional space," someone else may well have a definition that encompasses a wholly different set of objects. So the problem arises of formulating a definition of such objects, merely for purposes of communication alone. Of course, it would be preferable if the resulting definition were exactly in accord with every man's preconceived notion of the term defined, but this is not essential and is frequently impossible. For example, you may well feel that a polyhedron ought to be connected-that is, come in a single lump rather than several-but you could easily imagine how one could defend an alternative definition in which polyhedra were not required to be connected. If there were in the literature of Qlathematics a single and well-established definition o(the term "polyhedron," it would be the one used here; but since there is not, we may define this term as we choose so long as, for practical reasons, our definition is reasonably close to the commonly accepted meaning of the term. So we choose a definition close to the common meaning, but suited to the purposes of this chapter. What we seek is a way of defining a solid three-dimensional object, a subset of three-dimensional space bounded by "flats" with straight edges, and connected so as to assure that the object "comes in a single piece." Since we must begin with some primitive or undefined terms, we assume as given and understood the terms "point," "line segment," and "plane" of ordinary euclidean geometry. Part of the definition of polyhedron will involve the notion of a polygon, so first we rephrase the definition of "polygon" as given in Chapter I. A polygon is a plane figure of finite area bounded by a finite number of straight-line segments with the property that each endpoint of each line segment is in fact the endpoint of exactly two of the line segments, and such that each two such line segments meet, if at all, in a single common endpoint of each. Finally, each polygon is to be connected, and so is the boundary of each polygon-that is, the condition that the polygon be connected means that a) For each two points x andy within the polygon, there is a polygonal line L with endpoints x and y lying entirely within the polygon. And the condition that the boundary of each polygon be connected means that b) For each two points x and y on the boundary of the polygon, there is a polygonal line L with endpoints x and y lying entirely within the boundary of the polygon.
110
Polyhedra
5.1
Fig. 5.1 The exterior of a polygonal boundary is not a polygon.
To understand this definition fully, you should determine exactly what sorts of polygon-like objects are ruled out by this definition. First, the condition that a polygon have finite area prohibits such an object as that shown in Fig. 5.1 from being a polygon. Next, the condition that each endpoint of each line segment be, in fact, the endpoint of exactly two line segments is really a condition on the boundary of the polygon-by which we mean the union of that collection of boundary segments-and will prevent an object such as that shown in Fig. 5.2 from being a polygon. That each two line segments meet, if at all, in a common endpoint of each eliminates the set shown in Fig. 5.3 as a possible polygon. In this figure the two crossing sides are understood to cross at a point not an endpoint of either side. The condition that the polygon be connected will prevent the polygonal-like object shown in Fig. 5.4 from being a single polygon: this example is the union of three polygons. The last condition, that the boundary be connected, will prevent a polygon from having a hole in it, as shown in Fig. 5.5. This consequence is a special case of the Schoenflies Theorem, in which it is proved that a simple closed curve in the two-dimensional plane is the boundary of a topological disk (a "topological disk" is one that can be continuously deformed into a circular disk). It turns out to be surprisingly difficult to prove this "obvious" theorem, and it is beyond the scope of this text even to attempt to prove it for the special case of a polygonal simple closed curve.
The definition of polyhedron
5.1
Fig. 5.2 The figure is not a polygon because one point lies on more than two boundary segments.
Fig. 5.3 The figure is not a polygon because two sides intersect in a point not a common endpoint.
111
112
Polyhedra
5.1
Fig. 5.4 The figure is not a polygon because it is not connected.
Fig. 5.5 The figure is not a polygon because the boundary is not connected.
5.1
The definition of polyhedron
113
•
Fig. 5.6 A slight alteration to make the boundary connected.
These polygons, as defined above, are to be the faces of our polyhedral solids. We could actually alter the definition and allow polygons to have polygonal holes in them, but over and over again in the subsequent proofs we would have to convert each such polygon into one fitting the above definition by slight alterations in its structure, of the sort shown in Fig. 5.6. Now by a polygon we will mean a plane figure as defined above together with its boundary, so in future use of the term "polygon" it will be permissible for a subset of a polygon to lie partly or wholly on the boundary of the polygon. Using the term "polygon" as defined above, we can now define exactly what is meant by a polyhedron, or polyhedral solid. A polyhedron K is a subset of three-dimensional space such that a) K has positive but finite volume; b) each two points of K can be joined by a polygonal line lying entirely within (or partly on the boundary of) K; and c) the boundary of K is the union of finitely many polygons, such that each two of these polygons meet, if at all, in a single common vertex or a single common edge; and such that each three such polygons meet, if at all, in a single common vertex. Condition (c) of this definition is meant to prevent such an object as that shown in Fig. 5.7 from being a polyhedron; however, it will be permissible for a polyhedron to have a hole in it, as shown in Fig. 5.8.
114
5.1
Polyhedra
Fig. 5.7 The figure is not a polyhedron because more than two faces have an edge in common.
I I
/ /
/
I I I
I I
I I
I ~ II I I
I
II
I )--~
I I
I I
I I
1 ..... ""1
/--_----..Lc--~ /
Fig. 5.8 A polyhedron with a hole running all the way through.
5.1
The definition of polyhedron
I! I I
I
1 1
I I
I (
I I /
Fig. 5.9 The polygonal closed curve in the boundary does not separate the boundary.
/
/
I
/
/
I \
\
J----1I "I
I
-----/
/
I
I \ I
I
I
I
:
1 I
/
115
I 1
1-I I I
I I I 1/\1
----j(t \1
..... .... --_~ ..... ,.-
The vertices of the polygons forming the boundary of the polyhedron K are called the vertices of K; the edges of the polygons are called the edges of K; the polygons themselves are called the faces of K. In some of the theorems which follow we shall need an additional hypothesis that will assure us that our polyhedra have no holes in them. Imagine a large cube with a small cube removed from its interior. We could prevent this phenomenon by requiring that the boundary of each polygon be connected; that is, each two points of the boundary could be joined by a polygonal line lying entirely within the boundary. But this condition would not prevent a hole running all the way through the polyhedron, as in Fig. 5.8. If we need to exclude such a possibility, so as to consider only such polyhedra as could be molded from a cube of clay without cutting or poking holes, we could add the following condition: that each polygonal closed curve in the boundary of a polygon separates the boundary into two parts. The polyhedron shown in Fig. 5.8 does not have this last property; Fig. 5.9 shows a polygonal closed curve in its boundary that does not separate the boundary. Since we may need both the conditions we have been discussing, in order to prevent an internal hole or a hole all the way through, we combine them into a single definition.
116
Polyhedra
5.1
The connected polyhedron K is said to be 2-connected provided that K has connected boundary and each closed polygonal curve on the boundary of K separates the boundary of K into two parts. By virtue of condition (b) of our definition of polyhedron, each time we use the term "polyhedron" we mean one which is connected; but we refer to a connected polyhedron in the above definition to emphasize that this definition is unambiguous only for connected polyhedra. In any case, a 2-connected polyhedron is what most people think of as a "polyhedron"the five regular solids, for example, are each 2-connected. A large piece of Swiss cheese, even with polyhedral holes, is not. Exercises
5.1 Draw a map of connected countries requiring four colors for a "proper" coloring-one in which two countries that have a boundary segment in common must be colored differently. 5.2 Is there a solution to the previous exercise in which only four countries are drawn? Why? 5.3 Can each of the countries drawn in Exercise 5.1 be shaped like a rectangle? Explain your answer. 5.4 Can each of the countries drawn in Exercise 5.1 be square? Give a reason for your answer. 5.5 Suppose that a map of countries is drawn in which each vertex lies on the boundary of exactly four countries. How many colors are needed for a proper coloring of such a map? Can you prove this? 5.6 Draw a map of connected countries on the surface of a sphere-use a tennis ball if you wish. Consider each country to be a "face," each boundary segment between two vertices to be an "edge," and each point where three or more boundary segments meet to be a "vertex." Let F be the number of faces, E the number of edges, and V the number of vertices. Evaluate the number V - E + F. Repeat this experiment a number of times. 5.7 Repeat the previous exercise, but use a torus-a figure shaped like a doughnut-instead of a sphere. Use only countries that have connected boundaries. 5.8 What happens in the previous exercise if some countries do not have connected boundaries? 5.9 Why was it required in the definition of "polyhedron" that a polyhedron have finite volume? 5.10 A subset of three-dimensional space is said to be convex provided that each two points of the set can be joined by a straight-line segment lying entirely within (or on the boundary of) the subset. Must a convex polyhedron be 2-connected? Must a 2-connected polyhedron be convex? Explain.
5.1
The definition of polyhedron
117
Fig.5.10 A Mobius strip.
5.11 If countries are not required to be connected (such as Pakistan) but the mapmaker still wishes to have all parts of a given country colored the same color, the solution to the coloring problem becomes more complicated. Suppose that you know that each country comes in no more than two connected parts-can you construct a map of such countries in the plane requiring eight colors for a proper coloring? Ten? More? 5.12 Find the minimum number of colors necessary to properly color each of the regular solids, considering each face as a different country. 5.13 A Mobius strip is formed by cutting a long rectangular piece of paper, and then joining the short sides not in the expected fashion, but after giving the strip a half twist. An example is shown in Fig. 5.10. Can you draw a map of connected countries on a Mobius strip such that you need five colors to color the map properly? 5.14 What happens to the mapmaker's problem in three dimensions? Consider the "countries" to be three-dimensional solids, such as rectangular polyhedra. We require that two "countries" be colored differently if they have part of a boundary face in common. Can you construct a "map" of such countries requiring six colors for a proper coloring? Eight? Ten? More? 5.15 Repeat Exercise 5.11, but allow the countries to come in arbitrarily large numbers of pieces. For example, you may have one country in two pieces, a second in three, a third in four, and so on. Is there any upper limit to the number of colors needed to color all such maps?
118
5.2
Polyhedra
5.2 EULER'S FORMULA
Let K be a 2-connected polyhedron and let V, E, and F denote respectively the number of vertices, edges, and faces of K. The Austrian mathematician L. Euler discovered a simple relation that must always hold between the numbers V, E, and F: V-E+F=2. This is what you should have discovered in working Exercise 5.6. It is the aim of this section to provide a proof of this relation, known as Euler's Formula. Since this formula has nothing to do with the interior of the polyhedron K, we immediately forget its existence, and consider only the boundary of K, as composed of a number of vertices, edges, and faces. Choose a face of K that you particularly like (or dislike), and remove this face from K, leaving only the edges that it had in common with other faces of K. Since K is 2connected, it is now possible to deform the cup-shaped figure composed of the remaining boundary polygons in such a way that it lies in a plane, without changing the number of vertices or edges or the way they are connected. The remaining faces of K thus become polygons in the plane, bounded by the same edges as before; of course, the new faces cannot remain congruent to the old ones, nor can the edges retain their original lengths in this deformation. Figure 5.11 shows a cube before such a deformation, and Fig. 5.12 shows the resulting plane figure, which is sometimes known as the net of K. The net of K may be thought of as a number of polygons, some of which may be triangles, but some of which may not. We need next to convert all these polygons into triangles. But first we count the number of vertices, edges, and faces in the net, denoting these numbers by V, E, and Frespectively, and form the sum V - E + F. Note that V and E are the same for K and the net of K, while K has one more face than its net since the unbounded exterior domain of the net is not counted as a face. In Chapter 1 we provided a proof that any plane polygon can be triangulated without introducing additional vertices. In this proof, straight-line segments were drawn in the polygon, joining various pairs of its vertices, until the polygon had been completely triangulated. If such a line segment is drawn in one of the polygons in the net of K, thus joining two vertices of some polygon of the net, the value of E will be increased by one. But the value of F is simultaneously increased by one as well, and consequently the value of V - E + F will be unchanged by this process. Thus we proceed to triangulate each polygon in the net of K, and when we are done, though the values of E and Fwill generally change, the value of the expression V - E + F is not altered.
5.2
Euler's Formula
I I
I I
I
r------
// Fig.5.11
The boundary of a cube in three-dimensional space.
Fig.5.12 After one face of the cube is removed, its boundary can be deformed so as to lie in a plane.
/
119
120
5.2
Polyhedra
Fig. 5.13 The net of the cube is now completely triangulated.
In Fig. 5.13 we show the net of K completely triangulated; we are continuing with the cube as the model for this proof. We now attack this collection of triangles with an eraser. Our aim is to erase all the triangles but one, one at a time, without changing the value of V - E + F (though we are certainly going to change the values of the individual terms). In the erasing process, we also want to have a connected net of triangles at each stage. Two types of erasing will be needed. Figure 5.14 shows that one can erase a single edge, thus removing one edge and one face. The line to be erased is shown as a dashed line in that figure. Since this erasure decreases the value of each of E and F by 1, the value of V - E + F is unchanged. Figure 5.15 shows an example of the other type of erasing needed; one can also erase a vertex and two edges, shown as dashed lines in that figure. This second type of erasing decreases each of V and F by 1 and simultaneously decreases E by 2. Again, the value of V - E + F is unchanged. At each stage of the erasing process, at least one of these two types of erasures can be performed. We take care only to erase vertices and edges lying on the "outside" of the net, so that the net remains connected at each stage. Since at each erasure the value of F is decreased by 1, the process can be continued until after a finite number of steps the value of F becomes I and a single triangle is all that remains.
5.2
Euler's Formula
Fig.5.14 Removing one triangle by erasing an edge.
/'
I \
I
Fig.5.15 Removing one triangle by erasing two edges and their common vertex.
I
I
I
I
I
\
\
\
\
\
\
121
122
5.2
Polyhedra
The value of V - E + F is the same for this triangle as it was for the net of K, but since a triangle has V = 3, E = 3, and F = I, the value of V E + F is 3 - 3 + 1 = 1. Thus the value of V - E + F must also be 1 for the net of K. Now K itself has the same number of vertices and edges, and one more face, than its net, so the value of V - E + F for K must be 2. This establishes Euler's Formula: for any 2-connected polyhedron,
V - E
+F=
2.
Exercises
5.16 Calculate the value of V - E + F for a polyhedron shaped roughly like a cube with a single hole running all the way through, such as the one shown in Fig. 5.8. Repeat this for a polyhedron with two holes, three holes, and four holes. Generalize. How might you go about proving your guess is correct? 5.17 You can see from the proof of Euler's Formula that it is not actually necessary for the edges involved to be straight. Hence for any map of connected countries on the surface of a sphere, if F is the number of countries, E the number of boundary segments, and V the number of vertices, then it is still true that V - E + F = 2. We can also use the formula VE + F = 2 for a map of connected countries in the two-dimensional plane, provided that we interpret the unbounded outside region as a large country. You can use this somewhat generalized form of Euler's Formula to work a number of problems, including this one: Suppose you are given five points marked in the plane and lines are drawn from each to the other four, detouring around vertices. This will make ten lines in all. Prove that two of these lines must intersect. 5.18 Prove that three houses cannot each be connected to each of three wells by nonintersecting paths. Hint: See Exercise 5.17. 5.19 Prove that if each country of a map on a sphere has exactly three edges, then the number of countries is even. 5.20 If a map of connected countries on a sphere has 60 vertices and each country has exactly three edges, how many countries are there? 5.21 Suppose that a map of connected countries on a sphere has the property that each vertex lies on an even number of edges. What is the minimum number of colors required to properly color such a map? 5.22 Prove that there is no map covering the surface of a sphere such that each vertex lies on exactly four edges and each country has exa~tly six sides. 5.23 Suppose a map is given in the plane such that each vertex lies on exactly three edges and no two countries have in common more than a single vertex or single edge. Show that at least one country in the map must have five or
5.3
Regular solids
123
fewer edges. Hint: For each natural number n, let Fn be the number of countries that have exactly n edges. Suppose that a map such as the one mentioned above has no country with five or fewer edges. Then
= F 2 = F3 = F4 = F s = 0,
F1
and F6
+
F7
+
Fs
+ ...
= F.
Moreover, since each vertex lies on exactly three edges, 3 V = 2E. Finally, the number of edges in the map can be counted as follows: For each country with six edges, we count all the edges and get 6F6 • Similarly, we count the edges using countries with exactly seven edges, and get 7F7 • The total number of edges counted in this fashion is then 6F6
+
7F7
+
8Fa
+ ...,
and this process counts each edge exactly twice, so 2E
= 6F6 + 7F7 + 8Fa + > 6F6
+
6F7
+
6Fa
+
, .
= 6F. Now you have the following three relationships: V - E
+
F = 2, 3V = 2E, 2E > 6F.
Does this lead to a contradiction? 5.24 Figure 5.16 shows a map of countries on an island. Prove that this map cannot be colored with only three colors in such a way that adjacent countries are colored differently. 5.25 Suppose the Martians have divided their planet into two countries, one occupying the Northern hemisphere and called Northia, and one occupying the Southern hemisphere and called Southia~There is then a single boundary edge, the Martian equator. Does Euler's Formula hold for this sort of map? If not, what convention about boundary edges is necessary to make the formula valid? 5.3 REGULAR SOLIDS
A regular solid is a 2-connected polyhedron each of whose faces is congruent to a given regular polygon and each of whose vertices lies on the same number of edges. It was known to the ancient Greeks that there could be only five regular solids, but a careful use of Euler's Formula will show a much
124
5.3
Polyhedra
Fig. 5.16 A map that cannot be properly colored with only three colors.
more general result. It turns out that the metric or geometric properties are not in themselves what prevent the existence of a sixth regular solid, but merely the numerical relationships between the numbers V, E, and F. Theorem 5.1 There can be no more thanfive regular solids. Proof. Suppose we have a regular solid; that is, a 2-connected polyhedron with the property that the same number of edges meet at each vertex, and such that each face is bounded by the same number of edges. The latter condition means that we are actually considering figures that are much more general than regular solids, for in a regular solid not only is each face bounded by the same number of edges, but in fact each face is congruent to a fixed regular polygon. But we shall show, nevertheless, that there can be but five such figures, and it then follows that only five regular solids can exist. Since the number of edges incident at each vertex is the same, let us denote this number by p; similarly, let q denote the fixed number of edges bounding each face. Let us first count the number of edges by multiplying the number p incident at each vertex by the number V of vertices; we obtain p V. But since this process counts each edge twice, we find that pV
=
2E.
Let us also count the number of edges by multiplying the number q bounding each face by the number F of faces; we obtain qF. This process also counts each edge twice, and hence qF
= 2E.
5.3
Regular solids
125
Using the above two equations, we express each of E and Fin terms of V, and we obtain E = pV
and
2
F = pV. q
We substitute the above in Euler's Formula, V - E + F = 2, which holds for such a figure as we are considering, and we find that pV pV V--+-=2 2 q ,
or 2Vq - pqV
+
2pV
=
4q.
We solve for V, and we obtain the equation
v= 2p
+
4q 2q - pq
Now V is a positive whole number, and so is 4q, so that the last denominator must be positive. That is, 2p
+
2q - pq > 0,
or pq - 2p - 2q <
o.
We next add 4 to both sides of this last inequality in order to make it possible to factor the left-hand side, and we get pq - 2p - 2q
+ 4 < 4,
which, when factored, becomes (p - 2)(q - 2) < 4.
Our definition ofpolyhedron implies that each face has at least three bounding edges and that there are at least three edges incident at each vertex. Thus each of p and q is no less than 3. Hence both p - 2 and q - 2 are positive whole numbers, and their product is less than 4. The only possibilities are these: p = 3, q = 3: p = 4, q = 3: p = 3, q = 4: p = 5, q = 3: p = 3, q = 5:
The tetrahedron The octahedron The cube The icosahedron The dodecahedron
126
5.3
Polyhedra
The values of V, E, and F may be found by substitution in the equations
v
=
4q
2p
+ 2q - pq
E
= pV
F
= pV.
,
2 ' q
For example, for p = 5 and q = 3, we obtain V = 12,
E = 30,
F
= 20,
so that the figure in question must have 20 faces, each a triangle (since q = 3); 30 edges; and 20 vertices, each the meeting place of exactly five edges (since p = 5). We have referred to such a figure above as an icosahedron, whereas this is more properly the name of the regular solid with such a number of vertices, edges, and faces. What we have in fact shown is this: If one wishes to glue together a number of not necessarily equilateral triangles to form a 2-connected polyhedron with five edges incident at each vertex, then 20 triangles must be used, and no other construction is possible. In particular, there can be at most five regular solids in the classical sense. Exercise
5.26 We have shown in Theorem 5.1 that there can be at most five regular solids, but the theorem does not show that the five solids actually exist. Some very careful work with blocks of wood, solid geometry, and a bandsaw may make it seem very likely that all five do exist, but there is a difficulty. No matter how carefully the bandsaw is used, and no matter how carefully you measure lengths and angles, there is always the possibility that the figure you have constructed is almost, but not quite, regular, since errors in lengths and angles may be beyond the limits of physical measurement. Suppose you were given eight equilateral and congruent triangles, and were asked to prove that these can be assembled to form the boundary of a regular octahedron. It would not be difficult to show that the first seven could be matched up, each edge coinciding with another edge in most cases, but then you would have the problem of showing that the last triangle would exactly fit the triangular hole left in the constructed figure. This would probably involve some very difficult solid analytic geometry and all sorts of lengthy equations. Can you think of a way to prove the last triangle would fit without all this agony? Then proceed to prove the existence of the other four regular solids.
A converse of Euler's Formula
5.4
127
5.4 A CONVERSE OF EULER'S FORMULA
Suppose you are given three positive whole numbers, say a, b, and c. Suppose also that these numbers satisfy the relationship of Euler's Formula: a - b + c = 2. Need there exist a 2-connected polyhedron with V = a, E = b, and F = c? Obviously not, for it seems clear that in any polyhedron, the num1?ers V and F must be at least 4. But even if this condition is met, does such a polyhedron have to exist? If not, can we find conditions on the numbers a, b, and c that will assure its existence? What we are asking for is this: Given positive whole numbers V, E, and F, such that V - E + F = 2, what conditions on V, E, and F will assure the existence of a 2-connected polyhedron with V vertices, E edges, and F faces? E. Steinitz answered this question in a paper published in 1906. It turns out that the answer to the above question is the double inequality
V
+
4
~
2F < 4V - 8.
This inequality says, roughly, that the number of faces must be a little more than half the number of vertices, but not quite so much as twice the number of vertices. If you wish, you may convert the above inequality to the alternative form F
+ 4 -~ 2 V -~ 4F - 8.
This shows that there is some symmetry hidden in the original inequality. Later we will show a more natural geometric interpretation of this inequality. We state Steinitz's result as a theorem and proceed with its proof.
Theorem 5.2 Let V, E, and F be three positive whole numbers. Then there exists a 2-connected polyhedron with V vertices, E edges, and F faces if and only if both of the following relations hold: V - E
V
+
4
~
+
2F
F
= 2,
:s 4V -
8.
This is called an "if and only if" theorem, for we must really supply two almost independent proofs for the two theorems that Theorem 5.2 really is. First we have to show that if K is a 2-connected polyhedron, then both the above relations hold. Then we must show that if the numbers V, E, and F satisfy both the above relations, then there exists a 2-connected polyhedron K with the corresponding numbers of vertices, edges, and faces. The second part will be a constructive proof; that is, we will show how one actually would go about building the needed polyhedron. But the first part is much easier, and so that is where we begin the proof.
128
5.4
Polyhedra
Proof Suppose that K is a 2-connected polyhedron with V vertices, E edges, and F faces. We have already shown in Section 5.2 that V - E + F = 2, and so it remains only to show that V
+
4 < 2F
~
4V - 8.
Now there are at least three edges incident at each vertex. If we were to count all the edges at each vertex and there happened to be exactly three edges incident at each vertex, we would obtain the number 3 V, and we could then say that 3 V = 2E since in this process each edge would be counted twice. However, since there may be more than three edges at some vertices, let us only count three and ignore any others. The number we obtain by such careless counting will still be 3V, but since we may have ignored some edges, we can say only that 3V ~ 2E. Similarly, we count only three edges in the boundary of each face, even though some faces may have more. If each face had exactly three edges, we would find that 3F = 2E; since we ignore some edges in this process we can say only that 3F ~ 2E. We know that Euler's Formula holds for K, and so our information to this point may be summarized as follows:
+
V - E
F
= 2,
3V < 2E, 3F < 2E. We transform the first equation into 4
+
2E
= 2V + 2F,
and replace the quantity 2E which appears in the resulting equation with quantities known to be smaller (or at least no larger): first 3V, then 3F. We thus obtain the two inequalities
+ 3V ~ 4 + 3F <
4
2V 2V
+ +
We simplify each; the first becomes 4+V~2F,
and the second becomes 4
+
F < 2V,
F
~
2V - 4,
or or, finally,
2F
~
4V - 8.
2F, 2F.
5.4
A converse of Euler's Formula
129
Fig. 5.17 A cut of type A.
We now have both 4 + V ~ 2F and 2F < 4V - 8. Combining these two into a single relationship, we get V
+
4
~
2F < 4 V - 8,
and the first part of the proof is complete. To finish the proof of the theorem, we need to show that if V, E, and F are three positive whole numbers satisfying both the relations
+
F
= 2,
4 :::;; 2F
~
4 V - 8,
V - E
and V
+
then there exists a 2-connected polyhedron K with V vertices, E edges, and F faces. We shall construct Kby starting with a regular tetrahedron and literally sawing off various chunks until we obtain the desired polyhedron. You can imagine these saw cuts as actually being done with a bandsaw. There are three types of cuts that will be needed. Each is to be applied to a vertex where exactly three edges of the polyhedron meet. Since we are beginning with a tetrahedron, it will always be possible to make the first cut. We will need to show that in the construction of K, each cut we perform leaves us with at least one vertex where exactly three edges meet, so that the process can be continued until K is obtained. Cut of Type A.' See Fig. 5.17. Let P be a 2-connected polyhedron and v a vertex of P where exactly three edges meet. Choose three points a, b, and c, one on each of these three edges, and much closer to v than to the other
130
5.4
Polyhedra
a ------I b \ v I \ \
\ \
I I
I I
\ \
I I
\ \
I I
\
\
I
I
\ I \ I
Fig. 5.18
c
"
A cut of type B.
\
\
vertices of P. Cut along the plane determined by a, b, and c, and discard the small tetrahedron with vertices v, a, b, and c. Cut of Type B: See Fig. 5.18. Let P be a 2-connected polyhedron and v a vertex of P where exactly three edges meet. Choose two points a and b,
one on each of two of these edges, and much closer to v than to the other vertices of P. Let c be the vertex other than v on the third edge incident at v. Cut along the plane determined by a, b, and c, and discard the small tetrahedron with vertices v, a, b, and c. Cut of Type C: Let P be a 2-connected polyhedron and v a vertex of P where exactly three edges meet. Let a be a point on one of these edges much closer to v than to the other vertices of P, and let band c be, respectively, the vertices of P other than v on the other two edges incident at v. As indicated
in Fig. 5.19, cut along the plane determined by a, b, and c, and discard the small tetrahedron with vertices v, a, b, and c. In each case, a new vertex will be formed at the point a, and it is easy to see that exactly three edges must be incident at this vertex. Hence if we use only cuts of the three types described above, and begin with a tetrahedron, we shall always have available at least one vertex where exactly three edges meet. So this cutting process can be continued as long as we please. Moreover, it should be at least intuitively clear (though it is not difficult to prove)
5.4
A converse of Euler's Formula
_~~
a
\\ \
\ \
A cut of type C.
I II
v \ \
\
Fig. 5.19
~-- I I
~--
....-
\\
\
c
\
/
I
/ I
131
I
I
I
I
I
I
I
I
'.
,,
,,
that since we begin with a tetrahedron-which is 2-connected-we produce a 2-connected polyhedron at each stage as a result of each of these cuts. It is easy to show the following important facts about the results of these types of cuts. First, a cut of type A increases the number of vertices by two and the number of faces by one. Second, a cut of type B increases the number of vertices by one and the number of faces also by one. However, some care must be used in considering what happens as a result of a cut of type C. We must avoid using a cut of type C at a vertex v where three edges meet, but where also the three faces meeting at v are all triangles. Figure 5.20 shows what would happen if a cut of type C were applied in such a case. The number of vertices, edges, and faces would not change, so a cut of type C here would do us no good in constructing our desired polyhedron K. We must be sure, when applying a cut of type C, that not only is this cut applied at a vertex v where exactly three edges are incident, but in addition one of the three faces incident at v must have four or more edges. Moreover, we must choose the points band c indicated in Fig. 5.19 both in the boundary of the face with four or more edges. Then when a cut of type C is applied, the face with four or more edges is divided by the line from b to c into two faces.
132
Polyhedra
5.4
Fig. 5.20 The case when a cut of type C should not be used.
\
Hence, under the right conditions, a cut of type C will increase the number of faces by one without changing the number of vertices. Note that we do not care what happens to the number of edges as a result of any of these cuts, for the following reason: When we: finally succeed in constructing a polyhedron K with the desired number of vertices and faces, the number of edges will take care of itself because Euler's Formula must hold for K. Now we are ready to proceed with our construction of the polyhedron K, which is to have Vvertices, E edges, and Ffaces. We begin with a tetrahedron, which has four vertices, six edges, and four faces. Suppose first that V = F. In this case, simply apply F - 4 cuts of type B. The number of vertices and faces of the tetrahedron will then each be increased by F - 4, so the resulting polyhedron will have 4 + (F - 4) vertices, or F vertices. But in this case, V = F, so that the resulting number of vertices will be V, as desired. Moreover, the resulting polyhedron will have 4 + (F - 4) = F faces, also as desired. As we mentioned earlier, the value of E takes care of itself, as in any case E = V + F - 2. The second possibility is that F < V. If so, first apply to the tetrahedron 2F - V - 4 cuts of type B and then V - F cuts of type A. Again, it is not hard to verify that the resulting polyhedron does indeed have V vertices and Ffaces. Finally, the only complicated case: that in which V < F. Here we must use cuts of type C, but we must make sure that such a cut is applied only to a
5.4
133
A converse of Euler's Formula
:::::.
.....0 CIl Q.l
12
•
11
•
10
•
9
• • •
8 7
5
• •
Fig. 5.21 The graphical
4
interpretation of the formula V + 4 ~ 2F ~ 4V - 8.
•
3
•
2
•
1
•
::J
co
>
6
• • • • • •
• •
•
• • • • • • • •
• 2
3
•
•
•
• • •
• • •
• • •
•
•
• •
•
•
•
•
• • •
• •
•
•
• •
•
•
•
•
•
9
10
11
• •
• • •
• • •
• • •
•
•
• • • •
4
5
6
7
8
(Values of F)
vertex where three edges meet such that, of the three faces also incident there, at least one has four or more edges. We do so as follows: We apply one cut of type B followed by one of type C, using for the second cut one of the new vertices formed by the cut of type B. It is not hard to see that a cut of type B will always produce at least one new face with four or more edges, and that one of the new vertices created on the edges of this face will be a vertex where exactly three edges meet. So we use this vertex for the next cut of type C. We perform this process of first applying a cut of type B, then a cut of type C, exactly F - V times. This is possible since V < F. After this process is completed, we next apply exactly 2 V - F - 4 cuts of type B. Again, it is easy to verify that the resulting polyhedron does have exactly V vertices and F faces, as desired. This concludes the proof of Steinitz's Theorem. Exercises
5.27 The graph shown in Fig. 5.21 shows plotted dots, each representing a possible value pair for V and F. The two lines are the graphs of
V
+
4 = 2F,
and 2F
= 4V - 8.
134
Polyhedra
5.5
The shaded region between the lines represents possible value pairs for V and F for which a polyhedron can exist; outside this region none can exist. Why? 5.28 The proof of Steinitz's Theorem shows how one may begin with a tetrahedron and construct a polyhedron with given values of V and F. Note that the tetrahedron itself is represented by the point of intersection of the two lines shown in Fig. 5.21. Give a geometric interpretation of the construction in Steinitz's Theorem with the aid of Fig. 5.21. 5.29 Show that the application of a cut of anyone of the three types used in the proof of Steinitz's Theorem to a convex polyhedron produces a convex polyhedron. See Exercise 5.10. As a consequence of this exercise and Exercise 5.10, the construction used in the proof of the second part of Steinitz's Theorem works because each polyhedron used is convex. Why is this enough to make the construction work? 5.30 If a 2-connected polyhedron has 20 edges, what is the maximum number of vertices it can have? What is the minimum number? 5.31 Verify that a cut of type A does indeed increase the number of vertices of a polyhedron by two and the number of faces by one. 5.32 Verify that a cut of type B does indeed increase the number of vertices of a polyhedron by one and the number of faces by one as well. 5.33 Verify that application of a cut of type C in the proper place on a polyhedron does increase the number of faces by one without changing the number of vertices. 5.34 In the proof of Steinitz's Theorem, where the polyhedron K is constructed, it is shown only for the case V = F that the constructed polyhedron actually has the desired number of vertices and faces. Verify that the values of V and F also come out right in the other two cases, the cases of F < V and V < F. 5.35 To continue the previous exercise, how many cuts are actually needed to produce the polyhedron K from a tetrahedron? Your answer should be in terms of the number F of faces that K is supposed to have. 5.5 MAP COLORING
As we mentioned at the beginning of this chapter, it is an unsolved problem whether or not four colors are sufficient to color every map of connected countries on the surface of a sphere (or, for that matter, on the surface of any 2-connected polyhedron). With the aid of Exercise 5.23, which depends itself on Euler's Formula, it is possible to show that each such map can be colored using no more than five colors; the answer to Exercise 5.1 shows that four colors are necessary. But no one has ever been able to construct such a
5.5
Map coloring
135
Fig.5.22 The dashed line shows which countries and edges to remove in order to make the torus into a cylinder.
map requiring five colors, nor has anyone ever been able to prove that four colors are sufficient for every possible map. This appears to be a very difficult problem. However, in what should be a more complicated case, the problem has been solved. We refer to the case of a map on the surface of a torus. Actually, the problem has even been solved for a number of other surfaces as well, including the Mobius strip and the surface of a two-hole torus. It seems quite strange that the problem has not been solved in what really ought to be the simplest case of all-the case of the plane or sphere (the two are equivalent problems). We present next the proof that the answer to the coloring problem for the torus is seven; that is, that every map on the surface of a torus can be colored with seven or fewer colors, and that there do exist maps that require seven colors. Of course, we always refer to a proper coloring, in which adjacent countries have different colors; moreover, in the case of maps on the torus, we require not only that each of the countries be connected, but also that each has connected boundary and that no country encircle the torus in either of the two possible fashions. We could really eliminate these last conditions by a simple device, but we will make these assumptions for simplicity. As you might expect, it will first be necessary to derive Euler's Formula for a torus. Let a map of countries be given on a torus, subject to the conditions mentioned above. As in Fig. 5.22, draw a circle around the torus the short way, avoiding vertices, and passing through a country no more than once. Remove the countries and boundaries crossed by this curve. This operation will decrease the values of F and E for this map by the same amount,
136
5.5
Polyhedra
Fig. 5.23 The plane map resulting after the cylinder is deformed.
so that the value of V - E + F remains unchanged. If there should be any free vertices in the resulting map, where only two edges meet, remove these as well. Each such operation decreases the value of V and E by one each, so that after this is done, the value of V - E + F is still unchanged. What we now have is a map on a cylinder, which can then be deformed into a map in the plane in the shape of an annulus, as in Fig. 5.23. We add a dummy country in the hole in the middle of the annulus, and note that we have here a map of the sort examined in the proof of Euler's Formula in Section 5.2. There is no need to triangulate this map and remove triangles, for we already know that for this map,
V-E+F=1. We remove the dummy country in the hole, thus decreasing Fby 1, and find that V - E + F = O. But the value of V - E + F is the same for the annular map as for the original map on the torus, and so we discover that Euler's Formula for a torus is V - E + F = O. This is a result you should have obtained in Exercise 5.7. We are now ready to prove that seven is the "map-coloring number" for the torus.
5.5
Map coloring
Fig. 5.24 Adjusting a map so that only three edges meet at each vertex.
137
•
Theorem 5.3 Each map of connected countries each with connected boundary on the surface of a torus can be colored with seven or fewer colors. Proof The theorem is clearly true for a map consisting of seven or fewer countries. Suppose it is.false for some value of F larger than seven: Then there is a least such value of F for which the theorem is false; and so, letting that value be itself denoted by F, there is a map on a torus with F countries that cannot be colored with seven colors. Remember that F is the smallest number for which we are supposing the theorem to be false, so that for example, any map with F - 1 countries can be colored with seven colors. We have a map of F countries on the surface of a torus, and this map requires eight or more colors for a proper coloring. We next adjust this map so that only three edges meet at each vertex. One way to do this is shown in Fig. 5.24, in which a few edges are moved slightly to one side or the other. This will increase the value of V and E, but F will not be changed. Since three edges meet at each vertex, for this adjusted map we can multiply the number of edges at each vertex by the number of vertices, and obtain the number 3 V. This counts each edge twice, hence 3V
= 2E,
6V
=
4E.
V - E
+
F = 0,
or Also, for this map and so F= E- V,
so that 6F
=
6E - 6V.
138
5.5
Polyhedra
Since we also know that 6 V = 4£, then 6F = 6£ - 4£,
or 6F
= 2£.
We now claim that one of these countries has fewer than seven boundary segments, and hence is bounded by fewer than seven other countries. For if not, we let Fn denote the number of countries of this map that have exactly n boundary segments, for all natural number values of n. If no country has fewer than seven boundary segments, then F1
=
F2
=
F
=
F7
+ Fa + F 9 + .. '.
F3
=
F4
=
Fs
=
F6
= 0,
and We count the number of edges by counting the number bounding each country. Since seven edges bound F 7 countries, eight edges bound Fa countries, and so on, and because this process counts each edge exactly twice, we find that
+ 8Fa + 9F9 + . > 7F7 + 7Fa + 7F9 + . = 7· (F7 + Fa + F 9 + )
2£ = 7F7
= 7F. Hence 2£ > 6F. But we had previously established that, on the contrary, 2£ = 6F. This contradiction shows that at least one country must have six or fewer boundary segments. Remove such a country temporarily from the map, and allow the countries which formerly bounded it to annex the now unclaimed territory, as in Fig. 5.25. These six or fewer countries bound exactly the same countries as before the annexation, provided that the annexation is carried out properly. This new map has F - 1 countries, too, and hence can be colored using no more than seven colors. Color it. We now ask the countries that took part in the annexation to cede their new territory back to the country that was temporarily removed, and we replace that country. Since it is bounded by no more than six countries, it touches only six other colors at most, and the seventh color is available to color it properly. Color it that color. At the beginning of the proof, we made an adjustment of the edges so that only three ~dges met at each vertex. We now reverse that adjustment, restoring the original map. No new boundaries between countries are set up in this process; indeed, some countries may no longer bound ones they did
5.5
Map coloring
139
Fig. 5.25 Annexation of a cou ntry by its neighbors.
previously. Hence our coloring with seven colors will also work for the original map. This contradicts the assumption that the map could not be colored using only seven colors, and hence our supposition that the theorem was false is itself a false assumption. Thus the theorem is"true, and we have completed the proof of Theorem 5.3. All that is left is to establish the existence of a map on the torus that actually does require seven colors for a proper coloring. One such, using in fact only seven countries, is shown in Fig. 5.26. This figure requires some explanation. It is too confusing to draw a view of a semitransparent torus on a two-dimensional piece of paper and try to show the various countries. What appears in Fig. 5.26 is instead a recipe for building such a torus; or, if you prefer, a set of directions for painting such a map on the surface of an old inner tube. If you make a copy of Fig. 5.26 on a flat but flexible rectangle of paper and glue together the two long sides of the rectangle, you will have a long cylinder. Then glue together the ends of the cylinder; you will obtain a torus. Hence, in Fig. 5.26, opposite sides of the rectangle are to be thought of as attached, and so a country such as number 2 (for example) has a common boundary with country number 7. You can verify that each of the seven countries actually shares a boundary with each of the other six. We have drawn the figure, for simplicity, so that the sides of the rectangle are also boundaries of countries, except for the four corners, all of which belong to country number 1.
140
Polyhedra
5.5
Fig.5.26 A map on the torus requiring seven colors.
This example, together with Theorem 5.3, shows that the map-coloring number for the torus is indeed seven. The example shows that seven colors are necessary; the theorem shows that more are unnecessary. Exercises
5.36 Each face of a certain regular solid is a pentagon, and exactly three edges meet at each of its vertices. Use the techniques of this chapter, but not the results of Theorem 5.1, to find the possible number of faces such a figure can have. 5.37 Suppose that a map on the surface of a torus has II vertices and each country has three edges. How many countries can there be? Why? 5.38 What is the value of V - E + F for the Mobius strip? (See Exercise 5.13.) Prove that your answer is correct. 5.39 What is the map-coloring number for the Mobius strip? Hint: Use Exercise 5.38. 5.40 A map on the surface of a sphere consists of pentagons and hexagons attached as shown in Fig. 5.27. Note that: a) five hexagons surround each pentagon; b) three pentagons and three hexagons alternate, surrounding each hexagon; c) each vertex necessarily lies on exactly three edges; and d) not all of the map is shown in Fig. 5.27.
5.5
Map coloring
I
I I
Fig. 5.27 The pattern of hexagons and pentagons for Exercise 5.40.
,, \
141
142
5.5
Polyhedra
Find how many pentagons and hexagons there can be. Hint: Let P denote the number of pentagons and H the number of hexagons. Then P + H = F, where F is the number of faces; and, as usual, V-E+F=2.
Incidentally, the figure described one possible pattern for a geodesic dome, such as has been used for large structural work, and also gives the pattern of polypeptides in the beet virus molecule. 5.41 The author's wife has called to his attention the following generalization of the previous exercise: Suppose that a map on a sphere consists of countries each having either five or six sides, and such that each vertex lies on exactly three edges. Then exactly 12 of the countries have five sides. See whether you can prove this. 5.42 Continuing the previous exercise, what are the possible numbers of countries with six sides, proceeding under the following "regularity" assumption: Each pentagonal country bounds the same number of hexagonal countries? 5.43 If you look over the proof of Steinitz's Theorem, you will see a number of instances in which the next step is to perform (for example) 2 V - F - 4 cuts of one type, and so on. Check each such statement in the proof to make sure the number of cuts given is never negative-for if it were, this would invalidate the proof. Doing this exercise will help show exactly why the hypothesis v + 4 ~ 2F ~ 4 V - 8 is needed. 5.44 Suppose that a map on the surface of a sphere consists of a number T of triangular countries and a number Q of four-sided countries, such that each vertex lies on exactly three edges, each triangular country is bounded by three four-sided countries, and each four-sided country is bounded by exactly two triangular countries. How many triangular and how many four-sided countries are there? 5.45 Can a map such as the one described in the previous exercise exist on the surface of a torus? 5.46 Suppose that a map on the surface of a sphere consists of a number T of triangular countries and a number Q of four-sided countries, such that each vertex lies on exactly four edges, each triangular country is bounded by exactly three four-sided countries, and each four-sided country is bounded by exactly two triangular countries meeting it in opposite sides. How many triangular countries and how many four-sided countries are there? 5.47 Can a map such as the one described in the previous exercise exist on the surface of a torus? Explain your answer.
Notes and references
143
5.48 The situation involved in describing "regular solid tori" is in one way simpler, in another way more complicated, than in the case of the sphere. Assume that a map on the surface of a torus consists of a number of countries such that a) each country has connected boundary, b) each vertex lies on the same number d > 3 of edges, and c) each country has the same number n > 3 of sides. (This is in analogy to the case for the ordinary regular solids.) Use the torus formula
V-E+F=O to show that there are only three possibilities for the shapes of the countries: triangles, quadrilaterals, or hexagons. 5.49 Show that at least one map for each of the three possibilities of the previous exercise exists. 5.50 Show that-in continuation of the previous two exercises-more than one map of each of the sorts discovered can exist. In fact, infinitely many exist, of each type, and thus there are infinitely many "regular tori"; infinitely many are composed of triangular countries, infinitely many of rectangular countries, and infinitely many of hexagonal countries. NOTES AND REFERENCES
The paper of Steinitz mentioned in Section 5.4 is his "Uber die Eulersche Polyederrelationen," Arch. Math. Phys. (3), 11 (1906), pp. 86-88. The proof given in this text is new and possibly simpler. Some other general references to polyhedra and map coloring are listed below: Coxeter, H. S. M., Introduction to Geometry (Wiley, 1961). Coxeter, H. S. M., Regular Polytopes, second edition (Macmillan, 1963). Dynkin, E. B. and V. A. Uspenskii, Multicolor Problems, translated by N. D. Whaland, Jr., and R. B. Brown (Heath, 1963). Griinbaum, B., Convex Polytopes (Interscience, 1967). Hilbert, D. and S. Cohn-Vossen, Geometry and the Imagination, translated by P. Nemenyi (Chelsea, 1952). Lyusternik, L. A., Convex Figures and Polyhedra, translated by T. Jefferson Smith (Dover, 1963). Ore, 0., The Four-Color Problem (Academic Press, 1967). Tietze, H., Famous Problems of Mathematics (Graylock, 1965).
144
Polyhedra
A complete proof of the Five-Color Theorem for the sphere and plane may be found in Sherman Stein's Mathematics: The Man-Made Universe, second edition (Freeman, 1969), in the chapter on map-coloring. The great Swiss mathematician Leonhard Euler was born in Basle in 1707. He was one of the most productive of all mathematicians, continuing his research even after he became totally blind in his sixtieth year until his death seventeen years later. He spent most of his life either at St. Petersburg (Leningrad) or at Berlin, under the sponsorship of royalty, and contributed to mechanics, the calculus of variations, the three-body problem, number theory-indeed, to virtually every branch of mathematics. His interests were not restricted to mathematics. In his stay in Russia he developed a mathematical theory of investment (out of which our present theory of annuities grew), wrote most of the mathematics textbooks for the Russian school system (he was a superb textbook writer), and reformed the Russian system of weights and measures. The four-color problem is not so old as most people think. The best evidence we have is that Francis Guthrie (later a professor of mathematics) asked one of his teachers at University College, London, for a proof that four colors were sufficient to color any map in the plane. The teacher, who was the well-known mathematician Augustus de Morgan, communicated this problem to his colleague, the famous Sir William Rowan Hamilton, in a letter written in 1852. Apparently the problem did not formally appear in print until about 1878; at that time incorrect proofs were published by Kempe and Tait. P. J. Heawood found the flaw in Kempe's proof, and published in 1890 a paper showing how Kempe's proof could be modified to show that five colors are sufficient. Last-minute note: The author has just received a copy of the paper "The Four Color Theorem," by Professor Emeritus (of Duke University) Joseph Miller Thomas. In this paper Dr. Thomas states that each map on the sphere can be properly colored using no more than four colors, and gives a proof of some twelve printed pages. Perhaps this century-old problem has finally been solved.
CHAPTER 6 INFINITE SETS
Imagine a hotel with so many rooms that it takes all the positive whole numbers to number the rooms. That is, the hotel has its rooms numbered 1, 2, 3, 4, 5, ... , without any largest room number. If the hotel is filled with guests, one in each room, it might seem difficult to provide space for an additional guest. But should such an extra guest arrive at the hotel, a clever room clerk can arrange for this guest to have a room to himself with only minor inconvenience to the other guests: The room clerk can request the guest in room 1 to move to room 2, the guest in room 2 to move to room 3, and in general request the guest in room n to move to room n + 1. He can then assign the new guest to room 1. The hotel is still full, all the guests still have private rooms, and the new guest has been accommodated. In order to repay all the other guests for their courtesy, the new guest devises a scheme for improving the financial status of each. He posts a large sign in the hotel lobby directing each guest in rooms numbered 1 through 10 to deliver one dollar to the guest in room 1, each guest in rooms numbered 11 through 20 to deliver one dollar to the guest in room 2, each guest in rooms numbered 21 through 30 to deliver one dollar to the guest in room 3, and in general, each guest in rooms numbered IOn + 1 through IOn + 10 to deliver one dollar to the guest in the room numbered n + 1. (The number n is supposed to take on all positive whole number values.) Now this is no great inconvenience, for each guest need make no more than one trip, and the net result is that each guest receives ten dollars while paying out one, thus making a profit of nine dollars. Actually, our clever friend in room 1 does best of all by this scheme, for he receives nine dollars just as everyone else does, but he does not have to do any walking. Of course, this plan will not work properly unless each guest has at least one dollar to start with-but if some do not, there is still a way to handle the 145
146
Infinite sets
6.1
problem. Suppose we have the extreme case in which the guests in the oddnumbered rooms have no money, but all the others have at least one dollar. The guests in the even-numbered rooms could then be directed thus: Each is to deliver one dollar to the guest in the room with the number half his room number. Then the guest in room 2 would deliver a dollar to the guest in room 1, the guest in room 4 would deliver a dollar to the guest in room 2, and in general the guest in room numbered 2n would deliver a dollar to the guest in the room numbered n. You can easily verify for yourself that each guest in an odd-numbered room receives one dollar, and each guest in an evennumbered room pays out and receives one dollar. Thus there would be no change in the financial status of the guests in the even-numbered rooms, while each guest in an odd-numbered room would now have one dollar. Then the original scheme proposed above could take place without any difficulty. The reason that such peculiar manipulations are possible is of course due to the fact that we are dealing with infinite sets-the hotel has infinitely many rooms as well as infinitely many guests, and infinitely many dollars are involved in the financial transactions. However, in spite of the apparent contradictions involved here, we hope to show you that such phenomena are natural-even common-when one deals with infinite sets. Throughout much of the following material, the central idea is the concept of a one-toone correspondence between two sets. With this concept we will be able to compare the numbers of elements of two infinite sets because it turns out to be quite possible that one infinite set actually contains "more" elements than another. The basic tools are some ideas about sets and functions.
6.1
SETS
By a set we mean a collection of objects, thought of as a whole. The objects which make up the set are called the elements of the set. We do not pretend that the above comprises a definition, at least in the formal sense that previous definitions have been given in this book. Any attempt to define a term requires the introduction of other more primitive terms, and so it is easy to see that some terms must be taken as absolutely primitive-their meaning assumed clear. However, it is helpful in such cases to attempt to give synonyms and examples. For the term "set," which we take as such a primitive term, we can supply a large list of synonyms. Some of these could be "class," "collection," "aggregate," "conglomerate," or just "bunch." We will soon give several examples, and ask you to use your powers of abstraction. One final note: In the next section we shall encounter the same problem with the definition of the term "function." It is not necessary that the elements of a set have any particular property in common. For example, it makes sense to talk of the set whose four elements
Sets
6.1
147
are the first two words on this page, the number 6, and the moon. But we do ask two things: First, that of any object potentially the member of a set under consideration, it is at least theoretically possible to determine whether or not that object belongs to the given set. In the above example, it is possible for you to determine whether or not the word "however," or the positive square root of 36, or the eighth largest body in the Solar System, belongs to the set mentioned. Second, we also require that we can at least theoretically tell apart any two objects which do belong to a given set. One reason for this is that we will follow the convention that objects of a given set are not to be listed in that set more than once; otherwise, in the material which follows, we would run into difficulties in counting the number of elements of a set. For example, if W is to be the set whose elements are the first president of the United States and George Washington, we want the list of elements of W to contain only one entry, so that we can say that W contains one element. These two considerations-that a set should be sufficiently well defined to tell what its objects are and to tell them apart-are probably quite in accord with your intuitive notion of what a set is, but they are also important considerations for those mathematicians who study logic and the foundations of mathematics. There are basically two ways to specify a set. One method might be called the listing method. One simply writes down the elements of the set. This method is quite useful for sets with only a small number of objects or elements, but if the number of elements is large (or if the set is infinite) we must resort to the ellipsis: A = {I, 2, 3, 4}, B = {l, 2, 3, 4, , lOO}, C = {I, 2, 3,4, }. When the elements of a set are listed, the list is enclosed in braces for clarity. The set A above consists of the first four positive whole numbers; B consists of the first one hundred positive whole numbers; C consists of all the positive whole numbers. It is certainly possible to tell whether or not any given object is a positive whole number, and given two positive whole numbers it can be decided whether or not they are equal. Thus these three examples satisfy the two criteria previously mentioned. As an alternative to the listing method for specifying a set, we can use the descriptive method. D = {x I x is a positive whole number}.
We read this notation as follows. As soon as you see the first brace on the left, you realize that you are about to encounter a set, and you should think to yourself "The set of," or "The set consisting of ...." The next word is
148
Infinite sets
6.1
quite important, and it is surprising that it is omitted in the above notation; the word is "all." The letter x is simply a dummy variable, to be used in the sentence that will follow the vertical bar, so up to the vertical bar you can translate as follows: "The set of all objects ... " or "The set consisting of all elements x ...." The vertical bar is merely translated "such that" or "with the property that." Thus what follows that vertical bar must be a declarative sentence. It gives the exact condition an object must satisfy in order that it belong to the set being described. The brace at the right-hand side tells that the description has been completed. Thus, the entire sentence D
=
{x
I x is a positive whole number}
may be correctly translated as "D is the set consisting of all objects x such that x is a positive whole
number" , or "D is the set of all x such that x is a positive whole number,"
or "D is the set of all positive whole numbers,"
or, finally, "An object is an element of the set D if and only if the object is a positive whole number." Of course, you see that the set D above contains exactly the same objects as the set C = {I, 2, 3, 4, ... }. Since it is the aggregate of objects itselfwhich makes up a set, not the particular method of describing the set, we have here an example of what it means for two sets to be the same. We say that C and D are equal sets, and we write simply C = D. Using the symbol E we can also supply a convenient shorthand for the idea of an object's belonging to a set. If x is an element of the set C, then we write x E C; if not, we write x ¢ C. These two examples may be translated as follows:
"x E C" translates as "The object x is an element of the set C" or "x is an element of C."
Sets
6.1
149
"x ¢ C" translates as "The object x is not an element of the set C" or "x is not an element of C." In the example of the set C given above, two true statements using this notation are -3¢ C. and 2EC Two false but meaningful statements are and
DEC
17 ¢ C.
Finally, using this symbol E and the set C, we can use the descriptive notation for the sets A = {I, 2, 3, 4}, and B = {I, 2, 3, 4, ... , loo},
by writing the equivalent statements A = {x
E
C I 1 < x < 4},
and B
=
{x
E
C I 1 < x < loo}.
This quantification of the dummy variable x has the property of removing ambiguity, for without knowing what sorts of objects the values of x were limited to, we could not tell whether the set A, as described above, contained but four elements, or infinitely many (in case x were allowed to take on real number values). There are important relations between sets, and one of the most important is the idea of inclusion. The set A is said to be a subset of the set B provided that each element of A is also an element of B. Symbolically, we write A c B as a shorthand for "A is a subset of B," and it is customary to say, if A c B, that A is contained in B, that B contains A, that A is a subset of B, or that B is a superset of A. For A and B as in the examples above, it is true that A c B and false that B c A; in this case we write B ¢ A. For a formal definition, we can state the following: Let A and B be sets. Then A c B if, and only if, for each x E A also x E B. Moreover, using the relation of set-inclusion, we can also define set-equality: Let A and B be sets. Then A = B if and only if both A c Band B cA. New sets can also be manufactured from old ones. The union of the two sets Sand T is the set consisting of all elements which belong either to S or to T (or both), and is denoted by S u T. In symbols, then,
S u T = {x
I XES
or
XE
T}
Infinite sets
150
6.1
We can also form the intersection of Sand T, consisting of all elements common to Sand T; that is, S
fl
T
=
{x
I XES
and
XE
T}.
Since it may happen that Sand T have no elements in common, it turns out to be convenient to accept the notion of the so-called empty set, denoted by 0, which contains no elements. Thus, for example, if S
=
{x
I x is an even integer}
T
=
{x
I x is an odd integer},
and then S
fl
T
= 0.
In the examples below, let A B
= {I, 2, 3, 4}, = {I, 2, 3, 4,
C = {I, 2, 3, 4,
, loo}, },
S
= {... , - 4, - 2, 0, 2, 4, 6, ... },
T
= {... , - 3, -I, I, 3, 5, 7, ... }.
and
Example 6.1 The following are true statements: 5 ¢ A,
5 ¢ S,
5 E B, 5 E C,
5 E T,
5¢0·
Example 6.2 The following are true statements: A c B, Be C,
S ¢ T,
o
C ¢ S,
c S,
T¢
C.
Example 6.3 The following are true statements:
= B, A u C = C,
Au B
SuT=W, where W is the set of all whole numbers.
6.1
Sets
151
u
A
Fig.6.1
ns
u
A US
Venn diagrams illustrating A n B and A u B.
Example 6.4 The following are true statements: A (\ B = A, A (\ S = {2, 4}, C (\ T = {l, 3, 5, 7, ... }.
For some people, Venn diagrams as shown in Fig. 6.1 are very useful in visualizing the unions and intersections of sets. The rectangle U symbolizes some fixed universe of elements under consideration, and the area within the circle A is meant to symbolize the elements of the set A. The set A (\ B is shown on the left, shaded; A u B is shaded in the right figure. Exercises
6.1 Why are the sets {I, 2, 4} and {2, 4, I} equal? 6.2 Let A = {I, 2, 3}, B = {2, 3, 5}, and C = {4, 5, 6}. Express the sets below by listing their elements between braces, or by using 0 if necessary. a) Au B
b) A (\ B
c) Au C
d) A (\ C
e) (A u B) u C
f) A u (B u C) h) A (\ (B (\ C)
g) (A (\ B) (\ C
6.3 Using A, B, and C as in the previous exercise, compare (A u B) (\ C and A u (B (\ C) to see if they are equal.
152
Infinite sets
6.1
u
Fig. 6.2 The correct way to draw three sets in a Venn diagram.
c
6.4 In drawing Venn diagrams involving three sets A, B, and C, they should overlap as shown in Fig. 6.2 so as to allow for all possibilities of the various intersections of these sets. Of course, it is possible that no elements lie in some region, such as B n C, but it does no harm to let Band C overlap. Shade in the following sets in two copies of Fig. 6.2: A n (B u C),
and (A n B) u (A n C).
6.5 Your result on the previous exercise should suggest that the formula A n (B u C)
=
(A n B) u (A n C)
holds for all sets A, B, and C. The picture is, of course, no substitute for a formal proof, but it can serve as a guide for construction of such a proof. One way to invent such a proof is to choose an arbitrary element, say x, from the set A n (B u C) and show that x must necessarily belong to the set (A n B) u (A n C). Then one would choose an arbitrary element x (there is no harm in using the same symbol as before) from the set (A n B) u
6.1
Sets
153
(A n C) and show that x must also necessarily belong to the set A n (B u C).
These two proofs would show that, in order, A n (B u C) c (A n B) u (A n C),
and (A n B) u (A n C) cAn (B u C).
Hence by the definition of equality of sets, A n (B u C)
=
(A n B) u (A n C).
Provide the necessary details of this proof. 6.6 The previous exercise should remind you of the distributive law of multiplication over addition, which holds in the real number system. If a, b, and c are real numbers, then a' (b
+
=
c)
(a' b)
+
(a ·c).
Addition does not distribute over multiplication in the real number system; that is, it is generally false that a
+
(b' c)
=
(a
+
+
b)' (a
c).
However, union of sets does distribute over intersection, for given sets A, B, and C, it is true that A u (B n C)
=
(A u B) n (A u C).
Please prove this. 6.7 The previous two exercises suggest that there is a formal "algebra" of sets much as there is an algebra involving addition and multiplication of real numbers. You may verify as many of the set-algebraic properties listed below as you wish. Capital letters stand for sets. A = A.
= B, A = B
If If If A
cA.
o
cA.
then and and
A
A c B
A u A
= =
A
=
B = A. B = C, Be C,
then then
AnA.
A u B B u A, and A n B (A u B) u C = A u (B u C). (A n B) n C
ou A =A
=
A = C. A c C.
=
A n (B n C).
A = 0. A u (B n C) = (A u B) n (A u C). A n (B u C) = (A n B) u (A n C).
and
0 n
B n A.
154
6.2
Infinite sets
6.8 In the above list of properties of the algebra of sets, one can make an analogy with the algebra of real numbers, interpreting A, B, and C as real numbers, intersection as multiplication, and union as addition. What role does 0 play in this analogy? How good is the analogy? How do you interpret the relation of set-inclusion? 6.9 We define A - Bas A - B
=
{x
E
A
I x ¢ B}.
Draw a Venn diagram, similar to the ones shown in Fig. 6.1, to illustrate the set A-B. 6.10 Let A, B, and C be sets. Prove that A - (B n C) = (A - B) u (A - C).
See Exercise 6.9. 6.11 Continuing the ideas in the previous exercise, discover and prove valid a formula resembling the one above for A - (B u C). 6.12 How would you define Ai u A 2
U
A3
U
A4
U "',
and Ai n A 2 n A 3 n A 4 n "',
where, for each natural number n, An is a set? 6.13 Prove the formula A - (B l n B 2 n B 3 n B 4 n ... ) = (A - B l ) u (A - B 2 ) u (A - B 3 ) u (A - B 4 ) u ....
Is a similar formula with unions and intersections interchanged also valid? 6.14 Let m and n be natural numbers, and let A be a set containing m elements and B be a set containing n elements. What can be said about the number of elements of A u B? What can be said about the number of elements of A n B? 6.15 In the previous exercise, if you know that the number of elements of A n B is k, what can be said about the number of elements in A u B? 6.2 FUNCTIONS
Even if the concept of "set" is of first magnitude in the galaxy of mathematics, so is the concept of a "function." Let A and B be sets. A functionjfrom A to B is a rule which assigns to each element of A one and only one element of B. We writej: A -+ B, call A the domain off, and B the range off. If x is an element of A, and y is that
6.2
Functions
155
element of B assigned by f to x, we call y the value of f at x, and write y
= f(x).
In this definition we encounter the same problem as in our definition of set. We have simply substituted for the word "function" another primitive term, "rule," as a sort of synonym. In Exercise 6.16 we shall discuss another method of defining the term "function," using a somewhat clearer and more natural primitive term. If f: A ~ B is a function, the phrase ''I assigns to each element of A one and only one element of B" is not meant to be interpreted as meaning that each element of B is used exactly once as a value off It is permissible to use some elements of B more than once, and others not at all. For example, let R denote the set of all real numbers, and let f: R ~ R according to the rulef(x) = x 2 • Thenfisjust the function which assigns to each real number its square. Not every number in the range is used--4 is not a value off On the other hand, some elements in the range are used more than oncef(2) = 4 and also f( - 2) = 4, so that the number 4 in the range is used twice. For another example, suppose you are given a large collection of gummed labels, say at least a hundred thousand with the phrase "five feet, two inches" printed on them, another hundred thousand with the phrase "six feet, eleven inches" printed on them, and so on, a hundred thousand for each of the possible heights of students you might encounter at your school. Suppose next that you walk around campus until you have met each student; upon first meeting each, you paste on his forehead that label most nearly indicating his height. (If a person claims to be exactly five feet, eleven and one-half inches tall, just round this off to six feet.) In this example the domain is the set of all students on campus, the range is a set of heights, and you are the rule, assigning to each element (student) in the domain one and only one element (his height) in the range. The "one and only one" part of the definition comes in because no student has two different labels pasted on him and because we presume you continue this campaign sufficiently long so that each student receives a label. You, as the paster, are operating as the rule part of this function. The function itself consists of three things, a domain, a range, and a rule; in this case, the function consists of the set of students on campus, the set of heights, and yourself. Note again that it is permissible for some heights to be used more than once-you will likely encounter at least two students almost exactly the same height-and some labels, such as the "eight feet, ten inches" label, may never be used at all, and thus the possible height "eight feet, ten inches" would be an unused element of the range. It is sometimes convenient to consider the set of all values in its range that a function does use; this is called the image of the function, and we can define it using set-theoretic notation as follows.
156
6.2
Infinite sets
Let f: A
~
B be a function. Then the image off is the set
[m(f)
=
{y
E
B Iy
= f(x) for some
x
E
A}.
In the study of infinite sets, there are two properties of functions which will be of particular interest to us. First, it may happen that a function does indeed use each value in its range no more than once. Such a function is said to be one-to-one. It may also happen that a function has its image equal to its range; such a function is said to be onto. Again, we provide a formal definition. Let f: A ~ B be a function. The function f is said to be one-to-one provided that, if Xl and X2 are elements of A and Xl =F X2' then f(Xl) =F f(X2)' If [m(f) = B, thenfis said to be onto. Moreover, iffis both one-toone and onto, then f is said to be a one-to-one correspondence from A to B. Example 6.5 Letf: R
~
R by f(x)
=
x 2 • Thenfis a function, because
a) each real number has a square, so that R is the domain andfcan operate on R; b) the square of each real number is again a real number, so that the "output" off does lie in R, and hence R is the range off; and c) if a E R, b E R, and a = b, then a 2 = b2, and hence f(a) = feb). Hence even if a real number is known by two different names a and b, the output of the function is fixed for that number. So f has the "one and only one" property. However, f is not one-to-one, and neither is it onto, as we have already observed. Example 6.6 Let f: R ~ R according to the rule f(x) = 2x • Then f is a function, for reasons similar to those above. In addition, f is one-to-one, since if a and b are real numbers and a =F b, then 2 =F 2b , and hence f(a) =F feb). Butfis not onto, because for each real number x, 2x is positive. Although - 1 E R, the range off, - 1 ¢ [m(f). The graph of the function f of this example is shown in Fig. 6.3. Q
Example 6.7 Let f: R ~ R by f(x) = x 3 - x. The graph off is shown in Fig. 6.4. Again, f is a function, but f is not one-to-one, since 1 and - 1 are both numbers in the domain off, but although 1 =F - 1, f(1) = 0 = f( - 1). However,fis onto, since given a number y in the range off, there does exist at least one value of X in the domain off for which y = x 3 - x; that is, y
=
f(x).
Example 6.8 This is an example of a non-function. Let the domain and range each be the set R of all real numbers, as in the previous example, and let f assign to each number its reciprocal. We might write f(x) = l/x.
6.2
Functions
157
y-axis
Fig.6.3 The graph of (x) = 2)( a one-to-one function from I
RtoR.
=======-------i------------ x-axis
y-axis
Fig. 6.4 The graph of (x) = x 3 - x, an onto function from R to R.
--------,,e----"k--------,,e---------
x-axis
158
6.2
Infinite sets
Then I is not a function, for there is at least one number-namely O-in the domain ofI to which no value in the range is assigned by f Example 6.9 This is another example of a non-function. Again, let the domain and range each be R. Let I assign to each irrational number the value 17. Given a rational number, express it as a fraction a/b. Then f assigns to this rational number the value a + b. This is not a function, because there are equal numbers in the domain to which are assigned unequal numbers in the range. For example, 1/2 = 2/4, but/O/2) = 3 ¥: 6 = f(2/4). Example 6.10 This is our last example of a non-function. Let the domain be the set R of all real numbers, and the range be the set [0, 1] of all real numbers between and 1 (including and 1). Let f have the rule f(x) = 2x + 3. Then/is not a function, for although the number 12 is in its domain, f(12) is technically undefined-but in any case, the only "value" f(12) could have, according to the rule off, is 27, and 27 does not belong to the range of f. Hence f does not assign to the number 12 in its domain any value in its range.
°
°
Example 6.11 Let I: R --+ R according to the rule f(x) = 2x + 3. Then not only is I a function, but in fact f is a one-to-one correspondence from R to itself. To show the latter, we need only show that I is both one-to-one and onto. First, suppose that Xl and X2 are two numbers in the domain off such thatf(x l ) = I(X2). Then so that and hence
Consequently, if Xl ¥: X2' then I(Xl) ¥: I(X2). Hence I is one-to-one. N ow suppose that y belongs to the range R off Let y - 3
X=--.
2
Then f(x)
= 2x + 3 =2. y - 3 +3 2
= (y
- 3)
+
3
=
y.
6.2
Functions
159
Hence given the number y in the range off, there does exist a value of xnamely, x = (y - 3)/2-such that f(x) = y and x does lie in the domain R off Hence f is also onto, and thus by definition is a one-to-one correspondence from R to itself. If f: A ~ B is a one-to-one correspondence from the set A to the set B, there is then naturally associated with this function another function g: B ~ A according to the rule g(y) is that element x of A such thatf(x)
= y.
The function 9 is called the inverse of the function f, and is sometimes denoted by f - 1. Example 6.12 Letf: R ~ R according to the rulef(x) = 2x + 3. We saw in Example 6.11 that f is a one-to-one correspondence from R to R. So f must have an inverse g. In the proofthatfis onto, we saw that the element x of the domain of f that was assigned to the element y in the range of f by f had the value x = (y - 3)/2. Hence the rule of 9 is given by g(y)
=
g(x)
=
y -
2
3
'
or, if you prefer, x - 3
2
.
Note that we can use any reasonable symbol we please to specify how the rule of a function acts, for in the above example, 9 is simply the function which assigns to each real number that number obtained by halving the given number diminished by three. Note also how much clearer it is, at least in this case, to use symbols rather than words to describe the rule part of the function. Finally, as you may have noticed, there is no reason why we should restrict our attention to functions which have domains and ranges subsets of R, and rules that look like algebraic formulas. The example with the stickers giving a student's height shows that other sets and rules may be used. And the rule does not have to have any special regularity about it; as a final example, you could simply paste the stickers on the students' foreheads more or less at random, so long as each student received one and only one sticker, and this would be a different function from the height function previously mentioned. Exercises
6.16 We give here another primitive term, that of ordered pair, by which an equivalent definition of "function" can be given. The ordered pair (a, b)
160
6.2
Infinite sets
consists of two objects a and b together with the idea that a is the first, and b the second, in the pairing. The Cartesian product A x B of two sets is defined as follows: A x B
= {(a, b) I a E A
and
bE B}.
A familiar example of an application of the above concept is the ordinary coordinate system for the two-dimensional plane used in analytic geometry. One forms the Cartesian product R x R, and then a is the so-called xcoordinate, and b the y-coordinate, of the point (a, b) E R x R. We can provide an alternate but equivalent definition of the term "function." Let A and B be sets. A function f from A to B is a subset of A x B such that a) for each x E A, there exists y E B such that (x, y) Ef; and b) if (x, y) Efand (x, z) Ef, then y = z. If f is a function in the sense of this definition, and (a, b) Ef, does our old notationf(a) make sense? What is the value of f(a)?
6.17 Give an example unlike Example 6.6 of a function f: R -+ R that is one-to-one but not onto. 6.18 Give an example unlike Example 6.7 of a function f: R -+ R that is onto but not one-to-one. 6.19 Give an example unlike Example 6.11 of a functionf: R -+ R that is a one-to-one correspondence from R to R. 6.20 Findf- 1 for your example above. 6.21 The open interval (- (nI2), n12) of all real numbers between - (nI2) and nl2 can be thought of as a set of angles in radian measure, and with this domain and with range R, the rule f(x) = tan x gives a function that is a one-to-one correspondence from ( - (nI2, n12) to R. The graph off is shown in Fig. 6.5. Sketch the graph off-I. 6.22 Let A and B be sets, and f a one-to-one correspondence from A to B. Prove that not only is f- 1 a function with domain B and range A, but also thatf- 1 is a one-to-one correspondence from B to A. 6.23 Let A be a set. Construct a one-to-one correspondence from A to itself. Hint: This problem is very easy. 6.24 Let A, B, and C be sets, and letf: A -+ Band g: B -+ C be functions. As indicated in Fig. 6.6, we can go directly from A to C by a new function, called the composition off and g, which we construct as follows. We denote this function by {g(f)}, and define {g(f)}: A -+ C according to the rule {g(f)}(x) = g(f(x)).
Verify that {g(f)} : A
-+
C is indeed a function.
6.2
Functions
y-axis
Fig. 6.5 The graph of (x) = tan x, - (1t/2) < x < 1t/2.
-(1T/2)
Fig. 6.6 The composition {g( f) } of the functions ( and g.
o {g(f)}
161
162
6.3
Infinite sets
6.25 Using the notation and definitions of the previous exercise, verify that if each off and g is one-to-one, then so is {g(f)}. 6.26 Using the notation and definitions of Exercise 6.24, prove that if each off and g is a one-to-one correspondence from its domain to its range, then {g(f)} is a one-to-one correspondence from A to C. 6.27 Continuing the previous exercise, find a formula for {g(f)} -1 in terms off- 1 and g-l. 6.28 See Exercise 6.24. Let f: R --+ R by f(x) = x 2 , and let g: R --+ R by g(x) = x + 1. To be equal, two functions must have the same domain, the same range, and the effects of their rules must be the same, in that if x is in their common domain, then f(x) = g(x). Are the functions f and g given in this exercise equal functions? Why? 6.29 Using the functions f and g of the previous exercise, find {g(f)} and {f(g)}. Are the latter two equal functions? Explain your answer. 6.30 Let f: A --+ B be a one-to-one correspondence from the set A to the set B, and let g = f-l. Show that {g(f)}(x) = x for all x E A and that {f(g)}(x) = x for all x E B.
6.3 MORE ON ONE-TO-ONE CORRESPONDENCES
Recall that we say there is a one-to-one correspondence from the set A to the set B provided that there is a one-to-one and. onto function f: A --+ B. If so, we shall use the notation A ,...., B, and say that the sets A and B can be put into one-to-one correspondence. By using the notion of one-to-one correspondence between two sets, we shall be able to define what we mean by a finite set, an infinite set, and what it means for two sets to have the same number of elements even if this number should be infinite. First, however, we examine the properties of one-to-one correspondences; specifically, we want first to show that this relation is an equivalence relation (see Exercise 1.6). That is, we want to show that if A and Band C are any sets, then a) A ,...., A. b) If A,...., B,
then
c) If A,...., B
and
then
A,...., C.
But part (a) has been taken care of in Exercise 6.23, part (b) has been taken care of in Exercise 6.22, and part (c) has been proved in your work for Exercise 6.26. So from this point on, we can eliminate a large number of tedious constructions from many proofs and exercises. For example, if you know that each of the two sets Sand T can be put into one-to-one correspondence with a third set U, then you know that Sand T can also be put into one-to-one correspondence with each other; knowing this, you know in
6.3
More on one-to-one correspondences
163
addition that there exists some one-to-one and onto function !: S -+ T, so you can simply "let" ! be such a function, and use! where necessary in your proofs. Now we can give a precise definition of what it means for a set to be infinite-we are going to say that a set is infinite provided that it is not finite, so first we define what we mean by a finite set. The set S is said to be finite provided that either S = 0 or, for some natural number n, S ,..., {I, 2, 3, ... , n}. The set S is said to be infinite provided that S is not finite. The example that comes easily to mind for an example of an infinite set is the set N = {I, 2, 3,4, ... }, whose elements are all the positive whole numbers. But in order to show that N is in fact an infinite set, it is necessary to show that N =1= 0 (which is easy) and that there can exist no one-to-one correspondence between N and any set of the form {I, 2, 3, ... ,n}. This is likely to be a formidable task, for one must show for infinitely many different values of n that no such one-toone correspondence can exist. Moreover, it is also quite difficult to prove at this point the obvious, necessary fact that the two sets {I, 2, 3, ... , n} and {I, 2, 3, ... , m} can be put into one-to-one correspondence if and only if m = n. In the next section we shall provide the Cantor-Schroeder-Bernstein Theorem, an almost indispensable tool in dealing with problems of this sort. Exercises
6.31 Supply, in addition to N, two more examples of infinite sets. 6.32 Show that the set N of positive whole numbers can be put into one-toone correspondence with its proper subset E of even positive whole numbers. Note: In order to do this, you must show the existence, presumably by construction, of a one-to-one correspondence between Nand E; that is, you must construct a one-to-one and onto function!: N -+ E (or!: E -+ N). 6.33 Show that the set N of positive whole numbers can be put into one-toone correspondence with its proper subset M of all squares of positive whole numbers. 6.34 Show that the set N of positive whole numbers can be put into one-toone correspondence with its subset T of all such numbers with two or more digits. 6.35 Show that the set N of positive whole numbers can be put into one-toone correspondence with the set X = {- 1, - 2, - 3, - 4, ... }. 6.36 Give three different functions each of which is a one-to-one correspondence from the set N to itself.
164
Infinite sets
6.4
6.37 Give two different functions each of which is a one-to-one correspondence from the set N to its proper subset L = {I, 3, 5, 7, 9, ... }. 6.38 Show that the set N of positive whole numbers can be put into one-toone correspondence with the set W of all whole numbers. 6.39 Show that the set of points on a line one unit long can be put into oneto-one correspondence with the set of points on a line two units long.
°
6.40 Show that the set (0, I) of real numbers between and I (not including or I) can be put into one-to-one correspondence with the set R of all real numbers. Hint: In Exercise 6.21, an example was given of a one-to-one correspondence between the set ( - (nI2), n12) and the set R. If you can show that (-(nI2), n12) and (0, I) can also be put into one-to-one correspondence, then it will follow (why?) that (0, I) and R can also be put into one-to-one correspondence. Assuming that you have established a one-to-one correspondence, say f, from (0, I) to R, can you now show the existence of a one-to-one correspondence from [0, I] to R? ([0, I] = {x E RIO ~ x ~ I}.) How?
°
6.4 THE CANTOR-SCHROEDER-BERNSTEIN THEOREM
There are many cases in which one desires to show that two sets can be put into one-to-one correspondence, but all that can easily be accomplished in practice is something such as this: Sand T are two given sets, and it turns out to be possible to show that S can be put into one-to-one correspondence with some subset B of T, and that T can be put into one-to-one correspondence with some subset A of S. In Fig. 6.7 this situation is diagrammed, with cI> the one-to-one function from S onto Band 'P the one-to-one function from Tonto A. This situation seems to suggest that since T has at least as "many" elements as S, and S as many as T, then Sand T themselves could be put into one-to-one correspondence. However, though this is true, finding such a correspondence in actual practice can sometimes be rather complicated. The Cantor-Schroeder-Bernstein Theorem guarantees under these conditions the existence of a one-to-one correspondence between Sand T.
Theorem 6.1 (Cantor-Schroeder-Bernstein) Let Sand T be two sets, and suppose that S '" B, where BeT, and that T '" A, where A c S. Then S '" T. Proof Let cI>: S -+ Band 'P: T -+ A each be one-to-one and onto functions. Both cI> and 'P exist because of our hypotheses that S '" Band T '" A. We shall produce a function E> that is a one-to-one correspondence from S to T. The proof of the existence of E> will actually be constructive, so that a formula for E> could actually be written in a specific case; however, in such
6.4
The Cantor-Schroeder-Bernstein Theorem
165
Fig. 6.7 The hypotheses of the Cantor-SchroederBernstein Theorem: The functions are one-to-one.
cases the formula could be so complicated that we will generally be content with the knowledge of the existence of 0. Consider an element XES. If there exists an element yET such that '¥(y) = x, then we will call y a parent of x. If such an element yET in addition has a parent Z E S, such that
(z) = y, we will also call z a parent of x, and in this case {'¥(
subsets: Soo SfT Sf
= = =
{x
E
S I x has infinitely many parents}.
{x
E
S I the ancestry of x begins in S}.
{x
E
S I the ancestry of x begins in T}.
Note that we have chosen the subscripts so as to help us remember which elements of S belong to each of these subsets of S. Of course, S - A C SfT' but in fact SCI contains those elements of S with an even number of parents
Infinite sets
166
• •
6.4
• • •
• •
•
•
• • •
Fig.6.8 The element E S has infinitely many parents; the element b E S has ancestry beginning in S; the element C E S has ancestry beginning in T.
a
b~-'
c
s
T
(recall that 0 is an even number) and Sr: contains those elements of S with an odd number of parents. See Fig. 6.8. Similarly, we divide Tup into three mutually exclusive subsets as follows:
= =
{y
E
T I y has infinitely many parents}.
{y
E
T I the ancestry of y begins in S}.
~ =
{y
E
T I the ancestry of y begins in T}.
Too Tu
We emphasize that Too, Tu' and Tr; are mutually exclusive-the intersection of any two is the empty set-and that T = Too u Tu u~. Hence each element of T belongs to exactly one of the sets described above. Similar remarks hold for S.
6.4
The Cantor-Schroeder-Bernstein Theorem
167
We will establish that is a one-to-one correspondence from S 00 to Too, that is also a one-to-one correspondence from Sq to Tq, and that'll is a one-to-one correspondence from ~ to St' This will prove that Soo ,..., Too, Sq ,..., Tq, and St ,..., Tt • Then we can "glue together" the functions on Soo u Sq and '1'-1 on St to obtain the one-to-one correspondence E> from S to T, as desired. Now Soo c S, so is defined on Soo' Moreover, if x E Soo, then x has infinitely many parents, so that (x) is an element of T with infinitely many parents as well. Hence is a function with domain S 00 and range Too. (Actually, this is an improper use of terminology, since by restricting the domain of from S to Soo we are actually considering a new function; perhaps we should indicate this by calling this new function <1>00 instead of of just <1>, but the additional notation hardly seems justified in this case.) We need to show that <1>: Soo --+ Too is both one-to-one and onto. But is clearly one-to-one, since it is one-to-one on all of S. And if y E Too, then y has infinitely many parents, so in particular it has an immediate parent XES, for which (x) = y. But this element x must actually belong to Soo' for if x had but finitely many parents then so would y. Hence there does indeed exist an element x E Soo such that (x) = y. Thus <1>: Soo --+ Too is both one-to-one and onto, and thus is indeed a one-to-one correspondence from S 00 to Too" By the symmetry of the remaining two cases, it is sufficient to prove only one of them, for example that 'II is a one-to-one correspondence from ~ to St' This is left for you, in the exercises at the end of this section. So we may assume that we know that each of the following is a one-to-one correspondence: <1>: S 00 --+ Too , <1>: Sq --+ Tq,
'1': ~
--+
St'
See Fig. 6.9. By Exercise 6.22, '1'-1 is also a one-to-one correspondence from St to ~, so that we have the following situation. The three functions shown below are each one-to-one and onto: <1>: S 00 --+ Too , <1>: Sq --+ Tq,
'1'-1: St
--+ ~.
The three domains shown above are disjoint and their union is all of S; the three ranges shown above are also disjoint and their union is all of T.
168
6.4
Infinite sets
s~
T~
•
Fig.6.9 Each function is a one-to-one correspondence on the indicated subset.
•
So we define 0: S
--+
T as follows:
If x
E
SOC!,
then
0(x)
If x
E
S(1'
If x
E
S-r'
then then
0(x) 0(x)
= (x). = (x). = \}1-1(X).
It should be clear not only that 0 is a function, but in fact is a one-to-one correspondence from S to T. Thus S '" T, and this concludes the proof of the Cantor-Schroeder-Bernstein Theorem.
6.4
The Cantor-Schraeder-Bernstein Theorem
169
As an application, we show how this theorem can be used to show the existence of a one-to-one correspondence between the set
w=
{... , - 2, - 1, 0, 1, 2, 3, ... }
of all whole numbers, and the set X X
= {em,
n)
=
W x W, or
I mEW
nEW},
and
the Cartesian product of W with itself. (This may appear to be a surprising result, since X appears to be much "larger" than W.) Let $: W --+ X by the rule $(n) = (n, 0). Then $ is clearly a one-to-one function from W to X. Now all we need is a one-to-one function 'P from X to W. Among other things, if (m, n) is an ordered pair of whole numbers (a typical element of X), we must so design 'P that 'P(m, n) is a whole number, and thus an element of W. One approach that almost works is to let 'P(m, n) = mn. Then'P is a function from X to W, but unfortunately is not one-to-one, since 'P(1, 6)
=6=
'P(2, 3)
and (1, 6) i= (2, 3).
Another approach that almost works is to let 'P(m, n) = 2m • 3n • Then 'P would be one-to-one, but unfortunately not a function from X to W, since (1, -1) E X but 'P(1, -1) = 2/3 and 2/3 ¢ W. But a modification of the last approach does work. In the last approach it is only the fact that m or n might be negative that produces fractions, rather than whole numbers, in the output of 'P, so we define 'P as follows: If m > 0
and
n > 0,
then
'P(m, n)
If m > 0
and
n < 0,
then
'P(m, n)
If m < 0
and
n > 0,
then
'P(m, n)
If m < 0
and
n < 0,
then
'P(m, n)
= = = =
2m • 3n • 2m • 5- n •
7- m • 3n • 7- m • 5- n •
In each case, the output of'P is a positive whole number, so that 'P: X --+ W is a function. The fact that each positive whole number has a unique prime factorization into the product of primes means that the output of'P uniquely determines its input; or, in other words, 'P is one-to-one. Neither $ nor'P is onto, but this does not matter. We have exactly the situation given in the hypotheses of the Cantor-Schroeder-Bernstein Theorem. Wand X are sets, $: W --+ X is one-to-one, so that W '" B = Im($) c X, and 'P: X --+ W is one-to-one, so that X '" A = Im('P) c W. Hence by the Cantor-Schroeder-Bernstein Theorem, W '" X.
170
6.5
Infinite sets
Exercises
6.41 Show, in the proof of Theorem 6.1, that'P actually is a one-to-one correspondence from Tt to St' 6.42 In the example following Theorem 6.1, in which it is shown that W '" X, show that the function
First we should show that both finite sets and infinite sets exist. It is clear that the sets of the form {I, 2, 3, ... , n} are finite for each positive whole number value of n. To show the existence of an infinite set,we prove that the set N of positive whole numbers is in fact infinite. Theorem 6.2 The set N
= {I, 2, 3, ... } is infinite.
Proof The proof is by contradiction; we suppose by way of contradiction that the set N is finite. Then, by definition, either N = 0 or N can be put into one-to-one correspondence with a set of the form {I, 2, 3, ... , n}, for some positive whole number n. Since 17 E N, N =F 0, and hence there is a one-to-one and onto function f from {I, 2, 3, ... , n} to N for some positive whole number n. We can think of f as a one-to-one correspondence from the set {I, 2, 3, ... , n} to the set {fO), f(2), f(3), ... ,f(n)}, and it would not be difficult to think of a method of selecting from the latter set its largest element, a method which would require no more than n steps. So the set {fO), f(2), f(3), ... ,fen)} contains a largest element, say m, and m = f(j) for some j such that 1 m. Hence our original supposition, that N is finite,
6.5
Properties of finite and infinite sets
171
leads to a contradiction, and consequently N is infinite. This completes the proof of the theorem. Note that it was not necessary to use the fact that f is one-to-one. Our next result depends on Exercise 6.46, at the end of this section: That if S is any infinite set whatsoever and XES, then the set S - {X} is also infinite, and therefore nonempty. Moreover, our next theorem also implies that the set N is in some sense a "smallest" infinite set-the exact meaning of this statement will be discussed in Exercise 6.47. Theorem 6.3 M ,.., N.
If S
is an infinite set, then S contains a subset M such that
S. Then by Exercise 6.46, S - {xd is infinite, and thus nonempty. So we may choose X2 E S - {xd. Since S - {Xl} is infinite, it follows again by Exercise 6.46 that S - {x h x 2 } is infinite and nonempty. So we next choose X3 E S - {Xl' X2}' We continue this process. It will no1. terminate, for if Xl' X2' X3' ... , Xk have been chosen, then still Proof. Choose
Xl E
S -
{Xl' X2' X 3 , ... ,
is infinite, thus nonempty, and we may choose
xk } Xk+ 1
from it. Then
is still infinite and nonempty, and the process can be continued. Thus for each positive whole number m we can produce an element X m E S, and by our construction, if j :I: m then x j :I: X m • So the set M = {x h X2' x 3 , ••. } is a set of distinct elements of S, one for each positive whole number. Let f: N -+ M according to the rule f(n) = X n • It is clear that f is a one-to-one correspondence, and hence that M ,.., N. This establishes the theorem. The next theorem seems "obvious," but it is not easy to prove, at least without using the Cantor-Schroeder-Bernstein Theorem. Theorem 6.4 Every subset of a finite set isfinite. Proof. Suppose that F is a finite set. The theorem is clearly true if F = 0, so we consider only the case in which F ,.., {I, 2, 3, ... , n} for some positive
whole number n. Suppose by way of contradiction that F contains an infinite subset S. By Theorem 6.3, S contains a subset M such that N ,.., M. Let f: N -+ M be a one-to-one correspondence from N to M. Now {I, 2, 3, ... ,n} c N, so that, lettingf(j) = mj for eachj EN, is a subset of M such that K ,.., {I, 2, 3, ... , n}. So K ,.., F. Also, since K c M and M c S, then K c S.
172
6.5
Infinite sets
So we have the following situation: K e S c F and K '" F. By Exercise 6.43, which is an easy consequence of the Cantor-SchroederBernstein Theorem, it follows that S '" F. But F '" {I, 2, 3, ... , n}, and hence S '" {I, 2, 3, ... , n} as well. This is impossible because S was supposed to be infinite; if S '" {I, 2, 3, ... , n} were true, then S would be finite by definition, and again by definition "infinite" means "not finite." This contradiction shows that the finite set F can contain no infinite subsets, and establishes the theorem. We now prove a theorem sometimes known as the "Dedekind Box Principle," which together with the succeeding theorem will enable us to give an alternate, equivalent, and sometimes more useful definition of what it means for a set to be infinite. Theorem 6.5 No finite set can be put into one-to-one correspondence with one of its proper subsets. Proof Let F be a finite set and G a proper subset of F. Suppose by way of contradiction that G '" F; then there exists a one-to-one correspondence qJ from F to G. Since G c F, then G must be finite by Theorem 6.4. Hence either G = 0 or else G = {I, 2, 3, ... , n} for some positive whole number n. If G = 0, then F =1= 0 since G is a proper subset of F; however, for F =1= 0 and G = 0, qJ: F --+ G cannot be a function. So we can eliminate the possibility that G = 0. In several previous exercises, it was seen that when a one-to-one correspondence exists between two sets such as F and G, there are generally several such correspondences. If we have one, such as cp, we can let Mlp = {x
E
F I cp(x)
=1=
x}
be that subset of F of points of F moved under the action of cp. By Theorem 6.4, for each such correspondence cp the set Mlp must be finite because it is a subset of the finite set F. So the number of elements in the set M lp is a positive whole number or zero. Choose a one-to-one correspondence cp from F to G so that M lp contains the smallest possible number of elements. With this choice of cp we will seek to establish a contradiction. Now Mlp =1= 0. For if Mlp = 0, then cp(x) = x for all x E F. Then G = Im( cp) = F, so that G would not be a proper subset of F. Therefore, since Mlp =1= 0, we may choose an element x E Mlp (") G (how?). Now cp(x) =1= x, so cp(x) = y for some y E F such that y =1= x. Moreover, cp is onto, so there exists WE F such that cp(w) = x. And if w = x, then cp(w) = cp(x) since cp is a function; but since cp(w) = x, this would imply that x = cp(x), which is false by the choice of x E Mlp. Hence w =1= x.
6.5
Properties of finite and infinite sets
173
Fig. 6.10 (fJ
moves w to x and x to y.
F
Next, y e M([). For if not, then qJ(Y) = y. But also qJ(x) = y, and cp is one-to-one. This implies that x = y, which we have shown false. Hence yeM([).
Also, we M([). For if not, then qJ(w) = w. But cp(w) = x by choice of w, so that W = x since cp is a function. But then cp(x) = x, again contrary to our choice of x. So we M([) as well. Hence we have the situation shown in Fig. 6.10. All three ofw, x, andy belong to M([). Also w =1= x and y =1= x (although it is possible that w = y; however, for clarity we have shown in Fig. 6.10 the more general case in which w =1= y). Finally, qJ(w) = x and cp(x) = y. We modify the function qJ and obtain a slightly different function I/J as follows: Let Let Let
I/J(w) = y.
= x. I/J(z) = qJ(z) I/J(x)
if
Z =1=
and
w
Z =1=
x.
Now I/J is almost the same function as qJ; all that has been done to cp in order to obtain I/J is to switch the values of qJ at wand x, as indicated in Fig. 6.11. Hence it is easy to see that I/J is also a one-to-one correspondence from F to G. But let us now consider the set M "', given by M", = {x e F
I I/J(x)
=1=
x}.
174
6.5
Infinite sets
Fig. 6.11
tp
is modified to become w to y and leaves
tfI, which moves
x fixed.
F
. The particular element x, previously chosen from M tp' has the property that q>(x) =1= x; but for this element x, t/J(x) = x. So X E Mtp' but x ¢ M",. Moreover, if v ¢ Mtp' then q>(v) = v and, in addition, v =1= wand v =1= x since x and ware elements of Mtp. So, by definition of t/J, also t/J(v) = v. Thus if v ¢ Mtp' then v ¢ M",. Hence M", can contain no more elements of F than Mtp' and in fact Mtp actually contains more elements of F than M", because x E Mtp but x ¢ M",. But q> was chosen so that its corresponding set of points Mtp would contain the minimum possible number of elements of F, and here we have constructed t/J, another one-to-one correspondence from F to G whose corresponding set M", contains fewer elements than MfP. This is in contradiction to the choice of q>, and this contradiction establishes that no such function as q> can exist. Therefore no finite set can be put into one-to-one correspondence with one of its proper subsets, and the theorem is proved. The mathematician Richard Dedekind suggested an alternative, but equivalent, definition of what it means for a set to be infinite; the term used is that the set is Dedekind infinite, and this means that the set can be put into one-to-one correspondence with one of its proper subsets. Theorem 6.6 shows that this is the same property as being infinite. Theorem 6.6 The set S is infinite if and only if Sis Dedekind infinite. Proof. Suppose first that S is Dedekind infinite. Then, by definition, Scan be put into one-to-one correspondence with one of its proper subsets. By
6.5
Properties of finite and infinite sets
175
s
Fig. 6.12 The function f is a one-to-one correspondence from S to S - {x}.
•
•
•
•
•
•
• •••
1234567···
the previous theorem, S cannot be finite; hence by definition, S must be infinite. Next, suppose that S is an infinite set. By Theorem 6.3, S contains a subset M such that M ,..., N. Let () be a one-to-one correspondence from N to M, and let x = ()(1). Then x EM c S, so that x is also an element of S; let T = S - {x}. Tis certainly a proper subset of S. We will show that S can be put into one-to-one correspondence with T. We define a function f from S to T as follows. If S E S - M, just let f(s) = s. If S E M, then s = ()(n) for some positive whole number n since (): N --+ M is onto; in fact, n is uniquely determined by s since () is also one-to-one. Letf(s) = ()(n + 1). Of course, what we are doing here is shoving each element of M up one notch, sending x = ()(1) to ()(2), sending ()(2) to ()(3), and so on, while leaving the elements of S not in M alone. This action of f is indicated in Fig. 6.12. You may easily verify for yourself that f is indeed a one-to-one correspondence from S to T. So we have shown that the infinite set Scan be put into one-to-one correspondence with one of its proper subsets, and thus that S is Dedekind infinite. This completes the proof of Theorem 6.6. The Dedekind Box Principle (Theorem 6.5) can be phrased very informally as follows: If a postman has n letters to put into m mailboxes and m < n, then at least one box must get at least two letters.
176
6.5
Infinite sets
Exercises
6.46 Prove that if 8 is an infinite set and x E 8, then S - {x} is infinite. Note: This exercise is used to establish Theorem 6.3, so that theorem and its successors may not be used in working this exercise. Hint: First suppose that 8 - {x} is finite, in order to reach an eventual contradiction. Set up a oneto-one correspondence f from 8 - {x} to the set {I, 2, 3, ... , n}.
Use f to produce a one-to-one correspondence g from (8 - {x}) u {x} to the set {I, 2, 3, ... , n, n + I}. Explain why this is a contradiction. Draw the desired conclusion. 6.47 Show that if K is a subset of N, then either K is finite or K '" N. Use this fact to show that if S is a set such that 8 '" N, and T is a subset of 8, then either T is finite or T '" N. This is the sense in which N can be considered a "smallest" infinite set, bearing in mind the next exercise as well. 6.48 Prove that if 8 is a set containing the set N of positive whole numbers as a subset, then S must be infinite. Use this fact to show that if 8 is a set containing a subset M such that M '" N, then 8 must be infinite. 6.49 A set 8 is said to be denumerable provided that 8 '" N. Prove that N x N is denumerable. Hint: Use the Cantor-Schroeder-Bernstein Theorem (Theorem 6.1) and the techniques used in the application immediately after its proof. See also Exercise 6.45. 6.50 Use Exercise 6.49 to prove that if A and B are denumerable sets, then so is A x B. 6.51 Show that the set Q of all rational numbers is denumerable. Hint: Use Exercises 6.47 and 6.50. 6.52 Prove that if A and B are denumerable sets, then so is A u B. 6.53 For each prime pEN, let A p be the set of all positive integral powers of p; that is, A p = {p, p2, p3, p4, ... }. Show that for each prime p, A p is a denumerable set. Prove that the collection {A p I p is prime} = {A 2 , A 3 , As, A 7 , All' ... }
is denumerable. Note: There are infinitely many primes. 6.54 Use the previous exercise to show that the set N of positive whole numbers contains infinitely many infinite sets no two of which have any element in common.
6.6
Nondenumerable infinite sets
177
6.55 Use the ideas developed in the previous two exercises to show that if each of B 1 , B 2 , B 3 , ••• is a denumerable set, then so is the set B
=
B1
U
B2
U
B3
U .••.
6.56 An equilateral triangle of side length 2 is drawn in the plane, and five points are selected within this triangle. Prove that some two of these points must lie within distance I of each other. Hint: Use the Dedekind Box Principle. 6.57 Prove that at least one pair of people in Atlanta, Georgia have the same number of hairs on their heads. 6.58 Show that the sets {I, 2, 3, ... , n}
and {I, 2, 3, ... , m} can be put into one-to-one correspondence if and only if m = n. 6.59 Prove that if F is a finite set and S is an infinite set, then S '" S u F. 6.60 So far, every infinite set has turned out to be denumerable. Do you believe that every infinite set is denumerable? For example, let P be the set of all polynomials with whole number coefficients. Is P denumerable? Hint: See Exercise 6.55. 6.6 NONDENUMERABLE INFINITE SETS
We have shown N infinite, and N is denumerable by definition. In Exercise 6.51, you were asked to show that the set Q of all rational numbers is also denumerable, and in Exercise 6.60 that the set P of all polynomials with coefficients in Wis also denumerable. At this point you might wonder, quite justifiably, if there might be only two kinds of sets-finite and denumerableand thus that all infinite sets could be put into one-to-one correspondence. Our next theorem shows that this is not the case; the familiar set R of all real numbers is certainly infinite, but it cannot be put into one-to-one correspondence with the set N, and is thus nondenumerable. Hence, in a very precise sense, the set R contains "more" elements than the set N. This was proved by the mathematician Georg Cantor in 1873; he later discovered a simpler proof which we next present. This proof depends on the fact that each real number has a decimal expansion, and that two real numbers are equal if and only if their decimal expansions are identical. (There is a minor point here to be discussed in the exercises; 0.99999 . .. and 1.00000. .. are different decimal expansions for the same real number; we assume in the proof to follow that if there should be such ambiguity, the latter form is to be used.)
178
1
Al
2
A 2
3 4
5
•• •
6.6
Infinite sets
A 3 A 4 As
• • • • •
a l5
•••
a 24
a25
•••
a32
a 33
a 34
a 35 • • •
a42
a 43
a 44
a45
a52
a53
a54
ass • • •
all
al2
a l3
a)4
a21
a22
a 23
a 31
a4) as)
•••
Fig. 6.13 The hypothetical one-to-one correspondence from Nto R.
•• •
Theorem 6.7 The set R of all real numbers is nondenumerable. Proof. We suppose, by way of contradiction, that R is denumerable. Then there must exist a one-to-one correspondenceffrom N to R. We arrange a "diagram" of the action off as shown in Fig. 6.13. The left column lists the positive whole numbers, the domain off To the right of each is its value under the action off Now we have supposed that the one-to-one correspondence f exists, but we do not know what it is; we do not know whether fO) is 1/4, -7, or n. So we must represent the values offaccording to their decimal expansion. Since it turns out that we will not have to consider the whole number part of fen), we have called this An; thusf(n) is shown in the form
so that anj is the jth digit in the decimal expansion of fen). Because f is onto, each real number must appear somewhere in the righthand column. We proceed to reach a contradiction by producing a real number that does not appear anywhere in the right-hand column. We construct this real number ex by going down the diagonal all' a22 , a 3 3' . . . , and changing each of these digits. Specifically, we construct ex as follows: The decimal expansion of ex is to be ex = 0 . blb2b3b4bs ... , where b i is obtained from au as follows. If au is less than 8, let b i = a ii If au is 8 or 9, let b i = O.
+
1.
6.6
Nondenumerable infinite sets
179
Now ex does not appear anywhere in the right-hand column, for the decimal expansion of ex differs in at least one place from any of the numbers listed in the right-hand column. But ex has a decimal expansion, and thus is indeed a real number. This contradicts the fact that! is onto. Hence R is nondenumerable. As you have seen, we count by means of one-to-one correspondence. For example, if the set A can be put into one-to-one correspondence with the set {I, 2, 3, ... , n}, then we say that the set A contains n elements. If A ,..., 0, then we say that A contains no, or zero, elements. The numbers 0, 1, 2, ... , that we use in counting the finite sets are called finite cardinal numbers. In fact, if we consider the collection of all sets that can be put into one-to-one correspondence with the set {I, 2}, there is but one significant property common to all sets in this collection, "twoness" or the property of containing two elements, and to be precise it is exactly this common property we mean when we speak of the cardinal number 2. There is nothing to prevent us from giving names to infinite cardinal numbers as well. By tradition, the name ~o has been used for the cardinal number of the set N of positive whole numbers, and thus is the cardinal number of all denumerable sets. (~is the first letter of the Hebrew alphabet, and ~o is usually pronounced "aleph-null.") The German letter c is customarily used for the cardinal number of the set R of all real numbers, because c is the first letter of the German word for continuum, used sometimes as a synonym for the real number line. We shall not go into the arithmetic of cardinal numbers, but only mention that it is possible to consider a natural order relation between them. Here is how this is done. First, if A is a set, then it has a cardinal number-finite or infinite-and we denote this number sometimes by [A]. Next, given two sets A and B, it may be possible to find a one-to-one function!: A ~ B. If so, then we say that [A] < [B]. If we can also show that there can be no one-to-one and onto function from A to B, then we can say in fact that [A] < [B]. This order relation obeys the expected properties (with one exception: If m and n are cardinal numbers, it does not follow without an additional powerful axiom of set theory that either m < n, m = n, or n < m). In particular, the Cantor-Schroeder-Bernstein Theorem may be translated into the language of cardinal numbers as follows: If m and n are cardinal numbers and both m ~ nand n < m are true, then m = n. Finally, Theorem 6.7 may be translated very simply: ~o ~o
< c.
But so far, the only two infinite cardinal numbers we have seen are just and c. Our final theorem says, in effect, that there are infinitely many.
180
6.6
Infinite sets
9
9
s
•
Fig. 6.14 Either or z ¢ K z -but either possibility leads to a contradiction.
Z E Kz
Theorem 6.8 Let S be any set and let f/ be the collection of all subsets of S. Then there exists a one-to-one function from S to f/ but no such function can be onto. Hence, in the terminology of cardinal numbers, [S] < [f/J. Proof There clearly does exist a one-to-one function f from S to f/; just let f(x) = {x} for each XES. Suppose, by way of contradiction, that there exists a function 9 that is a one-to-one correspondence from S to f/. What 9 must then do is assign to each element s E S a subset of S, which we will call K s ; thus K s E f/, for K s c S, and K s is just another name for g(s). Consider s E S. Since K s c S there are just two possibilities: either s E K s or s f K s • We are particularly interested in the latter case; in fact,
we let
Now :K c S, so that :K E f/. Since g: S -+ f/ is onto, there must exist an element Z E S such that g(z) = :K; in fact, since 9 is one-to-one, this element z is in fact uniquely determined by :K. Sinceg(z) = :K, then:K = K z • But where is z? Since z E Sand K z c S, either z E K z or z ¢ K z • Let us consider each of the two possibilities. Fig. 6.14 may be helpful. If z E Kz, then z E :K since :K = Kzo But:K is the set of all s E S such that s f K s • Since Z E K z , it follows by definition of :K that Z f :K. This is contrary to the fact that Z E :K. On the other hand, if Z f K z , then Z f :K since:K = K z • But by definition of :K, :K consists of all those elements s of S such that s ¢ K s • Since Z ¢ K z ' then Z E :K by definition of :K. This is contrary to the fact that Z ¢ :K.
6.6
Nondenumerable infinite sets
181
Either way, the assumption that there exists a one-to-one correspondence 9 from S to !/ leads to a contradiction. Thus there can be no such function, and this establishes the theorem. Exercises
6.61 Prove that irrational real numbers exist, using results of this chapter. 6.62 It was mentioned just before the proof of Theorem 6.7 that the same real number may have two different decimal expansions; the example given was 0.99999 . .. and 1.00000... . You can prove these two are equal by using some of the techniques of infinite geometric series (see Exercise 1.45). Of course, you use the fact that the correct interpretation of the decimal expansion 0.99999 ... is the sum of the series 9/10
+ 9/100 + 9/1000 + ....
6.63 Modify the proof of Theorem 6.7 to show that the set of real numbers between 0 and 1 is nondenumerable. 6.64 Use the previous exercise and other results to show that the cardinal number of the set of real numbers between 0 and 1 is c. 6.65 Show that the cardinal number of the set of points in the unit square S in the plane is also c by using the previous exercise and setting up a one-toone function from [0, 1] into S and a one-to-one function from S into [0, 1], then applying the Cantor-Schroeder-Bernstein Theorem. Note: S = {(a, b) Hint: If (a, b)
E
E
R x RIO < a, b
~
I}.
S, then a and b have decimal expansions of the form
and bo . b 1 b2b3b4 · ..
where ao and bo are each either 0 or 1. Where is the number
o . aObOalbla2b2a3b4 . .. ? 6.66 Use the definition of "Dedekind Infinite" to prove that the set R of all real numbers is infinite. 6.67 Let E be the set of all even positive whole numbers. Translate the statement "E is denumerable" into the language of cardinal numbers. 6.68 If % is the collection of all subsets of Rand [%] = n, what is the relation between c and n? 6.69 See Exercise 6.38; translate its statement into the language of cardinal numbers.
182
6.6
Infinite sets
6.70 Translate the statement of Theorem 6.3 into the language of cardinal
numbers. 6.71 Translate the statement of Theorem 6.4 into the language of cardinal numbers. 6.72 Let ~ denote the collection of all finite subsets of N. Prove that [~] = No. Hint: See Exercise 6.55. 6.73 If the collection of all sets were a set f/, then every subset of f/ would be an element of f/. This is a contradiction-but to what? Hint: See Theorem 6.8. 6.74 What is the cardinal number of the collection of all (unbounded) straight lines in the plane? 6.75 Suppose that an urn is filled with No balls, numbered 1, 2, 3, .... Let us call a "stage" in the following experiment, the act of removing three balls from the urn and then replacing two of the balls then outside the urn. Imagine performing an experiment, in which each "stage" is performed No times. It is clear that after stage 1, one ball is outside the urn; after stage 2, two balls are outside the urn; and, in general, after stage n, n balls are outside the urn. The question is this: After No stages-that is, after stage 1, stage 2, stage 3, ... ,-how many balls are in the urn? Be careful. This is a trick question. NOTES AND REFERENCES
Two good references are Set Theory and Logic, by Robert R. Stoll (Freeman, 1963). Set Theory, by Felix Hausdorff, translated by John R. Aumann (Chelsea, 1957). Recall that we have scarcely discussed the following two questions: Does there exist a cardinal number n such that No < n < c? Given two cardinal numbers m and n, must one of the three relations
m< n
m = n
n
be true? See Paul J. Cohen's excellent book Set Theory and the Continuum Hypothesis (Benjamin, 1966) for a discussion of these matters. Georg Cantor was born in 1845 in Russia; his father was a Dane; he grew up in Germany, so he is somewhat international. He studied at Zurich and Berlin, and gave promise of becoming a talented and conventional mathematician. But in 1874 his first revolutionary paper was published-a paper in which he was the first to attack the previously avoided problem of
Notes and references
183
the "infinite." In fact, most of the material of this chapter can be found in Cantor's published works. These innovations shocked mathematicians of the day, and stimulated violent attacks on Cantor by his colleagues. Cantor was sensitive and unable to weather the criticisms thrown his way; he had attacks of irrational anger or overwhelming depression, beginning when he was about forty, and he died in a mental institution in 1918. By this time he had been belatedly recognized as the genius he was. The German mathematician Richard Dedekind was one of Cantor's few allies during the above troubles. Dedekind had a long life-he lived from 1831 until 1916-and a very productive one; he suffered a mild form of the same sort of attack directed against Cantor because of his own researches. He is well known for putting the concept of an irrational number into a logically sound structure, and his brilliant ideas were not at first universally well received. However, he did live long enough to become recognized as one of Germany's greatest mathematicians.
CHAPTER 7 NUMBER THEORY
The theory of numbers is meant principally to answer questions about the set N
= {I, 2, 3, 4, ... }
of positive whole numbers, or natural numbers, and sometimes about the set Z
= {... , - 2, -I, 0, I, 2, 3, ... }
of integers, or whole numbers. However, the techniques used to answer such questions frequently involve the rational numbers, complex numbers, or even calculus. The questions themselves have fascinated people for centuriesindeed, number theory is one of the oldest branches of mathematics-perhaps because the questions themselves are easy to pose, and the privilege of searching for the answers is available to almost everyone. Indeed, for his research in number theory Pierre de Fermat is known as the "Prince of Amateurs"; he was a French jurist of the seventeenth century who made many of the most important contributions to number theory. We assume that you are familiar with the arithmetic properties of the sets Nand Z; that is, that addition and multiplication are both commutative and associative, that multiplication distributes over addition, and so on. With this background you may plunge immediately into number theory. One of the cornerstone concepts is that of divisibility, for upon this concept are built many of the other important definitions and theorems of number theory-and so it is with divisibility that we begin. 7.1
DIVISIBILITY
The integer m is said to be divisible by the integer d provided that there exists an integer k such that m = dk. If so, then d is said to be a divisor of 184
7.1
Divisibility
185
m and m is said to be a multiple of d. If d is a divisor of m, we write the shorthand expression dim and read this expression as "d divides m." For example, the number 6 has exactly four natural number divisors; namely, I, 2, 3, and 6 itself. If our context is the set Z of integers, then the number 6 would have twice as many integral divisors-the ones listed above together with their negatives. Because there is such a close relationship between the natural number divisors of an integer m and its integral divisors, we will in the future always mean (unless otherwise stipulated) by a divisor d of the integer m a natural number divisor. The integers 0, - I, and I are rather special so far as divisibility is concerned. First, 0 is divisible by all whole numbers, for given a number d it is always true that d I O. To see this, just take m = 0 and k = 0 in the definition of divisibility; then, since 0 = o· d, it follows by definition that d is a divisor of O. Moreover, the only integer of which 0 is a divisor is 0 itself, for the equation m = k· 0 has a solution k only for m = o. On the other hand, I and -I are divisors of every integer, while they themselves are their only divisors. Note that each natural number other than I has at least two divisors, itself and I. Some natural numbers, such as 5, have no other divisors; others, such as 6, do. Divisibility may be studied for its own sake. It should be easy to see that if d and m are natural numbers and dim, then I ~ d ~ m. With this in mind we can establish our first theorem. Theorem 7.1 If a, b, and c are natural numbers, then:
a) a I a. b) If a I b
and
bla
then
a = b.
c) If a I b
and
blc
then
a I c.
To prove this theorem it is necessary only to look at the definition of divisibility. The details of the proof are outlined in the exercises for you to complete. Note the close similarity between the behavior of the symbols I and ~; the above theorem is true if the former symbol is replaced by the latter. The natural number I has but one divisor (remember, by "divisor" we mean only natural number divisors); 1 is the only such natural number, so we divide the others into two classes, as indicated in the next definition. The natural number p is said to be prime provided that p has exactly two divisors. If the natural number m has three or more divisors, then m is said to be composite. Note that I is neither prime nor composite. If you need a name for such a situation, you may refer to I as a unit, as it is called in some branches of mathematics. It is easy to see that there are infinitely many composite
186
Number theory
7.1
numbers (why?). On the other hand, prime numbers do not seem so plentiful. The first twenty, as you may easily verify, are as follows: 2 31
3 5 37 41
7 43
11 47
13 53
17 59
19 23 29 61 67 71
There are twenty-five primes between 1 and 100, sixteen between 1000 and 1100, eleven between 10,000 and 10,100, and only six between 100,000 and 100,100. Since primes tend to become less plentiful among the larger numbers, you might suspect that there are only finitely many primes. However, it has been known for several thousand years that the opposite is true-there are in fact infinitely many primes, just as there are infinitely many composite numbers. An outline of the simple proof of this fact appears as one of the exercises at the end of this section; in order to supply the details of this proof all that is needed is our next result.
Theorem 7.2 If m is a composite natural number, then m has a prime factor; that is, m = kp where k and p are natural numbers and p is prime. Proof Suppose that m is a composite natural number. Then m does have divisors between 1 and m, so that the set D
=
{d E Nil < d < m
and
dl
m}
is a nonempty set of natural numbers. Moreover, this set is finite, since it can contain no more than m - 2 numbers. Hence it is possible to select from the set D its least element, which we will call p. It now suffices to show that p must be prime. Suppose by way of contradiction that p is not prime. Then, since p E D, also p > 1; hence p must be composite. Thus p = ab, where a and bare natural numbers such that
l
and
1 < b < p.
We need consider only the number a in order to reach a contradiction. Since a I p and p I m, it follows from our previous theorem that also a I m. Hence a E D. But p was chosen as the least element of D, and a < p. This is a contradiction, and hence p cannot be composite. Thus p is prime. This establishes the theorem, for p is thus a prime factor of m. As we have already mentioned, our next theorem follows easily now that Theorem 7.2 is established, and the proof is outlined in the exercises.
Theorem 7.3 There are infinitely many primes. It is quite true that the primes do tend to become more and more sparsely distributed among the larger numbers. This tendency is well illustrated by the next theorem, whose proof is also outlined in the exercises.
7.1
Divisibility
187
Theorem 7.4 Given a natural number n, it is possible to find a sequence of n consecutive natural numbers each of which is composite.
In other words, given a natural number n, there do exist two consecutive primes whose difference is at least n. Exercises
7.1 List the divisors of 60, 100, 117, and 119. Circle the least such that exceeds 1 in each case, and go on to the next exercise. 7.2 Is the least divisor of 60 (other than 1) prime? Does this also hold for 100, 117, and 119? 7.3 The proof of Theorem 7.2 actually "proves" a little more than is needed. Rephrase Theorem 7.2 with the stronger conclusion that follows from the proof given. 7.4 If the natural number n has d divisors, how many integral divisors has n? 7.5 Suppose that a and b are integers such that both a I band b I a are true. What can you say about the relationship between a and b? In particular, must it be true that a = b? Why? 7.6 Let p and q be primes such that p I q. What can you say about the relationship between p and q? 7.7 List all the even primes. How many odd primes are there? 7.8 The text stated that if d and m are natural numbers such that dim, then 1 ~ d ~ m. Fill in the details of the following possible proof of this fact. First, it is clear that 1 ~ d (why?). Let us suppose by way of contradiction that m < d. Then the difference d - m is a positive whole number; say, d - m
= a.
Since dim, m = dk for some natural number k (why?). k =1= 1 (why?). Hence k > 2. But then m
k =d '
m+a-a d
------
d a = - -d
=
1
d
a d
Moreover,
188
7.1
Number theory
(Justify each equality.) But aid is positive, and hence a k=l--<1. d
(Why?) This is a contradiction (why?), and hence d ~ m (why?). Therefore 1 ~ d < m. 7.9 Suppose that a, b, and e are natural numbers such that both a I e and b I e are true. Does it follow that ab I e? 7.10 Suppose that a, b, and e are natural numbers such that a I b and a I e. Does it follow that a I be? 7.11 Suppose that a, b, and e are natural numbers such that a I be. Does it follow that either a I b or a I e must be true? 7.12 Is it possible to have two composite natural numbers a and b such that neither a I b nor b I a is true? 7.13 Here is the outline of the proof of Theorem 7.1. First, to show in part (a) that a I a is true, it suffices to find an integer k such that a = ak. What value do you choose for k? Next, suppose that a I band b I a. Apply Exercises 7.5 and 7.8 in order to conclude that a = b; or, if you prefer, use the following approach: Since a I b, there exists an integer k such that b = ak. Since b I a, there exists an integer j such that a = bj. Substitute b = ak into the latter equation. What conclusion can be drawn about the number kj? What must be the values of k and j? Why does this imply that a = b? Finally, for part (c), suppose that both a I band b I e are true. Apply the definition of divisibility to obtain two equations similar to the two above. As above, substitute one into the other. Does the desired conclusion a I e follow? 7.14 The text asked why there are infinitely many composite numbers. Supply the reason. 7.15 Show that there are infinitely many odd composite natural numbers. 7.16 Here is the outline of the proof of Theorem 7.3, which states that there are infinitely many primes. Supply the details. First, suppose by way of contradiction that there are only finitely many primes. If so, the primes could be listed in increasing order, as follows:
where Pi stands for the ith prime in the complete list of all n primes above. Form the number q
= PtP2P3 .. ·Pn + 1.
7.2
189
Well-ordering
Now q > 1 (why?), so q must be either prime or composite. But q cannot be prime (why?). So q must be composite. Hence q has a prime factor P (why?). None of the primes in the complete list is a divisor of q (why?). Hence since P I q, P cannot appear in this list. This is a contradiction (why?). Hence there are indeed infinitely many primes, and this establishes Theorem 7.3. 7.17 Give the reasons, where needed, in the following outline of a proof of Theorem 7.4: Given a natural number n, it is possible to find a sequence of n consecutive natural numbers each of which is composite. Let the natural number n be given. By n! we mean the product of all the natural numbers from 1 to n; that is, n!
=
n' (n -
1)· (n - 2)· (n - 3)' . ·3 ·2· 1.
Show that there are n consecutive composite natural numbers in the sequence (n
+
I)!, (n
+
I)!
+
1, (n + I)! + 2, (n + I)! + 3, ... , (n + I)! + (n - 1), (n + I)! + n, (n + I)!
+
(n
+
1).
7.18 In the proof of Theorem 7.3 outlined in Exercise 7.16, a finite number
of primes are multiplied together, the number 1 is added to the product, and the fact that this product is composite leads to a contradiction. It would seem to follow that if the first n primes were multiplied together and the number 1 added to the product, thus obtaining 1
+ PIPZP3 ••• Pn'
then this number would be prime. Is this always so? If not, does this mean that the proof given for Theorem 7.3 is invalid? Explain your answer. 7.19 It follows from Exercise 7.8 that no natural number can have infinitely many divisors. Why not? 7.20 A surprising formula is that if a natural number n has k divisors, then their product is ~nk.
Verify this formula for three different two-digit values of n. 7.2 WELL-ORDERING
One property of the set N of natural numbers is quite important in establishing many results in number theory. It is strange that this property has nothing to do with the algebraic structure of the natural number system; at least on the
190
Number theory
7.2
surface, it contains no mention of either addition or multiplication. We state it next. Well-Ordering Axiom Each nonempty subset of the set N of natural numbers contains a least element. For example, there is a least prime, a least composite natural number, a least number which is the sum of two squares (of natural numbers), a least number which can be expressed as the sum of two squares in two different ways, a least common multiple of 11 and 13, and a least natural number whose decimal representation uses all the odd digits. The existence of each of these numbers is shown by establishing that the set in question is nonempty in each case; finding the least element of that set may be more difficult in practice. In the last case, the number 971,513 is a natural number whose decimal representation uses all the odd digits; hence the set of all such natural numbers is nonempty. The Well-Ordering Axiom guarantees that this set contains a least element, and thus there does exist a least natural number whose decimal representation uses all the odd digits. In contrast, the Well-Ordering Axiom does not hold for some other number systems. For example, you should have no difficulty in finding a nonempty set of integers which does not contain a least element. The Well-Ordering Axiom is logically equivalent to the very useful Induction Principle for the set N. Induction was first discussed in Exercise 1.17; we next state the Induction Principle, and then prove its equivalence to the Well-Ordering Axiom in our following two theorems. Induction Principle for N Suppose that f/ is a statement meaningful for each natural number, and suppose moreover that both a) f/ is true of the number 1, and b) whenever f/ is true of the number n, then f/ is also true of the number n + 1. Then f/ is true for each natural number. It is amusing and sometimes helpful to visualize the Induction Principle
by the device shown in Fig. 7.1. A row of dominoes is depicted, one for each natural number, and the dominoes are so arranged that whenever the domino numbered n falls over, it knocks over the domino numbered n + 1. This is in analogy to the second hypothesis of the Induction Principle. We knock over the domino numbered I-this is in analogy to the first hypothesis of the Induction Principle-and we obtain the analogous conclusion: All the dominoes fall over. Before showing the equivalence of the Induction Principle and the Well-Ordering Axiom, we provide an example of how the Induction Principle may be used to prove a theorem in number theory.
7.2
Well-ordering
191
,
• • •
n
9
I
8
7
I 6
I
I
5
4
I Fig. 7.1 Visualizing the induction principle as falling dominoes.
I
3
I 2 1
~
"""
~
I--
f--
Example 7.1 In Exercise 7.8 we provided an outline of a proof that if d and m are natural numbers such that dim, then 1 < d < m. Here is a more elegant proof, using the Induction Principle for N.
Let d and m be natural numbers such that dim. Since d is a natural number, d > 1. Moreover, since dim, then by definition m = dk for some natural number k. Suppose by way of contradiction that d > m. Then let f/ be the statement, "For each natural number n, m < dn." Then f/ is certainly a statement meaningful for each natural number value of n, for given n, it can be determined whether it is true or false that m < dn. Moreover, part (a) of the hypotheses of the Induction Principle is satisfied, for the statement f/ is true for the value n = 1. (Recall that we have supposed, by way of contradiction, that m < d = d· 1.) Suppose that f/ is true for the natural number n. Then m < dn. But n < n + 1, and so dn < d· (n + 1). Consequently, we have also that m < d· (n + 1). So part (b) of the hypotheses of the Induction Principle is also satisfied, for if f/ is true for n then f/ is true for n + 1. Therefore by the conclusion of the Induction Principle, it must be the case that f/ is true for all values of n. But recall that m = dk for some
192
Number theory
7.2
natural number k. The fact that !/ is true for all values of n and the fact that k is a natural number together imply that m < dk. But we know that m = dk. This is a contradiction, and hence the assumption that m < d must be false. Hence if dim, then 1 < d ~ m. Other examples of applications of this principle to prove various theorems may be found elsewhere in this book, in the next set of exercises, and in George Polya's book Induction and Analogy in Mathematics (Princeton University Press, 1954). We proceed to show the logical equivalence of the Well-Ordering Axiom and the Induction Principle for N. Theorem 7.5 The Well-Ordering Axiom implies the Induction Principle for N. Proof We assume that the Well-Ordering Axiom is true; that is, every nonempty set of natural numbers contains a least element. We assume also the hypotheses of the Induction Principle: that !/ is a statement meaningful for each natural number, that !/ is true of the number 1, and that whenever !/ is true of the natural number n, it follows that !/ is also true for the natural number n + 1. We wish to show the conclusion of the Induction Principle, that !/ must be true for every natural number. Suppose, by way of contradiction, that !/ is not true for every natural number. Let F be the set of natural numbers for which !/ is false. Then F is nonempty because of the above assumption, and hence contains a least element k because of the Well-Ordering Axiom. Moreover, k =F 1 since !/ is true of the number 1 by hypothesis. So k > 1. Hence k - 1 is also a natural number, and moreover, !/ is true of the number k - 1 because k is the least natural number for which !/ is false. But we have one more hypothesis to use: That whenever !/ is true of a given natural number n, then also !/ must be true for the number n + 1. In particular, since k - 1 is a natural number for which !/ is true, then also !/ must be true for (k - 1) + 1 = k. But!/ is false for k, and so we have reached a contradiction. Consequently,!/ must be true for each natural number. This establishes the Induction Principle for N, and completes the proof of Theorem 7.5. Now for the converse. Theorem 7.6 The Induction Principle for N implies the Well-Ordering Axiom. Proof To establish the Well-Ordering Axiom, we must begin with a nonempty subset of N and use the Induction Principle to show the existence of a least element of that set. So let S be a nonempty set of natural numbers. We suppose by way of contradiction that S contains no least element. Then, in particular, the number 1 is not an element of S, for if it were then it would be the least element of S.
7.2
Well-ordering
193
Now let f/ be the statement, "If 1 ~ k ~ n, then k is not an element of S." Then f/ is clearly a statement meaningful for each natural number value of n, for given a natural number n we can certainly establish whether or not it is true that each natural number between 1 and n fails to belong to S. We now apply the Induction Principle to the statement f/. First, f/ is certainly true for the value n = 1, for in that case the only value of k such that 1 ~ k ~ n is k = 1 as well, and we have already seen that 1 is not an element of S. We suppose that f/ is true of the natural number n; in order to obtain the conclusion of the Induction Principle we need only show that it follows that f/ is necessarily also true of n + 1. Having supposed that f/ is true of n, we know that if 1 ~ k :::; n, then k is not an element of S; in particular, n itself is not an element of S. If f/ were false for the number n + 1, then for some natural number k between 1 and n + 1, k would be an element of f/. But we have seen that k cannot lie between 1 and n, so such a value of k could only be n + 1 itself. Thus n + 1 would have to be an element of S. But then n + 1 would in fact be the least element of S, for no number between 1 and n belongs to S. But since we have supposed that S contains no least element, this situation is impossible, and hence the statement f/ must be true for n + 1. So we have shown that f/ is true for 1, and that whenever f/ is true for n it must also be true for n + 1. Hence by the Induction Principle, f/ must be true for each natural number n. Now f/ was chosen to be the statement, "If 1 ~ k ~ n, then k is not an element of S." In particular, since f/ is known to be true for all values of n, it follows that no natural number is an element of S. Since S is a set of natural numbers and natural numbers only, S must be the empty set. This contradicts our assumption that Sis nonempty, and this contradiction establishes that S must contain a least element after all. This concludes the proof of Theorem 7.6.
Exercises
7.21 Use the Induction Principle to prove that the sum of the first n natural numbers is n(n + 1)/2; that is, that
1+2+3+···+n=
n· (n
+ 2
1)
.
7.22 Use the Induction Principle to prove that the following formula holds for all natural numbers n:
1 1 -1 + ++ ... + - -1- 1·2
2·3
3·4
n·(n
+ 1)
n
---
n
+1
194
7.2
Number theory
7.23 In the proof of Theorem 7.6, what goes wrong if we replace the statement f/ used there by the far simpler statement, "If n is a natural number then n is not an element of S."? 7.24 It is frequently easier to apply the Induction Principle when it is stated in the following form: Suppose that f/ is a statement meaningful for natural numbers, that f/ is true of the natural number 1, and that whenever f/ is true of every natural number less than the number n, then f/ is true for n as well. Then f/ is true for every natural number. Show that this form of the Induction Principle follows from the form previously stated. Note: Since the original form of the Induction Principle has been shown equivalent to the Well-Ordering Axiom, it may be easier (and is certainly sufficient) to show that the above form of the Induction Principle follows from the Well-Ordering Axiom. 7.25 Let S be the set of all positive real numbers. Does S contain a least element? Does the Well-Ordering Axiom hold for the real number system? 7.26 Suppose that % is a statement meaningful for natural numbers, that % is true of the natural number 5, and that whenever % is true for the natural number n, then also % is true for the natural number n + 1. For what natural numbers need % be true? Can you prove this? 7.27 Compute the values of the successive sums
1, 1 + 3, 1 + 3 + 5, 1 + 3 + 5 + 7, .... Guess a formula and prove it by induction. 7.28 Does the Well-Ordering Axiom hold for the set E
= {2, 4, 6, 8, 10, ... }
of all even natural numbers? Can you prove this? 7.29 Does the Well-Ordering Axiom hold for the set of all rational numbers? Give a reason for your answers. 7.30 Compute the values of the successive sums 1,
1
+ -1
2'
1
1
1
2
4'
+-+-
111 1+-+-+2 4 8'
1 1 1 1 1+-+-+-+2 4 8 16 ' Guess a formula and prove it by induction. 7.31 The number 125 can be expressed as the sum of two squares in two different ways:
7.2
Well-ordering
195
Knowing this, could you prove that there is a least natural number which can be expressed as the sum of two squares in two different ways? If so, what is it? 7.32 Prove by induction: If S is a finite set containing n elements, then S has 2" subsets. Hint: Having assumed this proposition true for n, suppose that S is a set containing n + 1 elements. Choose a E S and let T = S - {a}. Then T contains n elements, and hence has 2" subsets. Note that every subset of S is either a subset of T or a subset of T with the element a adjoined. 7.33 Let the sequence
of real numbers be defined as follows: 1 1
51
= - = 1,
52
= 1
53
54
+ -1 51
2 1
1
3
52
2
= -,
= 1 + - = -, = 1
1
+-
5
= -,
3
53
and, in general, for each natural number n, 5,,+ 1
=
1
1
+- . 5"
If you write a few more terms of this sequence you will notice that
Prove this by induction. Note: It suffices to prove that if n is odd; if n is even; and if m is odd and n
IS
even.
It may help if you first est~blish that if s" = alb, then s,,+ 1 = (a + b)la. 7.34 Prove by induction: If x and yare positive real numbers such that x < y, then x" < y" for each natural number n.
7.35 Prove by induction: For each natural number n, 3 I (n 3
-
n).
196
7.3
Number theory
7.3 THE FUNDAMENTAL THEOREM OF ARITHMETIC
If you factor a given composite number as much as possible, you will find that it can be factored into the product of primes. Moreover, no matter how you go about this factorization, you will always obtain, for a given composite number, the same answer; that is, the prime factorization you obtain will be unique except possibly for the order in which you write the primes down. For example, you could factor 144 as follows:
144 = 2·72
= 2·2·36
=
2·2·6·6 = 2·2·2·3·2·3
= 24 • 32 • We establish this result as follows: We first show that each composite number can be factored into the product of primes. Then we establish the so-called Euclidean Algorithm. Finally, using the Euclidean Algorithm, we show that the factorization that exists must be unique. Note the use of the Well-Ordering Axiom in each of these three theorems.
Theorem 7.7 If m is a composite natural number, then m is the product of primes,. that is, m = PtP2P3 ... Pm where Pi is prime for 1 ::;; i < n. Proof Suppose by way of contradiction that the theorem is false. Then, by the Well-Ordering Axiom, there exists a least composite natural number not expressible as the product of primes; we denote that number by m. Since m is composite, m = ab, where 1 < a < m and 1 < b < m. Since each of a and b exceeds 1, each is either prime or composite. We shall consider only one case, the one in which a is composite and b is prime; the other cases are handled similarly. Since m is the least composite number not expressible as the product of primes and a is a composite number less than m, then a can be expressed as the product of primes; thus
a = PtP2P3 ... Pk' where Pi is prime for 1 ::;; i
~
k. Hence
m = ab,
= PtP2P3 ... Pkb , and, since each number in the last expression is prime, we have expressed m as the product of primes. This contradiction establishes the theorem.
7.3
The fundamental theorem of arithmetic
4
5')23 + 3, dq + r.
Fig.7.2 23 = (4)·5 and m =
20 3
197
q
d)7; qd r
Perhaps you have begun to notice a pattern in most of our proofs that use the Well-Ordering Axiom. We first suppose that the theorem is false; then the set of natural numbers for which the theorem does not hold is nonempty, and thus contains a least element. Using this smallest exception to the theorem, we produce smaller numbers for which the theorem must then be true. We use the truth of the theorem about the smaller numbers-in the above theorem, factors of the original number-to establish the truth of the theorem about the original number, the number about which the theorem was supposed to be false. This contradiction establishes the theorem. We next use the Well-Ordering Axiom to establish the Euclidean Algorithm-which says, quite simply, that it is possible to divide one integer into another so as to obtain not only a quotient but also a nonnegative "small" remainder. We shall treat only the case in which the divisor is positive; the case in which it is negative can be handled similarly. Theorem 7.8 Let m be an integer and d a natural number. Then there exist integers q and r such that m = dq + rand 0 < r < d. Proof In spite of the fact that the statement of this theorem contains no mention of divisibility, it really is a theorem about dividing one number into another. See Fig. 7.2.
We take as an example the case where d = 5 and m = 23. You recall from the chronological vicinity of the third grade the process by which one divides 5 into 23, as shown in Fig. 7.2. The quotient is 4 and the remainder is 3. Moreover, we hope you also recall the method by which your thirdgrade teacher asked you to check your work. You were to multiply the divisor by the quotient and add the remainder to this product; if you obtained the dividend, your arithmetic was correct. Finally, you should also recall that your answer was "wrong" unless your remainder was a nonnegative integer less than the divisor; in this example,
o ~. 3 <
5
is true, and the arithmetic check looks like (5 . 4)
+
3 = 23.
198
Number theory
7.3
6
2
SJ23
s}23 30
10 13
23 = (2)' S + 13
-7 23
=
(6)'S + (-7)
Fig. 7.3 Arithmetically correct ways of dividing 5 into 23.
The general case is also shown in Fig. 7.2. The divisor d is divided into the dividend m, obtaining the quotient q and the remainder r. The equivalent check would be to verify that O~r
and that dq
+r=
m.
But this is precisely the statement of this theorem. So the theorem really does say that it is possible to divide one whole number by another and obtain a quotient and a remainder, the remainder nonnegative and less than the divisor. Without the requirement that the remainder r satisfy the inequality o ::;; r < d, there would be infinitely many ways to divide d into m in which the arithmetic would check. For example, we could divide 5 into 23 with a quotient of 2 and a remainder of 13, as shown in Fig. 7.3; in this case we would be informed by the third-grade teacher that our quotient is not large enough because "5 goes into 23 more than twice." Or if you prefer, you could obtain a quotient of 6 and a "remainder" of - 7; again, the arithmetic checks, as indicated in Fig. 7.3. Here the third-grade teacher would inform us that the remainder is supposed to be nonnegative. The restriction that the remainder be nonnegative and less than the divisor provides not only the answer referred to as "correct" by the third-grade teacher, but also provides a unique choice of q and r, as we shall prove. Our method of proof will be to look at all possible such divisions of d into m, without the restriction that 0 ~ r < d, and by use of the WellOrdering Axiom select the division that gives the least nonnegative value of r. We will then show that 0 ~ r < d; the quotient q will automatically "work" in that m = dq + r. Finally, we show that this choice of q and r is indeed unique. Now for the proof of Theorem 7.8.
7.3
199
The fundamental theorem of arithmetic
• • • m -
di
'3
m -
d·3
'2
m -
d·2
'1 '0
m -
d·1
m
d·O
'-1
m -
d·(-1l
'-2
m -
d·(-2l
'i
• • •
Fig. 7.4 Possible remainders upon division of m by d.
=
•
• •
Recall that we are given the divisor d, which is a natural number, and the dividend m, which is an integer. For each integer i, we let
'i
=
m -
di.
In other words, as indicated in Fig. 7.4, we look at all possible ways of "dividing" d into m, and associate with each possible quotient i the arithmetically correct remainder; since'i = m - di, it follows that our arithmetic check m = di + 'i will be automatically true. We have denoted the remainder 'i with the subscript i because the value of the remainder depends on the choice of the quotient i. We first show the existence of a remainder , i such that 0 < , i < d. We do this by considering the set of all nonnegative remainders appearing as in Fig. 7.4; if we can show this set is nonempty, then we can choose its least element, which is certainly a reasonable candidate for the inequality
'i
< d.
200
7.3
Number theory
If m > 0, we let i = - 1. Then the corresponding remainder r - 1 = m + d is also nonnegative. On the other hand, if m is negative, we let i = m. Then rm = m - dm
= m(l - d) = (-m)(d -
1),
and since - m > 0 and d - 1 > 0, r m > O. In either case, there is at least one nonnegative remainder. Hence the set R of all nonnegative remainders is a nonempty subset of the nonnegative whole numbers. It should be clear that the Well-Ordering Axiom holds for the set N u {O} as well as N, so we can select from R its least element, which we denote simply by r, and we let the corresponding quotient be denoted by q. We know that r = m - dq, and that
o~
r;
hence m = dq
+
r,
and
o~
r.
If we can show that also r < d, this will complete the proof of the existence of the desired numbers q and r. Keep in mind that r is the least possible nonnegative remainder upon
dividing d into m. Suppose by way of contradiction that d ~ r. Then the number s = r - d is nonnegative, and s < r because 0 < d. Now m
= dq + r = dq + d + r - d = d(q + 1) + r - d = d(q + 1) + s.
But since m = d(q + 1) + s, we have expressed the division of d into m with a new quotient q + 1 and a remainder s such that 0 ~ s < r. This is in contradiction to the fact that r is the least possible nonnegative remainder. Hence our supposition that d :::;; r leads to a contradiction, and therefore the desired inequality r < d must be true. Thus we have shown the existence of integers q and r such that both m
are true.
= dq +
r
and
O~r
7.3
The fundamental theorem of arithmetic
201
Technically, this completes the proof of Theorem 7.8 as stated, but we want to show one more useful fact while all this machinery is set up: that the quotient q and the remainder r chosen above are unique as well; that is, if x and yare two numbers such that both m=dx+y and O~y
are true, then x = q and y = r. Suppose that x and yare integers satisfying the above two relations. Now r was chosen to be the least nonnegative integer such that m = dq + r for some integer q; since y is such an integer, it must be true that r ~ y. Suppose that y = r. Then r = m - dq, and
=
y
dx,
m -
hence m - dq = m - dx, and thus dq = dx, so that x = q. In this case, y = r and x = q, as we wish to show. On the other hand, suppose that r < y. Then we have the inequality
o~
r
<
m
=
dq
y
< d.
Now
+ r,
and m = dx
+
y,
hence dq
+
r
= dx + y.
Thus dq - dx
=
y - r,
or d(q - x) = y - r. Therefore d is a divisor of y - r. But since
o~
r
< y < d,
it follows that
o<
y - r < d - r < d,
so that
o<
y -
r < d.
202
Number theory
7.3
By Exercise 7.8, no natural number (such as d) can divide evenly into a smaller natural number (such as y - r). So the fact that d I (y - r) leads to a contradiction, and shows that the inequality r < y is impossible. Only the previous case considered, in which y = r, can occur; and we have seen that in this case also x = q. This establishes the uniqueness of q and r, as desired. For example, while there are many ways to "divide" 5 into 23 to obtain a "quotient" q and a "remainder" r such that 23 = 5q + r, there is one and only one choice of q and r-namely, q = 4 and r = 3-such that both 23 = 5q + rand 0 ~ r < 5. Theorem 7.9 (The Fundamental Theorem of Arithmetic) If m is a natural number other than 1, then m can be factored into the product of primes, and this factorization is unique (apart from the order of the prime factors). Proof If m is already prime, then we understand the theorem to mean that m is its own unique factorization; certainly m cannot be equal to two different primes, nor can m be factored any further. So we suppose that m is composite; moreover, we suppose by way of contradiction that the theorem is false. Since Theorem 7.7 guarantees that each composite natural number has at least one prime factorization, it must then be the case that there exists a composite natural number with a nonunique prime factorization. By the Well-Ordering Axiom, we select the least composite natural number with a nonunique prime factorization, and denote this number by m.
Hence and
where all the p's and q's are primes, and these two factorizations are different in that the p's and q's differ in number or in kind, or both. If PI = q l' then
and since n < m, these two prime factorizations of n must be the same. Hence by properly rearranging subscripts, we have P2 = q 2, P3 = q 3' . . . , Pi = q j' and i = j. But then the two factorizations of m shown above are the same, for also PI = q 1. This contradicts the fact that the two given factorizations of m are different. And by repeating this argument for other p's and q's, we see that none of the p's can equal any of the q's.
7.3
The fundamental theorem of arithmetic
203
Since P1 '1= q l' we can suppose that the p's and q's are so named that P1 < q 1· By Theorem 7.8, there exist integers wand r such that q1 =P1 W
+
r,
and
We substitute P1 w
+
r for q 1 in the equation
and obtain m = (P1 W
+
r)q2q3··· qj
=P1 wq2q3···qj
+
rq2q3···qj·
But m is also equal to P1P2P3 ... Pi' so we have the equation
Hence rq2q3··· qj = P1P2P3·· ·Pi - P1 wq2q3··· qj = P1(P2P3 ... Pi - wq2q3··· q).
So P1 is a divisor of rq 2q 3 ••• q j. Now r < P1 andp1 < q1' so r < q1. Hence the natural number
is less than m, since m = q 1q 2q 3 ••• q j. Because m is the least natural number with a nonunique prime factorization, rq 2q 3 ••• q j has a unique prime factorization. Moreover, since rq 2q 3 ••• q j is divisible by P1' P1 appears in some factorization of rq 2q 3 ••• q j. Since P1 cannot equal any of the q's, P1 must appear in the prime factorization of r, and hence P1 I r. But 0 :::; r < P1' so the only way thatp1 I r can be true is if r = O. But if so, then the previous equation becomes
and hence P1 I q 1. This, too, is a contradiction, as we have shown that none of the p's can equal any of the q's; but since P1 and q 1 are primes andp1 I q l' the only way this can be true is if P1 = q 1. Thus our supposition that there is a composite number with a nonunique prime factorization leads to a contradiction, and this establishes the theorem.
204
Number theory
7.3
An alternate version of stating the Fundamental Theorem of Arithmetic is this: If m is a natural number other than 1, then there is one and only one way of expressing m as the product
where the exponents nt, n2' ... , nk are natural numbers, and Pt, P2' ... ,Pk are primes arranged so that Pt < P2 < ... < Pk' Exercises
7.36 In the proof of Theorem 7.7, only one of four cases was consideredthe one in which m = ab, where a was composite and b was prime. Show how to handle the other three cases. 7.37 Let m = -23 and d = -5. For what values of q and r is it true that m = dq
+
r
and
o~
r
< -d?
(One handles the statement of Theorem 7.8 for the case of a negative divisor by changing the sign of d to obtain a pusitive number.) 7.38 Factor 6300 into the product of primes. different ways can this be done?
In how many essentially
7.39 If we interpret the product of no numbers to be the number 1, and the product of one number to be itself, can we then state that "Every natural number has a prime factorization?" Can we state that "Every natural number has a unique prime factorization?" 7.40 Let a, b, and c be integers. Prove that if a I b and a I c, then a I (b + c) and a I (b - c). 7.41 Suppose that a and b are natural numbers and P is a prime such that P I abo Prove that either P I a or P I b. 7.42 Is it true that if n is a natural number such that 12 I n 2 then 12 I n? 7.43 Prove that every divisor of both m and n is also a divisor of both 3m + nand m + 2n. 7.44 Suppose that p is a prime and m and n are natural numbers such that P I (m + n). Does it follow that one of the relations P I m or pin is true? 7.45 Wilson's Theorem states that the natural number n is prime if and only if n
I (n
-
I)!
+
1.
Use Wilson's Theorem to prove that 7 is prime. 7.46 See Exercise 7.45. Use Wilson's Theorem to prove that 20 is not prime.
7.4
The greatest common divisor
205
7.47 See Exercise 7.45. Use Wilson's Theorem to prove that l00! + 1 is not prime. 7.48 Suppose that P and q are two different primes and n is a natural number such that both pin and q I n are true. Prove that pq I n. 7.49 How could you use the prime factorization of 144 in order to write down all the divisors of 144? 7.50 Suppose that n is a natural number with the "canonical" prime factorization • •• p(Zk n -- p(Zlp(Z2 I 2 k' By the "canonical" factorization we mean the one in which the exponents (Xl' (X2' ••• ,(Xk are natural numbers and the numbers PI' P2' ... ,Pk are primes so arranged that PI
< P2 < ... < Pk'
Use this factorization to find a formula giving the number of divisors of n. Hint: See the previous exercise. 7.4 THE GREATEST COMMON DIVISOR
If m and n are two integers not both zero, then by their greatest common divisor g we mean the largest integer g such that both g I m and gin are true. The greatest common divisor of m and n will be denoted by (m, n). It should be clear that if m and n are not both zero, then (m, n) exists, for m and n do have common divisors (such as 1), and only finitely many integers can divide evenly into both m and n. So there must be a largest such; in fact, since I is a common divisor of m and n, then 1 ~ (m, n), and hence (m, n) is a positive integer. For example, (24, 28) = 4 and (8, 9) = 1. The Euclidean Algorithm provides a method for computing the greatest common divisor of two integers. The method depends on the next theorem. Theorem 7.10 Let m and n be natural numbers, and let q and r be integers such that m = qn + rand 0 ~ r < n. Then (n, m) = (r, n).
The proof of this theorem is outlined in the next set of exercises. We illustrate with an example the method of using this theorem to find the greatest common divisor of 21 and 78. As indicated in Fig. 7.5, we first divide 21 into 78, obtaining a quotient of 3 and a remainder of 15. We ignore the quotient; Theorem 7.10 tells us that (21,78) = (15,21). We continue successive divisions, and obtain the chain of equalities (21, 78) = (15, 21) = (6, 15) = (3, 6) = (0, 3).
206
7.4
Number theory
3 21}78 63
15
2 67) 193 134
59
2
3)8 6 2
1
2
2
15}21 15
6}15
6
1
59}67 59
8
12 3
3T6 6
a
Fig.7.5 Successive divisions to find (21, 78).
7
8)59 56 3
1
2)"3 2
1
Fig.7.6 Finding (67, 193)-Note that each remainder is next divided into the corresponding divisor.
Since the remainders must decrease with each division, we must eventually reach the remainder zero after a finite number of steps. So in any problem of this sort, our last entry in a chain such as the one above will have the form (0, r), where r is the next-to-Iast remainder. But r =1= 0; in fact, r > 0, and it is easy to see that (0, r) = r. So the greatest common divisor of m and n will be the last nonzero remainder in the sequence of successive divisions of previous remainder into previous divisor. For another example, we show the calculations to find (67, 193) in Fig. 7.6. We obtain the chain of equalities (67, 193)
= (59, 67) = (8, 59) = (3, 8) = (2, 3) = (1, 2) = (0, 1) = 1.
Hence the greatest common divisor (67, 193) of 67 and 193 is 1.
7.4
The greatest common divisor
207
Moreover, a sort of reverse of this process can be carried out once the above computations are made. We take the example (21, 78) = 3. The divisions shown in Fig. 7.5 can be expressed in the form below: 78 = (3)' 21 + 15, 21 = (1)'15 + 6, 15=(2)'6+3, 6
=
(2)' 3
+
O.
Recall that (21, 78) is 3, the last nonzero remainder above. We ignore the last equation, and solve the one in which the remainder 3 appears for 3 itself, thus obtaining 3 = (1)·15 + (-2)·6. We solve the second equation in the above list for its remainder, 6, and substitute 6 = (1)·21 + (-1)' 15 in the previous equation to obtain 3 = (1)' 15 + (- 2) . [(1) ·21 + (-1)' 15] = (3)·15 + (-2)·21.
The remainder previous to 6 is 15; we solve the equation 78 = (21)' 3 + 15 for 15, and substitute the result for the 15 in the above equation: 15 = (1)·78 + (-3)·21,
so that 3 = (3)'[(1)'78 + (-3)·21] + (-2)·21 = (3)·78 + (-11)·21.
What we have accomplished is the expression of 3, the greatest common divisor of 78 and 21, in terms of 78 and 21 themselves. It should be clear that this process of back-substitution of remainders can always be carried out so as to express (m, n) in terms of m and n. We give one more example, using 67 and 193. From the data shown in Fig. 7.6, we obtain the following equations: 193 = (2)·67 + 59, 67 = (1). 59 + 8, 59 = (7)· 8 + 3, 8 = (2)' 3 + 2, 3 = (1)·2 + 1, 2
=
(2)' 1
+
O.
208
Number theory
7.4
The last equation is unnecessary; we discard it. Our sequence of solutions and substitutions goes as follows:
+ (-1)·2 (1). 8 + (-2)·3 (1). 3 + (-1). [(1) . 8 + (- 2) . 3] (3)' 3 + (-1)·8 (1). 59 + (-7)·8 (3)' [(1) . 59 + (-7)· 8] + (-1)' 8 (3)' 59 + (- 22) . 8 (1)' 67 + (-1)' 59 (3)'59 + (-22)'[(1)'67 + (-1)'59] (25)' 59 + (- 22) . 67 (1). 193 + (- 2) . 67
1 = (1). 3 2 = 1 = =
= I = =
3
8 = 1=
= 59 =
1 = (25)' [(1). 193 + (-2)· 67] + (-22)·67 = (25)' 193 + (- 72) . 67 Hence we have expressed the greatest common divisor of 67 and 193 in terms of 67 and 193, as follows:
(67, 193) = (-72)·67
+ (25)' 193
If you see that this process will always work, you have in effect seen the proof of our next theorem. Theorem 7.11 Let g
= (m, n). Then there exist integers x and y such that xm
+
yn
=
g.
Exercises
7.51 Evaluate (8334, 9612). 7.52 Find integers x and y such that 43x
+
91y
=
1.
7.53 One can use the Well-Ordering Axiom to prove Theorem 7.11; however, this proof will merely show the existence of integers x and y such that xm
+
yn
=
(m, n).
One proceeds as follows: Let S = {xm
+
yn I x
and yare integers and xm
+
yn > O}.
7.4
209
The greatest common divisor
First show that S is nonempty; then let 9 be the least element of S. Prove that 9 = (m, n). Please fill in the details of this proof. 7.54 Fill in the details of the following proof of Theorem 7.10: If m and n are natural numbers and q and r are integers such that m = qn + rand o ~ r < n, then (n, m) = (r, n). Let d be any common divisor of nand m. Show that d must be a divisor of rand n. Then show that every common divisor of rand n is also a common divisor of nand m. Since the common divisors of nand m are thus the same set of natural numbers as the common divisors of rand m, it follows immediately that (n, m) = (r, n). 7.55 Show that (36, 64) = 4, and find integers x and y such that 36x 64y = 4. 7.56 Show that (11, 27) 27y = 1.
= 1, and find integers x and
y such that llx
+ +
7.57 Using the Fundamental Theorem of Arithmetic, we can factor 4200 into the prime product 2 3 . 3 . 52 . 7 and 4500 into the prime product 2 2 . 32 . 53. Show how these factorizations can be used to find the greatest common divisor (4200, 4500) of 4200 and 4500. 7.58 Is it possible to find integers m and n such that 12m Explain. 7.59 Is it possible to find integers m and n such that 12m
+
+
13n
16n
= I?
= 6?
7.60 Find integers m and n such that 31m
+
231n = 1.
7.61 Let n be a natural number. Prove that (n, n
+
3)
=1=
2.
7.62 Let n be an integer. Prove that (n, n
7.63 Prove that 8 I (n 2
-
+
2) ~ 2.
1) if n is an odd integer.
7.64 Let m and n be natural numbers. Prove that (m, m
+
n)
I n.
7.65 We have seen that the set W of all whole numbers, together with the two operations of ordinary multiplication and addition, satisfies the Fundamental Theorem of Arithmetic, and it seems that Well-Ordering is essential for establishing this theorem. This is indeed the case, for consider the mathematical system K defined as follows:
K = {a
+
b~
Ia
and
b
are integers, and
e=
- 5}.
210
Number theory
7.5
We define addition and multiplication on the set K as follows: (a
+
(c
+
d~)
=
b~)' (c
+
d~)
= ac +
b~)
+
(a
+
c)
+
(b
+
d)~,
and (a
+
= (ac -
+ bc~ + bde 5bd) + (ad + bc)~.
ad~
With these two operations K becomes an algebraic system satisfying the same axioms as the system of integers; specifically, If ex, fJ, and yare elements of K, then
ex + fJ and exfJ belong to K', ex + fJ = fJ + ex and exfJ = fJex; (cxfJ)y = cx(fJy); (ex + fJ) + Y = ex + (fJ + y) and ex(fJ + y) = exfJ + exy; fJ = 0 + O~ has the property that fJ + ex = ex; z = 1 + O~ has the property that zcx = cx; and if cx = a + b~, then -cx = (-a) +. (-b)~ has the property that -cx belongs to K and (-cx) + cx = fJ. However, no ordering of K compatible with addition and multiplication can be a well-ordering; thus it is not surprising that the Fundamental Theorem of Arithmetic does not hold for K. To see this, one defines an element of K to be "prime" if its only factorizations are the obvious ones; that is, the element ex E K is prime if it can only be expressed as a product of two elements of K in the two forms cx = zcx = (- z)( -cx). It is possible to find a "composite" element of K with two different prime
factorizations. Please do so.
7.5 APPLICATIONS
The equation ax + by = c, where a, b, and c are whole numbers, has of course infinitely many solutions in the real number system (unless for example a = 0 = band c i= 0). But in number theory we are interested in only those whole numbers rand s which are solutions to the equation ax + by = c. The ability to find such solutions opens the way to solve a wide variety of fascinating problems. Let us consider the equation ax + by = c, where a, b, and c are given fixed whole numbers. The only case giving difficulty occurs when none of a, b, and c is zero. We begin by letting 9 = (a, b). If 9 is not a divisor of c, then the equation ax + by = c has no solution in whole numbers, so we further suppose that 9 I c.
7.5
Applications
211
We use the methods of Theorem 7.11 to solve the equation ax + by = g. Let m and n be a pair of whole numbers that solve the equation, so that am
+
bn
= g.
Since g I c, there is a whole number k such that c = gk. Let r s = kn. Then rand s are solutions of the original equation ax for ar
+
bs
= km and + by = c,
= a(km) + b(kn) =
k(am
=
kg
+
bn)
= c.
So there is no difficulty in finding one whole number solution to the equation ax + by = c. However, we seek all possible solutions. Using the particular solution ar + bs = c that we have found, it turns out to be possible to express all possible solutions in terms of the numbers rand s. For suppose that the whole numbers p and q give another solution, so that ap
+
+
bq
bq
= c.
Then ap
= ar + bs,
so that a(p - r)
Now 9 obtain
= b(s -
q).
= (a, b), so 9 I a and 9 I b. We divide our last equation by g, and (alg)(p - r)
= (blg)(s -
q).
Note that all quantities in this last equation are whole numbers. Moreover, the greatest common divisor of al g and big is 1, so that (alg)
I (s
and
- q)
(big)
I (p
- r).
Let t be a whole number such that (alg)' t = s - q. If we substitute (alg) . t for s - q in our previous equation, we obtain (alg)(p - r) = t· (alg)(blg).
We just cancel alg from both sides, and thus p - r
= (big)' t.
From the relations (alg)' t = s - q and (big)' t we can by solving for p and q also derive that p
= r + (big)' t,
q
= s - (alg)' t.
= p - r we have obtained,
212
7.5
Number theory
Now r, s, a, b, and g are all known, so that we have expressed the other solution pair p, q in terms of known quantities and the "variable" t-which is an integer. Since it is not hard to verify also that each integer choice of t does indeed produce a solution pair p, q to the equation ax + by = c, we have in effect established our next theorem.
Theorem 7.12 Let a, b, and c be nonzero whole numbers and let g = (a, b). Consider the equation ax + by = c. Ifg does not divide c, there is no solution to the equation. If g I c, then all possible integral solutions are given by x = p and y = q, where
= r + (big)· t, = s - (alg)· t,
p q
where r, s is one solution pair for the original equation and t is allowed to range through all whole number values. Example 7.1 Find all solutions in whole numbers of the equation 8x 9y = 10.
+
Now (8, 9) = 1 and 1 I 10, so solutions do exist. We first solve 8x + 9y = 1. In this simple case one can see a solution immediately: x = - 1 and y = 1. Hence we multiply this solution pair by 10 to obtain a solution to the original equation:
8· (-10)
+
9· (10) = 10.
So we have a = 8, b = 9, c = 10, r = - 10, s = 10, and g Theorem 7.12, all solutions are given by the formulas
= 1. By
x = -10 + 9t, y
= 10 - 8t,
where t is a whole number. It frequently happens in such problems that we are interested only in the positive solutions. If so, we must have both -10
+
9t > 0,
and 10 - 8t > 0, so that 8t < 10 < 9t,
and thus that t
< 1
and
2 < t.
But there is no integer t satisfying both these inequalities, and so we can conclude that the equation 8x + 9y = 10 has no positive whole number solutions.
7.5
Applications
Example 7.2 Find all integral solutions of the equation 3x
+
12y
213
= 100.
There is no solution, since (3, 12) = 3 but 3 is not a divisor of 100. Example 7.3 Find all positive integral solutions of the equation 5x 15y = 100.
+
A simplifying procedure is to divide each term by (5, 15) = 5, and the resulting equations will have the same solutions as the original equation. Thus we consider instead the equation
+
x
3y
= 20.
Now (1, 3) = .1, so we first solve instead
+
x
3y
= 1.
The obvious solution x = 1, y = 0 will of course do. We multiply each of x and y by 20 to obtain the solution r = 20, s = 0 to
x
+
3y
= 20.
By Theorem 7.12, we can obtain all possible solutions from the equations
= 20 + 31,
x
Y = 0 - 1,
where 1 is an integer. For positive solutions, it is further necessary that 20
+
31 > 0,
and
o-
1
> O.
These relations lead to the inequalities -20 < 31
and
1
< 0,
and thus
-6
~ 1 ~
-1.
So only the values 1 = - 6, - 5, - 4, - 3, - 2, and - 1 can produce positive solutions. In this case, those solutions are as follows:
= 2, y = 6, x = 5, y = 5, x = 8, y = 4, x
x = 11, y = 3, x = 14, y = 2, x=17,y=1.
214
Number theory
7.5
Example 7.4 If a man has ninety-five cents in dimes and quarters, how many of each type of coin might he have? If we let d denote the number of dimes and q the number of quarters he has, then we need to find the nonnegative solutions of the equation
+
10d
25q
=
95.
As in the previous example, we solve instead the simpler equation
+
2d
= 19.
5q
Now (2, 5) = 1 and 1 I 19, so solutions do exist. We first solve 2d
+
5q
=
1
One solution, by inspection, is d = 3 and q = -1. We multiply by 19 to obtain the particular solution d = 57, q = -19 of the original equation. Then all solutions have the form
d q
= 57 + 51, = -19 - 21
for 1 an integer. For q and d both to be nonnegative, we must also have -57 < 51
and
21 < -19,
which lead to only the two values -11 and -10 for 1. These lead to the two possibilities and d=2 q = 3, or and
d=7
q
= 1.
So the solution to the original problem is this: The man has either two dimes and three quarters, or else seven dimes and one quarter. Exercises
7.66 Find all positive whole number solutions of 4x
+
6y
= 100.
7.67 Find three different solutions of lOx - 7y
= 23.
7.68 A man cashed a check at a bank for two hundred forty-five dollars, and asked the teller for some ones, ten times as many twos, and the balance in fives. In how many different ways can the teller oblige him? What are these ways?
7.5
Applications
215
7.69 How would you go about solving the equation
ax
+
by
= 0,
where a and b are whole numbers? Are the formulas given in Theorem 7.12 also valid in this case? 7.70 What can be said about the whole-number solutions of the equation
+
ax
by
= c,
where a, b, and c are integers, in case one of a and b is zero? What if both a and b are zero? 7.71 Let a, b, and c be whole numbers with neither a nor b equal to zero, and let 9 = (a, b). Prove that if the equation ax + by = c has a solution in integers, then 9 I c. 7.72 To prove Theorem 7.12, it is necessary to show that if a and bare nonzero whole numbers and 9 = (a, b), then the greatest common divisor of alg and bIg is 1. Please prove this. 7.73 Here is the method for solving the equation
ax
+ by + cz =
d
in whole numbers, where a, b, c, and d are integers, none of a, b, and c is zero, and all solutions are desired. The method is much more complicated than the case of only two unknowns, so we give only the method for finding the solutions and omit the proof. First, there is no solution unless g, the greatest common divisor of a, b, and c, also divides d, so we suppose that 9 I d. Let
and -b b=-. (b, c)
Then ([3, b) = 1, hence we can find integers ex and y such that exb - [3y = 1. Let
, =
(a, bex
+
cy),
and find integers Jl and v such that Jla
+
v(bex
+
cy)
= ,.
216
7.5
Number theory
Then all possible solutions of ax
+
by
+
cz
=d
are given by the three equations below, where t and u are allowed to assume all possible whole number values (and every solution can be obtained from the equations below): x
dJl =- (bex + cy)t
Y
=
z
=-
,
exdv
T
exat
"
+ T + fJu,
, ,
ydv
yat
+-
+
bu.
Use this result to find all positive solutions of the equation
x
+
2y
+
3z = 12.
7.74 If four marks are worth one dollar, five zlotys are worth one dollar, and eight pesos are worth one dollar, how maya dollar be exchanged fairly for marks, zlotys, and pesos so that at least one unit in each of the foreign currencies is obtained? 7.75 A man paid two dollars for 100 eggs, including some new-laid eggs at ten cents each, some fresh eggs at two cents each, and some old eggs at one cent each. He found that he had the same number of two kinds of these eggs. How many of each did he buy? 7.76 A man bought pomegranates at 16 cents each, oranges at two cents each, and tangerines at one cent each. If he bought twenty pieces of fruit, including at least one of each kind, and spent 80 cents in all, how many of each did he buy? 7.77 A man cashed a check for less than one hundred dollars at a bank. The teller confused the number of dollars on the check for the number of cents, and paid the man forty-three dollars and fifty-six cents more than he deserved. In how many different amounts could the check have been written? 7.78 Find three different solutions of 8x
+
9y
+
10z
=
12.
7.79 Verify that the solutions x andy given in Example 7.1 actually work for all integral values of t. 7.80 Verify that the solutions p and q given in Theorem 7.12 actually work for all integral values of t.
7.5
Applications
217
7.81 If 11 brass balls (or equal weight) weigh exactly 15 pounds, 11 copper balls weigh 16 pounds, and 11 silver balls weigh 17 pounds, how many of each are required to weigh exactly 11 pounds? 7.82 Prove that 1 1 1 -+-+-+ ... +-1
2
3
4
n
is never a whole number. 7.83 Each odd prime has the form 4n + 1 or else the form 4n + 3, where n is a nonnegative integer. Prove that no prime of the latter form is the sum of two squares of whole numbers. 7.84 Prove that if n is a cube of a natural number, then the product of the three consecutive integers n - 1, n, and n + 1 is divisible by 504. 7.85 Find a natural number half of which is a square (of a natural number), one-third of which is a cube, and one-fifth of which is a fifth power. 7.86 Show that if n is a natural number, then 10 I (n 5
-
n).
7.87 What is the last digit of 7355 ? 7.88 We consider the equation x 2 + y2 = Z2, and seek natural number solutions. It turns out that any solution has the form x
= m2
-
n2 ,
y = 2mn, Z
= m 2 + n2,
(provided that, since one of x and y must be even, we let that be y) where m and n are natural numbers with m > n. Verify that the formulas given above actually do provide a solution for the equation x 2 + y2 = Z2. 7.89 Find three different solutions to the equation x2
+ y2 =
Z2.
7.90 In connection with Exercise 7.88, we are actually finding Pythagorean right triangles-those right triangles with whole number sides. Find two Pythagorean right triangles with the same hypotenuse. 7.91 Find two different Pythagorean right triangles with the same perimeter. 7.92 How many Pythagorean right triangles can have a hypotenuse of length 10? 7.93 How many Pythagorean right triangles can have one side of length 10? 7.94 Find three different Pythagorean right triangles whose legs differ by the number 1.
218
Number theory
7.5
7.95 Prove that one leg of a Pythagorean right triangle always has length divisible by 3 and that one side always has length divisible by 5. 7.96 Find three different Pythagorean right triangles such that the length of the hypotenuse and the length of one leg differ by the number 1. 7.97 Prove that every natural number n can be expressed in the form
n
= a2 +
b2
-
c2 ,
where a, b, and c are integers. 7.98 Is 2 100
-
I prime? Why or why not?
7.99 Let n be a natural number and p a prime not a factor of n. Fermat's Theorem states that p I (n P - 1 - I). Use Fermat's Theorem to prove that 7 I 999,999. 7.100 Show that every prime other than 2 and 5 divides evenly into some number of the form 999 ... 99 (k digits, all nines). NOTES AND REFERENCES
The following books on number theory may be of interest: Beiler, A., Recreations in the Theory of Numbers (Dover, 1964). Dudley, U., Elementary Number Theory (Freeman, 1969). Gelfond, A., The Solution ofEquations in Integers, translated by J. B. Roberts (Freeman, 1961). Griffin, H., Elementary Theory of Numbers (McGraw-Hill, 1954). Mordell, L., Diophantine Equations (Academic Press, 1969). Niven, I. and H. Zuckerman, An Introduction to the Theory of Numbers (Wiley, 1960). Rademacher, H., Lectures on Elementary Number Theory (Blaisdell, 1964). Please do not consider this chapter as containing more than a minute fraction of what is known about the theory of numbers. We have not touched on Farey series, the distribution of primes, twin primes, Fermat's "Last Theorem," perfect numbers, or even Gauss' Law of Quadratic Reciprocity. Since the latter is considered by many to be one of the most beautiful results in the field, we state it below, and invite you to verify it for yourself in a number of special cases. ~,
Suppose that p and q are two different odd primes. The problem is in finding a natural number n such that p
I (n 2
-
q).
Notes and references
219
The Law of Quadratic Reciprocity states that such a natural number n can be found if and only if there exists a natural number m such that q I (m 2
-
p),
with an exception: If p and q are both of the form 4k + 3, then the first relation above has a solution if and only if the second does not. The Euclidean Algorithm of Section 7.3 is, of course, named for the Greek mathematician Euclid-about whom very little is known other than that he lived in the third century B.c. His Elements formed such a complete systemization of geometry that his work has been used as a geometry text even in this century. On the other hand, a great deal is known about the life of Carl Friedrich Gauss. He was born in 1777 of very poor parents in Braunschweig, Germany, and only a succession of fortunate coincidences made it possible for him to become a mathematician. He lived seventy-eight years, and though not so prolific a writer as Euler, unquestionably made far greater contributions to mathematics: His work in number theory is particularly impressive, but he also laid the foundations for differential geometry, complex analysis, and modern topology. He desired perfection in his publications, and thus it is that he anticipated results of many later mathematicians-his journal contains a large number of valuable results which he never published, and for this reason many of these were credited to others. In any case, the consensus seems to be that the world has produced three truly outstanding men of genius-Archimedes, Newton, and Gauss.
CHAPTER 8 ANIMAL POPULATIONS
We shall consider an important branch of ecology, the branch which deals with the growth of animal populations, but we feel very strongly that two important cautions should be issued at once. First, we shall deal with only the simplest possible cases; we shall be assuming, for example, that no more than two or three species are involved, that the reproduction rate of each species is constant, that there are no changes in the populations in question caused by migration, and numerous other simplifying conditions that will become apparent as we proceed. Second, there is the philosophical consideration that mathematics really cannot prove anything about the real world, but only about mathematics itself. We shall be passing from a physical reality to a mathematical abstraction; such a procedure always merits cautious interpretation because of simplifying assumptions such as those mentioned above. In addition, we shall be making essentially unprovable (in the mathematical sense) assumptions about the way in which animals behave; here the usefulness of the mathematical model lies in the fact that it can offer predictions about the way in which animal populations ought to fluctuate; and if such fluctuations are indeed observed by biologists and ecologists in the field and laboratory, this can be considered evidence in favor of the validity of such assumptions. To make the last idea clear, mathematicians have been accused in the past of "proving" things which are patently absurd; for example, there is a rumor that a mathematician once "proved" that a bumblebee cannot fly. It was likely the case that the prover in question made the assumption that insect flight muscle could not metabolize its fuel any faster than mammalian muscle, as well as a host of other such assumptions. Of course, some of these assumptions must then have been false, provided that the "proof" itself was mathematically valid. In this case, rather than being a futile exercise, perhaps 220
8.1
Unrestricted growth of a single species
221
such a proof could give to biologists clues as to which such assumptions should be experimentally verified. As another example, it is said that a mathematician once "proved" that it was impossible for a drag racer to turn a quarter mile in less than 9.0 seconds. But dragsters commonly beat this time these days; in this case, the erroneous assumption may well have been that the coefficient of friction between rubber and strip did not increase as the rubber temperature increased, an assumption now known to be false. In summary, then, one cannot do ecology with paper and pencil. What mathematics can contribute to ecology, as it has contributed to the other sciences, is the prediction of the behavior of a system after certain assumptions-usually simplifying ones-have been made. If the predictions correlate with field and laboratory research, this is evidence in favor of the validity of the assumptions. If not, the assumptions should be examined; again, such examination can be done only experimentally, not mathematically. With these cautions in mind, we proceed to the simplest case (with the simplest assumptions). 8.1
UNRESTRICTED GROWTH OF A SINGLE SPECIES
One simple assumption about the growth of an animal population is that the growth rate is proportional to the number of individuals present. This assumption is contingent upon the further assumption that there is nothing to impede growth of the population; for example, there must be effectively unlimited living space and food supply. It would seem reasonable that such should be the case if the population were small relative to the amount of living space and food supply available. For example, with a reasonably small population of bacteria in a large culture medium, one would expect that if the bacteria population were 1000 individuals increasing at the rate of 3 individuals per second, then a population of 2000 individuals would increase at the rate of 6 individuals per second and a population of 10,000 individuals would increase at the rate of 30 individuals per second. Here is the reason why this should be the case. Suppose we have such a situation, say of a small number of bacteria in a large culture, and we denote the number of bacteria at a given time t by N(t), or more simply merely by N. It seems very reasonable that the rate of increase of this population, which we will denote by N'(t) or simply N', depends on the value of N itself at the time t in question. This is why we have chosen a notation N' for the rate of increase, in order to suggest that the value of N itself has something to do with the rate N'. After all, if all extraneous factors could be removed, it should be the case that the value of N' depends only on the value of N, whether or not they happen to be proportional.
1
\
222
j 8.1
Animal populations
i
1
I We can try to make plausible the assumption that N' is proportional to N in this simple case. We can by virtue of the above discussion at least write the equation N' = f(N) to indicate that there is some function f that gives the manner in which N' behaves with respect to the value of N. For example, it might be that
=
N'
N 3,
or N'
=
2N
+
N2
+
N N
+
1
Suppose our bacteria culture were divided into two equal populations, each thus containing N /2 individuals. This division could be done either physically, by removing half the bacteria to another culture medium-or it could be done by drawing an imaginary line down the middle of the culture. Since the actual rate of increase of each half of the population should be the same in either case, but in the former case it should be f(N /2) and in the latter case just N' /2, we could therefore write the equation N' /2 = f(N /2).
Similar considerations with respect to dividing the population into thirds, or multiplying it by four, lead us to the equation
= f(aN)
a- N'
for all meaningful values of a. But since N'
= f(N),
we can substitute this into the former equation, and obtain a .f(N)
= f(aN).
This means that the function f has the property that f(ax) = af(x) for all values of a and x. It turns out that the only such function with a smooth graph must be of the form f(x) = kx, where k is a constant, and hence N'
= f(N) = k - N.
In other words, it would seem reasonable that the form of the function f is extremely simple, and that N' is indeed proportional to N. Note that the constant k must be positive in the case of bacterial growth, since we are assuming N to be positive, and N' is also positive because the population is increasing rather than decreasing.
Unrestricted growth of a single species
8.1
223
Let us rewrite our equation as an equation about functions, by reinserting the time variable t. Then N'(t) = k· N(t).
By the end of a course in differential calculus, most students will have learned how to "solve" this equation in order to find explicitly the form of the function N(t), the real unknown in the above equation. Calculus enters the picture here because differential calculus is itself the study of rates of change of functions; although we will next present the formal manipulations used to solve the above so-called differential equation, it is not necessary that you understand these procedures. The reasons are these: First, many of our subsequent equations representing growth rates of animal populations are too complicated to be solved with pencil and paper alone-they are usually solved by approximation methods involving use of high-speed electronic computers. Second, in our later studies we shall not be interested so much in the actual form of the function N(t), but rather in the steady-state or limiting behavior of the animal population, and we can discover this steady-state behavior without the necessity of solving any differential equations. All that will be needed is some elementary work with inequalities. However, given the differential equation N'(t) = k· N(t),
the calculus student first divides by N(t), since we may assume that N(t) IS never zero. N'(t) N(t)
= k.
Then
and hence
f
N'(t) dt = N(t)
f
k dt
,
log N(t) = kt + C, where C is a constant and the logarithm function is to the base e (the approximate value of e is 2.71828) rather than to the base 10. From this follows N(t)
=
ekt + C
=
eC ' ~t.
Since C is constant, so is eC , and we evaluate it by introducing the further experimental assumption that the population N(t) is known when t = 0; say, N(O) = No. Then and hence N(t)
= No' ~t.
224
Animal populations
8.1
N-axis
Fig.8.1 The graph of N(t} = No ·e kt, which is increasing at an increasing rate.
(k>O)
t-axis
0....-
The graph of the function N(t) is shown in Fig. 8.1. In this graph, we have assumed that the constant of proportionaHty k is positive, as we have already mentioned. The graph increases more and more rapidly with increasing values of t, indicating that the rate of population increase is itself increasing. Of course, the value of k itself as well as the value of No must be determined individually for each experimental case; k, for example, should depend not only on the species of animal involved, but also on the concentration of the food supply, the effectiveness of the food in promoting reproduction, the temperature of the environment, and numerous other experimental factors. The information that we want to derive from our initial population equation N'
= k·N
8.1
Unrestricted growth of a single species
225
is available to us without use of the manipulations of calculus shown earlier. In this chapter we shall be mostly concerned with the eventual or long-term behavior of the population in question. In this example, we can reason very simply from the preceding equation in the following manner. At the beginning of the experiment, we assume N to be positive, and N' positive as well since the value of N is assumed to be increasing, at least initially. As we have seen, k too must then be positive. But then, so long as N remains positive, N' itself must be positive; that is, the population will continue to increase. Thus N will indeed remain positive, and the population will in fact always be increasing. In addition, as N increases, the value of N' must also increase, so the population will be increasing at a faster rate as time goes on. Thus the population will not tend toward any steady state, but behave as the graph in Fig. 8.1 indicates. This is no proof that E. coli, or any other living species, will eventually take over the whole world, for we have made one very important simplifying assumption: that the increase in the population in no way impedes future growth of the population. In actual practice-say, in a culture of E. coli in a test tube-available space and food supply are both strictly limited, and we shall see in the next section how certain very simple additional assumptions will lead us to a better model of growth of a single species. On the other hand, it is important that we mention that the form of the function N(t) that we derived using calculus, N(t)
= No· tit
has been experimentally verified in a large number of cases. That is, for a small population with a large amount of available food and space, the graph of the actual population compares well with the graph of N(t) in Fig. 8.1, so long as the value of N itself is small. In other words, so long as there are not too many individuals, the rate of growth of the population does behave as if it were proportional to the population. This will be an important assumption in much of our later work. The graphs shown in Fig. 8.2 show a typical population curve, shown as a dashed line, compared with the graph of N(t) = No· ekt • The two curves match quite well for small values of N. This should be considered as experimental justification for our assumption that the rate of growth of animal population is proportional to the population itself, so long as other factors are kept equal and so long as there is no inhibition of the growth rate by a too-large population. Exercises
8.1 Suppose for some reason we had been led to the equation N'
=~ N'
226
8.1
Animal populations
N-axis
--
"'.... ~';1' ~~
~';1'
Fig. 8.2 The population curve is well approximated by N(t) = No· ekt for small populations.
----------
' - - - - - - - - - - - - - - - - - t-axis
where k is a constant. Assuming that the population N and its rate of increase N' are both positive at some initial time t = 0, sketch the graph showing-very roughly-the behavior of the function N(t). 8.2 If a population N is initially positive, and its rate of increase N' is constant, what will be the behavior of the function N(t)? Treat three cases: when the constant is positive, when it is zero, and when it is negative. 8.3 For reasons given in Section 8.1, it would be plausible to suppose that an animal population of N individuals would have a constant birth rate b, resulting in a rate of increase in the population B = bN proportional to the value of N ; it is equally reasonable that the population would have a constant death rate d, resulting in a rate of decrease D = dN in the population also proportional to the value of N. The net rate of increase in the population, which we have denoted by N', should then have the form N'
=
B - D.
Show how the equation N' = kN
. can be derived from the above assumptions. What is the value of k? What interpretation can be given to the constant k?
8.1
Unrestricted growth of a single species
227
N-axis
N ( t)
= No· ekt
(k
< 0)
t-axis
L....-
Fig.8.3 For negative k, the graph of N(t) = No ·ekt is decreasing.
8.4 A quantity of a radioactive substance can be thought of as a population, say in terms of the mass present or the number of atoms present. It follows from the assumption that radioactive decay is equally likely for any two atoms of one substance that the decrease of the amount of the radioactive substance is proportional to the number of atoms present. We again obtain the differential equation N' = k· N,
where N is the amount present and N' is the rate of increase of the amount. Here, though, k is a negative constant-the rate of increase must be negative since the amount of radioactive substance is actually decreasing. Nevertheless, we obtain the same solution N(t)
= No· t!'t,
and the graph of this function, for negative k, is shown in Fig. 8.3. The half-life of a radioactive substance is the time it takes for one-half of the substance to decay; for example, in the case of Iodine-131, half will be gone after eight days. Show that the "half-life" is a meaningful concept; that is, that for a given radioactive substance, the half-life is independent of the initial amount present. Hint: Let T be the half-life. Find some way of "solving" the equation N(t) = No· t!'t for T, and show that the solution is independent of the initial quantity No.
228
Animal populations
8.1
N-axis
8.5 Here is one method by which the graph of a function N(t) satisfying a differential equation such as
N'(t) = k· N(t)
can be found approximately, without the necessity of actually obtaining an explicit formula for the function N(t). We shall illustrate how the procedure works with the equation above, assuming k > 0, and leave it to you to fill in the details. Imagine a point in the first quadrant, as shown emphasized in Fig. 8.4. Here both Nand t are nonnegative, and since N' = k· N, N' is also nonnegative. Also, the larger the value of N, the larger the corresponding value of N'. So the point indicated on the graph not only represents a certain possible population at a certain time, but also can be thought of as lying on the graph of a solution to the above equation. As t increases the value of N must also increase, as indicated by the arrow. For larger values of N the rate of increase will be proportionately greater, as indicated by the equation N' = k· N. Thus arrows of steeper slope are indicated for the larger values of N. Note that smaller values of N have associated with them arrows of smaller slope
8.2
Growth of a species under limiting conditions
229
as well, but the slope is always positive because N' is positive in the first quadrant (except when N = 0). If you select an initial population value No, and sketch in a smooth curve following the trend of the arrows, you will obtain a rough idea of the shape of the graph of N(t). You can repeat this process with the first two exercises, as well as with other differential equations such as
N'=
N
N
+
l'
and other such examples of your own invention. 8.2 GROWTH OF A SINGLE SPECIES UNDER LIMITING CONDITIONS
Since the graph of N(t) = No· ekt does not give an accurate picture of the growth of an animal population-at least, for large values of N-we should try to improve our assumptions which led to the original differential equation N' = k·N.
One very simple way to do this is to introduce into the equation a term which will, in effect, decrease the value of the rate constant k as the value of N increases. A very natural way of doing this is to assume that the population is living in an environment that will support a certain maximum population M of individuals, and the closer the value of N gets to the number M, the smaller the value of k (and thus the smaller the value of N'). However, the solution of the previous section turns out to be quite accurate for small values of N, and hence we do not wish to modify the value of k when N is small. What we need is a term that has the value 1 for N very small, and which decreases to 0 as N increases; in fact, it should become 0 when N = M and become negative for N > M. The reason for the latter consideration is that we can imagine an animal population in an environment of insufficient resources to support that population. In such a case, it seems reasonable that the number of deaths would exceed the number of births, and thus that the value of N' would be negative, indicating a net decrease in the population. Hence we need to multiply the constant k by a term that is nearly 1 for N close to zero, a term that decreases to zero as N gets closer and closer to the value M, and that becomes negative if N exceeds M. One of the simplest ways of inventing a formula for such a term is to use
M-N M
230
8.2
Animal populations
For N = 0, the value of the above term is 1; for N between 0 and M, its value is between 0 and 1; it becomes 0 when N = M and negative for N > M. Now the above term can be regarded as a sort of degree to which the potential increase of population is realized; indeed, when these concepts were first introduced into ecology, the word equation rate} = {pote.ntial rate} . {degree of real~zatiOn} { act~al of Increase of Increase of potentIal was used. This translates very naturally into the differential equation N'=k·N·
M
M
N '
and this is the equation that we shall examine. There is only moderate difficulty in actually finding the form of the function N(t), but we shall spare you the details, and use instead the sort of analysis that will be appropriate in later sections. Remember that, as before, k is a positive constant, and M too is a positive constant, indicating a maximum population that can be supported by the environment. We can use the same sort of analysis as was used in Exercise 8.5. We sketch in Fig. 8.5 the arrows indicating the direction of movement of the value of the population. For small values of N (small, that is, relative to M) the value of(M - N)/M is very close to the number 1, so that its effect on the equation can be neglected. Hence, for small N, N' is approximately proportional to N, and so for larger values of N the arrows become steeper. However, somewhere along the line the effect of the term (M - N)/M begins to make itself felt, and the arrows become no steeper; indeed, as N increases, the term (M - N)/M becomes quite small, effectively reducing the value of k and thus the value of N'. Hence the arrows begin to flatten out. For N = M, in fact, the arrows must be horizontal, indicating no change in the population, for when N = M, the equation N'=kN·
M
M
N
becomes N'
= O.
That is, there is no change in the population. Finally, for N > M the value of (M - N)/M is negative, and the arrows must slope downward; as N continues to increase, the downward slope of the arrows must increase since (M - N)/M is taking on values such as -1, -2, -3, .... In Fig. 8.6 we show three different population curves, each depending on the initial choice No of the population at time t = O. These curves are
8.2
Growth of a species under limiting conditions
231
N-axis
,N
=M
"""," """ " """ " ...........
.........
-----------------------------------------------.." . . " ~
.....,. .....,.
.....,.
./
/
/
/
/
/
I
I
I
I I
I
I
I
I
I I I
I
I
I
I
I
I
"
./ ./ /
/
/
~
""
~
~.....,.
Fig.8.5 Approximating the graph of a solution to N' = kN·(M - N)/M.
,;#'"
t-axis
N-axis
Fig. 8.6 The graphs of three typical solutions to the equation N' = kN·(M - N)/M.
-----------------------------
~-------------t-axis
232
8.2
Animal populations
obtained by choosing a value of No and then sketching in the graph of N(t) using the arrows as a guide. The hardest experimental test of this solution is in the case of the lowest curve, which exhibits the most complicated behavior. But ultimately this so-called sigmoid curve (because it is shaped like the letter "S") gives a surprisingly good fit to actual experimental evidence. Thus the hypothesis that the potential rate of increase of a population is appropriately modified by multiplication by the degree of realization of that potential is justified, as well as the form
M-N M
for the degree of realization. Actually, some researchers in the field believe that various modifications of the equation N'
=
kN. M - N M
would give a solution N(t) that would more accurately fit the experimental data in more cases; for example, one might wish to consider instead the equation
or even ask, in general, for the best possible exponent rx in the equation
However, for reasonable choices of rx, the steady-state behavior of the system will generally be unchanged. We shall consider at present only the case of our original equation, where rx = 1. Suppose, then, we inquire into the behavior of the function N(t) that solves N' = kN. M - N M '
as t increases without bound. The answer is already before us. The fact that all the arrows in Fig. 8.6 are directed toward the horizontal line where N = M indicates that, regardless of the initial population, its size will tend toward the value Mas t increases; unless, of course, No = O. This simple analysis will not be much complicated in the more complex systems we take up in the next sections.
The case of two competing species
8.3
233
Exercises
8.6 Repeat the analysis of this section for the differential equation
N' = kN.
(M ;:; N)
1/2
and show that the population tends toward one of the two values 0 or M. For what initial value of the population will it tend toward the value O? 8.7 Examine the behavior of the population equation N'
= k. (N)1/2 . M - N . M
This equation has been observed to give accurate fits to experimental data under certain conditions. 8.8 Construct, as an alternative to (M - N)jM, a formula for the so-called degree of realization of potential population increase. Examine the corresponding differential equation and see if it has the right sort of general behavior as t increases without bound. 8.9 Suppose that a cylindrical tank with vertical axis has a small hole in its bottom, is filled with water, and the water leaves the tank at a rate proportional to the water pressure. Make the necessary assumptions about constants (height of tank, density of water, and so on) and write down the differential equation whose solution V(t) is the volume of water in the tank at time t. 8.10 Suppose that the increase in a certain animal population due to births is proportional to the number of individuals present, but that the decrease in the population due to deaths is constant. Write down the differential equation describing the behavior of the population N as a function of time t, and include the term (M - N)jM for degree of realization of potential. What is the behavior of the solution of this differential equation as t increases without bound? 8.3 THE CASE OF TWO COMPETING SPECIES
We now turn our attention to the case of two different species of animals with a coexistence problem. We assume that they live in the same space and are competing for much the same food supply. However, please remember that we are still making a large number of simplifying assumptions; for example, we assume that there is under such circumstances a maximum population which the environment will support with respect to each species, and that this maximum is constant; in practice, of course, this maximum is likely to undergo variations because of the available food supply suffering seasonal variations, and the like. In order to make the ideas of this section more concrete, we shall consider the case of two reasonably similar species of fish-bluegill and redear-
234
8.3
Animal populations
living in a pond free from predators and with a constant but limited food supply. Since adults of these species are not predatory upon one another, we shall ignore the effects of predation on immature fish, so that the most important aspect of the interspecies competition is the sharing of food and living space. The important thing here is that bluegill and redear have similar, though not identical, food preferences. To make the significance of the notation easy to remember, we shall use the following symbols: B(t) or B will denote the number of bluegill present at time t. R(t) or R will denote the number of redear present at time t. B'(t) or B' will denote the rate of increase of the bluegill populationas usual, if B' is negative this means that the population is actually decreasing. R'(t) or R' will denote the rate of increase of the redear population.
With redear absent, we suppose that the pond will support a certain maximum population of bluegill, and we denote this maximum by C (since the letters Band C are alphabetically adjacent). Similarly, we let S denote the maximum population of redear the pond would support in the absence of bluegill. With redear absent, we have seen in Section 8.2 that a reasonable differential equation describing the population B(t) of bluegill would be
B' = kB. C - B
C
'
where k is a positive constant having to do with the birth and death rate of bluegill. Similarly, with bluegill absent, we can describe the behavior of the redear population by R ' = vR . S - R . S Again, v is a positive constant. With both species present, neither of the above equations will still be appropriate, for the very existence of redear in the pond impinges upon the available food and space for bluegill, and in effect decreases the value of C, the maximum possible bluegill population. In fact, it seems reasonable to suppose that the value of C would be decreased in direct proportion to the number of redear present, and hence our degree of realization of potential term would become C-B-exR C where ex is a constant that can be thought of as representing the degree to which redear interfere with the bluegills in the latter's quest for food and space. If a wide variety of foods were available to the two species and their food preferences overlapped only partially, one would expect the constant ex to be somewhere between 0 and I; however, if there were only one type of
8.3
The case of two competing species
235
food available and the redear were more efficient in obtaining it than bluegill, then it would be reasonable to suppose that ex > 1. We shall treat all possibilities, but remember that there are also numerous other such factors of comparison between the two species that are concealed in the little constant ex. To continue our analysis, we can write a reasonable (though oversimplified) equation for the population of bluegill as
B' = kB. C - B - exR C and, similarly, the redear equation becomes R' = vR. S - R - pB . S
Of course, p plays a role for redear analogous to the role of ex with respect to bluegill. What we now ask is this: Given initial populations of the two species of fish, and values of the various constants, what will be the eventual population of the pond? Will one species inevitably dominate the other, so that the latter becomes extinct and the former reaches its maximum population? Or can the two species coexist, each at a certain percentage of its maximum population? In Fig. 8.7, rather than plotting either of the functions B(t) and R(t) against the time variable t, we plot instead values of B(t) on the x-axis and values of R(t) on the y-axis. The reason for this will shortly become clear. We now ask for what values of Band R will B be increasing; that is, we solve the inequality B' > O. That is done by recalling that
B' = kB. C - B - exR • C Now if B' > 0, we have
kB. C - B - exR > C
o.
Since we may assume each of k, B, and C positive (they are certainly not negative, and if B = 0 there is no problem) each may be canceled from the above inequality, and we see that C - B - exR > 0, or
exR < C - B, or R <
C-B ex
.
236
8.3
Animal populations
R-axis
R
= Cia
.. 8'<0
Fig.8.7 B' = 0 on the straight line R = (C - B)/a.
., 8'>0 L-.---------------'""'-----8-axis 8=C
Thus the population of bluegill will be increasing when R < (C - B)fa.; this is the condition that will make B' > O. If we plot the graph of R = (C - B)fa., as we have done in Fig. 8.7, this will be a straight line; it must cross the vertical axis where B = o-that is, where R = C fa.-and it must cross the horizontal axis where R = 0; that is, when
C-B a.
= 0,
which occurs exactly when B = C. Below this line, we have
R <
C-B a.
,
and hence in the triangular region below this line, where each point represents a possible population of bluegill and a possible population of redear, the bluegill population will be increasing. We have indicated this in Fig. 8.7 by an arrow directed to the right, which indicates an increase in the value of B-we are temporarily silent as to the behavior of R. Similarly, in the region above the line R = (C - B)fa., we must have B' < 0, so that the
8.3
The case of two competing species
,
\
\
,
\
Case 1
Case 2
Case 3
Case 4
237
,
\\
Fig. 8.8 The four important cases for the positions of the lines on which B' = 0 and R' = O.
bluegill population will be decreasing in that region. This is indicated by an arrow pointing to the left. If we perform a similar analysis on the equation R'
= vR. S - R - pB S '
we can expect to obtain another straight line, with redear population increasing on one side and decreasing on the other; moreover, this line must cross the vertical axis at S and the horizontal axis at SIP. There are, however, four possibilities, as shown in Fig. 8.8, where the "redear line" is shown as a dashed line. The redear line can lie entirely over the bluegill line, entirely beneath it, or cross it in either of two ways. (We neglect the unlikely possibilities that the two lines coincide or cross at a point exactly on one of the two axes, for reasons to be discussed in the exercises to come.)
238
8.3
Animal populations
R-axis
8'<0 R'
Fig. 8.9 The case in which the redear Ii ne lies entirely above the bluegill line.
L
L S/{3
C
Which of the actual cases shown in Fig. 8.8 actually occurs depends on the experimentally obtained values of four of the constants we have introduced; the first case shown in Fig. 8.8 corresponds to the case when C
-<8 (X
and
8 C < -.
f3
We begin our analysis with this case. Under the solid line (or bluegill line) we have indicated with arrows pointing to the right the fact that B is increasing, and above that line the arrows pointing to the left mean that in that region B is decreasing. At the beginning of each of these arrows we have attached another arrow, vertically upward if R is increasing and vertically downward if R is decreasing. All this is shown in Fig. 8.9, and is just a convenient way of indicating that in the small triangle, both populations are increasing; in the middle region, the redear population is increasing while the bluegill population is decreasing, and above both lines both populations are decreasing. Moreover, note that on the bluegill line the values of Band R are such that B' = 0; that is, the bluegill population is steady. Similarly, the redear population is steady on the redear line.
The case of two competing species
8.3
239
R-axis
Fig.8.10 Three typical curves showing population trends.
Each point on the graph where Band R are nonnegative represents a possible initial population of bluegill and redear. We have selected three such points in Fig. 8.10, one in each of the three essentially different regions determined by the two lines, and then imagined the value of t increasing from its initial value of 0. The curves drawn represent, as one moves along each curve from its initial point, the manner in which each of the two populations must change, as indicated by the arrows in Fig. 8.9. It is easy to see what must happen in each of the three cases; for any initially positive populations of bluegill and redear, the curves lead inexorably with increasing values of t to the point (0, S), where the redear line crosses the vertical axis. When the curves reach that point, there they stay, for both B' and R' are zero at that point. At this point, the population of bluegill is zero and the population of redear is the maximum the pond can support, the number S. Hence, in this case, the redear will eventually take over the pond.
240
8.3
Animal populations
One concept we have just encountered will be subsequently quite useful; that is the concept of a critical point. The point (0, S) is a critical point for the system of differential equations B'
R'
=
kB. C - B - IXR
=
vR. S - R - fJB
C
S
' '
because when B = 0 and R = S, both B' and R' are zero, and the populations should not be expected to change without outside influence. In general, we refer to any such situation in which all rates of change involved are zero as a critical point, and the location and nature of these critical points will usually give us some idea as to the eventual or limiting value toward which the populations are tending. In the case above, the point (0, S) will be called a stable critical point because, if the values of Band R are changed slightly from the values B = 0 and R = S, the tendency will be for the values of Band R to return to the values 0 and S, respectively. However, the above system has two other critical points as well, which can be found by inspection of the diagram shown in Fig. 8.9, or simply by solving the differential equations so as to find when both B' and R' are zero. In the latter choice of procedures, we have kB. C - B - IXR
= 0
C
'
and vR·
S - R - fJB S
=
o.
The former equation holds when k = O-a solution which we ignore-or when B = 0, or when C - B - IXR = o. That is, when R =
C-B
.
IX
Similarly, R' = 0 when (ignoring v = 0) R = 0 or when R = S - fJB.
To find when both B' and R' are zero, we have the four following cases: 1) B = 0 and R = O. In this case the pond holds no fish. This critical point is not stable because a slight change in the "population" will result in the movement of the population away from this critical point, rather than back to it.
8.3
2) B
241
The case of two competing species
= 0 and
R
= S -
pB. Since B
= 0, the latter simplifies to
R
= S.
We have already seen that this is a stable critical point; the pond contains only redear. 3) R = (C - B)ja. and R = O. Since R = 0, the former equation simplifies to just B = C. The pond contains only bluegill; again, this critical point is unstable, since introduction of a small number of redear will upset the balance. 4) R = (C - B)ja. and R = S - pB. This solution is a point where the "bluegill line" and the "redear line" do not cross in the first quadrant, and thus this case gives us no critical point. To summarize, the population of the pond will be constant if it contains no fish, only bluegill, or only redear, and only the latter case is stable. Recall our earlier statement that some physical interpretation could be attached to the numbers a. and fJ. Let us examine that interpretation in the situation just considered. We can see from Fig. 8.9 that since the redear line lies entirely above the bluegill line, we must have the inequalities C -<8 a.
and
8 C < -.
fJ
For simplicity, and because the two species are fairly similar in this example, we suppose that C = S. Then 8 a.
8
- < 8
and
8<-
1 < a.
and
fJ<1.
fJ'
or
Remember that a. can be thought of, very roughly, as measuring the degree to which the redears interfere with the success of bluegills, and fJ the degree to which bluegills interfere with redears. That a. > 1 and fJ < 1 thus could be interpreted, very simply, as meaning that in this situation, redear is a "more successful" species. If we pass to the second case shown in Fig. 8.8, then the situation would be exactly reversed. If it turned out, by experimental measurement, that C and S were approximately equal and that a. < I and fJ > 1, then the stable critical point would be located at (C, 0), indicating that one would expect that the population ofthe pond would tend toward the maximum of bluegill and no redear. The third case shown in Fig. 8.8 is the most complicated. If you fill in the arrows indicating the direction of population trend as in Fig. 8.9, you will obtain a diagram much like that shown in Fig. 8.11. Here, for C and S approximately equal, it turns out that both a. and fJ can be expected to exceed
242
8.3
Animal populations
R-axis
S Fig.8.11 Directions of population trends in the case of an unstable critical point in the first quadrant.
Cia
L L-----------~---~--B-axis
SI(3
C
the number 1, which should indicate that each species competes more successfully with the other than with itself. This sounds like a peculiar situation, and some typical lines of population flow are shown in Fig. 8.12. We have here two stable critical points-one where B = 0 and R = S (all redear) and one where B = C and R = 0 (all bluegill). The expected critical point at (0, 0) is unstable and also rather uninteresting. But the bluegill line and redear line cross at a point in the first quadrant, specifically at
B=
rxS - C , rxf3 - 1
and R _ ,--f3C _ -_S - rxf3 - 1 . At this point, both B' and R' are zero, indicating a steady-state population, but this state is certainly unstable since a small change will generally force a population movement toward one of the two stable critical points.
The case of two competing species
8.3
243
R-axis
Fig.8.12 Typical curves of population trend in the case of an unstable critical point in the first quadrant.
. . . . ." '----B-axis
L...----------~---
The curves in Fig. 8.12 also tell us how to predict the eventual fish population. If the pond starts with relatively few redear, the curves flow toward the point of eventual extinction of redear; with relatively few bluegill, the redear will eventually dominate the pond. The most interesting case of all, shown as the fourth case in Fig. 8.8, has been reserved for your enjoyment as the very next exercise. Exercises
8.11 Consider the bluegill-redear competition in the fourth case shown in Fig. 8.8; that is, when
c
S
and
c
s
<-.
p
Draw figures analogous to Figs. 8.11 and 8.12. Find the stable and unstable critical points and discuss the physical interpretation of your solution. 8.12 Give a reason why we can ignore such cases as those in which the redear and bluegill lines coincide, or intersect on one of the coordinate axes.
244
8.4
Animal populations
8.13 What sort of critical points are obtained in case the redear and bluegill lines do coincide? 8.14 See Exercise 8.11. Does it make sense to draw the arrows on the coordinate axes? Does it make sense to draw curves of population movement that intersect the axes? What interpretation would you give to such a curve? 8.15 Let us consider the case examined in Exercise 8.11, in which a critical point is found where both Band R are positive. Assume it is plausible that a fisherman fishing for bluegill and only bluegill has the effect of decreasing the value of C, the maximum population of bluegill supportable by the pond. What effect will his fishing have on the relative populations of bluegill and redear in case the populations have previously stabilized at the critical point mentioned above? What if the fisherman is so expert that he reduces the value of C so much that Clr.x < 8? 8.16 In analogy to the case of two competing species considered in this section, write down reasonable differential equations describing the case of three competing species. 8.17 Why must the "bluegill line" and the "redear line" each cross the coordinate axes at a positive value? 8.18 In Section 8.2, the differential equation N'=kN·
M
M
N
was discussed. Suppose N represents a population of one species of fish in a pond, and M the maximum population that the pond will support. Find the stable and unstable critical points for this system. 8.19 Continuing the previous exercise, what will be the effect on the stable critical point if a fisherman catches fish from the pond in proportion to their population? 8.20 Continuing the previous two exercises, what will be the effect on the stable critical point if a fisherman catches fish from the pond at a constant rate? In particular, what is a reasonable differential equation that describes this system?
8.4 THE PREDATOR-PREY CASE
Let us now examine the interesting case in which there are again two species involved, one of which preys on the other. Again, we shall suppose that this situation exists in a pond containing two species of fish-bass and redearthe former the predator, the latter the prey, for while redear account for a very substantial portion of the food supply for bass in such a situation,
8.4
The predator-prey case
245
redear will not prey upon bass above a certain minimum size. We will make a large number of simplifying assumptions here, including the following: • We assume that neither population is so great as to necessitate the introduction of the degree of realization factor. Consequently, the rate of increase of redear population will be the increase due to births, minus those redear consumed by bass; here we further assume that such consumption completely accounts for attrition of redear. • Moreover, we assume that the increase in the bass population due to births is wholly dependent on the available food supply-redear and nothing else. Thus we must introduce a term to account for the decrease in bass population due to deaths. • Finally, we assume that redear are consumed at a rate proportional to the number of encounters between bass and redear. Since the number of such encounters will thus be proportional to the number of bass as well as to the number of redear, it is not hard to see that the number of encounters will be proportional to the product of the number of bass and the number of redear. For if the number of bass were doubled, the number of encounters would be doubled. If then, in addition, the number of redear were doubled, the number of encounters would again be doubled, resulting in four times as many encounters in all. Hence we assume that the following very simple differential equations describe the predator-prey situation: B' = kBR - dB, R' = vR - wBR.
Here, B is the number of bass present, B' is as usual the rate of increase of the bass population, R is the number of redear, and R' their rate of increase. As we have said, we assume that the rate of increase of the bass population is proportional to the number of encounters between bass and redear, so that this rate of increase is kBR, where k is a positive constant. The positive constant d represents the death rate of bass, presumably from old age; since the number of deaths of bass is proportional to the bass population, the attrition of the bass population is thus dB. Since this term represents a decrease in the bass population, we subtract the number of deaths from the number of births to obtain the net population increase of bass, and thus we obtain the first equation B' = kBR - dB. We have also assumed that the increase of the redear population due to births is proportional to the population of redear, so that this term becomes vR, where v is a positive constant. Redear vanish from the redear population
246
8.4
Animal populations
R-axis Fig.8.13 Directions of trends of bass population in the predator-prey example.
B'>O
R = d/kt-------------------
..
B'
.......- - - - - - - - - - - - - - - - - - - - B - a x i s
R-axis
r
L R'
> 0, B' > 0
R' 0
d/kl------------~I__--------
R'
L...-
> 0, B' < 0
Fig.8.14 Trends of bass and redear population.
R'
w/v
B-axis
8.4
The predator-prey case
247
at a rate proportional to the number of bass-redear encounters, and thus by the amount wBR, where w is another positive constant. Again subtracting deaths from births, we find that R'
= vR - wBR.
We now solve the inequality B' > 0, in order to find for what values of Band R the bass population is increasing. We have kBR - dB> 0.
Since we may assume that B > 0, the term B may be canceled from this inequality, and thus we see that kR > d,
or
d
R>k'
since k, too, is positive. We plot the straight line R = djk in the graph shown in Fig. 8.13, where values of B lie on the x-axis and values of R lie on the y-axis. Thus the line R = djk is a horizontal line passing through the point djk on the y-axis, and for values of R in excess of djk-that is, above this straight linethe bass population is increasing since B' > 0, when R > djk. This makes good sense; when there are many redear, the bass population should be on the increase, and when there are few redear the number of bass should be decreasing. This information is indicated by the usual arrows in Fig. 8.13. We next ask for what values of Band R is the redear population increasing; that is, we solve R' > 0. We have vR - wBR > 0,
so that v - wB > 0,
and hence
v
B < -. w
We plot the vertical line B = vjw in the graph shown in Fig. 8.14; for values of B less than vjw (that is, to the left of this line) R' > 0, and so the redear population is increasing. And if B > vjw, the redear population is decreasing. The usual arrows have been drawn in Fig. 8.14, summarizing our findings about the population changes of the two species. Now on the line R = djk, the horizontal line shown in Fig. 8.14, B' = 0. This can be seen by substitution of R = djk in the bass equation B'
= kBR - dB.
248
8.4
Animal populations
R-axis
d1k --------------- -----
Fig. 8.15 Curves of population trend in the predator-prey case.
L...-
----'-
8-axis
v/w
°
Similarly, R' = on the vertical line B = v/w. Hence the point (v/w, d/k) where the two lines intersect is a critical point. Its stability will be discussed shortly. Another critical point is (0, 0), the case in which the two species are absent from the pond. In practice this should probably be considered a stable critical point, since if a few of each species are introduced the bass could be expected to eliminate the redear and then die of starvation themselvesremember, we are assuming that the bass have no other food supply. There is no other critical point where R = 0, for again, in the absence of redear, the bass population will decrease to zero. One can expect a critical point where B = 0, somewhere high up on the vertical axis, where the population of redear will stabilize at a point much higher than if bass were present. This is not indicated by our differential equations, since for simplicity we omitted the degree of realization term from each equation. In Fig. 8.15 we have indicated the apparent behavior of the curves of population trend. The behavior of the arrows in the previous figure very strongly suggests that the curves are closed, and thus that the population
8.4
The predator-prey case
249
of each species will undergo periodic variations. First bass and redear increase, then the larger number of bass cause the redear population to decrease; next, the diminished food supply causes a decrease in the bass population, and finally this last decrease permits an increase in redear until there are sufficiently many to permit an increase in the number of bass. It might actually be that the curves are slowly spiraling in to the point (v/w, d/k), in which case the latter would be a stable critical point; or, possibly, the curves could be spiraling slowly outward, and the point (v/w, d/k) would then be an unstable critical point. For these particularly simple differential equations, it turns out that the point (v/w, d/k) is neither. The curves actually are closed, and hence a perturbation in a population initially with values B = v/w, R = d/k will result in a new population which can be thought of as having values circling about the point (v/w, d/k), and thus not really moving either away from or closer to the point (v/w, d/k). Perhaps some term could be invented here, such as calling this the case of a quasi-stable critical point. In addition, each closed curve of the sort shown in Fig. 8.15 can also be thought of as a critical path ("path" seems a better term than "point" here), since deviations from one curve simply place the population on a nearby curve. It also is sensible to ask about the duration of such a paththat is, how long does it take the population to go through one complete cycle? For some species of fish and many mammals, the duration of a cycle is usually measured in terms of a few years; curiously enough, even the much "longer" paths do not seem to have much greater duration that the very short ones. Of course, these very simple differential equations cannot pretend to be an accurate representation of what actually takes place in nature, but such cycles have been observed with sufficient frequency so that they deserve some explanation, even a weak one. In summary, the mathematics predicts the existence of periodic cycles in populations of predators and their prey, such cycles have been observed, and there is apparently a good reason why such cycles should occur. Exercises
8.21 In the predator-prey case involving bass and redear, it would seem to the fisherman's advantage to fish so as to move the populations of each species close to the quasi-stable critical point (v/w, d/k). Why? 8.22 Continuing the previous exercise, during which parts of the population cycle should the fisherman fish for bass? For redear? Can he ever fish for both? Should he ever fish for neither? 8.23 Suppose that there are two species of fish in a pond, say A and B, and the members of species A eat the eggs of species B at a rate proportional to the population of species A. Suppose also that the members of species B
250
8.4
Animal populations
eat the eggs of species A at a rate proportional to the population of species B. Suppose, finally, that the two species are not in competition for food or space, so that other than egg-eating the presence of each species will not inhibit the growth of the other. Under such circumstances the degree of realization term (M - N)/M should probably be introduced into each resulting differential equation. Write down a plausible pair of differential equations describing this situation. It should be the case that your equations are sufficiently simple to make the next exercise feasible. 8.24 Use the equations invented in the previous exercise to go through a process like that in the last two sections, finding stable and unstable critical points and sketching curves representing population trends. 8.25 List at least five deficiencies in the oversimplified treatment of the predator-prey situation given in the previous section. For example, do bass reproduce continuously? 8.26 Suppose that species A is parasitic on species B, which it has no difficulty in finding, and this takes place in a region of ample space and food supply. It would then be reasonable to write A' = etA. KB - A KB '
and B'
=
f3B' M - B - kA . M
Explain the origin of these equations. 8.27 Find critical points and sketch curves of population trends for the equations in the previous exercise. 8.28 One can try to improve the predator-prey equations given in the last section by introducing a degree of realization term for the redear. We write B'
=
kBR - dB,
R'
=
vR·
S-R S
- wBR
'
where S is the maximum population of redear the pond can support, and the other constants and terms have the same meaning as in the last section. Making reasonable assumptions about the relative sizes of these constants where necessary, find critical points and sketch curves of population trend for this system of differential equations.
8.4
The predator-prey case
251
8.29 Suppose the two sPecies Band R are symbiotic, though neither is completely dependent on the other, and there is no competition for food or space. Explain the origin of the descriptive equations
where
IX
and
B'
= kB' C - B + C
R'
=
vR. S - R S
p are positive constants.
+
IXR
'
pB ,
Hint: See Section 8.3.
8.30 Assuming that the constants IX and p of the previous exercise are relatively small, sketch curves of population trend for the equations of the previous exercise and find any critical points. 8.31 If the human race were considered as two species-male and femalecompeting for food and space, what sort of differential equations would describe this system? You should assume that the birth rate of males and females is the same. Analyze the consequences of your differential equations, and see if the results seem to match up with reality. 8.32 In the predator-prey equations
= kBR - dB, R' = vR - wBR B'
for bass and redear, suppose a poison is introduced into the pond which kills members of each species at a rate proportional to the population of that species. We could then write the modified equations
= kBR - dB - pB, R' = vR - wBR - qR, B'
where p and q are the positive "poison" coefficients. Without the poison terms, the most interesting stable critical point is (v/w, d/k), as we have seen. Where is the corresponding critical point for the new system of equations? In particular, will the number of prey rise or fall? 8.33 Ladybugs prey on aphids, and aphids are hard on many vegetables and ornamentals. If you observe a stable population of ladybugs and aphids in your garden, and you have an insecticide as effective in killing ladybugs as in killing aphids, should you spray? 8.34 Suppose that a city of fixed area contains N(t) automobiles at time t. Suppose also that new cars are being introduced into the city at a constant rate, and that cars are being permanently removed by total destruction due to two types of accidents-single-vehicle accidents and two-vehicle accidents.
252
8.4
Animal populations
Should the rate of attrition of cars due to single-vehicle accidents be proportional to N? To what should the rate of attrition due to two-vehicle accidents be proportional? What differential equation would describe the behavior of N' under these circumstances? 8.35 In Section 8.1, we discussed the case of unrestricted growth of a single organism. Under such circumstances, will there be a fixed time to such that the population will double in every interval of duration to? 8.36 Suppose that a cylindrical tank with vertical axis has a small hole in the bottom of such size that if the tank were full of water, then water would pour out of the hole at the rate of 100 gallons per minute. Suppose also that in any case, water pours out of the hole at a rate proportional to the water pressure, and that the tank is very large compared to the other amounts used in this problem. If water is running into the tank from an overhead pipe at the rate of 10 gallons per minute, toward what limiting volume is the amount of water in the tank tending? (Use appropriate constants where necessary.) 8.37 How would one test the growth of the population of the United States in the last hundred years to see if it fits the case of the unrestricted growth of a single species treated in Section 8.1 ? 8.38 How would one test the growth of the population of the United States in the last hundred years to see if it fits the case of the growth of a single species under limiting conditions, as treated in Section 8.2? 8.39 Return to the predator-prey case of Section 8.4, and assume that the death rate d of bass is so small that we might as well suppose that d = O. What happens to the bass-redear system in this case? 8.40 Continuing the previous exercise, a more realistic assumption might be the following: If the death rate of bass is ignored, then a degree of realization term should be introduced into the equation giving the rate of population increase of the bass. In this case we might have for the bass equation
B' = kBR. M - R M
'
where M is some theoretical maximum population of bass the pond can support. Should M be proportional to R, the number of redear present? If so, how does the bass-redear system behave? What if M is assumed to be constant?
NOTES AND REFERENCES
Three interesting books concerned with animal populations are: Gause, G. F., The Struggle for Existence (Williams and Wilkins, 1934).
Notes and references
253
Lack, D. L., The Natural Regulation of Animal Numbers (Oxford Clarendon Press, 1954). Slobodkin, L. B., Growth and Regulation of Animal Populations (Holt, Rinehart, and Winston, 1961). /'
Eugene Odum's well-known textbook EcOlogy (Holt, Rinehart, and Winston, 1963) is an excellent general introduction to the whole field of ecology. If you have worked a number of the exercises, do not feel as if you have been doing ecology. Field work and laboratory work are essential; ecology cannot be done solely at the desk. The most one can hope for is the formulation of new hypotheses susceptible to experimental verification-or invalidation.
CHAPTER 9 THE ART GALLERY THEOREM
The main purpose of this chapter is to develop enough of the theory of convex sets to prove Krasnoselskii's Theorem, also known as the "Art Gallery" Theorem. However, there will be considerable development of side topics, particularly in the exercises. Although the theory of convex sets has applications in game theory, linear programming, and other branches of mathematics much used in the social and managerial sciences, convex sets are quite interesting in their own right. Moreover, numerous pictures can be drawn illustratin~ definitions and theorems in this chapter; we recommend that you draw such pictures frequently, for they will help greatly in constructing proofs and will not usually mislead you. A small amount of notation from the elementary theory of sets will be used throughout this chapter, so if this notation is unfamiliar to you, it would be very helpful to read over the first section of Chapter 6. On the other hand, though the study of convex sets will be presented from a very geometric point of view, plane geometry itself is not a prerequisite for this chapter. We will usually restrict our attention to subsets of the ordinary two-dimensional plane; although generalizations to higher dimensions are frequently possible, such material will usually be reserved for the exercises. 9.1
CONVEX SETS
We assume that you are familiar with the properties of straight lines and straight-line segments in the plane. We will denote the ordinary two254
9.1
Convex sets
255
y-axis
b
Fig.9.1 The segment [a, b] from the point a to the point b.
- - - - + - - -.......- - - - - - - - - - - x-axis a
dimensional plane by £2, just an abbreviation for Euclidean space of two dimensions. If a and b are points of £2, the straight-line segment joining a and b will be denoted by [a, b], and consists of a and b together with all points between a and b on the straight line in £2 through a and b. By [a, a] we mean the set {a}, consisting of the point a alone. Example 9.1 Let a = (0, 1) and b = (2, 2). Then [a, b] consists of all points (x, y) of £2 such that 1 ~ x :$; 2 and y = 2x - 2. The set [a, b] of this example is shown in Fig. 9.1. It should be clear from the definition that [a, b] = [b, a]. You may use anything you know about plane geometry to prove our first theorem.
Theorem 9.1 If a and b are points of £2 and p E [a, b], then
[a, p] u [p, b] = [a, b]. Of course, to prove that the above two sets are equal, one shows that each point of [a, p] u [p, b] is a point of [a, b], and conversely. The following is our most important definition. Let C be a subset of £2. We say that C is convex if [a, b] is a subset of C whenever a and bare points of C. That is, for every possible pair of points a and b of C, it must
256
The art gallery theorem
9.1
be true that the segment [a, b] belongs wholly to C. And thus, to prove a set is convex, a usually fruitful approach is to select two arbitrary points a and b of the set, and then prove that [a, b] is a subset of the given set. For some examples, the plane sets shown in Fig. 9.2 are convex; those shown in Fig. 9.3 are not. Some special cases of convex sets are these: E 2 itself is convex. Each segment [a, b] is convex. A set consisting of just one point in E 2 is convex. A circular disk in E 2 , containing all, part, or none of its boundary, is convex. If C consists of a triangle together with all points within the triangle, then C is convex. (We will call such a set a triangular region.) The empty set 0 is convex.
The last assertion above is a consequence of the definition of convexity. For if a set S is not convex, it must contain at least one pair of points a and b such that [a, b] is not wholly contained in S. The empty set contains no such pair of points, for it contains no points at all. On the other hand, while it is at least intuitively clear that a triangular region is convex, a formal proof of this is difficult and depends on a careful definition of "triangle." You should assume whenever necessary that a triangular region is convex, even though we will not supply a proof of this fact. Exercises
9.1 Professor Aardvark told his class that a convex set was one "each two points of which could see each other." What did he mean? 9.2 Let a and b be points of E 2 • Under what circumstances is the set {a, b} convex? 9.3 How would you define convexity for subsets of three-dimensional space E 3 ? 9.4 How would you define convexity for subsets of one-dimensional space E 1 ? 9.5 Prove Theorem 9.1; that is, show that if a and b are points of E 2 and p E [a, b], then [~p]u[p,~
=
[~~.
9.6 Draw three convex sets in E 2 such that any two of them have at least one point in common, but such that there is no point common to all three of the sets. 9.7 Prove that each segment [a, b] is convex.
Convex sets
9.1
Fig.9.2 Two convex sets in £2.
Fig.9.3 Two nonconvex sets in £2.
257
258
9.2
The art gallery theorem
Fig. 9.4 Proving that
C rt D is convex if C and Dare. ~
9.8 Draw two overlapping convex sets C and D. In your example, is their intersection C (J D also convex? 9.9 Prove that if C and D are any two convex sets whatsoever in £2, then C (J D must be convex. 9.10 Must the union of two convex sets be convex? Explain the reason for your answers. 9.2 INTERSECTIONS OF CONVEX SETS
One might solve Exercise 9.9-to show that the intersection of two convex sets is convex-as follows. Let C and D be two convex sets in £2. We can dispose of the simplest cases first. If C (J D is empty, then it is convex; if C (J D contains only one point, it is convex. So we might as well suppose that C (J D contains at least two points, say p and q. To help us proceed with the argument, we draw a picture like that shown in Fig. 9.4. What do we know about the two points p and q? Only that each belongs to C (J D. Put another way, this means that p and q belong to C, and also that p and q belong to D. But since p and q belong to C, and C is given a convex set-here is where that hypothesis is used-it follows by the definition of convex set that [p, q] is a subset of C. For exactly the same sorts of reasons, [p, q] is also a subset of D. But since [p, q] is a subset of both C and D, it follows that [p, q] is a subset of C (J D. What has happened? Two arbitrary points p and q were chosen in C (J D, and it turned out that the segment [p, q] was consequently a subset of C (J D. This means that C (J D is convex, by definition. The above argument thus establishes our next theorem.
9.2
Intersections of convex sets
259
Theorem 9.2 The intersection of two convex sets is convex.
By mimicking this proof, you can show without difficulty that the intersection of three convex sets must be convex. For a start, let A, B, and C be convex sets, and let D = A n B n C. Stop here and try to prove that D must be convex. If you worked Exercise 9.10, you probably saw that the union of two convex sets need not be convex. In fact, if p and q are two points of £2, then each of the sets {p} and {q} is convex, but their union {p, q} is not, since it does not contain all the points of the segment [p, q]. Even if two convex sets have points in common, their union need not be convex, a fact that you can establish by means of innumerable simple examples. However, there is one situation in which the union of convex sets is convex-this situation will be discussed in the next group of exercises. Though you may have thought of one proof that the intersection of three convex sets must be convex, there are two, one proof involving segments (probably the one you thought of), and one not; the latter is the one given for our next result. Theorem 9.3 The intersection of three convex sets is convex. Proof. Let A, B, and C be convex sets, and let D D
=
=
A n B n C. Then
(A n B) n C,
since set-intersection obeys an associative law. But since A and B are convex, so is A n B, by the previous theorem. Hence we have written D as the intersection of two convex sets, namely A n Band C. Again using the last theorem, it follows that D is convex. Hence the intersection of any three convex sets must be convex. Exercises
9.11 Use the technique of the proof of Theorem 9.3 to show that the intersection of four convex sets must be convex. 9.12 Suppose that n is a natural number at least 2, and it is known that the intersection of any collection of n convex sets is convex. Use this fact and the technique of the proof of Theorem 9.3 to show that the intersection of any collection of n + 1 convex sets is a convex set. 9.13 Does it follow from Theorem 9.2 and the previous exercise that the intersection of any finite collection of convex sets is convex? Explain carefully. 9.14 Does it follow from Theorem 9.2 and Exercise 9.12 that the intersection of infinitely many convex sets is convex? Explain your answer.
260
9.3
The art gallery theorem
9.15 In the previous section it was mentioned that there is one special situation in which the union of convex sets must be convex. This occurs when the collection of convex sets forms what is called a tower. That is, the collection ~ of sets-eonvex or not-is said to be a tower if, given any two sets A and B in the collection ~, it is true that either
AcB
or
B cA.
Prove that if ~ is a tower of convex sets, then u ~ (the union of all sets in the collection ~) is convex. Hint: If p and q are two points of u~, then pEA and q E B for some sets A and B in~. How can you use the fact that ~ is a tower? 9.16 Does Theorem 9.2 hold for convex sets in E 3 ? (The notation E 3 of course stands for three-dimensional Euclidean space.) Explain. 9.17 Do Theorem 9.3 and the technique of its proof remain valid for convex sets in E 3 ? Why? 9.18 Does Exercise 9.15 hold for convex sets in E 3 ? Give your reasons. 9.19 Let ~ be a collection of sets each finite subcollection of which is a tower. Does it follow that ~ itself must be a tower? Explain your answer. 9.20 If ~ is a tower of sets in E 2 , need there be a "largest" element of ~ that is, need ~ include a set L such that L contains all the other sets in ~? Explain carefully.
9.3 HULLS AND KERNELS
Although we can use our method of proving Theorem 9.3 to show that the intersection of any finite collection of convex sets is convex, an alternate proof can be given that shows that the intersection of any collection of convex sets is convex. This is one of the more important results about convex sets, and we present it as our next theorem. Theorem 9.4 If ~ is any collection of convex sets, then the intersection of all the sets in ~-denoted by r. ~-is convex. Proof If p and q are two points of r.~, then both p and q belong to every set in the collection~. Since each such set is convex, the segment [p, q] also belongs to every set in~. Hence [p, q] is a subset of r.~. Therefore by definition, r. ~ is convex. It is fortunate that such an important theorem has such an easy proof. Note also that Theorems 9.2 and 9.3 are superseded by this theorem, for they are special cases of it. The proof seems to work even in the special cases where CC contains only one convex set, or no sets at all; the latter somewhat puzzling situation will be discussed in the next set of exercises.
9.3
Hulls and kernels
261
s Fig. 9.5 Forming the convex hull of the two sets Sand T.
I
Hull TI
Hul (S)
In addition, if A is any subset of £2 whatsoever, then it is meaningful to consider the collection ~ of all convex subsets of £2 containing A. The intersection of all these sets is called the convex hull of A, and is abbreviated by Hul (A). It has the following properties, which we list in the form of a theorem. Theorem 9.5 Let A be a subset of £2. Then a)
A c Hul (A).
b) Hul (A) is convex. c) If C is any convex set containing A, then C also contains Hul (A). d) A is convex if and only if A = Hul (A).
The proofs of each of these facts are quite easy, and are left for the exercises. We should remark at this point that, as a consequence of this theorem, there is a unique "smallest" convex set, namely Hul (A), containing each subset A of £2. Some examples of hull formation are shown in Fig. 9.5. Associated with each subset A of £2, in addition to its convex hull, is another set called its convex kernel. The kernel can be defined as follows: Ker (A) = {x
E
A I if YEA,
then
[x,y]cA}.
262
9.3
The art gallery theorem
Fig.9.6 Forming the convex kernels of the two sets Sand T.
T
Ker (T)
Some examples of the formation of Ker (A) from A are shown in Fig. 9.6. One convenient way to remember the definition of Ker (A) is to say that Ker (A) is the set of all points of A that can "see" all the other points of A. And, in analogy to our theorem about the convex hull, we have our next result.
Theorem 9.6 Let A be a subset of E 2 • Then a)
Ker (A) c A.
b) Ker (A) is convex. c) A is convex if and only
if A = Ker (A).
9.3
Hulls and kernels
263
Again, the easy proof is left for the exercises. The analogy between kernels and hulls is not perfect; while there is a unique "smallest" convex set containing the given set A, the kernel need be neither the largest nor the smallest convex subset of A. However, there is a connection between Ker (A) and the "largest" convex subsets of A, and we will make this connection explicit in our next theorem. But, before we even state that theorem, some preliminaries are needed. Let ~ be a collection of sets, and let II be a property meaningful for each set A E~. (That is, for each set A in ~ it is either true or false that A has property II.) The set M in ~ is said to be maximal with respect to property II if a) M has property II, and b) M is not a proper subset of any other set in ~ having property II. It is important that you do not confuse the property of being maximal
with the property of being maximum. A maximum set having property II would be a set having property II and also containing every other set having property n. To illustrate the distinction, we give an example. Let ~ be the collection of all subsets of £2. Let us say that a subset of the plane is nonlinear if no three of its points lie on a straight line. Let II be the property of "being nonlinear." Then II is certainly meaningful for sets in the collection~. And ~ contains sets maximal with respect to property II; for example, let M be a circle. Then M is maximal with respect to being nonlinear, because a) M itself is nonlinear, and b) if M is a proper subset of L (where L E~) then L cannot have property II-for since M is a proper subset of L, L contains at least one point x ¢ M. A straight line through x and the center of the circle M contains x and two points of M, thus three points of L, since MeL. This example shows that a set that is maximal with respect to property II need not be unique, for there are numerous different circles in the plane, and each is maximal with respect to being nonlinear. However, there is no subset of £2 that is a maximum with respect to being nonlinear, for the only subset of £2 containing all circles is £2 itself, and £2 is not nonlinear. The following axiom is one form of the so-called Zermelo Axiom, which is usually assumed true by most mathematicians. We will need this axiom to prove our next theorem. Axiom Let ~ be a collection of sets and let II be the property of "being a tower." (Then II is meaningful for each subcollection d c ~, since each such subcollection either is or is not a tower.) Then ~ contains a subcollection .,I( maximal with respect to property n.
264
The art gallery theorem
9.3
As an example of an application of this axiom, let ~ be the collection of all circular disks in E 2 each of which contains its boundary. The above axiom guarantees the existence of a maximal tower vii of such disks; that is, vii is a collection of circular disks, each disk in vii either contains or is contained in each other disk in vii, and if some circular disk D is either contained in or contains each disk in vii then DEvil. (In this example it is not actually necessary to use the axiom to find the maximal tower vii of circular disks, since it is not difficult to show that the collection of all disks centered at the origin is a maximal tower. Note also that this latter collection is maximal, but not a maximum.) We mentioned that there was a connection between Ker (A) and the convex subsets of A. We now proceed to demonstrate that connection, first by proving the following lemma-which shows the existence of maximal convex subsets of a given set A-and then immediately proceeding to the theorem in question, which says that Ker (A) is the intersection of the maximal convex subsets of A. The axiom just given is necessary to establish the lemma. Lemma Let A be a subset of E 2 and let p be a point of A. Then there exists a subset C of E 2 ,such that
C; b) C c A; c) C is convex; and d) C is maximal with respect to the above three properties. a) p
E
Proof Let ~ be the collection of all convex subsets of A containing p. Then ~ is not the empty collection, since {p} E~. By the axiom, ~ contains a maximal tower J(. By Exercise 9.15, u vii is convex. Let C = u vii. It is not difficult to show that C has the four desired properties listed in the statement of the lemma. Theorem 9.7 Let A be a subset of E 2 • Let f!A be the collection of all maximal convex subsets of A. Then Ker (A) = n f!A.
The proof is outlined in one of the next exercises. Exercises
9.21 This and the next three exercises provide a proof of Theorem 9.5. Let A be a subset of E 2 • Show that A c Hul (A). 9.22 Let A be a subset of E 2 • Show that Hul (A) is convex. 9.23 Let A be a subset of E 2 and let C be a convex subset of E 2 such that A c C. Prove that Hul (A) c C. 9.24 Let A be a subset of E 2 • Prove that A is convex if and only if A = Hul (A).
9.3
Hulls and kernels
265
9.25 Let A be a subset of £2, and let A(A) be the set obtained by adjoining to A all points belonging to segments [p, q], where p and q are points of A. For example, if A is a circle, this process produces a circular disk for A(A). Need the set A(A) formed in this way be convex? Explain. 9.26 Continuing the previous exercise, show that A C A(A) and that A(A) C Hul (A). 9.27 See the previous two exercises. Is it true that, given A c £2, the application of A a finite number of times will produce Hul (A)? That is, is it true that one of the sets A, A(A), A(A(A»), A(A(A(A»), . ..
must be the convex hull of A? Is there a maximum number of applications of A that will always suffice for the production ofHul (A)? What is this number? 9.28 Repeat the previous three exercises for subsets of £3 rather than £2. 9.29 Repeat Exercises 9.25, 9.26, and 9.27 for subsets of £1 rather than £2. 9.30 Let A be a subset of £2 and let 9" be the collection of all triangular regions whose vertices lie in A. Need Hul (A) = u 9"? Why? 9.31 Let A be a subset of £3 and let 9" be the collection of all triangular regions whose vertices lie in A. Need Hul (A) = u 9"? Explain. 9.32 Give an example of a tower CC of convex subsets of £2 such that (") CC is nonempty. Then give an example such that (") CC is empty. 9.33 In the previous section we provided an example-being nonlinearas a property meaningful for subsets of £2, and a set maximal with respect to this property. Provide a different example of such a property and, if possible, find a subset maximal with respect to that property. 9.34 Let CC consist of all convex subsets of £2 no one of which contains the origin (0, 0). Show that there exists a maximal convex subset of £2 not containing the origin. Is it possible to use the axiom of the last section? Is it necessary? 9.35 If CC is the empty collection of subsets of £2, what is (") CC? Hint: If p ¢ (") CC, then p must fail to belong to some set in the collection CC. What points p have this property? 9.36 IfCC is the collection of subsets of £2 consisting of just one set, say Athat is, CC = {A}-then what is (") CC? 9.37 Let A be a subset of £2. Show that the point p belongs to Hul (A) if and only if p belongs to the convex hull of a set consisting of three or fewer points of A. Hint: Use Exercise 9.30 if you wish. 9.38 The last line of the proof of the Lemma of the preceding section leaves verification of the four properties of the Lemma for the reader. Please verify them.
.
266
The art gallery theorem
9.4
9.39 Prove Theorem 9.6. 9.40 Here is an outline of the proof of Theorem 9.7; please fill in the details. We have given that A is a subset of £2 and that f!A is the collection of all maximal convex subsets of A. Let B = n f!A; we need to show that Ker (A) = B. First, we show that Ker (A) c B. There is no problem if Ker (A) = 0 (why?), so let p E Ker (A). It suffices to show that if M is a maximal convex subset of A, then p E M (why is this sufficient?). So let M be a maximal convex subset of A. Since p E Ker (A), [p, q] c A for each point q E M (why?). Let C consist of all points belonging to all such segments [p, q], where q E M. Then C is convex (why?), C is a subset of A (why?), and M c C. Since M is a maximal convex subset of A, M = C (why?). Since p E C (why?), it follows that p E M. As we mentioned previously, this is sufficient to show that p E B. Hence, since p is an arbitrary point of Ker (A), it follows that Ker (A) c B, and we are half done. Next, there remains only the problem of showing that B c Ker (A). If B = 0, there is no problem, so let x be a point of B. To show that x E Ker (A), it is sufficient (why?) to show that [x, y] c A for each YEA. So let y be an arbitrary point of A. By the Lemma of the last section, y is contained in a maximal convex subset M of A (exactly how is the Lemma applied ?). But since x E B, it follows (why?) that x E M too. Thus [x, y] c M (why?). Hence [x, y] c A (why?). It follows that x E Ker (A), and thus that B c Ker (A). We have shown that both Ker (A) c B
and
B c Ker (A)
are true, and consequently Ker (A) = B. This establishes Theorem 9.7. 9.4 HELLV'S THEOREM
Helly's Theorem will be the principal tool we use to prove the Art Gallery Theorem. First, suppose that CC is a nonempty collection of convex subsets of £2 such that each two sets in CC have nonempty intersection. Does it follow that nee must be nonempty? Can we even conclude that the intersection of each three sets in CC is nonempty? Try to answer these questions before proceeding. We can answer the first question very easily. In the coordinatized plane, let the set Cn consist of all points on, or to the right of, the vertical line through the point (n, 0) on the x-axis, where n is allowed to assume all whole number values. Let CC consist of all the sets Cn' Then CC is a collection of plane convex sets, and the intersection of each two sets in CC is clearly nonempty. However, no point belongs to n~, for in order that pEn CC, it would be
9.4
Helly's theorem
267
Fig.9.7 The line passing through (n, 0) has slope n.
(5,0)
necessary that the x-coordinate of p be greater than every integer, which is impossible. So the answer to the first question above is a very emphatic "No"; indeed, any finite subcollection of CC has nonempty intersection, but still (J CC = 0. However, this example sheds little light on the second question, for in this case the intersection of each three sets from CC is indeed nonempty. However, the unbounded straight lines drawn as shown in Fig. 9.7 do form a collection of convex subsets of £2, and each pair of these lines intersect since no two of the lines are parallel. However, with the help of a little analytic geometry one can show that no three of these lines have a common point. So the answer to the second question raised above is also in the negative. But ReIly's Theorem tells us that the answer must be affirmative if we increase the numbers of sets mentioned in both the hypothesis and the conclusion. The above examples serve to show why Helly's Theorem may be a somewhat unexpected result.
268
ABC
,
, ,,, ,,, ,,, ,,, I
I
ACO
9.4
The art gallery theorem
---------- --
ABO \ \ \
\ \
\ \ \ \ \ \
-----------
BCD
Fig. 9.8 One case in the proof of Helly's Theorem.
Theorem 9.8 (Helly's Theorem) Suppose that Cft/ is a collection of convex subsets of £2 such that any three sets in Cft/ have a point in common. Then for each n > 4, each n sets in Cft/ have a point in common.
This is just one version of Helly's Theorem; there are forms of Helly's Theorem for three- and higher-dimensional Euclidean space, and forms that guarantee even that n Cft/ itself is nonempty. However, the version stated above will be quite sufficient for the proof of the Art Gallery Theorem, and it is the easiest version to prove. Proof We attack the simplest case first, the case in which the number n of the theorem is 4. That is, suppose that Cft/ is a collection of plane convex sets each three of which have a point in common; we desire to show that each four of the sets in Cft/ must also have a common point. So let A, B, C, and D be four sets in Cft/. Since each three sets in Cft/ must have a point in common, there must in particular be a point common to the three sets A, B, and C. For convenience, we will denote this point by ABC, for this notation serves to remind us, among other things, that the point ABC belongs to each of the three sets A, B, and C. Similarly, there are points ABD, ACD, and BCD common to the other possible combinations of three of the four sets in question. In the most general case, these points are four distinct points in £2, and one possibility is that they form the vertices of a convex quadrilateral, as shown in Fig. 9.8.
Helly's theorem
9.4
269
In this case, the diagonals of the quadrilateral must intersect in a point we have called p. In Fig. 9.8, the diagonal from ABC to BCD has its end points in the convex sets Band C, so this diagonal must be a subset of B n C. Similarly, the other diagonal must be a subset of AnD. Hence the point p must belong to all four of the sets A, B, C, and D. This shows that any four sets in re must have nonempty intersection in this case, the one in which the four points form the vertices of a convex quadrilateral. Fortunately, there are not many other cases, and all of the others are even simpler, so that we have left their discussion for the exercises at the end of this section. Assuming, then, that the other cases can similarly be disposed of, we have shown that if each three sets in re have a point in common, then also each four sets in re must have a point in common. You may well guess what comes next. Knowing of re more than we did before-that each four sets in re must have a common point-we proceed to show that each five sets in re must have a point in common. By continuing this process, we may conclude that each finite subcollection of sets from re must have nonempty intersection, since each finite value of n > 4 must eventually be reached by this method. However, the "obvious" way to show that each five sets in re have a point in common is not the best way. For suppose we are given five sets A, B, C, D, and E in re. What we should not do is label the point known to lie in the four sets A, B, C, and D by the symbol ABCD, and consider the possibilities for the five points we would thus obtain. There are too many possibilities, and for larger values of n the situation becomes far more complicated. Instead, we use a trick like that in the proof of Theorem 9.3 and Exercises 9.11 and 9.12. We consider the collection fJI consisting of the five sets A, B, C, D, and E. Since these sets come from re, we know that any three of them have a point in common, and thus, by what we have already proved, also that any four of them must have a point in common. Consider next the new collection d
=
{A, B, C, D n E}.
In d, we have four convex sets, and each three of them have a point in common. For the only possible combinations of intersections of three of them are A n C n (D n E), A n B n C, A n B n (D n E),
B n C n (D n E).
Because of what we have observed about fJI, each of the above sets is nonempty. Hence each three sets in d have nonempty intersection. Since each set in d is convex, it follows by what we have already proved that any four sets in d have nonempty intersection. There are only four sets in d, so we now know that A n B n C n (D n E)
270
9.4
The art gallery theorem
is nonempty. Thus the five sets A, B, C, D, and E have nonempty intersection. Since these are five sets arbitrarily chosen from Cfj, this shows that each five sets from C(j have nonempty intersection. You can see how we continue application of this method. For example, now that we know each five sets in Cfj have nonempty intersection, we let A, B, C, D, E, and F be six sets arbitrarily chosen from Cfj, and let f!} =
{A, B, C, D, E, F},
and if
=
{A, B, C, D, E n F}.
Now f!} is a subcollection ofCfj, and so by what we have already shown, each five sets in f!} have nonempty intersection. Moreover, if is a collection of convex sets in E 2 and each four sets in if have nonempty intersectionsome of the possibilities for intersections of four sets from if are listed below:
An B n C n D, A n B n D n (E n F), B n C n D n (E n F). In every case, any intersection of four sets from if can be thought of as the intersection of five or fewer sets from fi), and hence any intersection of four sets from if is nonempty. Hence, by what we have already shown, the intersection of any five sets from if must also be nonempty. The only possibility for such an intersection is A n B n C n D n (E n F),
but this is the same as the intersection of all six of the sets in f!}. So the intersection of any six sets from Cfj itself is nonempty. Thus, given n > 4, we will eventually reach the value of the integer n after a number of repetitions of this idea, and thus we can conclude that the intersection of any finite subcollection of sets from Cfj is nonempty. This establishes our version of Helly's Theorem. Exercises
9.41 Does the following statement describe what was actually shown in our proof of Helly's Theorem? Suppose that Cfj is a collection of convex sets in the plane, n is a whole number at least 3, and any n sets in Cfj have nonempty intersection. Then any n + 1 sets in Cfj have nonempty intersection. 9.42 Suppose that Cfj is a collection of 5000 unbounded straight lines in the plane, any three of which have one point in common. What can you conclude?
9.5
Krasnoselskii's theorem
271
9.43 Suppose that rc is a collection of 10,000 circles in the plane (here, by a "circle" we mean the boundary of a circular disk, rather than the disk itself), and any three circles in rc have at least one point in common. What can you conclude?
9.44 Suppose that rc is a collection of 64 solid balls in three-dimensional space, each three of which have a point in common. What can you conclude? 9.45 Suppose that n is an integer at least 4, and rc is a collection of n points in £2 such that each three of these points lie within a circle of radius 1. Can you show that all n points must lie within a circle of radius 1 ? 9.46 What do you think is the correct version of Helly's Theorem for £3? 9.47 What do you think is the correct version of Helly's Theorem for £1? 9.48 In the proof of Helly's Theorem, we began by letting rc be a collection of convex sets in the plane each three of which are known to have a common point, and we sought first to prove that each four sets in rc had a common point. We chose four sets A, B, C, and D from rc, and denoted the point common to A, B, and C by ABC, and so on. There were several cases for the location of the four point ABC, ABD, ACD, and BCD in £2, and we considered only the case in which these four points lay on the vertices of a convex quadrilateral. List the other possibilities. 9.49 Continuing the previous exercise, show how in each of the cases listed one can conclude that there is a point common to all four of the sets A, B, C, andD. 9.50 Continue one step further the argument used in the proof of Helly's Theorem; that is, knowing that rc is a collection of convex sets in £2 each six of which have a common point, show that each seven sets from rc must also have a common point.
9.5 KRASNOSELSKII'S THEOREM
We will now prove the Art Gallery Theorem itself. Watch for the point in the proof at which Helly's Theorem is used. Theorem 9.9 (Krasnoselskii's Theorem) If, for each three paintings hung in an art gallery, there is a spot from which those three can be viewed simultaneously, then there is a spot in the gallery from which all the paintings can be viewed.
Of course, this statement is a little imprecise; we can phrase the theorem as M. A. Krasnoselskii did, in the language of plane convex sets. Theorem 9.9 Rephrased Let P be a plane polygon. Suppose that, given any three points a, b, and c on the boundary of P, there exists a point x E P such that [x, a] u [x, b] u [x, c] c P. Then Ker (P) =1= 0.
272
9.5
The art gallery theorem
, ,, I
, I
I
I
/'11. I
f"
"', '
...
...... ......
...... ......
- - ...... ...... ...... ......
.....
...... ......
..., I
,
I
I
I
I
Fig. 9.9 Construction of the squares in the Art Gallery proof.
I
,, I
,
I
I
I
I
I
p
, I
/.............. I
I
I , I
I
I ..............
I
-.........
I........
, I
/
,
I
-..........
...... , -~_
I
" I
.........
" I
............ ..... -
I
~'''''I
From the second form of the theorem, you can see that the art gallery must be polygonal in shape and with vertical walls. There are certain other differences in the statements given above for the Art Gallery Theorem, but these differences can be resolved by a careful study of the following proof. We prove the second form, of course. Proof As shown in Fig. 9.9, we first give the sides of P a counterclockwise orientation. For each side (J of P thus oriented, there is an unbounded straight line A containing (J, and the orientation of (J induces a like orientation on A. This orientation of A makes it possible to distinguish the points on the
"left side" of A from those on the right; just imagine yourself standing on A, like a tightrope walker, facing in the direction of the orientation; the "left side" of A consists of those points of £2 to your left.
9.5
Krasnoselskii's theorem
273
The plane set consisting of A together with all points to the left of A is called the closed left half-plane determined by A. In this half-plane use a segment of Ato construct a square 8 with the following properties: a) 8 has one side on the line A; b) 8 is a subset of the closed left half-plane determined by A; and c) 8 is so large as to contain all the points of P lying to the left of A. We construct such a square 8 for each side (J of P in exactly the above fashion. We want to show that each three of these squares have a common point. Let 8 1 , 8 2 , and 8 3 be three such squares, and let a, b, and c be points lying, respectively, on the corresponding sides (J1' (J2' and (J3 of the polygon P. By our hypotheses, there then exists a point x E P such that
[x, a] u [x, b] u [x, c]
c P.
But x must lie on the left side of each of the corresponding three lines A 1 , A2' and A3' For if x were to lie to the right of (for example) A 1 , then [x, a] would also lie to the right of A1 , and thus there would be points of P arbitrarily close to (J 1 but to the right of (J l' This cannot happen, for the counterclockwise orientation of the sides of P guarantees that no points of P lie immediately to the right of any side of P. Hence x must lie to the left of each of the lines A 1 , A 2 , and A 3 , and x E P too. Since each of the three squares 8 1 , 8 2 , and 8 3 is constructed so as to contain all points of P to the left of the lines used to construct these squares, it follows that x belongs to all three of the squares 8 b 8 2 , and 8 3 , So each of the possible triples of squares we have constructed has nonempty intersection, and in fact, given three such squares, there is a point of P belonging to all three of them. Each square is a convex subset of E 2 (here we understand that a "square" consists of the boundary together with its interior). So we have the following situation: We have finitely many convex sets in E 2 -namely, the squaresand each three of these sets has nonempty intersection. By Helly's Theorem, there must be a point common to all the squares. We call this point q, and now we wish to show that q E P. If q were not a point of P, we could draw a straight line segment from q to some interior point r of P, and so arrange matters that [q, r] does not intersect a vertex of P. Then, as one moves along [q, r] in the direction from q to r, one must first encounter a point of some side (J of P, as shown in Fig. 9.10. Call that point t. Since [q, t] meets P only at the point t, then each point of [q, t] must lie to the right of the side (J and thus to the right of the line Adrawn through (J. In particular, q itself would have to lie to the right of A, and thus q could not be a point of the square 8 built on the line A.
274
9.5
The art gallery theorem
q
Fig. 9.10 The situation if q ¢ P.
\
\
\
\
This is impossible, since q belongs to every square, and hence we may conclude that q is, after all, a point of P. If we can show that q E Ker (P), this will establish the theorem, for it will show that Ker (P) =I 0. But suppose that q ¢= Ker (P). Then there exists some point Z E P that q cannot "see"-that is, such that the segment [q, z] does not lie entirely within P. Let y be a point of [q, z] not contained in P, as shown in Fig. 9.11. Then, as one travels along the segment [y, z] from y to z, one first meets the boundary of P at some point w on a side (j of P. Hence [q, w] lies to the right of the side (j, as there are points of [q, w] arbitrarily close to (j but not in P. But then, q must lie to the right of the straight line A through (j, and hence q cannot belong to the square S built on A. This is in contradiction to the fact that q lies in each of the squares, and this contradiction establishes that q E Ker (P). Therefore Ker (P) =I 0, and this establishes Krasnoselskii's Theorem. Actually, Krasnoselskii's Theorem is true for any plane figure bounded by a closed curve, and the method of proof is quite similar. However, this version of the theorem requires the use of a form of Helly's Theorem for the case of the intersection of infinitely many convex sets.
9.5
Krasnoselskii's theorem
-----------~~~~
275
-----x--------
Fig. 9.11 Showing that q E Ker(P).
Exercises
9.51 The proof of the Art Gallery Theorem is so long that it is difficult to understand without a summary. Try summarizing the proof of the theorem; make your summary as condensed as possible, omitting most reasons. For example, you might begin like this: "First orient the sides of P in a counterclockwise direction, then draw an unbounded straight line through each side. On each such line, construct a square such that ... " 9.52 How could you phrase the Art Gallery Theorem for a three-dimensional gallery, with pictures hung in all sorts of directions from an observer? Should the word "three' in the statement of the theorem then be replaced by the word "four"? 9.53 Is it possible that although 8 is a nonempty subset of £2, Ker (8) = 0? Give your reasons. 9.54 An unbounded straight line A in £2 divides £2 into three sets: A itself, the points of £2 on one side of A, and the points of £2 on the other side of A (the latter two sets are to contain no points of A itself). The last two sets are called the open half-planes determined by A.
276
The art gallery theorem
9.6
Prove that if 8 is a subset of E 2 , then Hul (8) is the intersection of all open half-planes containing 8. (Remember that the intersection of an empty collection of sets is all of E 2 .) 9.55 Is it possible to divide E 2 into two disjoint convex sets whose union is E 2 ? 9.56 Let f(j be a finite collection of rectangles (including boundary and interior) in E 2 each of which has sides parallel to the coordinate axes. There does exist a natural number k such that, in such a situation, if each k rectangles from f(j have a point in common, then there must be a point common to all the rectangles in f(j. What is the least value of k for which this is so? Hint: It is clear that the answer is either k = 2 or k = 3-but why is this clear? 9.57 In Section 9.4, there was shown in Fig. 9.7 a collection of unbounded straight lines in E 2 such that any two intersected, and no three had a point in common. The figure shows lines passing through the points (n, 0) on the x-axis-where n is a natural number-and the slope of the line through the point (n, 0) was to be n itself. If you know some analytic geometry, this will help in constructing a proof that no three of the lines have a point in common. If not, see if you can find an alternative proof of this fact. 9.58 See Exercise 9.46. Give an example of four convex sets in E 3 such that each three have a point in common but such that there is no point common to all four. 9.59 This is a version of Helly's Theorem for E 2 in which the word "three" can be replaced by the word "two," but proving the following version is not easy. Try it anyway. Let f(j be a finite collection of convex sets in E 2 such that each two sets in 2 f(j have a point in common. Then, for each point p E E , there exists an unbounded straight line A. passing through p and through all the sets in C(/. 9.60 Here is another difficult problem, in which Helly's Theorem can be used to give the solution. Suppose that the set 8 consists of n points in E 2 , where n is a natural number. Then there exists a point p in E 2 such that, for each unbounded straight line A. passing through p, at least n/3 of the points of 8 lie in each of the closed half-planes determined by A..
9.6 i-CONVEXITY
One frequently fruitful approach to mathematics is the generalization of previous results. Sometimes such generalizations actually simplify the theory by removing extraneous details, and they sometimes show connections between branches of mathematics that were thought to be unrelated. One
9.6
L -convexity
277
possible generalization of the idea of convexity is L-convexity. We still restrict our attention to subsets of £2. Let us say that the subset K of £2 is L-convex provided that, for each two points x and y of K, there exists a point z of K such that [x, z] u
[z, y]
c K.
The name "L-convex" comes, of course, from the idea that each two points of K can be joined by a vaguely L-shaped figure lying entirely within K. Every convex set is L-convex, but the converse is easily shown to be false. So we do have here a generalization of the idea of convexity, and a number of our previous theorems still hold after appropriate modifications have been made. For example, even without modification, one can prove the following two theorems. Theorem 9.10 The union of a tower of L-convex sets is L-convex. Theorem 9.11 Let K be a subset of £2 and let p be a point of K. Then there is an L-convex subset of K maximal with respect to containing the point p.
The L-convex kernel of a set has a very natural definition: If K is a subset of £2, then the point x of K is said to belong to the L-convex kernel of K provided that, for each y E K, there exists Z E K such that [x, z] u [z, y] c K. We abbreviate the L-convex kernel of K by L-Ker (K). At this point, the professional mathematician would raise questions such as the following: Need Ker (K) c L-Ker (K)? Need L-Ker (K) be L-convex? Need L-Ker (K) equal the intersection of the maximal L-convex subsets of K? Unfortunately, it is not so easy to define an L-convex hull for the set K. We would want the L-hull to be L-convex, by analogy with properties of the convex hull. But the "obvious" approach of forming the L-hull by intersecting all L-convex sets containing K does not work-as you will see if you let K consist of three sides of a square in the plane. The real difficulty seems to be that the intersection of L-convex sets need not be L-convex. On the other hand, there are some constructions with L-convex sets that do not work for convex sets. Given a subset S of £2 and a point p E £2, we can form the join of p and S, denoted by p # S, as follows:
p#
S
= u
rEp, s] I s E S}.
It would seem plausible that if S is any subset of £2, then p # S would be convex. Unfortunately, this is not the case. However, it is true that p # S is L-convex. We leave further development of these ideas to the exercises.
278
The art gallery theorem
9.6
Exercises
9.61 Give an example of an L-convex subset of £2 which is not convex. 9.62 Prove that every convex set in £2 is L-convex. 9.63 Prove Theorem 9.10: That the union of a tower of L-convex sets is L-convex. 9.64 Prove Theorem 9.11: That if K is a subset of £2 and p is a point of K, then there exists an L-convex subset of K maximal with respect to containing p. 9.65 Is it true that for every subset K of £2, Ker (K) c: L-Ker (K)? 9.66 Is it true that for every subset K of £2, Ker (K) = L-Ker (K)? 9.67 Suppose that K is a polygonal region in £2. Can you show that L-Ker (K) must be L-convex? (So far as the author knows, this problem is unsolved.) 9.68 Suppose that K is an arbitrary subset of £2. Need it be true that L-Ker (K) is L-convex? (Hint: See the previous exercise.) Explain. 9.69 Suppose that K is a subset of £2. Need L-Ker (K) be equal to the intersection of the maximal L-convex subsets of K? Give your reasons. 9.70 Need L-Ker (K) contain the intersection of the maximal L-convex subsets of K for every subset K of £2? Why? 9.71 Let K consist of three sides of a square in £2. Show that K is not L-convex, and that the intersection of all L-convex sets containing K is in fact equal to K. 9.72 Give an example of two L-convex sets in £2 whose intersection is not L-convex. 9.73 Give an example of a subset S of £2 and a point P E £2 such that P # S is not convex. 9.74 Prove that if S is a subset of £2 and p is a point of £2, then p # S is L-convex. 9.75 Show that if C is a circle in the plane, then the closed circular disk with boundary C is not a minimal L-convex set containing C. 9.76 Formulate an alternative generalization of the idea of convexity in the plane. 9.77 Is the complement of a circular disk (containing its boundary) in £2 L-convex? 9.78 Let P be a polygonal region in £2. Suppose that each two points x and y of the boundary of P can be joined by the segment [x, y] with [x, y] c: P. Does it follow that P must be convex?
Notes and references
279
9.79 Let P be a polygonal region in E 2 • Suppose that for each two points x and y on the boundary of P, there exists a point z of P such that [x, z] u [z, y] c P. Does it follow that P must be L-convex? 9.80 Is it true that the subset S of E 2 is L-convex if and only ifKer (S) :I: 0? NOTES AND REFERENCES
The books by Hadwiger, Debrunner, and Klee and by Yaglom and Boltyanskii listed below are collections of problems on convexity and related topics, with discussion material liberally interspersed, and with the level of the material not excessively high (with certain exceptions). The books by Valentine and Griinbaum are rather advanced. The first study of L-convexity known to the author can be found in the paper, "Some properties of L sets in the plane," by Alfred Horn and F. A. Valentine, published in Volume 16 (1949) of the Duke Mathematical Journal. Benson, R., Euclidean Geometry and Convexity (McGraw-Hill, 1966). Griinbaum, B., Convex Polytopes (Interscience, 1967). Hadwiger, H., Dubrunner, H., and Klee, V., Combinatorial Geometry in the Plane (Holt, Rinehart, and Winston, 1964). Lyusternik, L., Convex Figures and Polyhedra, translated by T. Jefferson Smith (Dover Publications, 1963). Valentine, F., Convex Sets (McGraw-Hill, 1964). Yaglom, I. and Boltyanskii, V., Convex Figures, translated by Kelly and Walton (Holt, Rinehart, and Winston, 1961).
CHAPTER 10 THE REAL NUMBER SYSTEM
We shall begin with the assumption that you already clearly understand the system (Q, " +, <) of rational numbers with ordinary multiplication and addition and the usual order relation. We shall examine four main topics: 1) 2) 3) 4)
the inadequacies of the rational number system, remedying such inadequacies by construction of the real number system, the significance of decimal expansions for real numbers, and some unusual and important properties of the real number system.
10.1 THE RATIONAL NUMBERS
The system of rational numbers is quite adequate, of course, for solving many kinds of equations, such as 2x
+
3
= 7 - 5x,
or even systems of simultaneous equations in more than one unknown; for example, 5x + 3y - 2z = 0, 2x = 3z, yJ2
+
9x
= 1001.
Moreover, it does appear that every "quantity" encountered can, in many circumstances, be approximated numerically to within any desired degree 280
10.1
The rational numbers
281
Fig.10.1 The number measures a length.
.J2
of accuracy by rational numbers; we have in mind "quantities" such as length, weight, volume, velocity, and even the famous rational approximation 22/7 for 1t. However, 1t is not equal to 22/7, and such a simple equation as x2
= 2
cannot be solved using only rational numbers. But this equation certainly ought to have an exact solution, for the positive "number" x such that x 2 = 2 can easily be visualized as the exact length of the hypotenuse of an isosceles right triangle of leg length 1, as in Fig. 10.1. We see as a result of our first theorem that this length cannot be a rational number. Theorem 10.1 No rational number is a solution of the equation x2
= 2.
Proof We will use Theorem 7.9, the Fundamental Theorem of Arithmetic,
to give the simplest (but by no means the only) proof. Suppose by way of contradiction that there were a rational number r such that r 2 = 2. Whether or not r is positive, since it is rational there must (by definition of "rational number") exist integers m and n with n :1= 0 such that m r =-.
n
282
10.1
The real number system
Hence
But,2 = 2, and so
m2
2=n2
'
and hence Now m and n are integers, but here we are concerned only with their squares, so we may suppose that both m and n are positive. By the Fundamental Theorem of Arithmetic, each of m and n has a unique prime factorization, and so do m 2 and n2 • The prime factorization of m has some number-possibly zero-of 2's in it, and hence m 2 has one factorization in which twice as many 2's appear. That is, m 2 has a factorization with an even number of 2's. Since this factorization is unique, we see that the prime factorization of m 2 must contain an even number of 2's. Similarly, the prime factorization of n2 contains an even number of 2's. So one factorization of 2n 2 contains an odd number of 2's; again, this prime facto~ization is unique, so the prime factorization of 2n 2 must contain an odd number of 2's. But we have the equation 2n 2
= m2•
We have seen that if the term 2n 2 is factored into primes an odd number of 2's must appear in that factorization, while if m 2 is factored into primes this factorization must contain an even number of 2's. Hence the natural number with the two names above-the two names 2n 2 and m2-has two different prime factorizations, one with an even number of 2's and one with an odd number of 2's. This is in contradiction to the Fundamental Theorem of Arithmetic, since that theorem guarantees a unique prime factorization of each natural number. This contradiction means that our original supposition-that the equation has a rational solution-must be false, and hence no rational number can solve the above equation. In other words, the square root of 2 is irrational, our term for a real number (whatever that is) that is not rational. Since there then exist lengths which are not rational numbers-or, if you prefer, there are polynomials such as p(x) = x 2 - 2 which have no rational zeroes-it thus seems reasonable that there do exist numbers in addition to the rational numbers, numbers yet to be discovered. (If you prefer,
10.1
The rational numbers
283
you may say that such numbers have yet to be constructed. It depends on whether you think a mathematician is an explorer or an inventor.) An alternative formulation of the above situation is this: If S is the set of all positive rational numbers with squares no less than 2; that is, S
= {r E Q+ I r 2 > 2},
(where Q+ denotes the set of all positive rationals) then the set S contains no smallest element. The method of showing this will be outlined in one of the exercises at the end of this section. But you have undoubtedly approximated .J2 by a sequence of numbers such as 1.4, 1.41, 1.414, 1.4142, 1.41421, .... And you have likely carried out the above decimal approximations sufficiently far to obtain the necessary accuracy in the context of the problem you are working. It is not difficult to find what such a sequence is tending to, if there is some regularity or periodicity in the terms of the sequence. For example, the sequence 0.3, 0.33, 0.333, 0.3333, 0.33333, ... can easily be seen to be tending toward the number whose decimal expansion IS
0.333 333 333 ... , and the "value" of this decimal can be calculated using some elementary knowledge about geometric series. The above nonterminating decimal is actually just an abbreviation for 3 10
+
3 100
+
3 1000
+"',
which is a geometric series with ratio 1/10 (since each term is 1/10 the previous) and first term 3/10. If the ratio is between -1, and 1, the sum of such a series is given by the formula a
1- r where a is the first term of the series and r is its ratio; in the case of the series for the decimal 0.333 333 333 ... , we find its value to be 3 10 1
1 10
1 3
284
10.1
The real number system
Since no periodicity in the decimal expansion 1.414213 562 ...
is apparent, such techniqlles cannot be applied, so the question naturally arises at this point as to what the sequence 1.4, 1.41, 1.414, 1.4142, 1.41421, ...
has as its "value" toward which it is tending; that is, what is the "limit" of the above sequence of rational numbers. Thus, the construction of the real numbers can be thought of as giving a meaning to every possible decimal expansion, and interpreting such a decimal expansion as measuring some length. Our next theorem may give some insight into the structure of the rational number system; it almost says that the rational numbers are sufficient for approximation of any quantity to any desired degree of accuracy. Theorem 10.2 Let rand s be rational numbers with r < s. Then there are infinitely many rational numbers between rand s. Proof Choose integers m and n such that 0 < m < n. Then m n
0<-<1. Since r < s, s - r is positive. Also s - r is rational, as you will be asked to establish in the exercises, and so we have
o<
m - (s - r) < s - r, n
and hence r < r
+ -m (s n
- r) < s.
Since min and s - r are rational, so is their product, and so is their sum m r+-(s-r).
n
Again, the details are left for the exercises. But there are infinitely many choices of integers m and n such that 0 < m < n, and m and n can be chosen so as to give infinitely many different values of min, and thus infinitely many different values of m r + - (s - r), n all of which lie between rand s. This establishes the theorem.
The rational numbers
10.1
/'
VL./, / '
/i\
I \
.f7'f
I:
"
"
,,"
"
"
285
"
\
\ I' I \ I '
7/4
-(3/2)
3
By virtue of this theorem, it would appear that if the rational numbers were indicated as points on an unbounded straight line, located according to their values as indicated in Fig. 10.2, there could be no gaps in this line. But we have seen that is not a rational number, but could be located on this line by a method such as that shown in Fig. 10.2. So the rational number system does, after all, contain gaps, which we intend to fill with numbers we. will eventually call the irrational real numbers.
.J2
Exercises
10.1 Let a, b,
C,
and d be rational numbers. Show that the equation
ax
+
b
=
cx
+
d
has either no solution or a rational solution. 10.2 Let a and b be rational numbers. Show that the numbers
a
+
b
a - b
a·b
are rational, and that alb is rational if b i= O. 10.3 The number 1t represents a length. What length? It also represents an area. What area? 10.4 The value of 1t accurate to 30 decimal places is 1t
= 3.1415926535 89793 2384626433 83279.
Can this information be used to show that 1t is not a rational number? How?
286
10.1
The real number system
10.5 Can the decimal expansion of n be used to show that
n
=1=
22 ? 7
10.6 Give a better rational approximation to n than 2217. 10.7 For those who have studied Chapter 3. Use continued fractions and the information in Exercise 10.4 to find the "next better" rational approximation to n. 10.8 Show that ,J3 is irrational; that is, that the equation x 2 rational solution x. Hint: Use the technique of Theorem 10.1.
= 3 has no
10.9 Give the reasons for each step in the following alternative proof that is not rational.
.J2
Suppose that ,J2 is rational. Then 1 < ,J2 < 2 (why?). So there exist natural numbers m and n, with n =1= 1, such that
(why?) and the fraction min is in lowest terms. Hence
m2 2 = -2 n
'
and so
Hence m 2 is even (why?), so m is even (why?). So m = 2" where' is a natural number (why?). So 2n 2 = 4,2 (why?), thus (why?). So n 2 is even (why?), and so n is even (why?). This is a contradiction (to what?). Therefore,J2 is not rational. 10.10 For those who have studied Chapter 3. expansion of ,J2 is given by
The continued fraction
.J2 = (1; 2, 2, 2, 2, . . .), as seen in Section 3.3. How can this fact be used to prove that,J2 is irrational? 10.11 If the techniques of Theorem 10.1 are used in a (vain) attempt to prove is not rational, no contradiction is reached. Why not?
.J4
The rational numbers
10.1
10.12 Prove that if p is prime, then
287
,Jp is not rational.
10.13 Exactly which natural numbers n have the property that ,J~ is not rational? 10.14 Here is an outline of how to show that the set
S
= {r E Q+ I r 2 > 2}
contains no smallest element. Fill in the details. First, if rES then r 2 > 2 (why?). Suppose, then, that r is the smallest element of S. Let r2
-
2
8=r----
2r
Then 0 < s (why?), and S2 > 2 (why?). Hence S E S, since s is rational (why?). But s < r (why?). This is a contradiction (to what?), and hence S contains no smallest element (why?). 10.15 Express
0.888 888 888 ... as a rational number; that is, in the form min, where m and n are integers and n =1= O. 10.16 Express
0.327 327 327 ... as a rational number. 10.17 Express
0.999 999 999 ...
as a rational number.
,J6 is not a rational number. Prove that,J2 + ,J3 is not a rational number.
10.18 Prove that
10.19 Hint: Begin much as in the proof of Theorem 10.1. Then use Exercise 10.18. 10.20 Prove that,J2
+ ,J3 is not a solution of any equation of the form ax 2 + bx + c = 0,
where a, b, and c are rational numbers. Hint: There is no loss of generality in supposing that a = I-why not? This exercise together with the previous one shows that there are irrational numbers not solutions to any quadratic equation with rational coefficients. 10.21 Prove that no rational number is a solution of the equation
x 3 = 2.
288
The real number system
10.2
10.22 Prove that the equation
has no rational solution x. 10.23 Prove that if ex is not a rational number and r is a rational number, then ex + r is not a rational number. 10.24 Prove that if ex is not a rational number and r is a rational number other than 0, then ex . r is not a rational number. 10.25 If a, b, and c are integers with a =1= 0, and the equation ax 2 + bx + c = 0 has two rational solutions, what can be said about the relationship between the numbers a, b, and c?
10.2 NESTED INTERVALS OF RATIONAL NUMBERS
Let a and b be rational numbers with a < b. By the interval [a, b] we mean the set [a, b]
= {x E Q I a
~ x ~ b}.
That is, [a, b] consists of all rational numbers between a and b, including a and b. As we have seen in Theorem 10.2, each such set contains infinitely many rational numbers. If [a, b] is given, then we denote its length by A([a, b]), which is given by the formula A([a, b]) = b - a.
Thus if I is an interval of rational numbers, then A(I) is a positive rational number, and if J c I and J is an interval of rational numbers, then A(J)
~
A(I).
In fact, if J is a proper subset of I, then A(J) < A(I), a fact whose proof is outlined in the next set of exercises. We shall be concerned with sequences of such intervals of the form
where each interval contains the one immediately after it sequence, and where the sequence of numbers
In
the above
is approaching zero. We need a precise definition of this last concept, preceded only by a definition for convenience.
10.2
Nested intervals of rational numbers
289
If r is a rational number, then we denote by Ir I the absolute value of r, and Ir I has the value
r > 0, if \rl = r if r < 0. Irl = - r rational, Ir I is nonnegative, and just measures
Thus if r is the distance from r to 0, when r is thought of as located in its natural position on an unbounded straight line. Let
be a sequence of rational numbers, one for each natural number n. The sequence {sn} is said to approach 0, or have limit 0, provided that given any positive number s, no matter how small, there exists a natural number k such that, for all natural numbers n > k,
ISnl <
s.
This definition does not mean merely that as n increases, the numbers get closer and closer to zero. Example 10.1 For each n, let
Sn
Sn
= 1 + (l/n). Thus we have
2, 3/2, 4/3, 5/4, 6/5, 7/6, ... as indicated in Fig. 10.3. As n increases, the numbers Sn are indeed getting closer and closer to zero. But the sequence {sn} is not approaching zero. To see this, let s = 1/4. No matter what value of k is chosen, if n > k then
1 n
ISnl - 1 + > 1,
and hence ISnl is not less than 1/4 = s. Hence it does not have zero for a limit, although this sequence is getting closer and closer to zero. The definition of the limit of a sequence also does not mean that the terms of the sequence must get steadily closer and closer to the limit, but only that they tend to get closer "on the average." For consider the next example: Example 10.2 For each n, let
1
Sn = -
n
S
n
1 2n
=-
if n is odd, if n
IS
even.
290
10.2
The real number system
3/2
2
o Fig. 10.3 The sequence gets closer to 0 but does not have limit O.
1/16
I
0
/ /
I
I
1/5 1/4
I
I
1/3
1
1/7
Fig. 10.4 The sequence has limit 0 though it does not steadily get closer to O.
Nested intervals of rational numbers
10.2
291
Thus we have the sequence 111 1 1 ! ! , 4' 3' 16' 5' 64' ;;'
as indicated in Fig. 10.4. The terms of this sequence do not steadily get closer and closer to zero; however, the sequence does have limit zero. Here is one proof of this fact; note how we follow exactly the pattern of the definition. Let e > 0 be given. Since to this point, only rational numbers "exist," e itself is rational, and thus has the form a b
e = -, where a and b are natural numbers. Let k = b For n odd, 1 n
+
1. Suppose that n > k.
1 n
ISnl = - -and for n even, 1
1 n
= - n <-. 2
-
Hence, if n > k, then
That is, given e > 0, we have shown the existence of a natural number k such that for each n > k,
ISnl <
e.
Therefore, by definition, the given sequence {sn} has limit zero. An alternative way of putting the definition of the sequence {sn} having limit zero is this: No matter how small an interval of the form (- e, e) is chosen around zero, from some point on all the terms of the sequence {sn} lie within that interval. This phenomenon is illustrated in Fig. 10.5. We shall, in essence, construct the real numbers by locating their position using sequences of closed intervals of rational numbers. A typical sequence for locating .J2 would be 11
= [1, 2],
12 = [1.4, 1.5],
13 = [1.41, 1.42], 14 = [1.414, 1.415], Is = [1.4141, 1.4142],
292
10.2
The real number system
(
I
I I ""11111
I
I
)
I
o Fig.10.5 All terms of the sequence from
524
on lie in the interval ( -e, e).
2 1.4
1.5
1.414 1.41
1.415 1.42
Fig. 10.6 Locating.j2 with a sequence of closed intervals of rational numbers.
10.2
Nested intervals of rational numbers
293
These intervals are shown in Fig. 10.6. The behavior of these intervals is such that only the number ,J"2 can belong to all of them. However, since ,J"2 has not yet been "constructed," we cannot speak of the one number belonging to all of these intervals; indeed, in our context of the rational numbers and only the rational numbers, there is no number which belongs to all the above intervals. Hence what we will do is simply say that the number ,J"2 is the sequence {In}' This is a subtle and ingenious idea, and the way it succeeds is aesthetically very pleasing. What is necessary to make this idea succeed is the behavior of the above sequence of intervals. The significant characteristics we need are these given below, which you can check in the above example. a) For each n, I n+ 1 is a proper subset of In. b) The sequence {A(In)} has limit zero. We abstract such behavior to obtain our next definition. A nested sequence of closed intervals of rational numbers is a sequence
of closed intervals of rational numbers such that a) for each n, I n+ 1 is a proper subset of In; and b) the sequence {A(In)} has limit zero. Here is an outline of the construction of the real number system; the details will be forthcoming. First, we define what it means for two sequences of nested interv.als to be the same. (This should not be surprising, that two different-looking such sequences could be considered the same; after all, the rational number 1/2 has many different symbols, such as 2/4, 3/6, 4/8, .... In the same way, differentlooking sequences of nested intervals will be considered as different names for the same real number.) We then define a real number to be a sequence of nested intervals of rational numbers. (Technically, here, a real number will be the set of all "same" sequences of nested intervals of rational numbers.) We show that every rational number is, in this sense, a real number, so that Q c R, where R denotes the set of all real numbers. Next, we define addition, multiplication, and the order relation for R, and show that these are the same as those already extant in Q. We also show that all the familiar axioms concerning these operations hold true in R. Finally, we show that when the elements of R are thought of as points on an unbounded straight line, located according to their value, then there are no gaps in this line-in particular,,J"2 E R. And we show that if we repeat the above process, using sequences of nested intervals of real numbers, no new numbers are obtained.
294
10.2
The real number system
Exercises
10.26 If r is a rational number and {rn } is a sequence of rational numbers, how would you define the statement "The sequence {rn } has limit r."? 10.27 Let {sn} be the sequence 1, - (1/2), 1/3, - (1/4), 1/5, .... Prove that {sn} has limit zero. 10.28 Suppose that I and J are closed intervals of rational numbers, and J is a proper subset of I. Fill in the details of the following outline of a proof that A(J) < A(I). Let I = [a, b] and J = [c, dJ. Since J c I, a ::::: c and d ~ b (why?). Suppose that a = c. Then we must have d < b (why?). Similarly, if d = b, then a < c. Hence either a < cor d < b. Now A(J) = d - c and A(I) = b - a (why?). If A(J) > A(I), we obtain a contradiction (how, and to what?). Therefore A(J) < A(I). 10.29 Remove as many absolute value signs as possible from each of the expressions below, without changing the value of each expression.
a) c) e)
1191 101 I-xl
b)
1-71
d)
Ixl Ilxll
f)
10.30 Prove that the sequence 1, 1/4, 1/9, 1/16, 1/25, 1/36, ... has limit zero. 10.31 What is the limit of the sequence in Example 10.1? Use your answer to Exercise 10.26 to prove that your answer to this exercise is correct. 10.32 What is the limit of the sequence 1, 0, 1, 0, 1, 0, 1, 0, ... ? Prove that your answer is correct. 10.33 Let r be a rational number such that Ir I < 1. Prove that the sequence 4 r, r 2 ,r3,r , ...
has limit zero. 10.34 Let a and r be rational numbers and n a natural number. Show that a
+
ar
+
ar
2
+
ar
3
+ ... +
ar
n
=
a(1 - r n + 1) 1 - r
.
10.3
Construction of the real numbers
295
10.35 If the sum of the infinite geometric series a
+
ar
+
ar 2
+
ar 3
+ ...
is defined to be the limit of the sequence a, a
+ ar, a + ar +
+
ar 2, a
ar
+
ar 2
+ a
ar 3,
+
ar
+
ar 2
+
ar 3
+ ar 4 , . . . ,
prove that the sum of the infinite series above is then given by the formula a
1 - r
,
if Ir I < 1. Hint: Use the two previous exercises. Now evaluate the sum of the infinite series 1/2
+
1/4
+
1/8
+
1/16
+
1/32
+ .. '.
10.3 CONSTRUCTION OF THE REAL NUMBERS
Let' and" each be a sequence of nested intervals of rational numbers. Then
, = {II' 12 , 13 , ••• }, and " = {J I , J 2 , J 3 ,
•.. },
where each In and each I n is a closed interval of rational numbers. We say that' and" are equivalent, and write' = ", provided that, for every combination of natural numbers m and n, there is a point common to 1m and I n • In other words, each interval in , overlaps each interval in ". This is just a way of saying that the two sequences' and" are "zeroing in" on the same point. This point will be the eventual location of the real number which, by virtue of the above definition, , and" are two names for. For if some interval 1m is disjoint from some interval J m then all the intervals after 1m in , will be disjoint from all intervals after I n in ", and there will be a positive distance between the point "defined" by , and the point "defined" by". As an illustration of the above definition, suppose that In
=
[1 - ~, 1+ ~}
In
=
and
[1 - ~, 1]
for each natural number n. If' = {In} and" = {In}, then' = ,,; the reason is that every interval In as well as every interval I n contains the number 1,
296
10.3
The real number system
s
r
~[
1m
r
[
]
Kn
-,
}
~------------------~ L Jr .J
Fig. 10.7 A contradiction is reached because all the intervals J t have length at least s - t.
and hence for every combination of natural numbers m and n, 1m overlaps I n in at least the number 1. Incidentally, we also have in the above example an illustration of the way in which a rational number can be thought of as a real number; namely, the number 1 is the only number common to all the intervals In and the only number common to all the intervals I n • So in this case both' and 1] are new names for the familiar rational number 1. We should now show that this "equality" (which we have temporarily called an equivalence in the above definition) has the properties that equality ought to have.
Theorem 10.3 Let " numbers. Then a) b)
c)
1],
,=" if ,= if ,=
and
e be
1]
then
1]
and
sequences of nested intervals of rational
1] 1]
= , = e
then
Proof Parts (a) and (b) are obvious. To establish part (c), suppose by way of contradiction that' i= e. Then some interval 1m in , is disjoint from some interval K n in e. Since 1m and K n are intervals, we can suppose that each number in 1m is less than each number in K m as indicated in Fig. 10.7. In particular, the right-hand endpoint r or 1m is less than the left-hand endpoint s of K n• Since , = 1] and 1] = (), it is not hard to see that each interval J t in the sequence 1] must contain both rand s, since each interval J t must intersect both 1m and K n • Since r < s, s - r is a positive number, and hence J...(Jt) > s - r for all natural numbers t, as the interval [r, sJ is a subset of J t for each natural number t. Hence the sequence {J...(Jtn cannot have limit zero. This contradicts the fact that 1] is a sequence of nested intervals of rational numbers,
10.3
Construction of the real numbers
297
as by definition the corresponding sequence of lengths must have limit zero. This contradiction shows that' = (), and establishes the theorem. We define a real number to be the collection of all equivalent sequences of nested intervals of rational numbers. This justifies the use of the symbol of equality in the definition of equivalent sequences, for then two real numbers are equal if and only if they are represented by equivalent sequences of nested intervals. We will use the symbol R to stand for the set of all real numbers. In order to define addition on the set R, we need first to define addition for sequences of nested intervals of rational numbers. Let' and '1 be two such sequences. Then and '1
We define,
+
= {J1 , J 2 , J 3 , ••• }.
'1 to be the sequence
{I 1
+
+
J 1 , 12
J 2 , 13
+
J3 ,
••• },
where, for each natural number n, and Thus in order to "add" two real numbers, we choose any sequences of nested intervals representing those two real numbers; we add these sequences by adding corresponding intervals, and intervals are added by adding each number in one to each number in the other. This makes sense, since in the latter case we are just adding rational numbers together, an operation already taken for granted as defined. There is no difficulty in calculating the result when two closed intervals of rational numbers are added in the above way. For example, if I = [2,3J
and J
= [4,7J
are two such intervals, then I
+
J = {a
+
b Ia
E
and
1
and this must be the interval [6, 10]. For if a 2
~
a < 3,
4
~
b
and ~
7,
and hence 6
~
a
+
b
~
10.
E
bE J}, I and b
E
J, then
298
10.3
The real number system
Moreover, if C E [6, 10], then 6 =:;;
C
< 10.
In order to show that there exist numbers a E I and b a + b, we must deal with cases. If 6 < c ~ 7, we let
E
J such that c =
a=c-4 and
b = c - a. Then a and b are rational numbers, a
+
= c, and moreover, since
b
6 < c < 7, then
2
E
I. And
b = c - a = c - (c - 4) = 4 so that b E J. The other cases are handled similarly. Thus, indeed, [2, 3] [4, 7] = [6, 10]. In general, one can prove the following theorem:
+
Theorem 10.4 If I and J are closed intervals of rational numbers, then so is 1+ J. But curiously enough, our definition of the sum , + 1'/ of two real numbers needs a justification theorem before it becomes a valid definition. The reason is that in order to add the real numbers' and 1'/, one selects just one of many possible sequences of nested intervals to represent , and just one of many possible sequences of nested intervals to represent 1'/. There are other choices; we may have' represented by both
{Ii' 12 , 13 ,
... }
and {Xl' X 2 , X 3 ,
••• },
and 1'/ too may be represented by both
{Jh J 2 , J 3 ,
..• }
and {Yl , Y 2 , Y 3 ,·
•• }.
Although {In} and {Xn} are equivalent, they need not be identical; similar remarks hold for {In } and {Yn }. Thus you should not expect the two sequences and
10.3
Construction of the real numbers
299
to be identical; the problem is, however, that they may not even be equivalent. Since they are both supposed to determine the same sum' + 1'/, the two sequence sums above should be equivalent, or else there is an unacceptable ambiguity in our definition of real number addition. This problem can best be illustrated by an attempt to define a method of combining rational numbers other than addition or multiplication. Suppose, for each two rational numbers rand s, we define r # s as follows: Represent each of rand s as quotients of whole numbers (with nonzero denominator)-thus m n
and
r =-
s
a b'
= -
where m, n, a, and b are integers, and neither n nor b is zero. Then r # s is to have the value m n
+ +
a b
This fails to be a "valid" operation, in that the definition of r # s is ambiguous: If r = 1/2 and s = 2/5, then r # s
= 0/2) # (2/5) = 3/7.
But there are alternative representations of rand s as fractions-for example, we could write r as 2/4 and s as 6/15. Then r # s = (2/4) # (6/15) = 8/19. Since 3/7 =I 8/19, we see that the result of combining rational numbers with the operation # gives a result dependent on the numeral used to represent each rational number, rather than on the actual number itself. This ambiguity is just what we need to show cannot happen in the case of our definition of addition for real numbers. To do so, it suffices to establish the next theorem. Theorem 10.5 Let' and 1'/ be real numbers, and let' be represented by the two sequences of nested intervals {In} and {Xn} and let 1'/ be represented by the two sequences of nested intervals {In } and {Yn }. Then the two sequences of nested intervals {In + I n} and {Xn + Yn} are equivalent, and hence give rise to the same real number' + 1'/. Proof. Suppose that we have the conditions as given in the hypotheses to this theorem. And also suppose, by way of contradiction, that the sequence {In + I n} represents the real number y, the sequence {Xn + Yn} represents the real number (j, and that y =I (j.
Because of the last condition, the two sequences {In + I n} and {Xn + Yn} are not equivalent, so that there must be some interval of the form I k + J k
300
The real number system
10.3
disjoint from some interval of the form X m + Ym • We suppose without loss of generality that m > k, and then, since 1m + 1m C I k + I k , it follows that 1m + 1m and X m + Ym are also disjoint. But there does exist at least one rational number a belonging to both 1m and X m , since {In} and {Xn} are equivalent sequences of nested intervals. Similarly, there does exist a rational number b belonging to both 1m and Ym • Hence the rational number a + bbelongs to both 1m + 1 m and to X m + Ym • This contradiction establishes the theorem. Of course, we have not yet established that if each of {In} and {In} is a nested sequence of closed intervals of rational numbers, then so is the sequence {In + In}; but the proof of this is very straightforward and is outlined in the next set of exercises. For convenience, we next establish that each rational number "is" a real number; that is, that each rational number can be thought of as a collection of equivalent sequences of nested intervals of rational numbers. Theorem 10.6 Let r be a rational number. Then r is a real number. Proof For each natural number n, let In
= [r - (lIn), r + (lin)].
Then, clearly, {In} is a nested sequence of closed intervals of rational numbers. It is easy to see that if {In} is another such sequence, then the latter is equivalent to the sequence {In} if and only if every interval In contains the number r. Moreover, if so, then r is the only number common to all the intervals in the sequence {In} because of the condition that the sequence {A(In )} has limit zero. Hence the rational number r is represented by a sequence of nested intervals of the necessary sort, and consequently is a real number. We have two ways of adding rational numbers. If rand s are rational, we can add them with our already given method of rational number addition, or we can think of rand s as real numbers, and add them using the sequences of nested intervals. The two results turn out to be the same, and so our new definition of real number addition turns out to be the same as the old rational number addition when both methods apply to a pair of numbers. Also, addition is commutative and associative, the rational number 0 (when thought of as a real number) is the identity for this operation, and each real number' has an additive inverse, which we of course denote by -,. The details of these assertions can be found in the exercises. An exactly similar procedure can be used to develop the idea of a multiplication for real numbers, and exactly analogous results follow: multiplication is commutative and associative and distributes over addition; the rational number 1 is the multiplicative identity; each nonzero real number has a multiplicative inverse. Since the multiplication defined for R is the
Construction of the real numbers
10.3
301
same as that given on Q when both can be applied, we have thus constructed a natural extension of the rational number system (Q, ., +) to the real number system (R, ., +); and we can think of the former as a subsystem of the latter, for not only is it true that Q c R, but the operations are the same on Q in either case. There remains only the question of how the order relation on Q is to be extended to R; we take up this topic in the next section. Exercises
10.36 To prove Theorem 10.3, it is necessary to know that if {sn} is a sequence of positive rational numbers each larger than the fixed positive rational number a, then {sn} cannot have limit zero. Please prove this. Hint: Let 8 = a/2. 10.37 Supply the details of the proof that if I and J each are closed intervals of rational numbers, then so is I + J. Hint: If I = [a, b] and J = [c, d], what are the end points of the interval 1+ J? What if some of the numbers involved are negative? 10.38 Suppose that each of {In} and {In} is a sequence of nested intervals of rational numbers. By the previous exercise, the sequence {In + I n} is indeed a sequence ofclosed intervals of rational numbers-but is it a nested sequence? Fill in the details of the following outline of a proof that {In + I n} is nested. First, for each natural number n, In + 1
+
I n+ 1
C
In
+
In
(Why?).
Second, for each natural number n, (Why?). It follows that for each natural number n, not only is I n + 1 + I n + 1 a proper subset of In + Jm but also that the sequence {A(In + I n)} has limit
zero. (Why, to both.) Therefore {In + I n} is a nested sequence of closed intervals of rational numbers.
.J2
10.39 Given that 1 < < 2, how would you construct a sequence of nested intervals of rational numbers representing the real (and irrational) number .J2-without using decimals? 10.40 Prove that if ( and Yf are real numbers, then ( + Yf = Yf + (. Hint: Let ( be represented by the nested sequence {In} and Yf by the nested sequence {In}. Show that In + I n = I n + In for each natural number n. It follows (why?) that ( + Yf = Yf + (. 10.41 Prove that if " Yf, and 0 are real numbers, then ( + (Yf + 0) (( + Yf) + O. Hint: Do this much like the previous exercise.
=
302
10.4
The real number system
10.42 Prove that the real number 0 is an additive identity; that is, that o + ( = ( for each real number (. 10.43 Suppose that the real number ( is represented by the sequence {In} of nested intervals of rational numbers. In terms of this sequence, what is a sequence representing - (? Show that - ( + ( = 0 for all real numbers (. 10.44 Show that if ( is a real number such that some representation of ( by a sequence {In} contains an interval I k of positive rational numbers only, then there exists a representation of ( by a sequence {In} in which every interval contains only positive rational numbers. Show that a similar result holds in the "negative" case. 10.45 Show how to define the product of two real numbers ( and 11 in terms of products of sequences of nested closed intervals of rational numbers. 10.46 Prove that if ( is a real number, then
o· ( =
0 and 1 . ( = (.
10.47 Show that multiplication of real numbers is both commutative and associative. 10.48 Show that multiplication of real numbers distributes over addition; that is, for any three real numbers (, 11, and 0, ( . (11
+
0) = (. 11
+ (. o.
10.49 Show that each real number other than zero has a multiplicative inverse. Hint: Use Exercise 10.44. 10.50 Show that the multiplication for real numbers defined by yourself in Exercise 10.45 coincides with ordinary rational number multiplication when both can be applied to two numbers. 10.4 THE ORDER RELATION ON R
Let ( and 11 be real numbers, with representations {In} and {In} respectively by sequences of nested closed intervals of rational numbers. If there exists a rational number r and a natural number n such that each number in In is less than r and each number in In exceeds r, then we say that ( < 11.
Theorem 10.7 If a and b are rational numbers and a < b in the order relation already given on Q, then a < b in the order relation on R defined above, and conversely. Theorem 10.8 relations
If ( and
11 are real numbers, then exactly one of the three
( < 11
11 < (
is true. Theorem 10.9 If (, 11, and 0 are real numbers such that ( < 11 and 11 < 0, then ( < o.
10.4
The order relation on R
303
Note: The notation , < '7 means that either' < '7 or , = '7 is true. (By virtue of Theorem 10.8, at most one can be true.) Theorem 10.10 If' is a real number and 0 < " then there exists a natural number n such that
-1 < ,. n Theorem 10.11 (Archimedean Property of R) If e is a real number with o < e, and y is a real number with 0 < y, then no matter how small e is and no matter how large y is, there exists a natural number n such that y
< n·e.
Let S be a nonempty set of real numbers. The number b is said to be an upper bound for S provided that x ~ b for all XES. If c is an upper bound for S such that c ~ b whenever b is also an upper bound for S, then c is called a least upper bound for S. The next theorem really says that the real number line contains no "gaps." Theorem 10.12 (Least Upper Bound Theorem) If S is a nonempty set of real numbers with an upper bound, then S has a least upper bound.
The proofs of the above six theorems are outlined in the next set of exercises. Using Theorem 10.12, we can prove that ,J2 is a real number; that is, that there exists a real number y such that y2 = 2. Here is an alternative method. The method is to construct two sequences of positive rational numbers of the form and such that, for every natural number n, (an)2 < 2, (b n)2 > 2, an < an+ h bn+ I < bn, and such that the sequence {Ibn - anI} is approaching zero. This sequence looks much like that illustrated in Fig. 10.8, arranged so that al <
a2
<
a3
< ... <
,J2 < ... <
b3
< b2 < bi
in actuality, although we have not yet shown the existence of,Ji Since {Ibn - anI} has limit zero, the sequence of intervals {In} = {[am bn]} will be a nested sequence of closed intervals of rational numbers, thus representing some real number y. It is then natural to expect that y2 = 2, and this will be proved.
304
The real number system
10.4
Fig.10.8 Illustrating the proof that.j"2 is a real number.
The first problem is the construction of the two sequences {an} and {bn }. We begin by letting a i = 1 and bi = 2. Then we average:
This average is a positive rational number whose square is either less than 2, or greater than 2. If less, we call it a2; if greater, we call it b 2. In this case, it turns out to be b 2 • Now that we have ai' b i , and b 2 , we average the last an constructed with the last bn constructed, and obtain
Again this average is a positive rational number; if its square is less than 2, we call it a2; if greater, we call it b 3. We continue this process. In general, if an and bm are the last numbers constructed in each sequence, we form the average
and call this number an + 1 if its square is less than 2; we call it bm + 1 if its square is greater than 2.
The order relation on R
10.4
305
It is not hard to see that {an} and {b n} are sequences of positive rational
numbers so arranged that
and so that, for each natural number n,
The only real problem is in showing that there are actually infinitely many a's and infinitely many b's. But suppose, for example, that there were but finitely many b's. Then there must be infinitely many a's, and we would have
where bk is the last bn constructible by this process. At each stage of the construction, the distance between the last two a's and b's constructed is half what it was at the previous stage; for example, we have Ib l - all - 1, Ib z - all - 1/2,
Ib z - azl - 1/4, and so on. Hence the sequence {b k - an} is approaching zero (n is the variable in this sequence; k is fixed). But then, the only rational number belonging to all the intervals of the form [am bkJ is bk itself. However, by Exercise 10.14, bk cannot be the smallest rational number whose square exceeds 2. So there is a smaller rational number, say c, whose square exceeds 2. By construction, the square of each an is less than 2, and the square of bk exceeds 2. Hence the number c must belong to each interval of the form [an, bkJ. This is a contradiction, since the only number in all such intervals is bk itself and c "# bk • The case in which there are but finitely many a's is handled similarly, and again, with this contradiction, we can conclude that there must be infinitely many terms in both the sequences {an} and {b n}. Consequently, {[an, bnJ} is a sequence of nested closed intervals of rational numbers, and thus represents some real number which we denote by y. The object now is to prove that yZ = 2. But this is quite easy, for the sequence of nested intervals
not only represents the real number l, but also has the property-again by construction-that it represents the number 2; for (an)Z < 2 < (bn)Z for exists. each natural number n. Therefore yZ = 2, and hence
.J"2
306
10.4
The real number system
Exercises
10.51 Part of the proof of Theorem 10.7 is presented below; please supply the details. Suppose that a and b are rational numbers and a < b in the order relation on Q. First construct a rational number r such that a < r < b. Represent a by the sequence {An} and b by the sequence {Bn} of nested intervals of rational numbers. Show that, for some sufficiently large value of n, each number in An is less than r and each number in Bn exceeds r. It may be helpful to do this last step by contradiction. Remember that {A(A n )} and {A(Bn)} are sequences approaching zero. The conclusion is then that a < b in the order relation on R. 10.52 The other part of the proof of Theorem 10.7 is presented below; please supply the details. Suppose that a and b are rational numbers such that a < b in the order relation on R. Show why there must exist a rational number r such that a < rand r < b in the order on Q. Conclude that a < b in the order on Q. 10.53 An outline of the proof of Theorem 10.8 is presented below; please supply the details. Suppose first that , < '1 and , = '1 are both true. Represent' by a sequence of nested intervals {In} and '1 by a sequence of nested intervals {In}. Since' = '1, each In must interse~t each J m. But since' < '1, there exists a natural number n and a rational number r such that each number in In is less than r and each number in I n is greater than r. Why does this lead to a contradiction? By similar treatment of other cases, conclude that at most one of the three relations
can be true. Suppose that neither of the last two relations above is true, and show that the first one must be true; since' =1= '1, there must be a natural number n so large that In and I n are disjoint. Show how to construct a rational number r such that each number in In is less than r and each number in I n exceeds r. This shows that' < '1, and thus establishes Theorem 10.8. 10.54 Prove Theorem 10.9. Hint: Use techniques similar to those in the above exercise, and use the fact that Theorem 10.9 does hold for rational numbers. 10.55 Prove Theorem 10.10. Hint: Use Exercise 10.44. 10.56 Prove Theorem 10.11. Hint: Use Theorem 10.10. 10.57 Prove Theorem 10.12. Hint: Let S be a nonempty set of real numbers with an upper bound. Let
A = {x
E
R
I x > s for all
S E
S}
Are there more numbers 7
10.5
307
and B
= {x E
R
I x
S E
S}.
Then A and Bare nonempty, A u B = R, and S c B. Moreover, each number in B is less than each number in A. Follow the technique used after the statement of Theorem 10.12, in proving that exists, to construct two sequences, one drawn from A and one from B. Construct a sequence of nested intervals using the terms of these sequences for endpoints. If your construction is much like that in showing that exists, you should obtain a sequence of nested intervals representing a real number' which can be shown to be the least upper bound of B. Then, with a little care, , can also be shown to be the least upper bound of S.
.J2
.J2
10.58 From this point on, we consider intervals to be intervals of real numbers. That is, [a, b] = {x
R
Ia < Ia < Ia <
x < b},
R
Ia <
x
E
R
{x
E
R
(a, b] = {x
E
E
[ a, b)
=
x < b}, x < b},
and (a, b)
=
{x
< b}.
Give three different upper bounds for the set [1, 2). 10.59 Prove that if a set of real numbers has a least upper bound, it has only one. 10.60 What is the least upper bound of the set [1, 2)1 What is the least upper bound of the set (1, 2] 1
10.5 ARE THERE MORE NUMBERS?
First let us try to answer the question above by showing how every length can be represented by a real number. Such a length can be represented as a length measured from 0 to some point on the unbounded straight line on which the rational numbers can be thought of as already located according to their values. If for some reason the length should be thought of as negative (perhaps as representing a velocity, charge, or other signed physical quantity) we shall nevertheless suppose that it has been measured off in the positive direction-for if we can show that some real number' measures the positive length, then - , will measure the negative length. So, in essence, the problem is to show that each point to the right of 0 on the line is the location of some real number' already constructed. Let P
308
10.5
The real number system
be such a point. There is certainly at least one rational number r to the right of P and -1 is a rational number to the left of P. Hence the set
s
=
{x
E
Q Ix
is to the left of P}
is a nonempty set of real numbers with an upper bound. Let' be the least upper bound of S. is to the left of P, then by the construction of' there must exist some closed interval I of rational numbers with , E I and P to the right of each number in I. In particular, the right-hand endpoint b of I is a rational number to the left of P. So b E S. But' < b; this is a contradiction, since' is the least upper bound of S. Similarly, if' is to the right of P, a contradiction is obtained. Hence the point P is the location of the real number'. This establishes that each point on the line is the location of some real number'; moreover, this is the natural location of' because this point is to the right of every rational number less than , and to the left of every rational number greater than'. We now show how each real number can be located in a natural position on the unbounded straight line of rational numbers. This will be easiest to see with an example; we show how to locate the number
If'
TC
= 3.14159 26535 ....
We will not actually use this decimal expansion in order to locate the position of TC, but only to make it clear what intervals of rational numbers are to be constructed. Given TC, let n be the greatest integer such that n ~ TC, and let 11 En, n + 1]. In the case of TC, we obtain the interval II = [3, 4]. Next, let n 1 be the greatest integer such that n
+
n < 10 -
_1
TC,
and let 12 be the interval n [
+!!..! n + n 1 + 10'
10
1].
In this case, using TC, we obtain 12 = [3.1, 3.2]' We continue this process, obtaining the sequence [3, 4], [3.1, 3.2], [3.14, 3.15], [3.141, 3.142], .... The above process will work without the necessity of knowing the decimal expansion of TC in advance; in fact, this process is just what we use to define the decimal expansion of each real number; the decimal expansion is just the
10.5
Are there more numbers?
309
limit of the sequence of left-hand endpoints of the above intervals (with a minor exception to be dealt with in the exercises). In any case, we think of the closed intervals we have just obtained as closed intervals of rational numbers. Since each has length 1/10 that of the previous interval and each is contained in the previous interval, we have a sequence of nested intervals of rational numbers, which "is" the real number n. The location of n is then the one point on the number line lying in all the above intervals; it is clear that there can be at most one such point, and it can be shown using some techniques of topology that at least one such point must exist. This is a natural location for n, since we locate n to the right of every rational number less than n and to the left of every rational number greater than n. This construction provides us with a bonus, as we have noted. We have developed a decimal expansion for each real number, and the rules of arithmetic by which these decimals can be added, multiplied, and so on, can be developed as well. This construction partly answers the question of the title of this section. There are not more real numbers; that is, since we have established a one-toone correspondence between R and all the points on an unbounded straight line, any physical quantity which can be interpreted as a length can be measured exactly by one and only one real number. On the other hand, even using real numbers the simple equation x2
has no solution x
E
+
1 = 0
R; this problem is discussed in Exercise 10.65. Exercises
10.61 Suppose that Cis a real number in the interval [n, n + I] where n is an integer. Prove that there exists a nonnegative integer m such that m CE [ n+-,n+
m
10
+ 10
1] .
What is the maximum possible value of m? 10.62 Suppose in the construction of a sequence of nested intervals of rational numbers, as we did for the number n, the number Cfor which the sequence is constructed lies at the right-hand end of each interval. For example, suppose that Cis the number 1. Then 1E 1E 1E 1E
[0, 1], [0.9, 1], [0.99, 1], [0.999, 1],
310
10.5
The real number system
Has the above construction been carried out in the same way as was indicated for the number n? Since 1 lies in each of the above intervals, it would seem reasonable that a decimal expansion for 1 might be given by 0.999 999. . .. Is this correct? See Exercise 10.17. 10.63 What real numbers have two different decimal expansions? Hint: See Exercise 10.62. 10.64 A question of some theoretical interest is this: If we were to repeat the construction of this chapter, beginning with R rather than with Q, would any new' numbers be obtained? The answer is that none would, and the reason is the truth of Theorem 10.12, the Least Upper Bound Theorem. Let {In} be a nested sequence of closed intervals of real numbers. Each interval In is of the form [am bn], where an and bn are real numbers and an < bn. Let' be the least upper bound of the set {an I n EN}, and show that' is the only real number that belongs to all the intervals In. Since in this alternative development, , is represented by the sequence {In}, this sequence produces only a number that is already a real number. Fill in the details of this argument. 10.65 In order to construct a solution to the equation
x 2 + 1 = 0, a procedure can be used much simpler than the construction of R. Let
c = {(a, b) I a E R
and
bE R}.
For (a, b) and (e, d) in C, let (a, b)
+
(e, d)
e, b
+
d)
(ae - bd, ad
+
be).
=
(a
+
and (a, b) . (e, d)
=
Show that addition and multiplication are closed, commutative, and associative; that (0, 0) is an additive identity and that (1, 0) is a multiplicative identity; and that (a, b) has an additive inverse and, if (a, b) :f;: (0, 0), then (a, b) has a multiplicative inverse. Then show that if the real number a is identified with the number (a, 0) E C, the operations of addition and multiplication with respect to C are the same as in R. This shows that C, the complex number system, is a natural extension of the real number system, and that R can be thought of as a subsystem of C. Finally, show that the equation
x2 + 1 = 0 has a solution in C.
An unusual set of real numbers
10.6
311
10.6 AN UNUSUAL SET OF REAL NUMBERS
If we used only the digits 0, 1, and 2 for counting, we would be counting in the so-called ternary system, as shown below:
Decimal Numeral
Ternary Numeral
o
o
1
1
2
2
10 11 12 20 21 22 100 101 102 110
3 4 5 6 7 8 9 10 11 12
A fraction written with the numeral 1/4 in the decimal system would then be written 1/11 in the ternary system, and so on. The development of the real number system from the rationals could be carried out exactly as before, and only the "decimal" expansions would look any different. The ternary "decimal" for the above fraction could be computed by division: 0·02020202··· 11)1·00000000· ..
22 100
22 100 22 100 22 1
As in the case of ordinary decimal representations of real numbers, some ternary "decimals" may differ yet represent the same real number. For example, 0.022222 ...
312
The real number system
10.6
can be "evaluated" using the formula for the sum of a geometric series (Exercise 10.35), as follows: 0.022222 ...
=
0
+ 0/3 + 2/9 + 2/27 + 2/81 + ....
This is a geometric series with "first" term 2/9 and ratio 1/3, hence its sum is 2 9
-
1 1 - 3
1
- 3'
or, in ternary notation, 0.022222 . . . can be written as 1 10
or
0.100000 ...
Here is an example of a very unusual set of real numbers, known as the Cantor Ternary Set. Let K be those real numbers in the interval [0, 1] not requiring use of the digit 1 in their ternary expansion. Thus the number 1/3 belongs to K, since although 1/3 does have a ternary expanSIOn 0.100000 ...
in which the digit 1 is used, it also has a ternary expansion 0.022222 ... in which the digit 1 is not required. If 1/3 < x < 2/3, x does require the use of the digit 1 in its ternary expansion, since the ternary expansion of such a number must use the digit 1 in the first place after the decimal. So K c [0, 1/3] u [2/3, 1]. Also, if 1/9 < x < 2/9 or if 7/9 < x < 8/9, it is necessary to use the digit 1 in the second place after the decimal point in representing x as a ternary decimal, and hence K c [0, 1/9] u [2/9, 1/3] u [2/3, 7/9] u [8/9, 1].
If this process of elimination is continued, it can be seen that K is that subset of [0, 1] that remains after the middle third (except for endpoints) of [0, 1] is deleted, then the middle third of each of the two resulting intervals is deleted, then the middle third of each of the resulting four intervals is deleted, and so on. This process is shown in Fig. 10.9. In this deletion process, once a point becomes an "endpoint" of K, it must remain in K in spite of all subsequent deletions; thus, for example, K contains the infinite set {I, 1/3, 1/9, 1/27, 1/81, ... }.
10.6
An unusual set of real numbers
313
0[-------------]1 2/3[-----]1
0[-----]1/3
[
[
]
]
[
[
]
]
E-3B
E-3E-3 • • •
• •
•
Fig. 10.9 First four stages in the construction of the Cantor Ternary Set.
Clearly, each endpoint of K has denominator a power of 3; however, K contains other points as well, such as 1/4, since 1/4 has the ternary decimal 0.0202020 .... Now let us calculate the "length" of the set K. This should be 1 - Jl, where Jl is the total length of all the deleted intervals. But then 1
2
4
8
3
9
27
81
Jl=-+-+-+-+ .... This series is geometric, with first term 1/3 and ratio 2/3, so its sum gives us the value of Jl: 1 3
Jl= 1
-
2 3
- 1.
Since the length of K is I - fl, K has length zero.
314
10.6
The real number system
Let f be a function defined on K and real-valued, operating according to the following rule: Given x in K, express x in ternary decimal form without using the digit 1. Replace each 2 in this ternary expansion by the digit 1. Interpret the resulting numeral as the binary decimal numeral for a real number r. Thenf(x) = r. For example, given 1/4 E K, we proceed as follows tofindf(1/4). The ternary decimal 0.020202... represents 1/4. Convert this to 0.010101 .... The latter is the binary decimal expansion for
o 1 0 1 .... -+-+-+-+ 2
4
8
16
This series is geometric, and its sum is 1/3. Hence f(1/4) = 1/3. For another example, given 1 E K, we findf(1) as follows. The ternary decimal 0.22222... represents 1. Convert this to 0.11111 .... Sum the series 1/2 + 1/4 + 1/8 + 1/16 + .. '. The sum is 1. Hence f(1) = 1. Now each number in [0, 1] is a value of the function f For, given Z E [0, 1], z has a binary decimal representation; each digit in this decimal can be doubled, obtaining a numeral which can be thought of as the ternary decimal of some real number. This ternary decimal contains no 1's, and has the form O. ????? ... , hence it represents a number x E K. It should be clear that f(x) = z. Thus every number in [0, 1] is a value off Since f is a function, it cannot have more values than the number of elements of K; but since K c [0, 1], K cannot contain more numbers than are in the set [0, 1]. Hence K and [0, 1] contain the same number of points. But K has length zero. This is what is unusual about the set K. Exercises
10.66 How do you count in the binary system, using only the digits 0 and 1? 10.67 Give the binary and ternary decimals for the numbers 1/2,2/9, and 3/7. 10.68 Show that the Cantor Ternary Set contains infinitely many points not endpoints. 10.69 Evaluate f(1/3) and f(2/3) for the function f constructed in this section. 10.70 Suppose instead of the middle third, the middle fifth is deleted from [0, 1], the middle fifth next deleted from each of the two resulting intervals, and so on. What is the length of the resulting set? NOTES AND REFERENCES
W. Rudin's Principles of Mathematical Analysis (second edition, McGrawHill, 1964) and M. J. Mansfield's Intermediate Real Analysis (Prindle, Weber,
Notes and references
315
and Schmidt, 1969) develop the real numbers from the rationals usmg Dedekind cuts. R. L. Wilder's Introduction to the Foundations of Mathematics (Wiley, 1952) gives, in addition, the development of the integers from the natural numbers, then the development of the rational numbers from the integers. B. Kripke's Introduction to Analysis (Freeman, 1968) gives some further topics in the study of the real number system, and contains a valuable first chapter on the approach to the study of mathematics. c. Goffman's Real Functions (Rinehart, 1953), D. A. Sprecher's Elements of Real Analysis (Academic Press, 1970), and R. R. Stoll's Set Theory and Logic (Freeman, 1963) develop the real numbers from the rationals using equivalent convergent sequences of rational numbers.
EPILOGUE
Mathematics can be thought of as being divided into several branches. The branches are listed below, and we have indicated which chapters of this book fall into each branch. In addition, the listing gives supplementary readings on related topics. Some references duplicate those previously given at the ends of the chapters. The level of difficulty of these books is quite variable, but most of them would be appropriate for students who have the equivalent of an undergraduate major in mathematics, while some of the books are suitable for college freshmen. Algebra. A very special kind of modern algebra is commonly taught in high school. Chapter 4 (on group theory) and some of the material in Chapter 2 belong in this category. Birkhoff and MacLane's A Survey of Modern Algebra (revised edition, Macmillan, 1953) covers many of the topics commonly thought of as "modern algebra," and has been used as a juniorlevel textbook. Number Theory. Perhaps only geometry antedates this very old branch of mathematics. Of course, it is closely allied to algebra, but since about 1900 techniques of analysis have been very fruitful in producing advances in number theory. Niven and Zuckerman's An Introduction to the Theory of Numbers (Wiley, 1960) is frequently used as a junior- or senior-level textbook. Beiler's Recreations in the Theory of Numbers (Dover, 1964) is written for the layman with some familiarity with elementary mathematics, and is a delightful book. Chapter 7 belongs in this category. Analysis. Freshman calculus, calculus of several variables, and differential equations form the backbone of the mathematical education of students majoring in the physical sciences. These topics form the beginnings of analysis, which together with its daughter, applied mathematics, have produced most of the visible effects of mathematics in our culture. (For example, almost all the mathematical problems involved in the flight plans of space explorations belong in this category.) The use of continued fractions in Chapter 3 is an example of an application of a topic in analysis; of course, the differential equations of Chapter 8 are solved using techniques of analysis. The material on convex sets in Chapter 9 is really geometry, but convex sets have found their widest applications in analysis; and of course, Chapter 10 might best be described as an introduction to the foundations of analysis. 316
Epilogue
317
If you wish to study more mathematics of this sort, H. S. Wall's Creative Mathematics (Texas, 1963) is an unusual book-a bright freshman with a great deal of persistence can learn a great deal of calculus on his own with the aid of this book. Bers' Calculus (Holt, Rinehart, and Winston, 1969) is one excellent recent text, as is Spivak's Calculus (Benjamin, 1967). Geometry. Chapter 1 on the Bolyai-Gerwin Theorem, Chapter 5 on polyhedra, and Chapter 9 on convex sets are all highly geometric in content. It is apparent that there is a great deal more to modern geometry than Euclid's Elements. Some interesting references are Hadwiger, Debrunner, and Klee's Combinatorial Geometry in the Plane (Holt, Rinehart, and Winston, 1964), Coxeter's Introduction to Geometry (Wiley, 1961), and Hilbert and CohnVossen's Geometry and the Imagination (Chelsea, 1952). Logic and Foundations. Chapter 6, about infinite sets, deals with part of set theory and the latter is usually included as a part of foundations. Many people would classify the material in Chapter 10 as belonging in foundations rather than in analysis. Stoll's Set Theory and Logic (Freeman, 1961) can be used as a senior-level textbook. Topology. Some of the material in Chapter 5 belongs in this branch of mathematics, but only the second chapter, on Brunnian links, is really mostly topological. Hocking and Young's Topology (Addison-Wesley, 1961) and Alexandroff's Elementary Concepts of Topology (Dover, 1960) are introductory but not elementary. Applied Mathematics. Perhaps probability and statistics belong in this category; perhaps they should receive separate listings. In any case, Chapter 8 on animal populations is an example of an application of mathematics. So are the topics treated in most physics books. Somewhere in the category of applied mathematics belong such new branches of mathematics as game theory, queueing theory, and others, each of which is quite likely to have a profound effect on our lives and cultures, perhaps in the very near future. With respect to game theory, Williams' The Compleat Strategyst (McGrawHill, 1954) is written for the layman, and is a very entertaining book. Two references dealing with problems in mathematics, mostly unsolved, are given below: Dorrie, H., 100 Great Problems of Elementary Mathematics (Dover, 1965, translated by David Antin). Tietze, H., Famous Problems of Mathematics (Graylock, 1965). That mathematics has far-reaching and surprisingly diverse applications can be seen merely by examination of the two titles below: Jakobson, R., Structure of Language and its Mathematical Aspects (Proceedings of the Twelfth Symposium in Applied Mathematics, American Mathematical Society, 1961). Lomont, J. S., Applications of Finite Groups (Academic Press, 1959).
318
Epilogue
Finally, here is a list of more general books, some with intent similar to this one, some not even intended as textbooks, but all appropriate for the educated layman: Aleksandrov, A. D., Kolmogorov, A. N., and Lavrent'ev, M. A., Mathematics: Its Content, Methods, and Meaning (M.I.T. Press, 1963, translated by Gould, Bartha, and Hirsch). Beck, A., Bleicher, M., and Crowe, D., Excursions into Mathematics (Worth, 1969). Crowdis, D. G. and Wheeler, B. W., Introduction to Mathematical Ideas (McGraw-Hill, 1969). Fraleigh, J.B., Mainstreams of Mathematics (Addison-Wesley, 1969). Polya, G., Mathematics and Plausible Reasoning (Princeton, 1954). Stein, S., Mathematics: The Man-Made Universe (second edition, Freeman, 1969). Wilder, R. L., Evolution of Mathematical Concepts (Wiley, 1968). Wilder, R. L., Introduction to the Foundations of Mathematics (Wiley, 1952). Of course, many fine books have been omitted from the above listing; but the bibliographies that appear in many of those listed will serve as a guide to additional reading.
Epilogue
319
GREEK ALPHABET
A B
p
r
')'
A
b
E Z
H
e
ex
, 8
" ()
I K
K
A
A.
M N ..... ...
f.l
.... 0
I
v
e 0
n
1t
P
P
1:
q
T y
't
X 'II
n
v 4J X
t/J OJ
Alpha Beta Gamma Delta Epsilon Zeta Eta Theta Iota Kappa Lambda Mu Nu Xi Omicron Pi Rho Sigma Tau Upsilon Phi Chi, Khi Psi Omega
al'f~
ba't~, be't~ gam'~
del't~ ep's~-lon'
za't~, ze't~ a't~, e't~ tha't~, the't~
i-6't~ kap'~ lam'd~
myoo, moo - nyoo noo, zi, si
,
om'~-kron' o'm~-kron'
pi ro sig'm~
tou, to up's~-lon'
fi
ki psi, si o-meg'~, o-me'g~, 6-ma'g~
pat ~ about pay pet pot thin pie toe boot pit out paw cut
ANSWERS AND HINTS
CHAPTER 1
1.1 The only polygon is (d). 1.2 Suppose that a polygon had fewer than three vertices, and reach a contradiction. 1.3 It can be done. 1.4 Yes. Choose a point p on one edge but not a vertex. A sufficiently small circular disk centered at p will be bisected by this edge, and part of this disk must lie within the polygon. But a semicircular region has positive area. 1.5 See a plane geometry textbook for a formula concerning the sum of the interior angles of a polygon. 1.6 See Section 1.8 for one approach.
1.7 In either case, since only finitely many parallelograms may be used, there must be one with a vertex coinciding with a vertex of the triangle and with one of the two sides incident at that vertex lying on one side of the triangle. 1.8 See Fig. 1.2. 1.9 For each integer n > 0, let Rn be the rectangle with vertices at the plane coordinates (n,O)
(n
+
1, 0)
and let P be the union of all these rectangles. Since R n has area 2- n, the total area of Pis 1
+
1/2
+
1/4
+
1/8
1.10 Try a figure with a square boundary. 320
+ ... =
2.
Answers and hints
1.11 Make R into two squares with one cut. along a diagonal. Reassemble.
321
Cut each resulting square
1.12 Cut the strip into squares each of which has side length the same as the width of the strip. Reassemble the strip I
2
3 4
5 6 ...
in the pattern 1 3 2 4
5 7 6 8
. .
1.13 Use the same squares as in Exercise 1.12. Start at the origin and work "circularly" outward. 1.14 Drop a perpendicular from the largest angle to the opposite side. 1.15 It can be done; you should really try to find the solution on your own. 1.18 One reasonable analogue to "polygon" in the one-dimensional case might be a figure which can be expressed as the union of finitely many line segments, each of which contains both its end points. 1.19 Test your theorem on the two line segments Sand T, both of length 1, where S consists of all numbers x such that 0 ~ x < 1 and T consists of all numbers y such that 2 ~ y < 3. 1.20 The relation is an equivalence relation. 1.21 There should be n - 3 cuts, resulting in n - 2 triangles. 1.22 First rephrase the induction principle as follows: If a statement meaningful for natural numbers is true for n = 3, and whenever it is true for all integers k with 3 ~ k < n then it is also true of n, then the statement is true for all natural numbers n > 3. 1.23 If you have assumed that 1
+
2
+
3
+ ... +
k _ k(k
+
1)
2
then 1
+
2
+
3
+ ... +
k
+
(k
+
1)
=
k(k
+
2
1)
+ (k +
1).
1.24 Suppose that if a polygon has k vertices and 3 =:;; k < n, then the polygon can be triangulated. Let P be a polygon with n vertices. Use the proof in Section 1.3 to divide P into two polygons each of which has fewer than n vertices. Apply the above assumption, and then the answer to Exercise 1.22.
322
Answers and hints
1.25 It cannot always be done. Try polyhedra of the sort that have a hole running all the way through. This is not an easy problem. 1.26 You may wish to use the fact that if b < c and a > 0, then ab < ac. 1.27 This is easy; you may wish to split the proof into two cases. 1.28 If 4n - 1 is divisible by 3, then there exists a natural number k such that 4n - 1 = 3k. Hence 4n is always of the form 3k + 1. What do you do to 4n to turn it into 4n + I? 1.29 This is not only a difficult question, it is also a trick question. In his book Mathematics: The Man-Made Universe (second edition, Freeman, 1969), in Theorem 3 of Chapter 8, Sherman Stein shows that-for example-the rectangle with sides of length 1 and ,J2 can not be cut up into finitely many squares in any way whatsoever! 1.30 Try the rectangle of sides of length 1 and ,J2, and suppose that, by way of contradiction, it can be cut into squares of side length a > O. 1.31 A 1800 rotation of one line does not change the fact that it is parallel to another line. 1.32 A right triangle, if cut the proper way. 1.33 There are really only two choices for which perpendicular to use. Show that at least one choice must produce a segment that lies within the parallelogram. 1.34 A motion without rotation will not change the fact that two lines are parallel. 1.35 The answer is best couched in terms of the ratio of the length of the altitude of the parallelogram to the length of its base. 1.36 In Fig. 1.4, the lines from q to the end points of the diameter are the hypotenuses of two similar right triangles, whose sides are thus in proportion.
1.37 A little experience with inequalities of real numbers will be helpful here. 1.38 This is a complicated but straightforward problem involving inequalities. 1.39 Note that no rotations are used, and that x 2 = abo 1.40 First show that the four right triangles in the "corners" of square A are c?ng(uent to each other. 1.41 Again, this is a long but not difficult problem.
Answers and hints
323
1.42 In Fig. 1.6, let the square S have side length c. Then show that a2 b 2 = c2 •
+
1.43 Read the summary immediately preceding this exercise. 1.44 Take a = 1 and b = 5 in the proof given in Section 1.6. 1.45 First, (4/3)3 > 2, so the numbers (4/3)3, (4/3)6, (4/3)9, (4/3)12, ...
are, respectively, larger than 2, 4, 8, 16, .... For the question about area, note that at each stage after the first, four times as many triangles are added as in the previous stage, and each new triangle has one-ninth the area of each triangle added in the previous stage. So the ratio of the resulting geometric series will be 4/9.
1.46 Use, axiomatically, that equidecomposable figures have the same "area." A detailed formal development of the area function A is quite long. 1.47 Cut the square into vertical line segments. Hold the one at the left fixed, and move each other segment to the right so that it ends up twice as far from the leftmost segment. 1.48 Go directly to a rectangle by using just one cut.
1.49 Try a construction similar to the one shown in Fig. 1.5. 1.50 Yes; one stops with congruent rectangles rather than going all the way to two equal squares. Details are given in the book of Boltyanskii mentioned at the end of Chapter 1. 1.51 If the ratio of the length of one side of one of the parallelepipeds to the length of one side of the other is a rational number, the construction is easy. Otherwise, the answer is still in the affirmative, but the author knows of no easy proof. That it is possible follows from a theorem mentioned in the answer to Exercise 1.29.
1.52 All four can be cut up and reassembled into congruent squares. All the cuts could then be superimposed onto a single such square. 1.53 It is possible; this follows from the answer to Exercise 1.51, but again the author knows of no easy proof. Try Exercise 1.54 instead.
1.54 If there are n cubes along each edge of the smaller cubes, and m along each edge of the larger cube, one must solve 2n 3 = m 3
324
Answers and hints
It follows from the Fundamental Theorem of Arithmetic (Section 7.3)
that this equation has no solution in integers. 1.55 Yes; the ingenious method of solving this problem is given in Chapter 7 of the text by Sherman Stein mentioned in the answer to Exercise 1.29. CHAPTER 2
2.1 A circle together with one of its diameters is a candidate for an answer to the first question. For other combinations, try finding figures that satisfy two of the properties but not the other two. 2.2 Certainly not the first. 2.3 Certainly not the last. 2.4 See the answer to Exercise 2.1.
2.5 A disk certainly does have property (a). 2.6 There are infinitely many different correct answers to this problem.
2.7 This is easy. 2.8 Circumscribe a circle about the square, and then move the points of the square outward along radii of the circle until the result is that the square has been deformed onto the circle. 2.9 Deform the curve until it lies in a plane. It then bounds a circular disk (after some possible additional deformation in the plane). The disk can be deformed to a hemisphere, and the other hemisphere then supplied. The curve has thus been deformed so as to become the "equator" of the sphere. 2.10 Procure an old inner tube, and draw a curve on it that goes around twice the long way while going around three times the short way. Do you think that every knotted simple closed curve can be deformed so as to lie on the surface of a torus? 2.11 Yes; but why? 2.12 No; but why not? (Give an example.)
2.13 Can you link each curve with every other curve? 2.14 In the first construction, first draw a completely splittable (n - 1)link.
2.15 One solution might have the curves so arranged so as to look like an infinite chain.
Answers and hints
325
2.17 Since one, and thus any, curve representing ab can be deformed continuously to a curve representing ba. The right convincing drawing can be found after a little effort. 2.18 Yes; since ab = ba, then also xy = yx if x and yare any two expressions whatsoever in the algebra. 2.19 Both (y-l X -l)(xy) and (xy)-l(xy) are equal to 1. 2.20 These two expressions do reduce to identical ones. 2.21 The answer is found in Section 2.5. 2.22 See Fig. 2.11. 2.23 Begin by copying Fig. 2.12. 2.24 Draw four separated circles, and then follow the formula found in Section 2.5. 2.25 No shorter formula is known to the author. 2.28 See Section 2.6. 2.29 Yes: Let x = aba- 1 b- 1 and y = c. 2.30 Here is one solution. Draw three circles A, B, and C forming the Borromean Rings, and then so arrange matters that D links A and B in the Borromean manner as well as linking A and C in the same way. One possible formula for D is thus aba - 1 b - 1 aca - 1 C - 1 . 2.31 Expand and simplify (x, y)(y, x). 2.33 Use a case argument. 2.35 Show that an (n + 1, 2)-Brunnian link can be constructed from an (n, 2)-Brunnian link by the methods of Section 2.6. 2.36 The two expressions represent the same curve if the point p is allowed to move. This is not permissible in the algebra, but is allowable for the purposes of constructing the various links of this chapter. This is the one way in which the geometry and the algebra are not in perfect correspondence. 2.37 Try
(a, b)(c, d)).
2.38 Use the form of the induction principle gIven Exercise 1.22. 2.39 Again, see Exercise 1.22.
In
the answer to
326
Answers and hints
2.40 Knottedness may be removed by reversing some of the crossings of a given curve over itself-how? 2.41 Look up "Pascal's Triangle" in any college algebra textbook. 2.42 There are many correct answers to this problem; however, certain arrangements are impossible. 2.43 To show the strip has only one side, start coloring at one spot with a crayon, and keep expanding the colored region. The whole strip will eventually be colored on "both" sides. 2.44 Note that the cut down the middle has the drawn line on both sides at all times, and never crosses it.
2.45 Try the experimental approach. 2.48 The surface would be one-sided; it is called a Klein bottle. If you find the surface difficult to visualize, this is probably because this figure cannot be placed in ordinary three-dimensional space without an artificial selfintersection-the same sort of artificial self-intersection you see when you try to draw a knotted simple closed curve on a (two-dimensional) piece of paper. 2.49 You can add any desired number of edges to a preexisting figure by removing the interior of as many small circular disks as you need. CHAPTER 3
3.1 Try solving the equation 1x
= 2.
3.2 The graph looks much like the one in Fig. 3.2. 3.3 Note that the number 1/2 must be taken to a negative power to give a value larger than 1. 3.4 First establish this for b 3.5 What does y
=
= l/e, where e > 1.
10gb x mean in terms of exponents?
3.7 Note that log2 (4 3 )
= 3 log2 (4).
3.8 Let 10gb xy = p, 10gb X = q, and 10gb y = r. Then bP
=
xy, bq
=
x,
and
3.9 Use the techniques of the previous exercise. 3.10 For b > 0, b i= 1, and x any real number, define bX to be that real number a such that a > 0 and x = 10gb a. Show first that such a value of a exists and is unique.
Answers and hints
327
3.12 The given expressions may be simplified, in order, to 1 1
+ 10gb C
C
-1 1
3.13 Use Exercise 3.11. 3.14 Let 10gb x = p and 10gb Y = q. 3.15 If 2x = 3, then x log 2 = log 3; the logarithms may be to any fixed base, such as 10. Why is this so? In any case, the answer correct to nine places is 1.584 962 501. 3.17 1
+
5/7.
3.18 The correct answer is not 1
+
log (5/2) , log 2
since the fraction there exceeds 1. Why? 3.19 Note that log 4 = 2 log 2. 3.20 Attack the numerator: log 2 = log (3/2)(4/3)
= (log 3/2) + (log 4/3). 3.21 34/25. 3.22 You should obtain the equation x
= 1+
1
2
3.23 If x 2 = 5, then x 2
-
+
x
4 = 1.
3.24 The answer is -1
+ 4../3 . 3
If you got this unlikely-looking monster-which we have simplified, by the
way-you almost certainly worked the problem correctly. 3.25 If../3 = (l; a, a, a, ... ), then you can obtain the equation ../3 = 1
+
1
a-I
and this should lead to a contradiction.
+
../ ' 3
328
Answers and hints
3.26 If you are in utter despair, see Exercise 3.50. 3.30 There is a connection between this exercise and the previous two, and you should have discovered it. 3.31 Can a sequence of positive numbers have a negative limit? 3.33 Increasing the denominator of a fraction with positive entries decreases its value, and conversely. 3.35 You should obtain (3; 6, 6, 6, ... )-and this is the correct answer. 3.42 First, of course, you must (correctly) guess that the limit is zero. For a proof: Let e > 0 be given. Then l/e is a positive number, so we may choose a whole number N such that N > l/e. Suppose that n is an arbitrary whole number such that n > N. Then
Il/n - 01 = l/n < l/N < e and hence the sequence has limit O. 3.43 The most natural definition of the sum, and the one used in mathematics, is the limit of the sequence
1, 1
+
1/2, 1
+
1/2
+
1/4, ...
obtained by "adding up more and more terms of the series." 3.44 See Exercise 10.35 if you wish. 3.45 By use of the definition in Exercise 3.32. 3.46 The results may surprise you. 3.48 This problem can be reduced to showing that the equation
3m
= 2n
has no solution if m and n are positive whole numbers. 3.49 Note that k 12
=
2. Why is this so?
3.52 A fourth is an "inverted" fifth. 3.60 Well, theoretically, yes. CHAPTER 4
4.3 No matter how the elements of (say) the group of Example 4.4 are renamed, one cannot obtain the multiplication of Example 4.5, since in the latter example the square of each element is the identity, and the former example does not have this property. This observation enables one to avoid
Answers and hints
329
consideration of the twenty-four cases corresponding to the twenty-four different ways of renaming the four elements of the first group with the names of the four elements of the second group. 4.4 Be sure to establish associativity the "easy way." 4.5 Consider the "value" of the product ef 4.6 Examine the element yxz. 4.7 Use the fact that x has an inverse yin G. 4.8 What is the value of (x- 1) -1 . x- 1 ? 4.10 Note that either 0 or 1 can stand for the identity of M. 4.11 There is only one way to fill in the table, if 0 is to be the identity. Why? 4.12 One can establish associativity by "realizing" the group of Example
4.5 as a group of (not necessarily all) motions of some geometric figure. One figure that will work is a rectangular parallelepiped. What are the appropriate motions to consider? 4.14 Yes. How? 4.16 If the element 9 appears twice in the row to the right of x, and in the columns headed by y and z, then xy = 9 = xz. 4.17 If the table were associative then it would have to be a group table. 4.18 Some other associative operations are these: x # y = x x # y
=
x
x # y
=
y.
+ Y + 17, + y - xy,
4.20 Try a group that is not commutative.
4.21 (W, +) contains just one subgroup of finite order-which one? 4.24 Yes; which one? Or are there more than one? 4.26 The identity has order 1. Xl = e?
To answer the second question, what if
4.27 The group L contains infinitely many elements of finite order and
infinitely many elements of infinite order. 4.28 A rotation of the disk one-seventh of the way around generates a subgroup of order 7. Generalize. The group L does contain many subgroups
of infinite order, but this may not be so obvious. 4.30 First show that (x- 1gx)n = x- 1g nx.
330
Answers and hints
4.31 See the next exercise. 4.32 Note that (gh)2 = ghgh. 4.33 To show that xy
=
yx, consider (xy) 2 •
4.35 If 9 has order 3, what is the order of g2? Can 9 and g2 be the same? 4.36 Note that (ab)" = a(ba)n- 1 b. 4.37 Note that b = a- 1 b 2 a. Substitute the b on the left-hand side for each b on the right-hand side. Continue this process, and eventually use the fact that as = e. 4.38 Since G is commutative, (xy)2 =
X
2y 2 for all x and y in G (why?).
4.39 See the answer to Exercise 4.36. 4.45 See the answer to Exercise 2.19. 4.46 Suppose by way of contradiction that 9 were an element of infinite order in the finite group G. 4.49 Is there a fixed value of k such that k is a multiple of each element's
order? 4.50 One method is to show that the identity must appear at least twice on the main diagonal of the group table for G. Suppose that it does not. 4.51 Yes; but why? 4.56 Use Exercise 4.54. 4.57 Try this first for n = 3. Generalize your proof to the case of arbitrary
natural number values of n. 4.59 Here is one way to show that Z is closed: Suppose that y and z are elements of Z. Then yg = gy and zg = gz for all elements 9 of G. Hence (yz)g = y(zg) = y(gz) = (yg)z = (gy)z = g(yz). Hence (yz)g = g(yz) for all elements 9 of G. Hence yz is an element
of Z by definition of Z. Hence the operation is closed in Z. 4.60 This proof is similar to the previous exercise. Do you need to know that G is finite? 4.62 Let n be the order of G.
Among the orders of subgroups of G are only the two natural numbers I and n. However, this does not show that among all natural numbers, n has only the two divisors I and n. How would you show that a group of order 12 has a proper subgroup? Generalize. 4.63 If x and yare elements of g-1 Hg, then for some hand k in H, x
= g-1hg
and
y
= g-1kg.
Answers and hints
331
4.64 Note that this is an "if and only if" proof. 4.66 If this is too easy, why don't you try a really tough problem: Let G be a group containing elements x and y such that, for some fixed natural number n, and Prove that x = e = y. 4.71 Even though A is not a sllbgroup, it has "cosets" such as gA, where 9 is an element of G. See the proof of LaGrange's Theorem to see what properties gA must have even though A is not a subgroup of G.
CHAPTER 5
5.2 Yes. 5.3 Yes. 5.4 Yes. 5.5 A man walking around a vertex passes through an even number of countries. Will this fact help in showing that a two-color "checkerboard" coloring pattern will work? 5.9 To prevent the boundary and exterior of (say) a cube from being a polyhedron. 5.10 The answer to the second question is "no." 5.11 The maximum possible is 12. Finding a map that requires all 12 colors is difficult; proving that 12 is sufficient for all such maps is very difficult. 5.13 It can be done. Curiously enough, it does not matter whether countries "go all the way through" the strip or whether one has different countries and non-coincident boundaries on the two "sides" of the strip. Can you see why not? 5.14 Yes; you can construct such "maps" requiring any given number of colors for a proper coloring. 5.15 There is no upper limit. 5.16 The proof might involve consideration of what happens to the value of V - E + F when one hole is "plugged up." 5.17 If the lines do not intersect, you have a net in the plane for which V - E + F = 1. What is the value of F? Reach a contradiction by considering the possible number of boundary edges of each country.
332
Answers and hints
5.19 First show that 3F
=
2E.
5.21 See Exercise 5.5. 5.22 Use the fact that each vertex lies on four edges to show first that 4V = 2E. Then, since each country is to have six sides, you can show that 6F = 2E. Since you also know that V - E + F = 2, you may be able to reach a contradiction using these three formulas. 5.24 Name the countries, and then proceed with a coloring scheme chosen so as to avoid cases. 5.25 A boundary edge must be a segment rather than a closed curve. So one should just introduce two (artificial) vertices onto the equator. 5.27 The points in the shaded region are exactly those satisfying the inequality of Steinitz's Theorem. 5.28 Try a few simple examples with small values of V and F. 5.30 Since E = 20, V + F = 22. Solve for F and use the inequality in Steinitz's Theorem to find the desired conditions on V. 5.35 Curiously enough, the answer is always F - 4. Why? 5.36 First establish that 3 V = 2E, and that 5F = 2E. Then use Euler's Theorem. 5.42 Use a case argument. 5.44 Use the techniques of the solution to Exercise 5.40. CHAPTER 6
6.13 The other formula DeMorgan's laws.
IS
also valid.
These formulas are known as
6.14 If k is the larger of the two numbers 111 and n, then the answer may be given in terms of an inequality involving k. 6.15 The answer is a formula involving k, m, and n. 6.16 Yes, the notationf(a) makes sense: If (a, b) Ef, thenf(a)
=
b.
6.19 The function f: R -4 R according to the rule f(x) = x 3 is sufficiently different from Example 6.11. 6.21 If you hold the page on which Fig. 6.5 is printed up to the light, and look at the other side of the page so that the x-axis is vertical and the positive y-axis is to the right, this has the effect of interchanging these two axes; thus, what you see is the graph off-I.
Answers and hints
333
6.22 Here is one way to show thatf- l is one-to-one: Suppose that x and yare elements of B, and that f-l(X) = f-l(y). Since f is a function, f(f-l(X)) = f(f-l(y)). Hence x = y. Therefore, if x :F y, thenf-l(x) :F f-l(y). Therefore f- l is one-to-one. 6.27 The appropriate notation would be {f - 1(g - I)}. 6.29 Note that (x
+
+
1)2 is not always equal to x 2
1.
6.30 Is the converse true? 6.32 Try f: N
-+
E according to the rule f(x) = 2x.
6.34 What about the function: 9 N
-+
T by
+
g(x) = x
9?
6.36 One possibility is f(1) = 2,
f(2) = 1, f(x) = x
if
x > 3.
Now find two more. 6.38 First devise the correspondence; then, if you have been sufficiently systematic, you can devise a formula for the appropriate function. 6.39 Draw a triangle with base two units long. It has a parallel median which must be one unit long. How do you correspond the points of the base with the points of the median? 6.40 Modify the answer to the first question to answer the second. 6.43 Since you want to put Band C into one-to-one correspondence, the trick is to draw two circles and label them Band C. Inside of each draw a smaller circle, and determine what sets these two should represent in order to be able to apply the Cantor-Schroeder-Bernstein Theorem. 6.45 One can let f: W
-+
N according to the rule
f(x)
=
2X
f(x) = 3- X
if
x > 1,
if
x
~
o.
6.52 You may wish to show this first in the special case in which A n B = and then apply some of the previous exercises or theorems. 6.59 Remember that S must contain a denumerable subset.
0,
334
Answers and hints
6.72 How many three-element subsets has N? subsets has N?
How many five-element
6.75 Compare the following two versions of the method of performing the experiment: First Method: At each stage, remove from the urn the three lowestnumbered balls not previously moved, then replace in the urn the two lowestnumbered balls outside the urn. Second Method: At each stage, remove the three lowest-numbered balls in the urn, then replace the two highest-numbered balls then outside the urn. Do you see a way to perform the experiment so that, at its conclusion, exactly thirty-seven balls are in the urn? It can be done!
CHAPTER 7
7.2 Yes; see the next exercise. 7.4 Check your answer with a few experiments. 7.5 Remember that integers may be negative as well as positive. 7.6 Remember that each prime is positive. 7.9 This is true under certain conditions, but not always true. 7.10 Yes; supply a proof. 7.12 Yes; supply an example. 7.17 The last n numbers in the sequence are composite. 7.18 The answer to each question is "no." 7.21 See Exercise 1.23 and the answer to it. 7.25 If a were the least positive real number, what about a12? 7.28 E is a subset of N. 7.29 What is the last positive rational number? 7.31 The number happens to be 65, but you should prove its existence using the Well-Ordering Axiom. 7.33 This is a moderately long problem. If you have studied Chapter 3, compare this exercise with Exercise 3.33. 7.34 If a < b, x < y, and all numbers involved are positive, then ax < by. Why?
Answers and hints
335
7.35 Simplify the expression
+
(n
1)3 - (n
+
I).
7.38 The second question is much easier than the first! 7.41 Use the Fundamental Theorem of Arithmetic. 7.42 No; but why not? 7.43 Handle 3m
+
nand m
+
2n separately if you wish.
7.44 No; but why not? 7.45 Take n = 7, of course. 7.46 Use Wilson's Theorem to show that 20 has a proper divisor. 7.47 Note that 101 is prime. 7.48 Compare this with Exercise 7.9. 7.50 The formula is
«(Xl
+
1)((X2
+
1)((X3
+
1)··· «(Xk
+
1).
7.53 This one is not easy. 7.58 The number 4 is always a divisor of the left-hand side and never a divisor of the right-hand side. 7.59 First find integers m and n such that 12m 7.61 If (n, n + 3) why?
+
13n
= 1.
= 2, then 2 would be a divisor of (n + 3) - n. But
7.63 If n is odd, then n has the form n
= 2k + 1 for some integer k.
7.64 Show that every common divisor of m and m It will then follow (why?) that (m, m + n) I n.
+n
is a divisor of n.
7.73 The author obtained the following solutions, but could persuade no one else to check his answer to this laborious problem: The triple (x, y, z) may be any of the following and no others. (7, 1, 1)
(3, 3, 1)
(5, 2, 1)
(2, 2, 2)
(4, 1, 2)
(1, 1, 3)
(1, 4, 1)
7.74 No way. It cannot be done.
336
Answers and hints
7.75 If X is the number of new eggs, y the number of fresh eggs, and z the number of old eggs, then X
+
Y
+ z
lOx
+
2y
=
100,
and
+
z = 200.
If you solve the first equation for z and substitute the result in the second equation, you obtain
+Y=
9x
100,
which is no problem to solve. Of the ten positive solutions, only x = 10, y = 10, z = 80 satisfy the last condition of the problem. 7.76 Use the same sort of simplification as in the previous problem. There is only one solution. 7.77 There is no need to list the solutions in order to count them; however, there turn out to be 45 ways in which the check could have been written. 7.81 If there are b brass balls, c copper balls, and s silver balls, then one obtains the equation 15b
+
16c
+
I7s = 121.
This problem is quite long, and the author is reasonably sure that there is only one solution. Hint: In that solution, no balls of one type were used. This is not ruled out as a possible solution. 7.82 You can get an easy proof if you know the following fact: If m > 2 is a whole number, then there exists at least one prime p such that m < p < 2m. Observe that if
1/2
+
1/3
+ ... +
1/11 = k,
where k is a whole number, then n must be much larger than k. However, there is a proof-hard to find-that does not use the above rather advanced result of number theory. 7.83 If the prime p is of the form 411 + 3, and is the sum of two squares, then one of the squares must be odd and the other even. 7.84 A small number of cases should be considered. 7.86 Use Exercise 7.48. 7.90 One solution is
Answers and hints
337
7.91 This is not so easy. The author believes that the smallest perimeter that solves the problem is 480. 7.92 This is easy, and can be answered without much trouble for any whole number, not just 10. 7.93 There is only one solution. 7.94 This is quite difficult, unless you have found a short cut unknown to the author. The smallest solutions he obtained are
32
+
(696)2
42
= 52,
+
(23,660)2
(697)2 = (985)2,
+
(23,661)2
=
(33,461)2.
Might this problem have infinitely many solutions? 7.95 This is just a matter of checking a few cases. 7.96 Unlike Exercise 7.94, this is quite easy. 7.100 Use Fermat's Theorem.
CHAPTER 8
8.1 The conditions of the problem indicate that k is positive. Hence N' is always positive, but as N increases, the value of N' gets closer to zero. 8.2 Since N' is constant, the graph of N(t) will be a straight line in each case.
8.3 The equation N'
=
B - D becomes
N'
= bN - dN,
N'
= (b - d)N.
so that Since b - d is constant, set k
b - d.
=
8.4 Note that No = No' e kT
2 by definition of the half-life T. 8.6 The graph of N(t) will look very much like one of those shown in Fig. 8.6; that is, taking the square root of the degree of realization term has little effect on the long-term behavior of N.
338
Answers and hints
8.9 Since the water flows out at the rate V'et), and V'(t) is proportional to the pressure at time t, and the pressure then is proportional to Vet) itself because of the shape of the tank, we obtain the differential equation V'et) = k Vet),
where k is a negative constant. Now see Exercise 8.4. 8.10 An equation giving a satisfactory interpretation of the conditions of the problem would be M - N - d N ' = bN M
'
where band d are positive constants. _~.11
You should obtain a stable critical point where the bluegill and redear lines cross; that is, the two species can coexist.
<8:14 Arrows should definitely be drawn on the coordinate axes; it can happen that some curve of population trend does meet an axis, indicating the disappearance of one species. 8.16 If the three populations are A, B, and C, then one of the three equations would be A'
=
kA M - A - aB - f3C
M
'
where M is the maximum population of population A the pond will support and a and f3 are positive constants. The behavior of the system could be examined by means of a graph in the first octant in three-dimensional space; instead of lines where A = 0, there would be planes, with A positive beneath the plane corresponding to it and negative above. There are a large number of possibilities for the eventual behavior of the system; for example, one species might disappear, followed by the coexistence of the remaining two. I
I
8.18 The system is critical for N the latter unstable.
= M and for N = 0. The former is stable,
8.20 Compare this exercise with Exercise 8.10. 8.21 Wide variations in population may result in the elimination of one species from the pond. 8.23 One reasonable set of equations is A' = (kA _ f3B) M ;; A ,
B' = (vB _ aA) C
~
B ,
Answers and hints
339
where all constants are positive, M and C representing the maximum populations of A and B, respectively, that can be supported by the pond. 8.28 The qualitative behavior of the solution is unchanged. 8.29 The fact that ex and p are positive describes the condition that each of the
two species contributes to the success of the other. A culture of yeast and slime mold on nutrient agar in a large dish will exhibit this sort of behavior. 8.33 You should not spray unless you can spray enough to completely
eliminate aphids. 8.35 Yes; what is the value of to in terms of the given constants? 8.37 One could obtain the solution of the differential equation, substitute
enough values of the population at various times in order to evaluate all unknown constants, and then check the resulting solution with values of the population at other times. 8.38 See the answer to the previous exercise. CHAPTER 9
9.2 If and only if a = b. 9.3 Use the same definition.
[a, b], if p is different from a and different from b then the same straight line contains {a, p} as contains {p, b}.
9.5 Since p
E
9.7 Use the previous two exercises if you wish. 9.8 See the next exercise. 9.9 This is Theorem 9.2. 9.10 No; give an example. 9.13 See the material on "proof by induction" in Chapter 7, or see Exercise 1.45. 9.14 No; proof by induction can only show something true "for each natural number n." However, although it does not follow from Theorem 9.2 and
Exercise 9.12 that the intersection of infinitely many convex sets is convex, this is nevertheless true. 9.15 To answer the last question of the exercise, you know that either A c B or B c A; there is no harm in supposing that the sets have been so named that A c B. 9.16 Yes, and the proof is the same.
340
Answers and hints
9.17 Yes; try proving it. 9.18 Again, yes. 9.19 Of course; the proof is very simple. 9.20 Consider the collection of all circular regtons (including boundary points) centered at the origin in E 2 • 9.25 What if A consists of three noncollinear points in E 2 ?
9.27 The first three questions of the exercise have affirmative answers. 9.28 The number of applications of A must be increased by one.
9.30 This exercise turns out to be very useful in some later problems. 9.31 No. Instead of triangular regions, what sort of sets should ~ consist of? 9.32 The answer to the second question is too easy if 0 an example in which 0 does not belong to CC.
E
CC, so try to give
9.33 Some candidates for the property might be "being linear" and "containing no straight line segment of length exceeding one." 9.34 It is possible but not necessary; try finding a proof in which the axiom is used, just for practice. 9.35 It turns out that the only consistent interpretation of n CC is E 2 • No points p of E 2 have the property indicated in the exercise, since there are no sets-and thus no such sets-in CC. Try showing that if CC is the empty collection of subsets of E 2 , then vCC = 0. 9.41 Yes. Hence our proof is really a disguised proof by induction. See the Index for more on induction. 9.42 Helly's Theorem does apply here.
9.43 Although Helly's Theorem does not immediately apply, it is still possible to reach some conclusion by considering the circular regions bounded by the circles in CC. 9.44 Nothing, for the analogue of Helly's Theorem in E 3 requires that each four sets have a point in common. There is a connection with Exercise 9.28 and the version of Exercise 9.30 for E 3 , a connection which explains why the number in the theorem must be increased by one. 9.45 This is not easy. See Exercise 9.43.
9.46 See Exercise 9.44 and the answer to it. 9.48 The easiest way to consider cases is to consider how many of the points are collinear.
Answers and hints
341
9.52 In order to apply Helly's Theorem in E 3 , the number of sets that intersect must be increased to four. Hence it would be necessary for each set of four pictures to be visible from some point in the gallery. 9.53 Yes; give an example. 9.55 Yes; one way is to divide E 2 into the disjoint convex sets E 2 and There are other solutions; find them.
0.
9.56 By Helly's Theorem, it is clear that k < 3. 9.57 Note that the equation of the straight line through (n, 0) with slope n is y = nx - n2 •
9.58 Examine a regular tetrahedron. 9.64 Use the axiom of Section 9.3. 9.66 No; give an example. 9.68 No; give an example. 9.72 This is possible even if the intersection is "connected"-see if you can find such an example. 9.80 One of the implications is true, the other false.
CHAPTER 10
10.2 Use the fact that each rational number can be expressed in the form min, where m and n are integers and n i= O. 10.4 No. Why not? 10.5 The decimal expansion given in the text can be so used; how? 10.6 How about 3.14159? 10.10 Every rational number has a finite continued fraction expanSIOn. This follows from the Euclidean Algorithm (Theorem 7.8). 10.12 Use the technique of Theorem 10.1. 10.15 8/9. 10.16 327/999. Examine the previous exercise; do you see a pattern? 10.17 Note that if 0.999 999 ... were less than 1, then there would be a number r such that 0.999 999 ... < r < 1.
342
Answers and hints
10.20 The techniques of Chapter 7 can be used to show that there are irrational numbers not solutions to any equation of the form
= 0,
p(x)
where p(x) is a polynomial with rational coefficients. transcendental numbers.
These are called
10.22 There is a connection with Exercise 3.48. 10.23 Suppose by way of contradiction that 10.25 You can conclude that b 2
IX
+ r is a rational number.
4ac is the square of an integer.
-
10.27 The proof is similar to that given in the answer to Exercise 3.42. 10.29 Nothing can be done with either (d) or (e). Why not? 10.32 The sequence has no limit. 10.33 This is not easy. 10.34 Expand the product (1 - r)(1
+r+
r2
+
r3
+ . .. + r n).
10.39 See Section 10.5. 10.44 Let J 1 = /b J 2 =
/k+
l' and so on.
10.45 Treat the case when both real numbers are positive first. Define the product in the other cases by using absolute values. 10.59 If a set had two least upper bounds, one would have to exceed the other. 10.61 Clearly, 0
~
m ::; 9. Or is it clear?
10.62 The construction differs in the choice of the first interval. 10.63 What about zero? 10.65 Both (0, 1) and (0, - 1) are solutions of the equation x2
+
1 = O.
10.68 For example, show that 1/4 E K. 10.69 The function has the same value at the two numbers.
INDEX
INDEX
Absolute value, 289 Aleksandrov, A. D., 318 Alexandroff, Paul, 317 American Mathematical Monthly, 107 Antoine, L., 51 Approximations by continued fractions, 63-65 Archimedean property, 303 Area, 21 Art Gallery Theorem, 254 Associative law, 82 operations in real number system, 301 Bach, J. S., 52, 76 Banach, S., 21-23 Batting average, 66 Beck, Anatole, 318 Beet virus molecule, 142 Beiler, A. H., 218, 316 Benade, Arthur H., 81 Benson, R. V., 279 Bers, Lipman, 317 Birkhoff, Garrett, 106, 316 Bleicher, Michael N., 318 Boltyanskii, V. G., 23, 279
Bolyai Farkas, 23-24 Janos, 24 Bolyai-Gerwin Theorem, 2, 6 Borromean Rings, 25 generalizations, 29 formula, 41 see also Brunnian links Brunn, H., 29, 51 Brunnian links, 29 (4, 2)-links, 45-46 (n, k)-links, 43, 49 Cancellative operations, 95 Cantor, Georg, 177, 182 Cantor-Schroeder-Bernstein Theorem, 164 applications, 169-172, 176,181 Cantor Ternary Set, 312 length, 313 number of elements, 314 Cardinal numbers, 179 existence, 180 Center of group, 104 Cohen, Paul J., 182 Cohn-Vossen, S., 143, 317 Commutative law, 34 345
346
Index
Commutative-continued operation in group, 87, 94 see also Group, Abelian Commutator, 45 k-commutator,48 Complex number system, 310 Composite numbers, 185 consecutive, 187 prime factors, 186 Congruence motions of disk, 90 product, 84-85 of square, 93 of tetrahedron, 90 of triangle, 83-87 Constructions with straightedge and compass, 6, 19, 23 Continued fractions, 59-61 and batting averages, 66-67 construction, 75-76 evaluation, 62-63 and grade distributions, 67-69 and irrational numbers, 286 Continuum, 179 Convex hull, 261 kernel, 261 polyhedron in Steinitz's Theorem, 116 sets, intersection of, 258-260 sets, tower of, 260 Convexity, 255 generalizations, 276-279 Coset of subgroup, 96 Coxeter, H. S. M., 143, 317 Critical point, 240 Cross-cancellative operation, 95 semigroup, 105 Crowdis, David G., 318 Crowe, Donald W., 318 Crowell, R. H., 51 Curve, knotted, 30 on torus, 30 polygonal, 28 simple closed, 27-30 tame, 28 wild, 28 Cycles in population, 249
Debrunner, H., 23, 51, 279, 317 Decimal expansion, 308-309 Dedekind, Richard, 183 Dedekind Box Principle, 172, 175 applications, 177 Dedekind infinite, 174 Degree of realization, 230, 232 in competing populations, 234 Dehn, M., 18 Differential equations, 223 for competing populations, 234, 245 in population growth, 224-226 in radioactive decay, 227 Divisibility, 184 Divisors, 184 product, 189 number, 205 see also Greatest common divisor Dorrie, Heinrich, 317 Dudley, Underwood, 218 Duplication of cube, 18 Dynkin, E. B., 143 Equidecomposable figures, 3, 22, 23 Equivalence relation, 5, 162 between nested sequences, 293, 295 Euclidean algorithm, 92, 196, 197, 219 proof, 197-202 Euler, L., 118, 144 Euler's formula, 108, 118 proof, 118-122 Exponent of group element, 98 Factorization into primes, 196, 202, 209-210 Fermat, Pierre de, 184 Fermat's Theorem, 218 Fifths, 73 improving, 77 Five-Color Theorem, 144 Fort, M. K., 51 Fox, R. H., 51 Fraleigh, John B., 318 Frequencies, 70 of notes on piano, 73 Functions, 154-162
Index
Fundamental Theorem of Arithmetic, 202, 204 applications, 281 proof, 202-203 Gause, G. F., 252 Gauss, C. F., 218-219 Gelfond, A., 218 Generator of group, 102 Geodesic dome, 142 Geometric series, 283 Goffman, Caspar, 315 Golden Mean, 62 Greatest common divisor, 205 computation, 205-206 Greek alphabet, 319 Griffin, Harriett, 218 Group, 92 Abelian, 102 associated with curves in space, 38 center, 104 cyclic, 102 examples, 82-92 of prime order, 102 uniqueness of identity, 93 Grtinbaum, Branko, 143, 279 Hadwiger, H., 18, 23, 279, 317 Half-plane, 273, 275 Hall, M., 106 Harmonics, 70 Hausdorff, Felix, 182 Helly's Theorem, 266, 268 applications, 270-271, 273, 276 Herstein, I. N., 106 Hilbert, D., 143, 317 Hocking, John G., 317 Homomorphism, 105 Horn, Alfred, 279 Image of homomorphism, 105 of function, 155-156 Induction Principle, 10, 190 and well-ordering, 190-193 applications, 10-11, 49, 193-195, 259 Infinite series, 69-70, 295
347
Integers, 184 Intermediate fractions, 64 Intervals, 288 length, 288 nested sequences of, 293 sum of, 297 Inverse of group element, 83 uniqueness of, 93 Jacobson, Nathan, 106 Jakobson, Roman, 317 Jeans, Sir James, 81 Join of point and set, 277 Kernel of homomorphism, 106 see also Convex Khinchine, A. Ya., 81 Klee, Victor, 23, 279, 317 Kolmogorov, A. N., 318 Krasnoselskii, M. A., 271 Krasnoselskii's Theorem, 254, 271 Kripke, Bernard, 315 Kurosh, A. G., 106 Lack, D. L., 253 LaGrange, J. L., 107 LaGrange's Theorem, 96 Lavrent'ev, M. A., 318 Law of Quadratic Reciprocity, 218 Law of Small Whole Numbers, 72 L-convexity, 277 Least upper bound, 303 Limit of sequence, 65, 69, 284, 289291, 294-295 Links, 30, 32 splittable, 30 see also Brunnian links Lobachevsky, N. I., 24 Logarithms, 53-56 Lomont, J. S., 317 Lyusternik, L. A., 143, 279 MacLane, Saunders, 106, 316 Mansfield, M. J., 314 Map on Mobius strip, 117 on sphere, 116 on torus, 116
348
Index
Mobius strip, 50, 117 Moise, E. E., 23 Mordell, L., 218 n-link,30 completely splittable, 32 splittable, 30 sublink of, 32 Natural numbers, 184 composite, 185 prime, 185 Nested sequences of intervals, 293 equivalence of, 295 Net of polyhedron, 118 Niven, Ivan, 218, 316 Nonmeasurable set, 21 Odum, Eugene, 253 One-to-one correspondence, 160-162 Order of group, 96 infinite, 96, 99 of group element, 99 Ordered pair, 159 Ore, Oystein, 143
156,
Parallelogram formed from triangle, 11 reassembled into rectangle, 11 Partial quotients, 62 Passman, D. S., 106 Polya, George, 192, 318 Polygon, 2, 109 connected, 109 edge, 2 equidecomposable, 3 vertex, 2 Polyhedron, 113 edge, 115 face, 115 2-connected, 116 vertex, 115 Primes, 102, 185 Pythagorean right triangles, 217-218 Pythagorean Theorem, 16
Rademacher, Hans, 218 Radioactive decay, 227 Rational numbers, 280 Real numbers, 297 Archimedean property, 303 as nondenumerable set, 178 decimal expansions, 308-309 order relation, 302 Rectangle formed from parallelogram, 11 reassembled into square, 12-14 Regular solid, 123 Rudin, Walter, 314 Schoenflies Theorem, 110 Semigroup, 104 Sets, 146 algebra, 153 and Venn diagrams, 151 Cartesian product, 160 convex, 255 denumerable, 176 descriptive definition, 147 difference, 154 distributive laws, 152-153 element, 146 empty, 150 equality, 148 finite, 163 inclusion, 149 infinite, 163 intersection, 150 listing, 147 maximal convex, 264 maximal tower, 263-264 maximal with respect to a property, 263 nondenumerable, 177 nonlinear, 263 notation, 147-148 number of elements, 154 subset, 149 tower, 260 union, 149 Sierpinski, W., 23 Sigmoid curve, 232 Singer, I. M., 51
Index
Slobodkin, Lawrence B., 253 Snowflake curve, 19-21 Spivak, Michael, 317 Sprecher, David A., 315 Square formed from rectangle, 12-14 formed from several squares, 14-16 reassembled into given polygon, 16-17 Squaring the circle, 18 Stein, Sherman, 144, 318 Steinitz, E., 127, 143 Steinitz's Theorem, 127 proof, 128-133 Stoll, Robert R., 182, 315, 317 Straight line segment, 255 Subgroup, 95 improper, 96 normal, 104 proper, 96 Tarski, A., 19, 22 Ternary system, 311 Texan rectangle, 7 Thirds, 78 improving, 81 Thomas, J. M., 144 Thorpe, John A., 51
ABCDE79876S432
349
Tietze, Heinrich, 143, 317 Torus, 30, 116 Transposition, 74 Triangle reassembled into parallelogram, 11 Trisection of angle, 18 Unbounded figure, 6 Uspenskii, V. A., 143 Valentine, F. A., 279 Venn diagrams, 151-152 Wall, H. S., 81, 317 Well-Ordering Axiom, 190 Well-tempering, 76-77 Wheeler, Brandon W., 318 Wilder, R. L., 315, 318 Williams, J. D., 317 Wilson's Theorem, 204, 205 Wolf Interval, 74 Yaglom, Y. A., 279 Young, G. S., 317 Zermelo Axiom, 263 Zuckerman, H. S., 218, 316