AMM April 2014

Periodicity Domains and the Transit of Venus Author(s): Andrew J. Simoson Source: The American Mathematical Monthly, Vol

Views 101 Downloads 8 File size 4MB

Report DMCA / Copyright

DOWNLOAD FILE

Recommend stories

ImagineFX - April 2014

70 0 21MB Read more

Cook Vegetarian - April 2014

OFC_apr_v2 Qx_Layout 1 14/02/2014 16:01 Page 2 COOK VEGETARIAN MEAT-FREE FOOD MADE EASY ITALIAN RICOTTA EASTER TRIFLE

8 0 29MB Read more

ImagineFX - April 2014.pdf

photoshop hours of FREE VIDEO tutorials Workshops animate your digital art in photoshop how to draw a charismatic male

123 2 63MB Read more

Guitar Fair Magazine April 2014

8 0 21MB Read more

AMM 442174e

TO: HOLDERS OF AIRCRAFT MAINTENANCE MANUAL, ATA NUMBER 44-21-74, REVISION 3, DATED OCT 15/07, FOR THE LAN 767 TOPSERIES

18 1 14MB Read more

a320 amm

HDG. AMM SROS May 94 00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

34 1 204KB Read more

A320 AMM

__________ HIGHLIGHTS REVISION NO. 48 Feb 01/10 Pages which have been revised are outlined below, together with the High

42 2 22MB Read more

The Artist s Magazine - April 2014

13 0 58MB Read more

Structure - April 2014 - Grupo G Anexo 1d

1 0 79KB Read more

Orifice Plate Fabrication Details - April 2014

REAFFIRMATION April 2014 Process Industry Practices Process Control PIP PCFFL000 Orifice Plate Fabrication Details P

7 0 383KB Read more

Author / Uploaded
Tamash Ionut

Citation preview

Periodicity Domains and the Transit of Venus Author(s): Andrew J. Simoson Source: The American Mathematical Monthly, Vol. 121, No. 4 (April), pp. 283-298 Published by: Mathematical Association of America Stable URL: http://www.jstor.org/stable/10.4169/amer.math.monthly.121.04.283 . Accessed: 30/03/2014 17:28 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp

. JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected].

.

Mathematical Association of America is collaborating with JSTOR to digitize, preserve and extend access to The American Mathematical Monthly.

http://www.jstor.org

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:30 PM All use subject to JSTOR Terms and Conditions

Periodicity Domains and the Transit of Venus Andrew J. Simoson Abstract. A transit of Venus occurs when it passes directly between the Earth and the Sun. A straightforward linear algebraic model for the orbits of Earth and Venus—essentially using one parameter, namely, the relative angular velocity σ of Venus—is powerful enough to generate respectable transit year predictions. We generalize, allowing σ to vary; uncover an algebraic analog for predicting transits; and show that time cycles for transits are what they are because each σ is sufficiently close to a suitably simple rational number, which for Venus is 13 , and 8 which in turn induces a modulo 8 shuffling of successive transit years by a factor of 3.

1. INTRODUCTION. At least once each year, Venus passes between the Earth and the Sun. Because the orbital planes of Earth and Venus intersect one another at an angle µ, only rarely does it come directly between the Earth and the Sun. On these occasions, the profile of Venus—a transit of Venus across the Sun—can be viewed from Earth. The last transit was in June 2012 and the next one will be in December 2117. Ascertaining the periodicity of the transits is a delicate problem. In particular, relative to Earth’s angular frequency of one rotation per year, Venus makes σ0 ≈ 1.62555 rotations per year. From this value, how can we deduce the 105-year transit lapse between, say, 2012 and 2117? And in general, as we allow angular velocity σ to vary, how does the time lapse between transits change? The answer is surprisingly chaotic. Before showing this, we first give some transit history.

Figure 1. A Venus transit viewed against a spire of the Taj Mahal, June 2012, courtesy of AP Photo/Kevin Frayer

2. A LITTLE HISTORY. In 1629, Johannes Kepler predicted a 1631 transit of Venus and estimated the period between transits as 120 years. The first recorded transit observation was in 1639 by Jeremiah Horox and William Crabtree. In 1663, James Gregory realized that careful observations of these transits would enable the scientific community to determine the distance a of one astronomical unit (AU)—the distance http://dx.doi.org/10.4169/amer.math.monthly.121.04.283 MSC: Primary 11A07; 70F15

April 2014]

PERIODICITY DOMAINS AND THE TRANSIT OF VENUS

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:30 PM All use subject to JSTOR Terms and Conditions

283

between the Sun and the Earth—in miles. Up until Kepler’s day, the reigning guess for a had been about five million miles; Kepler, after studying the geocentric parallax of Mars, bumped the value of a to at least 15 million miles. With the advent of the telescope, the guesses improved. In 1716, Edmund Halley predicted that a was “14,000 semi-diameters of the Earth” or about 56 million miles, and championed Gregory’s plan to test his guess [4].

Figure 2. William Crabtree observing a transit; mural at Manchester Town Hall by Ford Madox Brown (1821– 1893)

But for Halley, the next transit for Venus was 45 years in the future. Therefore he charged the astronomers of two generations hence to do what he could not. As a recent biographer of these events has written, “even on his death-bed whilst holding a glass of wine in his hand, Halley said, ‘I wish that many observations of this phenomenon might be taken by different persons at separate places”’ [11, p. xxiv]. Astronomers of the eighteenth century had two chances to observe, June 1761 and 1769. Many of the colorful adventures of these astronomers as they answered Halley’s call are chronicled in [10] and [11]. As reviewed recently in detail by Teets [9], James Short analyzed transit data from sites as far afield as South Africa and northern Finland, and published his conclusions in the December 1761 issue of the Philosophical Transactions of the Royal Society that a was 93,726,000 miles. The standard reference for transit dates is Jean Meeus’s tables, spanning 6000 years [5]. Espenak [3], who compiled NASA’s website on transits, names Meeus’s work “an indispensable reference for anyone wishing to do transit calculations.” DanlouxDumesnils [2] calls Meeus’s original tables [6] “une belle e´ tude.” Much of Meeus’s number crunching is based on “the modern planetary theory VSOP87 of the Bureau des Longitudes of Paris,” [5, p. 1]. Against this standard, we contrast our results. 3. THE MODEL. We assume that the orbits of Earth E and Venus V are circles, with periods of τe ≈ 365.26 days and τv ≈ 224.70 days, respectively. By Kepler’s third law of planetary motion, with time t in years and distance in astronomical units (AU), a 3 = τ 2 where a is the semi-major axis of a planet’s elliptical orbit and τ is its period. Thus, Venus is λ ≈ 0.723 AU from the Sun S. We further assume that E’s orbit lies in the x y-plane with S at the origin O and that V ’s orbit lies in a plane through O inclined at angle µ ≈ 3.39◦ to the x y-plane. We call the line between these orbital planes the nexus line or, according to Meeus [5], 284

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:30 PM All use subject to JSTOR Terms and Conditions

the line of nodes. The nexus line in Figure 3 is labeled BC. A nexus point or node for Venus—F and G in the figure—or for Earth—B and C in the figure—is where the orbit of V or E pierces the orbital plane of E or V , respectively. Transits will only occur when E and V are both near B and F, respectively, or both near C and G. The former transit is called a fall transit because in modern times E is at B in early December; it is also called, according to Meeus, an ascending transit, because as V ’s profile moves across S from left to right its trajectory rises. The latter transit is called a spring transit because E is at C in early June; it is also called a descending transit, because the corresponding trajectory decreases. E’s and V ’s position at any time is given respectively by E(t) and V (t):  cos(2π t) E(t) =  sin(2π t)  and 0 

  1 0 0 cos(2π σ t) cos µ sin µ   sin(2π σ t)  , V (t) = λ  0 0 − sin µ cos µ 0 

(1)

where σ is the relative angular velocity of V with respect to E. For simplicity, we initially position V and E at their spring nexus points. The value of σ for the actual V and E is σ0 = τe /τv ≈ 1.62555. The 3 × 3 matrix in (1) corresponds with a clockwise rotation by µ about the x-axis, so as to be consistent with a descending (spring) transit occurring near nodes (nexus points) C and G, where C = (1, 0, 0). A line parametrized by u from E through V at time t is P(u, t) = V (t) − E(t) u + E(t).

(2)

To find the projection of V ’s shadow on S as viewed from E(t)—an ideal geocentric point in space at E’s center—we imagine that S resides within a rotating plane or screen S(t) ever perpendicular to E(t). Figure 3 shows the two orbital planes and V ’s projection on the screen as viewed from E. The plane S(t) of S can be written as X · E(t) = 0

(3)

where X is a general point (x, y, z) on the screen. When E and V are on opposite sides of the screen at time t—which happens if and only if E(t) · V (t) < 0—we take the projection point of V onto the screen as that screen point between the planets. een

scr E’s orbit

C G

had

s V ’s

ow

V’s orbit V(t)

Sun O

F

nexus pt for V nexus pt for E

E(t)

B axis between the planetary planes Figure 3. The screen of the Sun

April 2014]

PERIODICITY DOMAINS AND THE TRANSIT OF VENUS

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:30 PM All use subject to JSTOR Terms and Conditions

285

We combine (2) and (3) so as to find the point X (t) where the line intersects the plane. That is, the equations P(u, t) = X and (3) are the following system of four equations with four unknowns x, y, z, u, as well as the time variable t:   x = (λ cos(2πσ t) − cos(2πt))u + cos(2π t)    y = (λ cos µ sin(2πσ t) − sin(2πt))u + sin(2πt) (4)  z = −λ sin µ sin(2πσ t)u    0 = x cos(2π t) + y sin(2π t). b Writing (4) as a matrix equation gives A b X (t) = E(t), where 1 0 0 1  A= 0 0 cos(2π t) sin(2π t) 

 0 cos(2π t) − λ cos(2π σ t) 0 sin(2πt) − λ cos µ sin(2π σ t)   1 λ sin µ sin(2π σ t) 0 0

(5)

b being the respective vectors (x, y, z, u) and (cos(2π t), sin(2πt), with b X (t) and E(t) 0, 0). For this transformation, det(A) = −1 + λ cos(2πσ t) cos(2π t) + cos µ sin(2π σ t) sin(2πt) λ (1 + cos µ) cos(2π(σ − 1)t) + (1 − cos µ) cos(2π(σ + 1)t) 2 λ ≤ −1 + |1 + cos µ| + |1 − cos µ| = −1 + λ < 0. 2

= −1 +

b Since it would Because the determinant of A is never zero, then b X (t) = A−1 E(t). be convenient to see these points of intersection on a stationary screen rather than the dynamic plane S(t), we clockwise rotate the first two components of b X (t) about the z-axis by 2πt radians. The result of such a transformation is a set of points whose first three components trace V ’s projection onto the screen of S. Finally, since the first component of such points will always be 0, and we are disinterested in u, we project this set of points so as to obtain their second and third components as ordered pairs, which we index as W (t) = (W1 (t), W2 (t)),  cos(2π t) sin(2πt) 0 0 0 1 0 0  − sin(2π t) cos(2πt) 0 0  −1 b W (t) = A E(t). 0 0 1 0  0 0 1 0 0 0 0 1 

(6)

0.005 −1

0.1

distances in AU

(a) A wide screen

T113.5

1 T117.5

0.005 T121.5

(b) Zooming in near the Sun

Figure 4. Trajectories of V ’s shadow on the screen of S

286

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:30 PM All use subject to JSTOR Terms and Conditions

Figure 4(a) shows the path of V ’s projection on the screen over 1.5 years. In our model, spring transits only occur near integer years, n, and fall transits only occur near half-years, n + 21 . Figure 4(b) is a close up of the screen near S over a period of about ten years, displaying three arcs of V ’s projection. The arc labeled T 113.5 corresponds with a fall transit near t = 113.5 years. The arc T117.5 corresponds with V and E being on opposite sides of S near t = 117.5; as such, we display the disk of S in front of this arc. The arc T121.5 misses the disk of S. 4. CONDITIONS FOR A TRANSIT TO OCCUR. In order to find how far from its nexus V may wander and yet be part of a transit across S, we project the disk of S through V out to E’s orbit, forming a cone as illustrated in Figure 5(a), which displays the situation where the base of the truncated cone is tangent to E’s orbit. base of truncated cone

C V

D

e plan

C

’s of V

disk of the B Sun

E’s orbit

(a) A cone of possible shadows

h

γ λ

it orb V’s − λ) k (1

V

kλ

S S

t

orbi

µ 1−λ

D ρ

E’s orbit

B

(b) A linear approximation of orbits

Figure 5. Maximum separation from the nexus for a transit

Let ρ be the radius of this base with center point D. To approximate where this extreme position for V occurs, we linearize the orbits of V and E, and imagine that they proceed along lines perpendicular to the nexus line BC, as illustrated in Figure 5(b). In this figure, we take the distance SB as 1 AU. The distances SV and SD are kλ and k, where k is a marginally-larger-than-1 deformation factor due to linearization. With s ≈ 0.00465 AU as the radius of S, from similar triangles, we see that ρ s = , kλ k(1 − λ)

(7)

which gives ρ ≈ 0.0178 AU. Furthermore, sin µ =

ρ h

and

tan γ = h,

(8)

where µ is the angle between the two orbital planes, γ is the angle between the nexus line and the line between S and V , and h is distance BD. By (7) and (8), s(1 − λ s(1 − λ) γ = tan−1 ≈ ≈ 0.0301, (9) λ sin µ λµ since the arguments of the inverse tangent and sine are so small. Thus, in order to be part of a transit, V may wander no further than about λγ ≈ 0.0218 AU from the nexus. By (9), the lapse of time L v for V to travel this far from its nexus is Lv ≈ April 2014]

s(1 − λ) ≈ 1.08 days. 2πλσ0 µ

PERIODICITY DOMAINS AND THE TRANSIT OF VENUS

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:30 PM All use subject to JSTOR Terms and Conditions

(10) 287

The corresponding maximal time L e that E may stray from its nexus points and yet take part in a transit is Le =

γ ≈ 42 hours < 2 days. 2π

(11)

speed in AU/yr 10

transit occurs here V and E on opposite sides of S

8 6 4 2 −0.5

0.0

0.5

1.0

1.5

2.0

time t, in years Figure 6. Speed, ||W 0 (t)||, of V ’s shadow across the screen of S

Since the speed at which a transit is traced across S is bounded by 10.34 AU/year as indicated by the graph of ||W 0 (t)|| in Figure 6, then ||W 0 (t)|| < 10.34 AU/year ≈ 0.0284 AU/day

(12)

for all t. Let t0 be a medial transit time, a time of a spring transit near integer time n or of a fall transit near half-year time n + 21 where W1 (t0 ) = 0. Since the time between t0 and either n or n + 21 must be at most about 42 hours by (11), then the most that ||W (n)|| or ||W (n + 21 )|| can differ from ||W (t0 )|| is approximately (0.0280 AU/day)(42 hours) ≈ 0.0496AU by (12). Since |W2 (t0 )| < s, then 0.05 AU is about the most that ||W (n)|| or ||W (n + 1 )|| can be. Therefore, our litmus test to determine if integer year n or half-year n + 21 2 is a promising one for a transit is for V and E to be on the same side of S and for 1 ||W (n)|| < 0.05 or W n + < 0.05. (13) 2 Applying (13) to the integers 0 to 2000 with σ = σ0 , we find the promising years of Table 1. Table 1. Years at which the spring and fall transits occur

288

0

113.5

227

1029.5

(1143, 1151)

(1256.5, 1264.5)

(340.5, 348.5)

(454, 462)

575.5

689

802.5

916

1378

1491.5

1605

1718.5

1832

(1945.5, 1953.5)

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:30 PM All use subject to JSTOR Terms and Conditions

Double-checking the dates in Table 1 by graphing the arc W (t) against the disk of S verifies that each of the years or half-years corresponds with a spring or fall transit, respectively, and are the only transits during this 2000-year period in our model. As can be seen, the familiar differences 8, 105.5, and 113.5 between successive transit times appear—good news for our model. The entries in the table eight years apart have been grouped as ordered pairs; their associated transits are called twins or doubles. For example, spring transits occur in our model in both year 454 and year 462. For a twin transit, we say the transit member whose path across S comes closer to S’s center is the dominant transit of the two. If a transit has no twin, it is a singleton transit. As can be seen in Figure 7 of the twin transit T454 and T462 , T462 is the dominant member. T227 is a singleton. In section 7, we show how to modify our model to simulate actual transit dates.

dominan

t twin

0.005 T462 0.005 T454

Figure 7. A twin pair of descending spring transits

Meanwhile, in looking for a pattern with respect to the data of Table 1, the reader may notice that W (0), W (802.5), and W (1605) are all almost (0, 0). Have we stumbled across a characteristic time period for which the data repeats? To answer, we b of this data as T b = 1605 years and argue that a more define the practical period T natural period exists for three reasons. b is related to σ0 . • It is unclear how T b explains the time lapse between successive transits. • It is unclear how a period of T b should • Since the time lapse between twin transits is 8 years, it seems likely that T somehow be related to 8, but how? In the next section, we find a natural period and demonstrate that the practical and natural periods are related. 5. RECOGNIZING THE PATTERN. To find a more natural transit period, we focus on spring transits for a season; from Table 1, we drop the fall transit dates, and are left with Table 2. When we refer to the spring transit year n j from the table, where Table 2. Spring transits

0

j transit year n j

1

2

3

4

5

6

7

8

0 227 (454, 462) 689 916 (1143, 1151) 1378 1605 1832

n j mod 8

0

3

6

1

4

7

2

5

0

3 j mod 8

0

3

6

1

4

7

2

5

0

April 2014]

PERIODICITY DOMAINS AND THE TRANSIT OF VENUS

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:30 PM All use subject to JSTOR Terms and Conditions

289

j ≥ 0, we mean term- j in row 2 or the dominant transit year if the term is a twin. For example, n 2 = 462, as evidenced by Figure 7. Observe that the first eight spring transits comprise a complete residue set modulo 8. Furthermore, n j mod 8 just happens to be 3 j mod 8, which suggests that the relative motion of the planets induces a linear shuffling of the transit year residues modulo 8. We thus refer to 3 as a shuffling factor. To help understand this 8-fold dynamic, observe that every eight years both E and V pass each other not far from where they had passed each other eight years before, with V a bit further ahead of E each time. We say that the arc given by W (n years ± 1 week) is rung-n in a ladder of arcs. As the years go by, these rungs step monotonically upward (or downward) to a climax before reversing their progression, with rung-8n being more or less either above or below rung-8(n + 1) for all integers n. Near the spring transit years, neighboring rungs are separated by a distance somewhat more than the radius of S, as illustrated in Figures 4(b), 7, and 8; the dots in Figure 8 represent V ’s projection at t = −16, −8, 0, 8, 16 years. With p = 8, the approximate distance d( p) between neighboring rungs near transit years is the distance between W ( p) and its projection onto W (0+ ), where we take 0+ as one hour, is W ( p) · W (0+ ) + W (0 ) (14) d( p) = W ( p) − ≈ 0.00672 AU. W (0+ ) · W (0+ ) Since s < d( p) < 2s, then a sequence of at most two successive rungs may cross the face of S, whereas if a rung crosses near the center of S, then only one rung in that succession of rungs may correspond to a transit.

Sun −16

−8

0

8

16

Figure 8. V ’s projection as given by W (t) near t = −16, −8, 0, 8, 16

When we extend the data as given in Table 2, the data seems to sort itself. That is, plotting {(n, W1 (n))}n≥0 corresponding to the times when E is at its spring nexus point shows a hodge-podge of dots across 100 years in Figure 9(a). Yet, when we look at a longer period of time, the trend is clear. Figure 9(b) displays the data across 2000 years. It appears as if V ’s projection when sampled at E’s spring nexus point lies on one of eight branches through the data, each of which appear to be uniformly spaced translates of one another. AU 1.0

AU 1.0

0.5

0.5 20

40

60

80

100

years

500

−0.5

−0.5

−1.0

−1.0 (a) A hodge-podge of dots

1000

1500

2000

years

(b) A better perspective

Figure 9. Horizontal component of V ’s projection at E’s spring nexus over time

290

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:30 PM All use subject to JSTOR Terms and Conditions

By (5) and (6), finding the periodicity present within {(n, W1 (n))}n≥0 is equivalent to finding the periodicity present within D(σ ) = {(n, sin(2π σ n))}n≥0 , as n ranges over integer values. Figure 10 shows that when restricted to the years 8n where n is an integer—and when adjacent points are connected by line segments—both curves display the same periodicity for σ = σ0 . The curves appear to have a root near t ≈ 917, but no spring transit occurs at either 912 = 8(114) or 920 = 8(115) years, because in our model V and E are on opposite sides of S at both times. However, near the next root t ≈ 1834, a transit occurs at n = 1832 = 8(229) years, but not at 1840, because V ’s projection falls just outside S’s disk in that year. AU 1.0

(8n, W1 (8n)) (8n, sin(2π σ (8n))

0.5 500

1000

1500

years

−0.5 −1.0 Figure 10. Paths through W1 (t) and sin(2πσ t) when t = 8n years, σ = σ0

Can we find curves y j = sin(α(t − β j)), where α and β are real numbers and j is an integer, 0 ≤ j ≤ 7, which characterize D(σ0 )? That is, we seek a period T , with T = 2π and β = T8 , for which T is near 1834 and where y j passes through all points on α branch- j of D(σ0 ). Observe that the values of sin(2π(σ )8n) and sin(2π(σ − m8 )8n) agree for all integers m. In particular, for the integer m for which m8 is nearest σ , namely, m = 13, we see that defining α and T so that α 13 365.26 13 1 = =σ− ≈ − ≈ 0.000545171 T 2π 8 224.70 8

(15)

indeed gives the natural period of D(σ0 ) as T =

2π 2π ≈ ≈ 1834.29 years, α 0.00342541

(16)

β=

T T 1834.29 = ≈ ≈ 229.286 years. p 8 8

(17)

which means that

b = 1605 years by 7, When we divide the practical period T b T ≈ 229.286 ≈ β. 7 That is, T ≈ April 2014]

8b T. 7

PERIODICITY DOMAINS AND THE TRANSIT OF VENUS

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:30 PM All use subject to JSTOR Terms and Conditions

291

Hence, the practical period of 1605 just happens to be a lucky seven integer multiple of the phase shift in the branches of the natural period. To verify the fourth row of Table 2, that 3 is the shuffling factor, observe by Figure 9(b) that n j is the first component in that point belonging to branch- j of D(σ ), which is nearest the first nonnegative root of y = sin(α(t − β j)), with 0 ≤ j ≤ 7. Hence, for a given j, we wish to find the residue r j of n j modulo p, where 0 ≤ j ≤ p − 1 and 0 ≤ r j ≤ p − 1, so that sin(2πσ t) = sin(α(t − jβ))

(18)

for all times t = pn + r j , for all integers n, with p = 8. By the pigeonhole principle, since there are eight branches and eight primitive residues, r j is unique for each j. Furthermore, by the affine nature of the arguments of sine in (18), it is sufficient to show that (18) has a solution for j = 1, which means that we must solve sin 2πσ ( pn + r ) = sin α( pn + r − β) (19) for r , where r = r1 and p = 8. By (15) through (17), (19) becomes 2π (13r )(2π ) = sin α(8n + r ) − . sin α(8n + r ) + 26π n + 8 8 Therefore, solving 13r ≡ −1 mod 8

(20)

gives the unique solution r = 3 for (19). Furthermore, generalizing the above argument demonstrates that the shuffling factor r in (19) remains at r = 3 for all σ 6 = 138 , for which σ − 13 < 1 = 1 , 8 32 4p a range of angular velocities called the periodicity domain of 138 . By an interval punctured by x, we mean a disconnected set of real numbers J whose union with {x} is an interval. Thus, the periodicity domain of 13 is an interval punctured by 138 . The reason 8 for excluding 13 from its periodicity domain is that its corresponding α and β would 8 be 0 and ∞, respectively. To account for arbitrary relative positions of E and V in their orbits about S, we imagine that at time t = 0, V is δ years ahead of its last rendezvous with its spring nexus, while E is at its spring nexus. Each of the branches characterizing V ’s projection undergo a phase shift , where sin(2πσ (8n + δ)) must equal sin(α(8n + )); by ) = α, which means that (15), one way for this to occur is when (2πδ)( 13 8 =

qδT 13δT = , p 8

where p = 8 and q = 13. Therefore, we have an algorithm for characterizing all spring singleton transits and all dominant members of spring twin transits, where δ is an orbital phase angle shift between V and E, p = 8 is the apparent periodicity of D(σ ), r = 3 is the shuffling factor among the year residues modulo p as given by (20), and q is the rational number close to σ as given by (15). p 292

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:30 PM All use subject to JSTOR Terms and Conditions

The Transit Rule. Let k, n, and j be integers, 0 ≤ j < p. A spring transit occurs at integer year m near time (k − qp δ)T + β j if and only if m = pn + ( jr mod p) and m is no further from (k − qp δ)T + β j than either m − p or m + p. If either m − p or m + p is a transit year as well, then m is the dominant member of the twin. To ascertain whether m ± p is also a spring transit, simply utilize the decision rule (13). Example 1. To illustrate the transit rule, let δ = 0, k = 3, and j = 5. Since 3 j mod 8 = 7, we want to find the transit year m = 8n + 7 closest to kT + jβ ≈ 6649.3. Then m = 8(830) + 7 = 6647, while m + 8 = 8(831) + 7 = 6655. That is, year 6647 is a singleton transit, while year 6655 is a near-miss, as shown in Figure 11(a).

T4754

T6655

T4746

T6647 (a) Spring transit near 3T + 5β

(b) Spring transit near (2 −

13 )T 80

Figure 11. Checking the transit algorithm

+ 6β

Example 2. This time, let δ = 0.1, k = 2, and j = 6. Since 3 j mod 8 = 2, we want to find the transit year m = 8n + 2 closest to (2 − 0.1( 138 ))T + 6β ≈ 4746.2. Then m = 8(593) + 2 = 4746, while m + 8 = 8(594) + 2 = 4754. That is, year 4746 is a singleton transit, while year 4754 is far from being a transit, as shown in Figure 11(b). As for fall transits, a similar rule applies, except that the eight branches through the data corresponding to time n + 21 are 1 y j = sin α t − β j + + . 2 6. VARYING VENUS’S ANGULAR VELOCITY. The key behind the transit rule is recognizing that D(σ0 ) consists of eight components or branches. Thus we say that the periodicity of D(σ ) is the integer p if D(σ ) appears to fall into p branches. To formalize what is meant by appears, for each positive integer η, we define N (η) as the maximal integer n for which {sin(2πσ η j)}nj=0 is monotonic. Intuitively, N (η) counts the number of rungs from a transit to a climax. We further define the periodicity quotient Q(σ, η) as N (η) , Q(σ, η) = η which gives a measure of normalization among the values of N (η). We say that the apparent periodicity of D(σ ) is p, if Q(σ, p) appears to approach the maximum of {Q(σ, η)|η ∈ Z + }. April 2014]

PERIODICITY DOMAINS AND THE TRANSIT OF VENUS

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:30 PM All use subject to JSTOR Terms and Conditions

293

Table 3. The periodicity of D(σ0 ) appears to be 8.

η

Q(σ0 , η)

η

Q(σ0 , η)

η

Q(σ0 , η)

1 2 3 4 5 6 7 8 9 10

1 0 1 0 0 0 0 7 0 0

11 12 13 14 15 16 17 18 19 20

0 0 0 0 0 1 0 0 0 0

21 22 23 24 25 26 27 28 29 30

0 0 0 0 0 0 0 0 0 0

The first few values of Q(σ0 , η) are given in Table 3, with the nonzero periodicity quotients in boldface. When extending this table indefinitely as far as a typical CAS allows, it appears as if Q(σ, η) = 0 for all η > 16. From such evidence, and since the maximum quotient among this η range is 7 and corresponds to η = 8, we conclude that D(σ0 ) has apparent periodicity 8. Let ν be a number between 0 and 0.5. Observe that Q(ν, η) = Q(n + ν, η) for all integers n. Because sine is an odd function, Q(ν, η) = Q(1 − ν, η). Therefore, the only σ values for which we need to evaluate Q(σ, η) are those in the range 0 ≤ σ ≤ 12 , or, equivalently, the range 1.5 ≤ σ ≤ 2, the reference interval containing σ0 . Armed with the use of the measure Q we ask, how far may we perturb σ from σ0 and yet have apparent periodicity remain invariant? Q(σ, 8) 12

10 8 6 4 2 1.615

1.620

1.625

1.630

1.635

σ

Figure 12. The range of 8-fold apparent periodicity

If Q(σ, η) ≥ 3, we say that D(σ ) displays significant apparent periodicity η. From Figure 12, we see that D(σ ) displays significant apparent periodicity 8 on the interval . Plots of D(1.6237) and D(1.6263) are much (1.6237, 1.6263) punctured by σ = 13 8 like Figure 13(a), in which an 8-fold periodicity is less pronounced than in Figure 9(b). As σ approaches 138 = 1.625, Q(σ, 8) goes to ∞. For example, Q(1.6251, 8) = 39; this strong apparent periodicity 8 is illustrated in the graph of D(1.6251) in Figure 13(b). Of course, when σ = 13/8, the eight √ branches collapse into five parallel lines corresponding to the sine values 0, ±1, ± 2/2, which means that Q( 13 , 8) = 0. We 8 therefore say that the domain of significant periodicity for 13/8 is an interval of angular velocities ν punctured by 13 , for which Q(ν, 8) is at least 3. 8 294

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:30 PM All use subject to JSTOR Terms and Conditions

1.0

1.0

500

1000

1500

2000

−1.0

500

1000

1500

2000

−1.0 (a) D(1.6263)

(b) D(1.6251)

Figure 13. A weak and a strong apparent periodicity 8

What about periodicity domains for other σ values, such as σ = 35 , 74 , 12 , 17 , and 7 10 as shown in Figure 14? It should come as no surprise that for σ values taken within the significant periodicity domains of these numbers, D(σ ) will exhibit apparent periodicity of 3, 4, 7, 10, and 11, respectively. For example, the data set D(1.714) in Figure 15(a) shows apparent periodicity 7, and is well within the significant periodicity domain of 127 ≈ 1.71429. 19 , 11

13/8 8

5/3

17/10 12/7 19/11 7/4

6

4

2

1.65

1.70

1.75

1.80

σ

Figure 14. Domains of periodicity

1.0

1.0

500

1000

1500

2000 4000 6000 8000 10000 12000

2000

−1.0

−1.0 (a) D(1.714)

(b) {n, W1 (t)}n≥0 where σ =

√ 11 2 10

Figure 15. Apparent periodicities 7 and 9

April 2014]

PERIODICITY DOMAINS AND THE TRANSIT OF VENUS

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:30 PM All use subject to JSTOR Terms and Conditions

295

The next example is an application of the transit rule corresponding to an apparent periodicity other than 8. √

Example 3. Let σ = 1110 2 ≈ 1.55563. The plot of D(σ ), Figure 15(b), shows that its apparent periodicity is p = 9. Since 149 is that fraction of integers with denominator 9 nearest σ , the analog of (15) is σ−

14 α 1 = = , 9 2π T

which gives T ≈ 12,600.3 years. Solving (19) gives the shuffling factor r = 7 rather than 3. Now let δ = 0, k = 0, and j = 5, which means that we are looking for a transit year with residue jr mod 9 ≡ 8 near time 5β = 5T /9 ≈ 7000.17. Thus, m = (777)(9) + 8 = 7001 is a transit year. With this new value of σ , V has receded from S, so the distance d(9) between the rungs has changed to d(9) ≈ 0.0014 by (14), which means that we have more than twin transits; in fact we have septuplets, as shown in Figure 16(a).

Y T7028 T7019 T7010

linear model approximation of the June 2012 transit

Z

actual June 2012 transit path

T7001 T6992

T6983 T6974

√ 11 2 (a) A transit family of septuplets, σ = 10

(b) Hunting for a phase angle δ

Figure 16. Transits with σ other than σ0

7. A REALITY CHECK. How does our model contrast with reality? A phenomenon omitted thus far from our transit model is the tendency of objects to rotate—including the orbital planes of V and E, a feature called precession. The values τe and τv used to define σ0 are the periods of the two planets with respect to the background of the fixed stars. To adapt our model appropriately, we must incorporate slightly different periods, namely, the time it takes for a planet to return to its aphelion. Since E precesses faster than V , as time goes on the nexus line rotates and hence spring and fall transits occur later in the year. Because precession rates are tiny compared to σ0 , we arbitrarily take σ0 ≈ 1.625550000. Meeus [5, p. 13] predicts that “an almost exactly central transit will take place on 11 July 5900”—a transit through S’s center. Thus from 2012 to 5900, the spring transit has now become a summer transit, having slipped forward by about 35 days during a lapse of 3888 years, which means that the change in the relative orbital speeds of V and E with respect to the nexus line is 1σ ≈ 35σ0 ≈ 0.0000397559, which means that we might try the new angular velocity σ1 = 3888τe σ0 − 1σ ≈ 1.625510244. Next, we need a phase shift δ to start our model. From [5, p. 48], the transit of 6 June 2012 crossed S’s boundary at Y ≈ 39.45◦ and at Z ≈ 291.4◦ measured counterclock296

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:30 PM All use subject to JSTOR Terms and Conditions

wise from the top of S, shown as a dotted line in Figure 16(b). Adjusting (1) and (5) so that the trigonometric arguments 2πσ t are replaced by 2π σ (t + δ), where δ is an indeterminate phase shift, and using a search method to find δ by dynamically plotting W (t − 2012) near t = 2012, yields the solid-line transit in Figure 16(b), suggesting that δ ≈ 0.00102 is a good match. The reason that the two transit lines are non-parallel is because E’s and V ’s actual orbits have positive eccentricity. When we apply (13) in this adjusted model for the years from 700 to 3000 AD, we find the promising spring transit Gregorian year possibilities of Table 4. The underlined years indicate a match between our results and Meeus’s. Not bad for a linear model. But can we do better? Table 4. The linear model versus Meeus’s Model

This linear model

(781, 789) (2004, 2012)

(1024, 1032) 2255)

1275 2498

1518 2741

(1761, 1769) (2984, 2992)

Meeus’s model

(789, 797) (2004, 2012)

(1032, 1040) (2247, 2255)

(1275, 1283) (2490, 2498)

(1518, 1526) (2733, 2741)

(1761, 1769) (2976, 2984)

To do so, we work backward through the transit rule and find a magic angular velocity. Since σ1 is within the periodicity domain of 138 , the corresponding shuffling factor is r = 3. We make use of a second unusual spring transit year, 183 BC, whose corresponding transit Meeus describes as “almost central.” The difference between 5900 AD and 183 BC is 6083 years. Identify t = 0 with year 5900. Thus, year 183 BC is referenced by t = −6083 = 8(−761) + 5, which means that 5 ≡ 3 j mod 8, whose solution is j = 7. Using the angular velocity σ1 with (15), the associated period is T1 ≈ 1959.85. We then solve kT1 + 7T81 = −6083, getting k ≈ −3.98. Next, reset k as k = −4, and solve (k + 87 )T2 = −6083, getting T2 = 48664 . By (15), 25 σ2 =

1 13 25 13 9888 + = + = ≈ 1.6255137267795495644. T 8 48664 8 6083

When we generate transits by the transit rule using angular velocity σ2 across the years 2000 BC to 4000 AD, we get an exact match with actual spring transits from Meeus’s results. Table 5. Spring transit years, generated by the transit rule

1884 BC 1641 BC 1398 BC 1155 BC 912 BC 669 BC 426 BC 183 BC 546 2984

∗

60

303

789

1032

1275

1518

1761

2004

2247

2490 2733

3227

3470

3713

3956

4199

4442

4685

4928 5171

As can be seen, the difference between successive entries in Table 5 is 243 years, except when passing from 2733 to 2984, the year marked with an asterisk. The match between these two approaches with respect to the recessive partner in twin transits is less spectacular. 8. SOME PARTING OBSERVATIONS AND QUESTIONS. What we have shown is that the cycle of transits is the way it is because V ’s angular velocity σ0 is enmeshed within the periodicity domain of 138 . This in turn induces a modulo 8 shuffling of successive transit years by a factor of 3, a phenomenon reflected in the 6000-year standard April 2014]

PERIODICITY DOMAINS AND THE TRANSIT OF VENUS

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:30 PM All use subject to JSTOR Terms and Conditions

297

tables of transits generated by Meeus [5], provided we partition transits into two families: spring transits and fall transits, and discard one of the years from each twin transit. With respect to the notion of periodicity domains, some natural questions arise. Does every D(σ ) have a well-defined apparent periodicity? For a challenge, try σ = √ 3. What happens when σ wanders into overlapping periodicity domains? The reality is a war-torn fractal-like dominance landscape foreshadowed in part by Figure 14. As 35 ≈ 0.673077 exerts its 52-ness dominance over its immediate a simple example, 52 neighbors. Yet, it is well within the dominance of 32 ; an examination of D( 35 ) shows a 52 clear 3-fold periodicity, and the periodicity quotient Q( 35 , 3) = 4 supports this result. 52 35 However, Q( 52 − 0.000001, 52) = 92 and a plot of its corresponding data set suggests periodicity 52. With respect to permanence, in the life cycle of S, S slowly loses mass and swells to giant status and so the orbits of the planets recede from S, which means that the transit cycle for V may change dramatically. The rational numbers with small integer denominator near 138 in increasing order are

3 11 8 29 21 13 31 18 23 28 33 5 7 , , , , , , , , , , , , . 2 7 5 18 13 8 19 11 14 17 20 3 4

A billion or two years from now, the natural periodicity of the Venus transit may change from 8 to 13 or 19. Hopefully, humans will yet be here to see. For an application of the ideas of this paper to the phases of the Moon, see [8]. Just as the transit of Venus involves the periodicity domain of 138 , so too the phases of the . Moon involve the periodicity domain of another fraction, this time 235 19 ACKNOWLEDGMENT. Thanks to Osmo Pekonen for asking me to write a review [7] of [11] which in turn sparked this project.

REFERENCES 1. G. K. Chesterton, Heretics. Reprint of the 1905 edition, Books for Libraries Press, Freeport, NY, 1970. 2. M. Danlous-Dumesnils, P´eriodicit´e des passages de V´enus, L’Astronomie 91 (1977) 117–127. 3. F. Espenak, Six millenium catalog of Venus transits, NASA, 2013, available at http://eclipse.gsfc. nasa.gov/transit/catalog/VenusCatalog.html. 4. E. Halley, A new method of determining the parallax of the Sun, or his distance from the Earth, in The Abridged Transactions of the Royal Society 6 (1809) 243–249. 5. J. Meeus, Transits. William-Bell Press, Richmond, VA, 1989. 6. , The transits of Venus, 3000 BC to AD 3000, Journal of the British Astronomical Association 68 (1958) 98–108. 7. A. Simoson, A review of [11], Math. Intel. 35 (2013) 84–85. 8. , Bilbo and the last moon of autumn, to appear in Math Horizons. 9. D. A. Teets, Transits of Venus and the astronomical unit, Math. Mag. 76 (2003) 225–348. 10. H. Woolf, The Transits of Venus: A Study of Eighteenth Century Science. Princeton University Press, Princeton, NJ, 1959. 11. A. Wulf, Chasing Venus: the Race to Measure the Heavens. Alfred Knopf Press, New York, 2012. ANDREW J. SIMOSON is a long time professor of mathematics at King University. Recently he stumbled upon a pertinent Chesterton quote, “Men take thought and ponder rationalistically touching remote things— things that only theoretically matter, such as the transit of Venus” [1, p. 141]. King University, 1350 King College Road, Bristol, TN 37620 [email protected]

298

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:30 PM All use subject to JSTOR Terms and Conditions

A Drug-Induced Random Walk Author(s): Daniel J. Velleman Source: The American Mathematical Monthly, Vol. 121, No. 4 (April), pp. 299-317 Published by: Mathematical Association of America Stable URL: http://www.jstor.org/stable/10.4169/amer.math.monthly.121.04.299 . Accessed: 30/03/2014 17:28 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp

. JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected].

.

Mathematical Association of America is collaborating with JSTOR to digitize, preserve and extend access to The American Mathematical Monthly.

http://www.jstor.org

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PM All use subject to JSTOR Terms and Conditions

A Drug-Induced Random Walk Daniel J. Velleman

Abstract. The label on a bottle of pills says “Take one half pill daily.” A natural way to proceed is as follows: Every day, remove a pill from the bottle at random. If it is a whole pill, break it in half, take one half, and return the other half to the bottle; if it is a half pill, take it. We analyze the history of such a pill bottle.

1. INTRODUCTION. A few years ago our cat Natasha (see Figure 1) began losing weight. We took her to the vet, who did some tests and determined that she had a thyroid condition. He gave us a bottle of pills and told us to give her half a pill every day.

Figure 1. Natasha

The next day we shook a pill out of the bottle, broke it in half, gave her half of the pill, and put the other half back in the bottle. We repeated that procedure for several more days. Eventually, a day came when the pill we shook out of the bottle was one of the half pills we had put back in on one of the previous days. Of course, we just gave her the half pill that day. We continued to follow this procedure until the bottle was empty, and then we started on a new bottle. The pills solved Natasha’s medical problem; she regained the weight she had lost, and she’s doing fine now. But they created an interesting mathematical problem. The state of the pill bottle on any day can be described by a pair of numbers (w, h), where w is the number of whole pills in the bottle and h is the number of half pills. We will assume that every day a pill is removed from the bottle at random, with each pill being equally likely to be chosen. When a whole pill is removed, it is cut in half and half of it is returned to the bottle; when a half pill is removed, nothing is returned to the bottle. Thus, if the state of the pill bottle on a particular day is (w, h), then with probability w/(w + h) the state on the next day will be (w − 1, h + 1), and with http://dx.doi.org/10.4169/amer.math.monthly.121.04.299 MSC: Primary 60G50, Secondary 65L05

April 2014]

A DRUG-INDUCED RANDOM WALK

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PM All use subject to JSTOR Terms and Conditions

299

probability h/(w + h) it will be (w, h − 1). This means that the state of the pill bottle executes a random walk in the plane, starting at the point (w, h) = (n, 0), where n is the initial number of pills in the bottle, and ending at (0, 0). Since the bottle contains 2n doses of medicine, the walk takes 2n steps. For example, Figure 2 shows a computer simulation of a pill-bottle walk starting with n = 20 pills. On the first three days, whole pills are removed from the bottle, and the state of the bottle goes from (20, 0) to (19, 1), (18, 2), and (17, 3). The next day, a half pill is removed, and the state goes to (17, 2). And the walk continues for 36 more steps until it ends at (0, 0). h 8 7 6 5 4 3 2 1 5

10

15

20

w

Figure 2. A pill-bottle walk with n = 20

Figure 3 shows simulated walks with n = 100, n = 1000, and n = 10000. It appears that although the walks are random, the overall shapes of the walks are similar, with the shape becoming smoother as n increases. Notice that the scales of the three walks in Figure 3 are different; the first starts at (100, 0), the second at (1000, 0), and the third at (10000, 0). It is only when they are drawn the same size that they look similar. This suggests that we should rescale the walks to a uniform size, independent of n. We will therefore switch to a new coordinate system. If we let x = w/n and y = h/n, then x represents the fraction of the original n pills that are still whole, and y represents the fraction that have become half pills. Notice that these fractions may add up to less than 1, since some fraction of the pills may have been used up completely. h 30 20 10

300 200 100 25

50

75

100

w

250

500

750

w 1000

3000 2000 1000 w 2500 5000 7500 10 000 Figure 3. Walks with n = 100 (top left), n = 1000 (bottom), and n = 10000 (top right)

300

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PM All use subject to JSTOR Terms and Conditions

Using the coordinates (x, y) to represent the state of the pill bottle, we get a random walk that starts at (1, 0), ends at (0, 0), and stays in the triangle x + y ≤ 1, x ≥ 0, y ≥ 0. When the state is (x, y), it changes as follows: • with probability x , the state changes to x − 1 , y + 1 ; x+y n n • with probability y , the state changes to x, y − 1 . x+y n We will call such a walk an n-walk. Increasing n does not make the walk larger, but it makes the steps smaller. Figure 3 suggests that as n increases, the walk approaches a smooth curve. What is this curve? The limit curve we seek is an example of a scaling limit of a discrete process. Perhaps the best-known example of a scaling limit is Brownian motion, which can also be thought of as the scaling limit of a random walk. For more on Brownian motion and scaling limits, see [5]. We first give an intuitive argument that suggests a possible answer to our question. We will find it helpful to introduce a third variable t, standing for time. We set t = 0 at the beginning of the walk, and to keep the scales of the variables comparable we will assume that t increases by 1/n for each step of the walk. Since the walk consists of 2n steps, this means that t will run from 0 to 2. We think of the limit curve as being given by parametric equations x = f x (t),

y = f y (t),

0 ≤ t ≤ 2,

or, in vector notation, (x, y) = ( f x (t), f y (t)) = f(t),

0 ≤ t ≤ 2.

When the state of an n-walk is (x, y), the displacement to the next state is either the vector (−1/n, 1/n), with probability x/(x + y), or (0, −1/n), with probability y/(x + y). Thus, the expected value of the displacement is x x+y

x−y 1 1 y 1 1 x , − , + 0, − = − . n n x+y n n x+y x+y

Since t increases by 1/n during the step, this suggests that the parametric form of the limit curve might be a solution to the system of differential equations x dx =− , dt x+y

dy x−y = . dt x+y

(1)

To solve this system of equations, we first note that dy dy/dt x−y y = =− = −1 + . dx d x/dt x x We will let you check that the curve y = −x ln x satisfies this equation for 0 < x ≤ 1 and passes through the point (1, 0). The graph of this curve is shown in Figure 4, and the similarity to the walks in Figure 3 is striking. Notice that although ln 0 is undefined, limx→0+ (x ln x) = 0. From now on we consider 0 ln 0 to be equal to 0, so that the curve y = −x ln x includes the point (0, 0). April 2014]

A DRUG-INDUCED RANDOM WALK

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PM All use subject to JSTOR Terms and Conditions

301

y 0.3 0.2 0.1 0.2

0.4

0.6

0.8

1

x

Figure 4. The graph of y = −x ln x

Substituting y = −x ln x in the first equation in (1), we get x 1 dx =− = . dt x − x ln x ln x − 1 Separation of variables gives Z t = (ln x − 1) d x = x ln x − 2x + C. Since x = 1 when t = 0, we must have C = 2, and therefore t = x ln x − 2x + 2.

(2)

Let g(x) = x ln x − 2x + 2 for 0 ≤ x ≤ 1. (Notice that by our convention that 0 ln 0 = 0, we have g(0) = 2.) Then g maps [0, 1] onto [0, 2] and is strictly decreasing, so it has an inverse. We define f x to be the inverse of g, which is a strictly decreasing function mapping [0, 2] to [0, 1]. Thus, if 0 ≤ t ≤ 2 and x = f x (t), then x and t satisfy equation (2).1 Using y = −x ln x, we can rewrite equation (2) as t = −y − 2x + 2, or equivalently y = 2 − 2x − t. We therefore define f y (t) = 2 − 2 f x (t) − t.

(3)

We leave it to you to verify that the equation (x, y) = ( f x (t), f y (t)) = f(t),

0≤t ≤2

(4)

parametrizes the curve y = −x ln x shown in Figure 4, and it satisfies the differential equations (1) for 0 ≤ t < 2, where we interpret the derivatives at t = 0 as one-sided derivatives. (At t = 2, we have x = y = 0, and therefore the right-hand sides of the equations in (1) are undefined.) The graphs of f x and f y are shown in Figure 5. It turns out that an n-walk does, indeed, approach the curve (4) as n approaches ∞, but the sense in which this is true must be stated carefully. Our main theorem is the following. 1 Using

the Lambert W function W−1 (see [1]), we can express f x (t) explicitly by the equation f x (t) =

t −2 . W−1 ((t − 2)/e2 )

However, we will not have any use for this expression.

302

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PM All use subject to JSTOR Terms and Conditions

1

x

y

0.8

0.3

0.6

0.2

0.4 0.1

0.2 0.5

1

1.5

2

t

0.5

1

1.5

2

t

Figure 5. The graphs of x = f x (t) (left) and y = f y (t) (right)

Theorem 1. Suppose that > 0. Let the points on an n-walk be p0 = (1, 0), p1 , . . . , p2n = (0, 0), and for 0 ≤ i ≤ 2n let ti = i/n. Then the probability that for every i, kpi − f(ti )k < approaches 1 as n → ∞. In other words, the n-walk converges uniformly in probability to the limit curve. Two notable features of the limit curve are that the tangent line at (1, 0) has slope −1, and the tangent line at the origin is vertical. The first feature makes intuitive sense: early in the walk, almost all of the pills in the bottle are whole pills, so it is likely that several whole pills will be removed before the first half pill is removed. For example, in the walk in Figure 2, three whole pills were removed before the first half pill was removed. When these initial whole pills are removed, the walk will move along the line y = 1 − x, which is the tangent line at (1, 0). The second feature seems more surprising: it appears that near the end of the walk, almost all of the pills are half pills, and the walk ends by moving along the line x = 0 toward the origin. This suggests two questions. Question 1. For a bottle of n pills, what is the expected number of whole pills that are removed from the bottle before the first half pill is removed? Question 2. For a bottle of n pills, what is the expected number of half pills that are removed from the bottle after the last whole pill is removed? Versions of Question 1 have appeared in the literature before (see, for example, [3, 4, 6, 8]). In the case n = 365, it is equivalent to the following version of the birthday problem: If people are chosen at random, one by one, what is the expected number of people with distinct birthdays who will be chosen before the first person who has the same birthday as a previously chosen person? We will give an elementary derivation of the answer to Question 1. In our next theorem, we express the answer in terms of the incomplete gamma function, which is defined as follows, 0(a, x) =

∞

Z

t a−1 e−t dt. x

Theorem 2. For a bottle of n pills, the expected number of whole pills that are removed from the bottle before the first half pill is removed is en 0(n, n). n n−1 April 2014]

A DRUG-INDUCED RANDOM WALK

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PM All use subject to JSTOR Terms and Conditions

303

As n → ∞, this expected value is asymptotic to r πn . 2 The answer to Question 2 was found by Richard Stong. Theorem 3 (Stong). For a bottle of n pills, the expected number of half pills that are removed from the bottle after the last whole pill is removed is the nth harmonic number, Hn = 1 +

1 1 1 + + ··· + . 2 3 n

For example, for a bottle of 100 pills, the expected number of whole pills before the first half pill is e100 0(100, 100) ≈ 12.21, 10099 and the asymptotic approximation in Theorem 2 is r

100π ≈ 12.53. 2

The expected number of half pills after the last whole pill is H100 ≈ 5.19. The rest of this paper is devoted to the proofs of Theorems 1–3. We prove Theorem 1 in Section 3, and Theorems 2 and 3 in Section 4. We consider variations on these theorems in Section 5. 2. BACKGROUND FOR PROOF OF THEOREM 1. In preparation for the proof of Theorem 1, we simplify the problem by eliminating one variable. According to definition (3), f y (t) = 2 − 2 f x (t) − t, so f(t) = ( f x (t), 2 − 2 f x (t) − t) = f x (t)(1, −2) + (0, 2 − t). A similar equation holds for the points on any n-walk. Suppose that after i steps, the n-walk is at the point pi = (xi , yi ), and let ti = i/n. This means that there are wi = nxi whole pills and h i = nyi half pills in the bottle. These pills are enough for 2wi + h i doses of medicine. Since there were 2n doses in the bottle originally, and i of those doses have been used up, there must be 2n − i doses left. Therefore, 2wi + h i = 2n − i, or equivalently, h i = 2n − 2wi − i. Dividing through by n, we find that yi = 2 − 2xi − ti ,

(5)

and therefore pi = (xi , 2 − 2xi − ti ) = xi (1, −2) + (0, 2 − ti ). 304

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PM All use subject to JSTOR Terms and Conditions

It follows that √ kpi − f(ti )k = k(xi − f x (ti ))(1, −2)k = |xi − f x (ti )| 5. Thus, to ensure that pi is close to f(ti ), it will suffice to ensure that xi is close to f x (ti ); we can ignore the y-coordinates of pi and f(ti ). In other words, to prove Theorem 1 it will suffice to prove the following lemma. Lemma 4. Suppose that > 0. Let the x-coordinates of the points on an n-walk be x0 = 1, x1 , . . . , x2n = 0, and for 0 ≤ i ≤ 2n let ti = i/n. Then the probability that for every i, |xi − f x (ti )| < approaches 1 as n → ∞. In fact, using equations (3) and (5), we can completely eliminate the variable y from the problem. We can describe the x-coordinates of the points on an n-walk by saying that xi+1 is equal to either xi − 1/n or xi , with the first possibility occurring with probability xi xi xi = = . xi + yi xi + 2 − 2xi − ti 2 − xi − ti

(6)

Similarly, if x = f x (t) and y = f y (t), then for 0 ≤ t < 2, f x0 (t) =

dx x x f x (t) =− =− =− . dt x+y 2−x −t 2 − f x (t) − t

(7)

Thus, we can work entirely with the points (ti , xi ) and the curve x = f x (t), both of which lie in the t x-plane. The idea behind our proof of Lemma 4 is straightforward. Let m be a large positive integer, and let n be an integer much larger than m. Now consider an n-walk, and break the 2n steps of the walk into m large blocks of steps. We view the n-walk in the t x-plane, ignoring the y-coordinates. The individual steps of the n-walk are random and unpredictable, but the net change in x that results from a large block of steps is more predictable: by the law of large numbers, this net change is likely to be close to its expected value. It will follow that if a block of steps starts at a point (t, x), then the net result of this block of steps is likely to be a small displacement in the t x-plane whose slope is close to −x/(2 − x − t). Since x = f x (t) is a solution to the differential equation d x/dt = −x/(2 − x − t), this means that the steps of the n-walk should stay close to the graph of f x . This proof sketch suggests that our proof will involve ideas related to Euler’s method. Recall that Euler’s method is a numerical method for solving a differential equation of the form f 0 (t) = F(t, f (t)) for a ≤ t ≤ b, with an initial condition f (a) = x0 . Here the function F and the numbers a, b, and x0 are given, and we want to compute values of f . To apply Euler’s method, we choose a positive integer n and a positive step size h ≤ (b − a)/n, let t j = a + j h for 0 ≤ j ≤ n, and then define x j recursively by the equation x j+1 = x j + h F(t j , x j ),

0 ≤ j < n.

Thus, the displacement from (t j , x j ) to (t j+1 , x j+1 ) has slope F(t j , x j ). If h is small and F is sufficiently well-behaved, then the points (t j , x j ) will be close to the graph of f . April 2014]

A DRUG-INDUCED RANDOM WALK

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PM All use subject to JSTOR Terms and Conditions

305

We will need to modify Euler’s method slightly, because according to our proof sketch for Lemma 4, the slope of the displacement caused by a block of steps in the n-walk starting at (t, x) is likely to be close to −x/(2 − x − t), but not exactly equal to it. We will therefore need a version of Euler’s method in which the slope of the displacement at step j is only approximately equal to F(t j , x j ). To make this precise, suppose that a < b, g1 and g2 are functions from [a, b] to R, and for all t ∈ [a, b], g1 (t) < g2 (t). Let D = {(t, x) ∈ R2 : a ≤ t ≤ b and g1 (t) ≤ x ≤ g2 (t)}. Now suppose that F : D → R and f : [a, b] → R, and for all t ∈ [a, b], (t, f (t)) ∈ D and f 0 (t) = F(t, f (t)), where we interpret f 0 (t) as a one-sided derivative when t = a or t = b. Let x0 = f (a). We want to use a version of Euler’s method to locate points (t j , x j ) near the graph of f . As before, we will use a positive step size h ≤ (b − a)/n, so for 0 ≤ j ≤ n we let t j = a + j h. We will assume that for 0 ≤ j < n, the slope of the displacement from (t j , x j ) to (t j+1 , x j+1 ) deviates from F(t j , x j ) by some amount δ j . Thus, we recursively define x j+1 = x j + h(F(t j , x j ) + δ j ). To ensure that this formula is defined, we assume that for every j, g1 (t j ) ≤ x j ≤ g2 (t j ), so that (t j , x j ) ∈ D. Lemma 5. In the modified Euler’s method described above, assume that for 0 ≤ j < n, |δ j | ≤ δ. We also assume that ∂ F/∂ x and f 00 are defined and bounded. Thus, we assume that there are positive constants C1 and C2 such that for all (t, x) ∈ D, ∂ F 00 ∂ x (t, x) ≤ C1 , | f (t)| ≤ C2 . Then for 0 ≤ j ≤ n, |x j − f (t j )| ≤

δ hC2 + 2C1 C1

(1 + C1 h) j − 1 .

(8)

Proof. We proceed by induction on j. Clearly, inequality (8) holds when j = 0, since both sides are 0. Now suppose that the inequality holds for some j < n. By Taylor’s theorem, we can write f (t j+1 ) = f (t j ) + h f 0 (t j ) +

h 2 00 f (c j ) 2

for some number c j between t j and t j+1 . And by the mean value theorem, we have F(t j , x j ) = F(t j , f (t j )) + 306

∂F ∂F (t j , d j )(x j − f (t j )) = f 0 (t j ) + (t j , d j )(x j − f (t j )) ∂x ∂x

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PM All use subject to JSTOR Terms and Conditions

for some d j between x j and f (t j ). Thus, x j+1 − f (t j+1 ) = x j + h(F(t j , x j ) + δ j ) − f (t j+1 ) ∂F = x j + h f 0 (t j ) + (t j , d j )(x j − f (t j )) + δ j ∂x h 2 00 0 − f (t j ) + h f (t j ) + f (c j ) 2 h 2 00 ∂F (t j , d j ) + hδ j − f (c j ). = (x j − f (t j )) 1 + h ∂x 2 Next, we take absolute values and apply the bounds given in the statement of the lemma: |x j+1 − f (t j+1 )| ≤ |x j − f (t j )|(1 + C1 h) + hδ +

C2 h 2 . 2

Finally, we apply the inductive hypothesis to conclude that |x j+1 − f (t j+1 )| ≤

hC2 δ + 2C1 C1

C2 h 2 (1 + C1 h) j − 1 (1 + C1 h) + hδ + 2

hC2 δ + 2C1 C1

(1 + C1 h) j+1 − 1 ,

= as required.

3. PROOF OF THEOREM 1. To complete the proof of Theorem 1, we return to our proof sketch for Lemma 4. Unfortunately, nailing down the details of this proof sketch is not easy. Nevertheless, in this section we show that, with some care, a proof based on these ideas can be carried out. Fix > 0. We will refer to the region f x (t) − < x < f x (t) + in the t x-plane as the -corridor. To prove Lemma 4, we must show that for large n, an n-walk is likely to stay entirely inside the -corridor. We first determine simple bounds on any n-walk. At step i of the walk, by (5) we have xi ≥ 0,

2 − 2xi − ti = yi ≥ 0,

and therefore 0 ≤ xi ≤

2 − ti . 2

(9)

Similar bounds apply to the graph of f x : for 0 ≤ t ≤ 2, 0 ≤ f x (t) ≤ 1,

2 − 2 f x (t) − t = f y (t) = − f x (t) ln( f x (t)) ≥ 0,

so 0 ≤ f x (t) ≤ April 2014]

2−t . 2

A DRUG-INDUCED RANDOM WALK

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PM All use subject to JSTOR Terms and Conditions

(10) 307

These simple bounds already imply that the end of the n-walk stays inside the corridor: if ti > 2 − 2, then 0 ≤ xi , f x (ti ) ≤

2 − ti < , 2

and therefore |xi − f x (ti )| < . Thus, we only need to worry about ti in the interval [0, 2 − 2]. In particular, if > 1, then there is nothing more to prove, so we can assume now that ≤ 1. By stopping short of t = 2, we avoid having to deal with the point (t, x, y) = (2, 0, 0) on the limit curve, where the right-hand sides of the equations in (1) are undefined. We will find it convenient to go a bit beyond t = 2 − 2, so we define 2−t , D = (t, x) ∈ R : 0 ≤ t ≤ 2 − and 0 ≤ x ≤ 2

2

and for (t, x) ∈ D we let F(t, x) = −

x . 2−x −t

Notice that for (t, x) ∈ D, 2−x −t ≥2−

2−t 2−t −t = > 0, 2 2

(11)

so F(t, x) is defined. By (9) and (10), any n-walk and the curve x = f x (t) both stay in the region D up to time t = 2 − , and by (7), if 0 ≤ t ≤ 2 − , then f x0 (t) = F(t, f x (t)). Thus, it makes sense to apply Lemma 5 to the functions F and f x on the region D. In preparation for this, we make some observations about these functions. We first note that by (11) and the definition of D, for (t, x) ∈ D we have 2−x −t ≥

2−t ≥ x ≥ 0. 2

Since F(t, x) = −x/(2 − x − t), it follows that − 1 ≤ F(t, x) ≤ 0,

(12)

| f x0 (t)| = |F(t, f x (t))| ≤ 1.

(13)

and therefore

Next, we compute ∂F 2−t (t, x) = − , ∂x (2 − x − t)2 308

f x00 (t) =

f x (t)2 (F(t, f x (t)))2 = . (2 − f x (t) − t)3 2 − f x (t) − t

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PM All use subject to JSTOR Terms and Conditions

Thus, if (t, x) ∈ D, then by (11), ∂ F 2−t 2−t 4 4 ∂ x (t, x) = (2 − x − t)2 ≤ ((2 − t)/2)2 = 2 − t ≤ . Similarly, if 0 ≤ t ≤ 2 − , then | f x00 (t)| =

(F(t, f x (t)))2 1 1 2 2 ≤ ≤ = ≤ . 2 − f x (t) − t 2 − f x (t) − t (2 − t)/2 2−t

We can therefore use C1 = 4/ and C2 = 2/ in Lemma 5. For reasons that will become clear later, the value we will use for δ in Lemma 5 is δ=

C1 . 6(e2C1 − 1)

(14)

Since the function F(t, x) is uniformly continuous on D, we can choose some ζ > 0 such that for any two points (t1 , x1 ), (t2 , x2 ) ∈ D, if |t1 − t2 | < ζ and |x1 − x2 | < ζ , then |F(t1 , x1 ) − F(t2 , x2 )|
n 2n mq + r m(q + 1)

2 2 >2− > 2 − > 2 − 2. q +1 2m 6

Thus, we can let k be the least index such that Tk > 2 − 2. Then for all j < k, T j ≤ 2 − 2, and therefore, by assumption, |δ j | ≤ δ. And since Tk−1 ≤ 2 − 2, by (17) we have Tk < Tk−1 +

≤ 2 − 2 + < 2 − . 3 3

We can therefore apply Lemma 5 to the points (T j , X j ) for 0 ≤ j ≤ k and the functions F and f x on the region D to conclude that for all such j, hC2 δ + (1 + C1 h) j − 1 . |X j − f x (T j )| ≤ 2C1 C1 Since j ≤ k ≤ m and h ≤ 2/m, 2C1 m (1 + C1 h) j ≤ 1 + < e2C1 , m where the last inequality is well known (see, for example, inequality 4.5.13 in [7]). Therefore, (2/m)(2/) δ e2C1 − 1 δ(e2C1 − 1) |X j − f x (T j )| < + (e2C1 − 1) = + . 2(4/) C1 2m C1 310

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PM All use subject to JSTOR Terms and Conditions

By (16) and (14), the last two fractions are both at most /6. Thus, we have shown that |X j − f x (T j )|
2 − 2, as we observed after (10), all points on the n-walk beyond (Tk , X k ) are also in the -corridor. We still need to worry about points on the n-walk in the interiors of the first k blocks. If (t, x) is such a point, then (t, x) occurs between (T j , X j ) and (T j+1 , X j+1 ), for some j < k. To see that (t, x) is in the -corridor, we compute |x − f x (t)| ≤ |x − X j | + |X j − f x (T j )| + | f x (T j ) − f x (t)|. We now bound each of the terms on the right-hand side. We already know, by (17) and (18), that |x − X j | ≤ |X j+1 − X j | < /3 and |X j − f x (T j )| < /3. For the third term we apply the mean value theorem: f x (T j ) − f x (t) = f x0 (c)(T j − t), for some c between t and T j . By (13) and (17), we conclude that | f x (T j ) − f x (t)| = | f x0 (c)| · |T j − t| ≤ | f x0 (c)| · |T j+1 − T j | < 1 ·

= . 3 3

Putting it all together, we get |x − f x (t)| ≤ |x − X j | + |X j − f x (T j )| + | f x (T j ) − f x (t)|
δ. To complete the proof, we will show that this is unlikely to happen. Partition {(t, x) ∈ D : t ≤ 2 − 2} into finitely many disjoint regions R1 , R2 , . . . , R K , each with diameter less than ζ . By (12) and (15), for each k with 1 ≤ k ≤ K we can choose a number rk such that −1 ≤ rk ≤ 0 and for every (t, x) ∈ Rk , |F(t, x) − rk |
δ). Thus, it will suffice to show that for each j and k, limn→∞ p j,k (n) = 0. Fix j and k with 0 ≤ j < m and 1 ≤ k ≤ K . The value of δ j is determined by the block of steps taken by the n-walk in going from (T j , X j ) to (T j+1 , X j+1 ). The points on this part of the walk are (t jq+i , x jq+i ) for 0 ≤ i ≤ q. We will refer to the step from (t jq+i , x jq+i ) to (t jq+i+1 , x jq+i+1 ) as step i of this block of the n-walk. Notice that there are q steps in the block, and since q is the quotient when n is divided by m and m is fixed, q → ∞ when n → ∞. Let a be the number of steps in the block in which x decreases by 1/n. In the remaining q − a steps, the value of x does not change, so X j − X j+1 = a/n. Therefore, by definition, δj =

a/n a X j+1 − X j − F(T j , X j ) = − − F(T j , X j ) = − − F(T j , X j ). h q/n q

Although the value of p j,k (n) does not depend on the precise method by which the steps in this block of the walk are chosen, it will be helpful to specify a method. We will assume that for 0 ≤ i < q, random numbers si are chosen, independently and uniformly in [0, 1], and then in step i, x decreases by 1/n if si
δ) + Prn ((T j , X j ) ∈ Rk and δ j < −δ). 312

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PM All use subject to JSTOR Terms and Conditions

We will show that both of the probabilities on the right-hand side approach 0 as n → ∞. For the first, suppose that (T j , X j ) ∈ Rk and δ j > δ. Since δ j = −a/q − F(T j , X j ), by (20) this implies that a 3δ < −F(T j , X j ) − δ < −rk − . q 4 Now let a 0 be the number of values of i for which si ≤ −rk − δ/2. By conclusion (a) above, a 0 ≤ a, and therefore 0≤

a 3δ δ a0 ≤ < −rk − < −rk − < 1. q q 4 2

This is very unlikely to happen. To see why, notice first that for 0 ≤ i < q, since si is chosen uniformly in [0, 1] and 0 < −rk − δ/2 < 1, the probability that si ≤ −rk − δ/2 is −rk − δ/2. And since the si are chosen independently, this means that a 0 /q, which is the fraction of values of i for which si ≤ −rk − δ/2, should be close to −rk − δ/2. More precisely, by the law of large numbers (see [2, Section VI.4, p. 152]), for any α > 0, the probability that |a 0 /q − (−rk − δ/2)| > α must approach 0 as q → ∞. And since q → ∞ as n → ∞, taking α = δ/4 we can conclude that 0 a 3δ lim Prn < −rk − = 0. n→∞ q 4 It follows that lim Prn ((T j , X j ) ∈ Rk and δ j > δ) = 0.

n→∞

The second probability is similar. If (T j , X j ) ∈ Rk and δ j < −δ, then 3δ a > −F(T j , X j ) + δ > −rk + . q 4 Now let a 0 be the number of values of i for which si < −rk + δ/2. This time we use fact (b) above to conclude that a 0 ≥ a, so 1≥

a0 a 3δ δ ≥ > −rk + > −rk + > 0. q q 4 2

Once again, the law of large numbers says that the probability of this event goes to 0 as n → ∞, which completes the proof of Lemma 4 and, therefore, Theorem 1. 4. PROOFS OF THEOREMS 2 AND 3. To prove Theorem 2, fix n > 0, and let A denote the number of whole pills removed from the bottle before the first half pill. Of course, the first pill removed from the bottle must be a whole pill, and there are n whole pills altogether, so 1 ≤ A ≤ n. For 1 ≤ k ≤ n, let X k = 1 if the first k pills removed from the bottle are all whole pills, and X k = 0 otherwise. Then we have A = X 1 + X 2 + · · · + X n , and therefore E(A) = E(X 1 + X 2 + · · · + X n ) = E(X 1 ) + E(X 2 ) + · · · + E(X n ). April 2014]

A DRUG-INDUCED RANDOM WALK

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PM All use subject to JSTOR Terms and Conditions

313

The probability that the first pill removed is a whole pill is 1. Once the first whole pill has been removed, the bottle contains n − 1 whole pills and 1 half pill, so the probability that the second pill is also a whole pill is (n − 1)/n. Similarly, if the first two pills are whole pills, then the probability that the third pill is a whole pill is (n − 2)/n. Continuing in this way, we see that for 1 ≤ k ≤ n, E(X k ) = Pr(X k = 1) n−k+1 n−1 n−2 · ··· n n n n! = k . n (n − k)! =1·

Thus, E(A) =

n X

E(X k ) =

n X

k=1

k=1

n! . n k (n − k)!

Reindexing by j = n − k, we get E(A) =

n X k=1

n−1 n−1 X n! X n j n! n! = = n . n k (n − k)! n n− j j! n j=0 j! j=0

(21)

To relate this formula to the incomplete gamma function, we first evaluate the integral in the definition of the incomplete gamma function. Applying integration by parts k times leads to the formula in the following lemma. Lemma 6. For every integer k ≥ 0, Z

k

t k e−t dt = −

k! X t j + C. et j=0 j!

Using this lemma, we find that 0(n, n) =

∞

Z

t n−1 e−t dt n

N n−1 j X (n − 1)! t  = lim − N →∞ et j! j=0 

n

=

(n − 1)! en

n−1 X j=0

j

n . j!

(22)

Thus, n−1 X nj j=0

314

j!

=

en 0(n, n). (n − 1)!

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PM All use subject to JSTOR Terms and Conditions

Substituting into (21), we get E(A) =

n−1 n! X n j n! en en = · 0(n, n) = 0(n, n). n n j=0 j! n n (n − 1)! n n−1

This proves the first statement in Theorem 2. To prove the second statement, about the asymptotic value as n → ∞, we need the following fact. Lemma 7. 1 0(n, n) = . n→∞ (n − 1)! 2 lim

Proof. According to inequality 8.10.13 of [7], 1 0(n + 1, n) 0(n, n) < < . (n − 1)! 2 n!

(23)

By Lemma 6 and equation (22), 0(n + 1, n) =

∞

Z

t n e−t dt = n

n n−1 n! X n j (n − 1)! X n j nn nn = n + = n0(n, n) + . en j=0 j! en j! en en j=0

Substituting into the second half of inequality (23), we get 1 0(n, n) nn < + n , 2 (n − 1)! e n! and therefore √ 0(n, n) 1 n n 2π n 1 1 < − ·√ < . 2 en n! (n − 1)! 2 2πn √ By Stirling’s formula, limn→∞ n n 2π n/(en n!) = 1, and the lemma now follows by the squeeze theorem. This lemma allows us to determine the asymptotic rate of growth of the expected value of A. The expected length of the initial run of whole pills can be rewritten in the form r √ √ πn en n! 0(n, n) 1 en · E(A) = n−1 0(n, n) = 2π n · √ ∼ 2πn · 1 · = , n 2 2 n n 2π n (n − 1)! which completes the proof of Theorem 2. Finally, we give Stong’s proof of Theorem 3. For 1 ≤ k ≤ n, consider the kth whole pill that is removed from the bottle. This pill is cut in half, and half of it is returned to the bottle; we will refer to this half pill as the kth half pill. Let X k = 1 if the kth April 2014]

A DRUG-INDUCED RANDOM WALK

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PM All use subject to JSTOR Terms and Conditions

315

half pill is removed from the bottle after the last whole pill is removed, and X k = 0 otherwise. Then the expected value we seek is E(X 1 + X 2 + · · · + X n ) = E(X 1 ) + E(X 2 ) + · · · + E(X n ). After the kth half pill has been returned to the bottle, there are n − k whole pills still in the bottle, and we have X k = 1 if and only if among the set of pills consisting of these n − k remaining whole pills and the kth half pill, the half pill is the last one to be removed from the bottle. Since each pill in this set is equally likely to be chosen at each step, we have E(X k ) = Prn (X k = 1) =

1 . n−k+1

Therefore the expected number of half pills removed from the bottle after the last whole pill is E(X 1 ) + E(X 2 ) + · · · + E(X n ) =

1 1 + + · · · + 1 = Hn . n n−1

5. VARIATIONS. In all of our calculations, we have assumed that when a pill is removed from the bottle, all pills in the bottle are equally likely to be chosen. But since the whole pills are twice as big as the half pills, another natural assumption would be that whole pills are twice as likely to be chosen as half pills. In this section we summarize the results of redoing our calculations with this alternative assumption, leaving the details to the reader. If whole pills are twice as likely to be chosen as half pills, then the differential equations (1) must be replaced by dx 2x =− , dt 2x + y

dy 2x − y = . dt 2x + y

The solution to this system of equations that passes through the point (1, 0) is √ y = 2( x − x),

x=

(2 − t)2 , 4

y=

t (2 − t) . 2

Once again, the random walk converges uniformly in probability to this curve as n → ∞. Surprisingly, in this case the expected number of whole pills removed before the first half pill turns out to be exactly the same as the expected number of half pills removed after the last whole pill. Calculations similar to those in the last section show that both expected values are 22n − 1. 2n n

There is a simple explanation for why these two expected values are equal. The explanation is based on an alternative procedure we could follow to decide which pill to remove from the bottle each day. First, number the pills in a full bottle from 1 to n. Then make a deck of 2n cards numbered from 1 to n, with each number appearing on 316

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PM All use subject to JSTOR Terms and Conditions

two cards, and shuffle the deck. Every day, deal a card from the top of the deck, and if the card has the number k on it, then remove pill number k from the bottle. As usual, if the pill is whole, then cut it in half and return half to the bottle. On any day, if pill number k is still whole, then there will be two cards numbered k in the deck; if half of pill number k has already been taken, then there will be only one card numbered k in the deck; and if pill number k has been used up completely, then there will be no cards numbered k left in the deck. It follows that whole pills will be twice as likely to be chosen as half pills, as required. If we follow this procedure, then the number of whole pills removed from the bottle before the first half pill is removed will be the same as the number of distinct cards dealt from the top of the deck before the first duplicate card. Similarly, we could determine how many half pills will be removed from the bottle after the last whole pill by dealing cards from the bottom of the deck and counting the number of distinct cards dealt before the first duplicate. It should now be clear by symmetry that the expected values of these two numbers are equal. Indeed, the problem of computing this common expected value is equivalent to the third question addressed in [9], and the answer follows from Theorem 5 of [9]. ACKNOWLEDGMENTS. I would like to thank Richard Stong, Greg Warrington, Rob Benedetto, Tanya Leise, Amy Wagaman, and the anonymous referees for helpful conversations and suggestions. Natasha would like to thank Dr. Michael Katz, D.V.M.

REFERENCES 1. R. M. Corless, G. H. Gonnet, D. E. G. Hare, D. J. Jeffrey, D. E. Knuth, On the Lambert W function, Adv. Comput. Math. 5 (1996) 329–359. 2. W. Feller, An Introduction to Probability Theory and its Applications. Vol. I. Third edition, Wiley, New York, 1968. 3. L. Holst, On birthday, collectors’, occupancy and other classical urn problems, Int. Stat. Review 54 (1986) 15–27. 4. M. S. Klamkin, D. J. Newman, Extensions of the birthday surprise, J. Comb. Theory 3 (1967) 279–282. 5. G. F. Lawler, V. Limic, Random Walk: A Modern Introduction. Cambridge Studies in Advanced Mathematics. Vol. 123. Cambridge University Press, Cambridge, 2010. 6. B. McCabe, Matching balls drawn from an urn, Problem E 2263, Solutions by B. C. Arnold and R. J. Dickson, Amer. Math. Monthly 78 (1971) 1022–1024. 7. National Institute of Standards and Technology, Digital Library of Mathematical Functions, March 23, 2012, available at http://dlmf.nist.gov/. 8. P. N. Rathie, P. Z¨ornig, On the birthday problem: Some generalizations and applications, Int. J. Math. Math. Sci. 2003 (2003) 3827–3840. 9. D. J. Velleman, G. S. Warrington, What to expect in a game of memory, Amer. Math. Monthly, 120 (2013) 787–805. DANIEL J. VELLEMAN received his B.A. from Dartmouth College in 1976 and his Ph.D. from the University of Wisconsin–Madison in 1980. He taught at the University of Texas before joining the faculty of Amherst College in 1983. He was the editor of the American Mathematical Monthly from 2007 to 2011. In his spare time he enjoys singing, bicycling, and playing volleyball. Department of Mathematics, Amherst College, Amherst, MA 01002 [email protected]

April 2014]

A DRUG-INDUCED RANDOM WALK

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PM All use subject to JSTOR Terms and Conditions

317

Analytical Solution for the Generalized Fermat–Torricelli Problem Author(s): Alexei Yu. Uteshev Source: The American Mathematical Monthly, Vol. 121, No. 4 (April), pp. 318-331 Published by: Mathematical Association of America Stable URL: http://www.jstor.org/stable/10.4169/amer.math.monthly.121.04.318 . Accessed: 30/03/2014 17:29 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp

. JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected].

.

Mathematical Association of America is collaborating with JSTOR to digitize, preserve and extend access to The American Mathematical Monthly.

http://www.jstor.org

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:14 PM All use subject to JSTOR Terms and Conditions

Analytical Solution for the Generalized Fermat–Torricelli Problem Alexei Yu. Uteshev

Abstract. We present an explicit analytical solution for the problem of minimization of the function F(x, y) =

3 X

q m j (x − x j )2 + (y − y j )2 ,

j=1

i.e., we find the coordinates of the stationary point and the corresponding critical value as functions of {m j , x j , y j }3j=1 . In addition, we also discuss the inverse problem of finding such values for m 1 , m 2 , and m 3 for which the corresponding function F possesses a prescribed position of stationary point.

1. INTRODUCTION. Consider the following problem. Given the coordinates of three noncollinear points P1 = (x1 , y1 ), P2 = (x2 , y2 ), and P3 = (x3 , y3 ) in the plane, find the coordinates of the point P∗ = (x∗ , y∗ ) that gives a solution to the optimization problem min F(x, y)

(x,y)∈R2

for

F(x, y) =

3 X

q m j (x − x j )2 + (y − y j )2 .

(1)

j=1

Here m 1 , m 2 , and m 3 are assumed to be real positive numbers and will be subsequently referred to as weights. The stated problem, in its particular case of equal weights m 1 = m 2 = m 3 = 1, has been known since 1643 as the (classical) Fermat–Torricelli problem. It has a unique solution that coincides either with one of the points P1 , P2 , P3 or with the so-called Fermat or Fermat–Torricelli point [2, 4] of the triangle P1 P2 P3 ; this point makes an angle of 2π/3 with any two vertices of the triangle. Generalization of the problem to the case of unequal weights has been investigated since the 19th century. This generalization is known under different names: the Steiner problem, the Weber problem, the problem of railway junction ((Germ.) Problem des Knotenpunktes) [3, 8], the three factory problem [6]. The last two names were inspired by a facility location problem such as the following. Let the cities P1 , P2 , and P3 be the sources of iron ore, coal, and water, respectively. To produce one ton of steel, the steel works needs m 1 tons of iron, m 2 tons of coal, and m 3 tons of water. Assuming that the freight charge for a ton-kilometer is independent of the nature of the cargo, find the optimal position for the steel works connected with P1 , P2 , and P3 via straight roads so as to minimize the transportation costs. In the rest of the paper, this problem will be referred to as the generalized Fermat– Torricelli problem. Existence and uniqueness of its solution is guaranteed by the following result [4]. http://dx.doi.org/10.4169/amer.math.monthly.121.04.318 MSC: Primary 51N20

318

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:14 PM All use subject to JSTOR Terms and Conditions

Theorem 1. Denote by α1 , α2 , and α3 the corner angles of the triangle P1 P2 P3 . If the conditions  2 m < m 22 + m 23 + 2m 2 m 3 cos α1 ,    1 (2) m 22 < m 21 + m 23 + 2m 1 m 3 cos α2 ,    2 m 3 < m 21 + m 22 + 2m 1 m 2 cos α3 are fulfilled, then there exists a unique solution P∗ = (x∗ , y∗ ) ∈ R2 for the generalized Fermat–Torricelli problem lying inside the triangle P1 P2 P3 . This point is a stationary point for the function F(x, y), i.e., a real solution of the system 3 X j=1 3 X j=1

m j (x − x j ) p = 0, (x − x j )2 + (y − y j )2 m j (y − y j ) p = 0. (x − x j )2 + (y − y j )2

(3)

If any of the conditions (2) are violated, then F(x, y) attains its minimum value at the corresponding vertex of the triangle. Let us overview some approaches for finding the point P∗ . Historically, the first approach is geometrical: The point is found as the intersection point of a special construction of lines or circles. For the equal weighted case, Torricelli proved that the circles circumscribing the equilateral triangles constructed on the sides of and outside the triangle P1 P2 P3 intersect at the point P∗ ; for an alternative Simpson construction of P∗ , see [5]. For the general, i.e., unequal weighted case, see [3, 8]. The second approach is based on the mechanical model (sometimes incorrectly called P´olya’s mechanical model): A horizontal board is drilled with holes at the points P1 , P2 , and P3 (or at the vertices of a triangle similar to P1 P2 P3 ). Three strings are tied together in a knot at one end, the loose ends are passed through the holes, and are attached to physical weights proportional to m 1 , m 2 , and m 3 , respectively, below the board. The equilibrium position of the knot yields the solution [3]. The third approach, based on the gradient descent method, originated in the paper [11]; further developments and comments can be found in [7, 9]. The present paper is devoted to the fourth approach, the analytical one. We look for explicit expressions for the coordinates of the stationary point P∗ as functions of {m j , x j , y j }3j=1 . Although the existence of such a solution by radicals, i.e., in a finite number of operations like standard arithmetic ones and extraction of (positive integer) roots, is not questioned in any review article on the problem, we failed to find in the literature the constructive and universal version of an algorithm even for the classical, i.e., equal weighted, case. 2. ALGEBRA. Theorem 2. Under the conditions (2), the coordinates of the stationary point (x∗ , y∗ ) of the function F(x, y) are as follows: K 1 K 2 K 3 x1 x2 x3 K 1 K 2 K 3 y1 y2 y3 x∗ = + + , y∗ = + + (4) 4|S|σ d K1 K2 K3 4|S|σ d K1 K2 K3 April 2014]

SOLUTION FOR THE FERMAT–TORRICELLI PROBLEM

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:14 PM All use subject to JSTOR Terms and Conditions

319

with F(x∗ , y∗ ) = min F(x, y) =

√

(x,y)∈R2

d.

Here 1 (m 2 K 1 + m 22 K 2 + m 23 K 3 ) , or alternatively, (5) 2σ 1 1 2 2 2 2 2 2 2 2 2 d = 2|S|σ + m 21 (r12 +r13 − r23 )+m 22 (r23 +r12 −r13 )+m 23 (r13 +r23 −r12 ) , (6) 2 q r j` = |P j P` | = (x j − x` )2 + (y j − y` )2 for { j, `} ⊂ {1, 2, 3},

d=

σ =

S = x1 y2 + x2 y3 + x3 y1 − x1 y3 − x3 y2 − x2 y1 ,

(7)

q 1 −m 41 − m 42 − m 43 + 2m 21 m 22 + 2m 21 m 23 + 2m 22 m 23 , 2

(8)

and  2 2 2 K 1 = (r12 + r13 − r23 )σ + (m 22 + m 23 − m 21 )|S|,    2 2 2 )σ + (m 21 + m 23 − m 22 )|S|, − r13 + r12 K 2 = (r23    2 2 2 )σ + (m 21 + m 22 − m 23 )|S|. − r12 + r23 K 3 = (r13

(9)

Proof. First, we establish the validity of the equality K 1 K 2 + K 1 K 3 + K 2 K 3 = 4σ |S|d,

(10)

2 2 2 r23 K 1 + r13 K 2 + r12 K 3 = 2|S|d

(11)

and the dual equality

for (5). Second, let us deduce the following relationships q mjKj (x∗ − x j )2 + (y∗ − y j )2 = √ 2σ d

for j ∈ {1, 2, 3}.

(12)

Here is the proof for the case j = 1: (x∗ − x1 )2 + (y∗ − y1 )2 " # K1 K2 K3 2 x3 x1 x1 2 y2 y3 y1 y1 2 x2 (10) = + − − + + − − 4σ |S|d K2 K3 K2 K3 K2 K3 K2 K3 (x3 − x1 )2 + (y3 − y1 )2 K 1 K 2 K 3 2 (x2 − x1 )2 + (y2 − y1 )2 + = 2 4σ |S|d K2 K 32 (x2 − x1 )(x3 − x1 ) + (y2 − y1 )(y3 − y1 ) +2 K2 K3 320

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:14 PM All use subject to JSTOR Terms and Conditions

=

K1 K2 K3 4σ |S|d

2

2 2 2 2 2 r12 r13 1/2(r12 + r13 − r23 ) + + 2 2 2 K2 K3 K2 K3

=

2 2 K 12 2 2 2 2 r12 K 3 + r13 K 22 + (r12 + r13 − r23 )K 2 K 3 2 (4σ |S|d)

=

2 K 12 2 2 (r12 K 3 + r13 K 2 )(K 2 + K 3 ) − r23 K2 K3 2 (4σ |S|d)

=

K 12 2 2 (2|S|d − r23 K 1 )(K 2 + K 3 ) − r23 K2 K3 2 (4σ |S|d)

=

K 12 2 2|S|d(K + K ) − r (K K + K K + K K ) 2 3 1 2 1 3 2 3 23 (4σ |S|d)2

(11)

=

K 12 2 2|S|d(K + K ) − 4r σ |S|d 2 3 23 (4σ |S|d)2

=

2|S|d K 12 2 K 2 + K 3 − 2r23 σ (4σ |S|d)2

(10)

=

K 12 2 2m 1 |S| 8|S|dσ 2

=

m 21 K 12 . 4σ 2 d

(9)

Similar arguments hold for j ∈ {2, 3} in (12). To complete the proof of these equalities, it should be additionally verified that the values K 1 , K 2 , and K 3 are nonnegative. This will be done in the next section. To prove the first statement of the theorem, we will utilize the following alternative representation for x∗ and y∗ : 1

(10)

x∗ =

1 K1

+

+

1 K3

1

(10)

y∗ =

1 K2

1 K1

+

1 K2

+

1 K3

x1 x2 x3 + + K1 K2 K3

y2 y3 y1 + + K1 K2 K3

,

and

.

(13)

We substitute (4) into the left-hand side of the first equation of (3). The resulting expression can be reduced with the aid of (12) to x∗ −x1 x∗ −x2 x∗ −x3 + + = x∗ K1 K2 K3

1 1 1 x1 x2 x3 (13) + + − + + = 0. K1 K2 K3 K1 K2 K3

Similar arguments are valid for the second equation from (3). Finally, we compute F(x∗ , y∗ ):

F(x∗ , y∗ ) =

April 2014]

3 X

3 q X √ m 2j K j (5) 2σ d (12) m j (x∗ − x j )2 + (y∗ − y j )2 = √ = √ = d. 2σ d 2σ d j=1 j=1

SOLUTION FOR THE FERMAT–TORRICELLI PROBLEM

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:14 PM All use subject to JSTOR Terms and Conditions

321

Some test values are provided in Table 1. P1 m1

P2 m2

P3 m3

1. (2, 6) (1, 1) 2

P∗ √ d

(5, 1)

3

4

√ √ 4103+1833 15 29523−4481 15 , 2866 8598

≈ (3.9086, 1.4152) p √ d = 2 79 + 15 15 ≈ 23.4174 751 647 , ≈ (1.5484, 1.3340) 485 485 √ √ d = 970 ≈ 31.1448 3 3 1 − √12 − √110 , √12 − √355 − √110 √

2. (2, 6) (1, 1) 3

5

3/2

2

(5, 1)

4 √ √ 3. (0, 0) (2, 0) (− 2, 2) 2

√

d=

≈ (0.0068, 0.0165) r q 32 +

23 √ 2

+3

55 2

≈ 7.9997

Table 1.

3. GEOMETRY. Let us give an interpretation for some constants that appeared in Theorem 2. First, on rewriting (7) in determinantal form 1 1 1 S = x1 x2 x3 , y1 y2 y3 we recognize that |S| = 2S4P1 P2 P3 , where S4P1 P2 P3 stands for the area of triangle P1 P2 P3 . As for the constant (8), factorization of the radicand on the right-hand side leads to the form m1 + m2 + m3 m1 + m2 + m3 m1 + m2 + m3 σ =2 − m1 − m2 2 2 2 1/2 m1 + m2 + m3 , − m3 × 2 which can be treated as the Heron formula for twice the area of a triangle formed by the triple of weights m 1 , m 2 , and m 3 . Under the restrictions (2), such a triangle exists. Construct this triangle and denote its angles, as shown in Figure 1.

m3

m3

α3 α2

m1

α1

m2

β1

β2

m1

m2 β3

Figure 1. Two triangles generated by the problem

322

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:14 PM All use subject to JSTOR Terms and Conditions

The first formula from (9) can thus be represented with the aid of the law of cosines as 2 2 2 − r23 + r13 m 2 + m 23 − m 21 r12 + 2 |S| σ 2r12r13 cos α1 2m 2 m 3 cos β1 = σ |S| + |S| σ

K 1 = σ |S|

= 2σ |S|(cot α1 + cot β1 ). Rewriting the first condition from (2) in the form cos α1 + cos β1 > 0, we can conclude that cot α1 + cot β1 > 0 and, thus, K 1 > 0. In a similar way, the expressions for K 2 and K 3 can be deduced, and we can establish that, under the restrictions (2), they are both positive. This completes the proof of Theorem 2. Remark 1. We set the dual generalized Fermat–Torricelli problem. Let the triangle be composed of the sides with the lengths equal to m 1 , m 2 , and m 3 ; let the weights r12 , r23 , and r13 be placed in its vertices, as shown in Figure 2. r 23

m3

m2

r 13 m1

r 12

Figure 2. Dual problem

The minimum value for the objective function will be the same as in the direct problem, since (6) is equivalent to 2|S|σ +

1 2 2 2 2 (m 22 + m 23 − m 21 ) . (m 21 + m 23 − m 22 ) + r23 r12 (m 1 + m 22 − m 23 ) + r13 2

4. CLASSICAL FERMAT–TORRICELLI PROBLEM. Consider now the equal weighted case m 1 = m 2 = m 3 = 1. Theorem 3. Let all the angles of the triangle P1 P2 P3 be less than 2π/3, or, equivalently, 2 2 2 r12 + r13 + r12r13 − r23 > 0, 2 2 2 r23 + r12 + r12r23 − r13 > 0, 2 2 2 r13 + r23 + r13r23 − r12 > 0.

The coordinates of the Fermat–Torricelli point for this triangle are as follows: k1 k2 k3 x1 x2 x3 k1 k2 k3 y1 y2 y3 x∗ = √ + + , y∗ = √ + + , k2 k3 k2 k3 2 3|S|d k1 2 3|S|d k1 April 2014]

SOLUTION FOR THE FERMAT–TORRICELLI PROBLEM

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:14 PM All use subject to JSTOR Terms and Conditions

(14) 323

with the corresponding minimum value of the objective function F(x∗ , y∗ ) = min

(x,y)∈R2

3 q X √ (x − x j )2 + (y − y j )2 = d. j=1

Here, 2 2 √ + r23 1 r 2 + r13 d = √ (k1 + k2 + k3 ) = 12 + 3 |S| 2 3

(15)

and √ 3 2 2 2 k1 = (r + r13 − r23 ) + |S|, 2 12 √ 3 2 2 2 k2 = (r + r12 − r13 ) + |S|, 2 23 √ 3 2 2 2 k3 = (r + r23 − r12 ) + |S|, 2 13 with the rest of the parameters coinciding with those from Theorem 2. It turns out that the right-hand sides of the expressions (14), being represented as rational fractions with respect to {x j , y j }3j=1 , can be reduced further to the form where denominators become “area free.” Corollary. Under conditions of Theorem 3, the coordinates of the Fermat–Torricelli point are as follows:  √ 1  2 2 2 (x1 + x2 + x3 )|S| + 3 x1r23 + x2r13 + x3r12 x∗ = √ (16) 2 3d  1 1 1 , y1 y2 y3 + 3 sgn(S) x2 x3 + y2 y3 x1 x3 + y1 y3 x1 x2 + y1 y2  √ 1  2 2 2 (y1 + y2 + y3 )|S| + 3 y1r23 + y2r13 + y3r12 (17) y∗ = √ 2 3d  1 1 1 . x1 x2 x3 − 3 sgn(S) x2 x3 + y2 y3 x1 x3 + y1 y3 x1 x2 + y1 y2 Remark 2. The result of the last corollary can be extended to the generalized Fermat– Torricelli problem. Numerators and denominators in the right-hand sides of the formulas (4) can be reduced by the common factor |S|. We do not present the resulting expressions here, since they are inelegantly cumbersome. Remark 3. One of the referees of the present paper suggested that the author “provide some motivation or insight of how he found the explicit expressions in Theorem 2.” 324

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:14 PM All use subject to JSTOR Terms and Conditions

Frankly speaking, the historical development of the investigation went in the direction opposite to what has been presented up to this point. First, the formulas (16)–(17) were obtained as the solution of a linear system of equations arising from the feature of the Fermat–Torricelli point to make an angle of 2π/3 with any two vertices of the triangle. Next, in a similar way, the formulas mentioned in Remark 2 were obtained for the generalized Fermat–Torricelli problem, i.e., for the coordinates x∗ , y∗ . Although these formulas looked awful, they permitted us to deduce the explicit expression (6) for the value of minimal distance. Moreover, we noticed the appearance of this value in the expressions for denominators of the formulas for x∗ and y∗ . Next, we intended to perform an additional verification of the obtained results via direct substitution into the equations (3). At this moment, the following lucky guess came to mind: the radicand of q (x∗ − x j )2 + (y∗ − y j )2 should be a perfect square! The only remaining trick was to discover the values (9). 5. INVERSE PROBLEM. Given the coordinates of the point P∗ = (x∗ , y∗ ), we wish to find the values for the weights m 1 , m 2 , and m 3 with the aim for the corresponding objective function (1) to posses a minimum point precisely at P∗ . Theorem 4. Let the vertices of the triangle P1 P2 P3 be counted counterclockwise. Then for the choice 1 1 1 m ∗1 = |P∗ P1 | · x∗ x2 x3 , y∗ y2 y3 1 1 1 m ∗2 = |P∗ P2 | · x1 x∗ x3 , and y1 y∗ y3 1 1 1 m ∗3 = |P∗ P3 | · x1 x2 x∗ (18) y1 y2 y∗ the function F(x, y) =

3 X

q m ∗j (x − x j )2 + (y − y j )2

j=1

has its stationary point at P∗ . Provided that the latter is chosen inside the triangle P1 P2 P3 , the values (18) are all positive, and 1 x∗ F(x∗ , y∗ ) = min F(x, y) = 2 y∗ (x,y)∈R x 2 + y2 ∗

∗

1 x1 y1 x12 + y12

1 x2 y2 x22 + y22

1 x3 . y3 2 2 x3 + y3

(19)

Proof. Substitute x = x∗ , y = y∗ and the values (18) into the left-hand side of the first equation from (3) as follows: April 2014]

SOLUTION FOR THE FERMAT–TORRICELLI PROBLEM

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:14 PM All use subject to JSTOR Terms and Conditions

325

1 1 (x∗ −x1 ) x∗ x2 y∗ y2

1 1 1 x3 +(x∗ −x2 ) x1 x∗ y1 y∗ y3

1 1 1 x3 +(x∗ −x3 ) x1 x2 y1 y2 y3

1 x∗ . y∗

(20)

Represent this combination of the third-order determinants in the form of the fourthorder determinant, namely 1 x∗ y ∗ 0

1 x1 y1 x∗ − x1

1 x2 y2 x∗ − x2

1 x3 y3 x∗ − x3

(expansion by its last row coincides with (20)). Now add the second row to the last to obtain the following: 1 1 x∗ x1 y∗ y1 x x ∗ ∗

1 1 x2 x3 . y2 y3 x∗ x∗

In this determinant, the first row is proportional to the last one; therefore, the determinant equals just zero. The second equality from (3) can be verified in a similar manner. Let us evaluate F(x∗ , y∗ ): 1 1 2 2 F(x∗ , y∗ ) = (x∗ − x1 ) + (y∗ − y1 ) x∗ x2 y∗ y2 1 2 2 + (x∗ − x2 ) + (y∗ − y2 ) x1 y1 1 + (x∗ − x3 )2 + (y∗ − y3 )2 x1 y1

1 x3 y3 1 x∗ y∗ 1 x2 y2

1 x3 y3 1 x∗ . y∗

To prove the equality (19), let us split it into the x-part and the y-part. First, keep the x-terms in brackets of the previous formula: 1 1 1 1 1 1 1 1 1 (x∗ − x1 )2 x∗ x2 x3 + (x∗ − x2 )2 x1 x∗ x3 + (x∗ − x3 )2 x1 x2 x∗ . y∗ y2 y3 y1 y∗ y3 y1 y2 y∗ Similar to the proof of the first part of the theorem, represent this linear combination as the determinant of the fourth order: 1 1 x1 x∗ y y1 ∗ 0 (x − x )2 ∗ 1 326

1 x2 y2 (x∗ − x2 )2

1 x3 . y3 2 (x∗ − x3 )

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:14 PM All use subject to JSTOR Terms and Conditions

Multiply the first row by (−x∗2 ), the second one by 2x∗ and add the obtained rows to the last one: 1 x∗ y ∗ x2 ∗

1 x1 y1 x12

1 x2 y2 x22

1 x3 . y3 x32

(21)

The y-part of the equality (19) can be proven in exactly the same manner with the resulting determinant differing from (21) only in its last row. The linear property of determinant with respect to its rows completes the proof of (19). Remark 4. The solution of the inverse problem is determined up to a common positive multiplier, i.e., the solution triple (m 1 , m 2 , m 3 ) is defined by the value of the ratio m 1 : m 2 : m 3 . (In the language of the facility location problem mentioned in the Introduction, this statement is equivalent to the fact that the optimal position of the steel works is independent of the currency of the state.) Up to this remark, the solution of the inverse problem is unique. We have proven this statement via direct computations starting from formulas (4). Example 1. Let P1 = (2, 6), P2 = (1, 1), P3 = (5, 1), and P∗ =

√ √ 1 1 4103 + 1833 15 , 29523 − 4481 15 . 2866 8598

Find the values for the weights m ∗1 , m ∗2 , and m ∗3 from Theorem 4. Solution. Formulas (18) give: √ q √ 2(20925 − 4481 15) 316380606 + 35999826 15, = 18481401 √ q √ 2(15105 − 2342 15) ∗ m2 = 75400161 − 9169767 15, 6160467

m ∗1

and m ∗3

√ q √ 8(−1185 + 15988 15) = 8335761 − 2050623 15, 18481401

with F(x∗ , y∗ ) =

√ 1 (−333980 + 193436 15). 4299

Now, compare the obtained result with the one represented in test 1 from Section 2. According to Remark 4, we might expect that m ∗1 : m ∗2 : m ∗3 = 2 : 3 : 4. We leave the verification of this fact as an exercise for the inquisitive reader. April 2014]

SOLUTION FOR THE FERMAT–TORRICELLI PROBLEM

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:14 PM All use subject to JSTOR Terms and Conditions

327

The next example originated from the question posed by one of the referees of the present paper: What will happen to the result of Theorem 4 if we take P∗ = P j ? Example 2. Show how to choose the values for the weights m 1 , m 2 , and m 3 in order for the point P∗ to coincide with the given point on a side of the triangle from Example 1. Solution. If we take P∗ = P2 , the formulas (18) give zero values for all the weights; however, the “weights” of these zeros are different. To explain this causistry, take P∗ = P2 + (µ, µ) for the infinitely small µ > 0. For this case, formulas (18) give: p √ m ∗1 (µ) = 4µ 26 − 12µ + 2µ2 = 4 26µ + o(µ), √ m ∗2 (µ) = 4 2µ(5 − 2µ), p m ∗3 (µ) = 4µ 16 − 8µ + 2µ2 = 16µ + o(µ). The weight m ∗2 (µ) dominates over m ∗1 (µ) and m ∗3 (µ) when µ → +0. As a matter of fact, the true values of these weights do not influence the position of the point P∗ ; the latter depends only on√ the value of the ratio m ∗1 (µ) : m ∗2 (µ) : m ∗3 (µ). Thus, the choice √ m ∗1 = 4 26, m ∗2 = 20 2, m ∗3 = 16 provides us with P∗ = P2 . Let us now manipulate the weights with the aim of extruding the point P∗ to an internal point of the side P2 P3 . This manipulation is not trivial, as in the previous case. First, we utilize formulas (18) and then simplify the obtained result with the aid of formulas (4). Finally, the variable weights m ∗1 (µ) = tµ, m ∗2 (µ) = 1 + µ, m ∗3 (µ) = 1 − µ with a fixed t >

√

104, provide the following asymptotics as µ → +0: 10 P∗ −→ 2 − √ ,1 . t2 − 4

Thus, the two “essential” weights m ∗2 (µ) and m ∗3 (µ) guarantee delivery of P∗ to the side P2 P3 , while the negligible weight m ∗1 (µ) ensures the fine-tuning of this delivery to the particular point within the open line segment P2 P⊥ . Here P⊥ = (2, 1) is the foot of the altitude of the triangle P1 P2 P3 through the point P1 . Let us discuss the geometrical meaning of the constants from Theorem 4. The value m ∗1 equals twice the product of the distance |P1 P∗ | by the area of the triangle P∗ P2 P3 . The first statement of the theorem is equivalent to the geometrical equality − → −−→ −−→ −−→ P∗ P1 · S4P∗ P2 P3 + P∗ P2 · S4P∗ P3 P1 + P∗ P3 · S4P∗ P1 P2 = O . Finally, the constant (19) is connected with 1 1 x∗ h=− S y∗ x 2 + y2 ∗

∗

1 x1 y1 x12 + y12

1 x2 y2 x22 + y22

1 x3 , y3 x32 + y32

which is known [10, pp. 251–252] as the power of the point P∗ with respect to the circle through the points P1 , P2 , and P3 (the circumscribed circle of the triangle). If we denote the circumcenter of the triangle P1 P2 P3 by C, then 328

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:14 PM All use subject to JSTOR Terms and Conditions

h = |CP∗ |2 − |CP j |2

for j ∈ {1, 2, 3},

(22)

and, provided that P∗ lies inside this triangle, the value h is negative. Results of the present section can evidently be extended to the case of three (and more) dimensions. Theorem 5. Let the points {P j = (x j , y j , z j )}4j=1 be noncoplanar, and be counted in such a manner that the value of the determinant 1 x V = 1 y1 z

1 x4 y4 z4

(23)

∗ 4 m j = |P∗ P j | · V j j=1 ,

(24)

1

1 x2 y2 z2

1 x3 y3 z3

is positive. Then for the choice

where V j equals the determinant obtained on replacing the jth column of (23) by the column [1, x∗ , y∗ , z ∗ ]> (here > denotes transposition), the function F(x, y, z) =

4 X

q m ∗j (x − x j )2 + (y − y j )2 + (z − z j )2

j=1

has its stationary point at P∗ = (x∗ , y∗ , z ∗ ). If P∗ lies inside the tetrahedron P1 P2 P3 P4 , then the values (24) are all positive, and F(x∗ , y∗ , z ∗ ) =

min

(x,y,z)∈R3

F(x, y, z)

1 x ∗ y∗ = − z∗ x 2 + y2 + z2 ∗

∗

∗

1 x1 y1 z1 x12 + y12 + z 12

1 x2 y2 z2 x22 + y22 + z 22

1 x3 y3 z3 x32 + y32 + z 32

1 x4 . y4 z4 x42 + y42 + z 42

(25) Geometrical meanings of the values appearing in the last theorem are similar to their counterparts from Theorem 4. For instance, the value (23) equals six times the volume of tetrahedron P1 P2 P3 P4 , while the value (25) divided by V is known [10, p. 255] as the power of the point P∗ with respect to a sphere circumscribed to that tetrahedron; it is equivalent to (22), where C this time stands for the circumcenter of the tetrahedron while j ∈ {1, 2, 3, 4}. 6. CONCLUSIONS. An analytical solution for the generalized Fermat–Torricelli problem and its inversion is presented. The three-point case is completely solved using “extended radicals”: In addition to elementary and extraction of roots operations, the sign function is utilized in the formulas. The treatment of the multidimensional n > 3 point case requires further investigation, although some theoretical results like [1] give little reason to hope for a nice, e.g., extended radicals, solution. April 2014]

SOLUTION FOR THE FERMAT–TORRICELLI PROBLEM

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:14 PM All use subject to JSTOR Terms and Conditions

329

7. APPENDIX. We prove here the equalities (10) and (11). We have K1 K2 + K1 K3 + K2 K3 =

1 [K 1 (K 2 + K 3 ) + K 2 (K 1 + K 3 ) + K 3 (K 1 + K 2 )] 2

(9)

2 2 2 = K 1 (r23 σ + m 21 |S|) + K 2 (r13 σ + m 22 |S|) + K 3 (r12 σ + m 23 |S|) 2 2 2 2 2 2 2 2 2 2 2 2 + r13 − r23 )r23 + (r23 + r12 − r13 )r13 + (r13 + r23 − r12 )r12 = σ 2 (r12 + S 2 m 21 (m 22 + m 23 − m 21 ) + m 22 (m 21 + m 23 − m 22 ) + m 23 (m 21 + m 22 − m 23 ) 2 2 2 2 2 2 2 2 2 + σ |S| m 21 (r12 + r13 − r23 ) + m 22 (r12 + r23 − r13 ) + m 23 (r13 + r23 − r12 ) 2 2 2 (m 21 + m 22 − m 23 ) (m 21 + m 23 − m 22 ) + r12 (m 22 + m 23 − m 21 ) + r13 + r23

= 4σ 2 S 2 + 4σ 2 S 2 2 2 2 2 2 2 2 2 2 + 2σ |S| m 21 (r12 + r13 − r23 ) + m 22 (r13 + r23 − r12 ) + m 23 (r13 + r23 − r12 ) . Here we have utilized (8) and the equality 2 2 2 2 2 2 2 2 2 2 2 2 4S 2 = (r12 + r13 − r23 )r23 + (r13 + r23 − r12 )r12 + (r23 + r12 − r13 )r13 ,

(26)

which can be verified either directly or with the aid of the Heron formula for the area of a triangle (see Section 3). Reference to the definition (6) of the constant d completes the proof of (10). We now deduce formula (11): 2 2 2 r23 K 1 + r13 K 2 + r12 K3 2 2 2 2 2 2 2 2 2 2 2 2 = σ (r12 + r13 − r23 )r23 + (r13 + r23 − r12 )r12 + (r23 + r12 − r13 )r13 2 2 2 2 + |S| r23 (m 2 + m 23 − m 21 ) + r13 (m 21 + m 23 − m 22 ) + r12 (m 21 + m 22 − m 23 ) (26) 2 2 2 2 2 2 2 2 2 = 4σ S 2 + |S| m 21 (r12 +r13 −r23 ) + m 22 (r23 +r12 −r13 ) + m 23 (r13 +r23 −r12 ) (6)

= 2|S|d.

ACKNOWLEDGMENTS. The author thanks the referees and the editor for valuable suggestions that helped to improve the quality of the paper.

REFERENCES 1. C. Bajaj, The algebraic degree of geometric optimization problems, Discrete Comput. Geom. 3 (1988) 177–191. 2. R. Courant, H. Robbins, What is Mathematics? Oxford University Press, London, 1941. 3. F. Dingeldey, Sammlung von Aufgaben zur Anwendung der Differenzial- und Integralrechnung. Erster Teil. Aufgaben zur Anwendung der Differenzialrechnung. Teubner, Leipzig, 1910. 4. Encyclopedia of Mathematics contributors, Fermat–Torricelli problem. Encyclopedia of Mathematics, available at http://www.encyclopediaofmath.org/index.php?title=Fermat-Torricelli_ problem&oldid=22419 5. D. Gonzalez Martinez, The Fermat point, available at http://jwilson.coe.uga.edu/ EMAT6680Fa10/Gonzalez/Assignment6/THEFERMATPOINT.htm

330

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:14 PM All use subject to JSTOR Terms and Conditions

6. I. Greenberg, R. A. Robertello, The three factory problem, Math. Mag. 38 (1965) 67–72. 7. H. W. Kuhn, A note on Fermat’s problem, Math. Program 4 (1973) 98–107. 8. W. Launhardt, Kommercielle Tracirung der Verkehrswege, Z. f. Architekten u. Ingenieur-Vereinis im Konigreich Hannover, 18 (1872) 516–534. 9. L. M. Ostresh, On the convergence of a class of iterative methods for solving the Weber location problem, Oper. Res. 26 (1978) 597–609. 10. J. V. Uspensky, Theory of Equations. McGraw-Hill, New York, 1948. 11. E. Weiszfeld, Sur le point pour lequel la somme des distances de n points donn´es est minimum. Tohoku Math. J. 43 (1937) 355–386. ALEXEI UTESHEV received his Ph.D. from the Leningrad (St. Petersburg) State University in 1988. His mathematical interests lie in computational algebra and geometry; he also carries on personal educational on-line resources in these areas. He is also interested in history and enjoys cross-country skiing. Faculty of Applied Mathematics, St. Petersburg State University Universitetskij pr. 35, 198504, Petrodvorets, St. Petersburg, Russia [email protected]

A One-Sentence Line-of-Sight Proof of the Extreme Value Theorem The maximum value of a continuous real-valued function f on [a, b] is attained at its largest “lookout point.” We call x in [a, b] a lookout point if, whenever t lies in [a, x), we have f (t) ≤ f (x). The set L of lookout points is closed. Indeed, let xn → x, with xn in L. If t is in [a, x), then eventually t lies in [a, xn ), so f (t) ≤ f (xn ). By continuity, f (t) ≤ f (x), as desired. We use the fact that a closed, bounded, and nonempty set has a maximum and a minimum. Thus, max(L) exists. Extreme Value Theorem. If f is a real-valued continuous function on [a, b] then f has a maximum value on [a, b]. In other words, for some c in [a, b], no value attained by f exceeds f (c). Proof. Letting L = {x in [a, b] such that t in [a, x) implies f (t) ≤ f (x)} and c = max(L), it suffices to show that, given k > f (c), the closed, bounded set Sk = {t in [a, b] such that f (t) ≥ k} is empty, and this follows since, if some d satisfies f (d) ≥ k, then d > c, whence d is not in L, so there exists a t < d for which f (t) > f (d) ≥ k, proving that Sk has no minimum. —Submitted by Samuel J. Ferguson, University of Iowa http://dx.doi.org/10.4169/amer.math.monthly.121.04.331 MSC: Primary 26A03 Secondary: 26A15

April 2014]

SOLUTION FOR THE FERMAT–TORRICELLI PROBLEM

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:14 PM All use subject to JSTOR Terms and Conditions

331

A One-Sentence Line-of-Sight Proof of the Extreme Value Theorem Author(s): Samuel J. Ferguson Source: The American Mathematical Monthly, Vol. 121, No. 4 (April), p. 331 Published by: Mathematical Association of America Stable URL: http://www.jstor.org/stable/10.4169/amer.math.monthly.121.04.331 . Accessed: 30/03/2014 17:29 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp

. JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected].

.

Mathematical Association of America is collaborating with JSTOR to digitize, preserve and extend access to The American Mathematical Monthly.

http://www.jstor.org

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:25 PM All use subject to JSTOR Terms and Conditions

6. I. Greenberg, R. A. Robertello, The three factory problem, Math. Mag. 38 (1965) 67–72. 7. H. W. Kuhn, A note on Fermat’s problem, Math. Program 4 (1973) 98–107. 8. W. Launhardt, Kommercielle Tracirung der Verkehrswege, Z. f. Architekten u. Ingenieur-Vereinis im Konigreich Hannover, 18 (1872) 516–534. 9. L. M. Ostresh, On the convergence of a class of iterative methods for solving the Weber location problem, Oper. Res. 26 (1978) 597–609. 10. J. V. Uspensky, Theory of Equations. McGraw-Hill, New York, 1948. 11. E. Weiszfeld, Sur le point pour lequel la somme des distances de n points donn´es est minimum. Tohoku Math. J. 43 (1937) 355–386. ALEXEI UTESHEV received his Ph.D. from the Leningrad (St. Petersburg) State University in 1988. His mathematical interests lie in computational algebra and geometry; he also carries on personal educational on-line resources in these areas. He is also interested in history and enjoys cross-country skiing. Faculty of Applied Mathematics, St. Petersburg State University Universitetskij pr. 35, 198504, Petrodvorets, St. Petersburg, Russia [email protected]

A One-Sentence Line-of-Sight Proof of the Extreme Value Theorem The maximum value of a continuous real-valued function f on [a, b] is attained at its largest “lookout point.” We call x in [a, b] a lookout point if, whenever t lies in [a, x), we have f (t) ≤ f (x). The set L of lookout points is closed. Indeed, let xn → x, with xn in L. If t is in [a, x), then eventually t lies in [a, xn ), so f (t) ≤ f (xn ). By continuity, f (t) ≤ f (x), as desired. We use the fact that a closed, bounded, and nonempty set has a maximum and a minimum. Thus, max(L) exists. Extreme Value Theorem. If f is a real-valued continuous function on [a, b] then f has a maximum value on [a, b]. In other words, for some c in [a, b], no value attained by f exceeds f (c). Proof. Letting L = {x in [a, b] such that t in [a, x) implies f (t) ≤ f (x)} and c = max(L), it suffices to show that, given k > f (c), the closed, bounded set Sk = {t in [a, b] such that f (t) ≥ k} is empty, and this follows since, if some d satisfies f (d) ≥ k, then d > c, whence d is not in L, so there exists a t < d for which f (t) > f (d) ≥ k, proving that Sk has no minimum. —Submitted by Samuel J. Ferguson, University of Iowa http://dx.doi.org/10.4169/amer.math.monthly.121.04.331 MSC: Primary 26A03 Secondary: 26A15

April 2014]

SOLUTION FOR THE FERMAT–TORRICELLI PROBLEM

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:25 PM All use subject to JSTOR Terms and Conditions

331

On the Proof of the Existence of Undominated Strategies in Normal Form Games Author(s): Martin Kovár, Alena Chernikava Source: The American Mathematical Monthly, Vol. 121, No. 4 (April), pp. 332-337 Published by: Mathematical Association of America Stable URL: http://www.jstor.org/stable/10.4169/amer.math.monthly.121.04.332 . Accessed: 30/03/2014 17:29 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp

. JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected].

.

Mathematical Association of America is collaborating with JSTOR to digitize, preserve and extend access to The American Mathematical Monthly.

http://www.jstor.org

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:39 PM All use subject to JSTOR Terms and Conditions

On the Proof of the Existence of Undominated Strategies in Normal Form Games Martin Kov´ar and Alena Chernikava

Abstract. In the game theory literature, there are two versions of the proof of the well-known fact that in a normal form game of n persons with compact spaces of strategies and continuous utility functions, the sets of undominated strategies are nonempty. The older one, stated in the first edition of the well-known book by Herve Moulin, depends on certain, relatively nontrivial results from measure theory, metric topology, and mathematical analysis. The proof is valid only for metrizable topological spaces. The second, revised edition of the same book contains a simplified proof, which is, however, incorrect. The author implicitly assumes that any linearly ordered set contains a cofinal subsequence, which is certainly not true. In this paper we correct, simplify, and generalize the second proof of Moulin by its reformulation in terms of topological convergence of nets. This modified technique also yields a slightly better result than is stated in the original. The assertion now holds for almost compact spaces. The argument used is elementary and easily understandable to non-experts.

1. INTRODUCTION. In a normal form game, assume that the set of all strategies of a player is compact and its associated utility function is continuous. In this paper, we present a slightly improved modification of the well-known result, which ensures the existence of an undominated strategy. Moreover, our result has a new and simpler proof. The standard and best-known version of the proof is in the first edition of the comprehensive textbook on game theory by Herve Moulin [8]. It is dependent on a combination of relatively nontrivial results from measure theory, metric topology, and mathematical analysis. In the second, revised edition [9] of the same book, there is a newer, simplified proof using some topological arguments together with Zorn’s Lemma. Unfortunately, the second Moulin’s proof is incorrect, since it implicitly uses a non-valid argument that every chain (that is, a linearly ordered set) contains a cofinal subsequence. The first uncountable ordinal ω1 is a proper counterexample, which demonstrates that in general it is not true. The mistake itself is not very critical for game theory, since in metric spaces, for which the classical results are usually formulated, the topology is first countable and hence the sequences are still sufficient to fully describe the topology by means of convergence. Nevertheless, the mentioned fact itself, noticed by the second author during her study of [9], constitutes an opportunity for a revision of Moulin’s original proof using somewhat finer and a little bit more general topological arguments. We will correct, simplify, and somewhat strengthen the proof of Moulin’s result to be applicable for slightly more general topological assumptions than is stated in its original version. Although the central notion that we use to express the phenomena of convergence—nets—lies rather aside from the main stream of the current general topology, it has an important advantage. It is understood also by non-topologists because of its similarity to the widely-spread and well-known convergence theory of http://dx.doi.org/10.4169/amer.math.monthly.121.04.332 MSC: Primary 91A10, Secondary 54D30

332

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:39 PM All use subject to JSTOR Terms and Conditions

sequences in metric spaces. We also keep the simplified formulation of Moulin’s theorem as in [8] and exclude the part regarding the prudent strategies, which is new in [9], but irrelevant with respect to the discussed correctness of Moulin’s proof. A short description of our modification of the proof now follows. To reach more clarity of the sketch, we use compactness instead of almost compactness (in a contrast to the full version presented in Section 3). We modify the relation of dominance to induce a preorder on the strategy set of the ith player. The identity map, restricted to any chain (meaning now a linearly or totally preordered subset) forms a net in a compact topological space, which, hence, has a cluster point. The continuous utility function maps the net as well as its cluster point to R, equipped with the Euclidean topology. Thus the cluster point is an upper bound of the chain, and Zorn’s Lemma finally completes the proof. 2. DEFINITIONS AND NOTATIONS. Before demonstrating the complete proof, we will need to recapitulate some necessary notions from game theory and topology. Recall that an n-person game G in a normal or strategic form is denoted by the 2ntuple G = (X 1 , X 2 , . . . , X n , u 1 , u 2 , . . . , u n ), where for each i ∈ {1, 2, . . . , n}, X i is a nonempty set of strategies of the ith player and u i : 5nj=1 X j → R is his real-valued utility, or pay-off function. Let i ∈ {1, 2, . . . , n} and let xi , yi ∈ X i be some strategies of the ith player. We say that the strategy yi dominates the strategy xi , if the following conditions hold. (1) For any selection of strategies sk ∈ X k , where k ∈ {1, 2, . . . , n}, k 6 = i, u i (s1 , s2 . . . , si−1 , xi , si+1 , . . . , sn ) ≤ u i (s1 , s2 . . . , si−1 , yi , si+1 , . . . , sn ). (2) For each k ∈ {1, 2, . . . , n}, k 6 = i, there exists some strategy tk ∈ X k such that u i (t1 , t2 . . . , ti−1 , xi , ti+1 , . . . , tn ) < u i (t1 , t2 . . . , ti−1 , yi , ti+1 , . . . , tn ). The strategy xi ∈ X i of the ith player is said to be undominated if there is no strategy yi ∈ X i that dominates xi . It should be noted that this kind of dominance is sometimes referred to as a weak dominance, in opposition to strict dominance, which differs from the above-defined notion at the condition (1) by the strict form < of the inequality. Two strategies xi , yi ∈ X i are called equivalent, if for any selection of strategies sk ∈ X k , where k ∈ {1, 2, . . . , n}, k 6 = i, it follows that u i (s1 , s2 . . . , si−1 , xi , si+1 , . . . , sn ) = u i (s1 , s2 . . . , si−1 , yi , si+1 , . . . , sn ). (For more detail, see, for example, [2, 11].) A binary relation on a set is called a preorder, if it is reflexive and transitive (and not necessarily antisymmetric). Let A be a nonempty set and 4 be a preorder on A such that for every x, y ∈ A there exists z ∈ A with x 4 z and y 4 z. Then we say that (A, 4) is a directed set. Let X be a topological space. A net in X is an arbitrary mapping from a directed set to the space X . A family 8 of nonempty sets is called a filter base if any intersection of two sets belonging to 8 contains a subset from 8. We say that p ∈ X is a θ -cluster point of a filter base 8 in X , if for every closed neighborhood H of p and every F ∈ 8, the intersection H ∩ F is nonempty. Similarly, p is a θ-cluster point of a net ϕ(A, 4), if for each closed neighborhood H of p and for each a ∈ A, there exists b ∈ A, b < a, such that ϕ(b) ∈ H . Taking the ϕ-images of the principal upper sets ↑a = {b| b ∈ A, b < a}, we can easily convert the net ϕ(A, 4) into a filter base, while the corresponding convergence and θ -convergence notions will remain preserved. April 2014]

UNDOMINATED STRATEGIES IN NORMAL FORM GAMES

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:39 PM All use subject to JSTOR Terms and Conditions

333

A topological space X is said to be compact, if every filter base (or equivalently, every net) in X has a cluster point. For more detail and other equivalent characterizations of compactness, especially in terms of open covers, we refer the reader to [3]. We also remark that in a modern approach to compactness, motivated by the growing interest of theoretical computer scientists in topology, the Hausdorff separation axiom is no longer assumed as a part of the definition of compactness (see, for example, [15]). A topological space is called almost compact [1] if every open filter base in X has a cluster point. It is clear that every compact topological space is almost compact (but not vice-versa). The real line R we consider is a topological space equipped with the natural, Euclidean topology, generated by all open intervals. 3. MAIN RESULTS. We will start with the following simple example. It illustrates some of the limitations of Moulin’s classical result. The undominated strategies may exist even if the spaces of strategies are not compact. Example 3.1. Consider a normal form game of two players with the same sets of strategies X 1 = X 2 = [0, 1) × {0} ∪ {1} × {0, 1, . . .}. Let the corresponding utility functions of the players be u1 =

x1 · f (y2 ), x1 + x2

and

u2 =

x2 · g(y1 ), x1 + x2

where f, g are arbitrary real-valued functions defined on {0} ∪ N. It is easy to see that the pairs (1, n) ∈ X i , where n ∈ {0, 1, . . .} and i = 1, 2, are equivalent, maximal and undominated strategies of the ith player. However, although the utility functions u i are continuous, the topology of X i , induced from the real plane is not compact. For instance, the sequence {(1, n)| n = 0, 1, 2, . . .} has no cluster point in X i . Hence, the existence of undominated strategies of the ith player is not a consequence of Moulin’s theorem. There are many possible interpretations of the previous example, but probably one of the most important is that there could be a duopolistic competition over market share with patent wars. The first component xi of the strategy (xi , yi ) of the ith player may represent the market share, while the second component yi can be interpreted as obstructions extracting the profit of the player’s opponent, in particular, litigation over patent rights. For our main theorem and also for a better understanding of some other aspects of the previous example, we will need the following lemma. The contents of the lemma are already known—it is essentially contained in (but rather split between) the book [1] and the paper [14]. Also useful are comments in [5]. We present the result here with a proof, in order to repeat and concentrate some ideas of these resources in one place for the reader’s convenience. Lemma 3.1. Let (X, τ ) be a topological space. The following conditions are equivalent: (i) (ii) (iii) (iv) 334

(X, τ ) is almost compact, every filter base in X has a θ -cluster point, every net in X has a θ -cluster point, every open cover of X has a finite subfamily whose union is dense in X . c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:39 PM All use subject to JSTOR Terms and Conditions

Proof. Suppose (i), and let 8 be a filter base in X . The family 9 = {U | U ∈ τ , there exists F ∈ 8 with F ⊆ U } is an open filter base, and so it has a cluster point, say p ∈ X . Let H be a closed neighborhood of p. We will show that H ∩ F 6 = ∅ for every F ∈ 8. Suppose conversely that F ⊆ X \ H for some R F ∈ 8. RThen X \ H ∈ 9 and so p ∈ cl(X \ H ). But this is not possible, since p ∈ H and ( H ) ∩ (X \ H ) = ∅. Hence, (ii) follows. Consider (ii) and take a net ϕ(A, 4) in X . The family 8 = {ϕ(↑a)| a ∈ A} is a filter base with a θ-cluster point, say p ∈ X . Let H be a closed neighborhood of p and let a ∈ A. Then H ∩ ϕ(↑a) 6 = ∅, so there is some b ∈ A, b < a, with ϕ(b) ∈ H . It means that p is a θ -cluster point of ϕ(A, 4) and (iii) holds. Assume (iii), and take an open cover of X . Let F be the family of all finite unions of elements of . The family F is directed by the set inclusion. Suppose that for every U ∈ F , the set X \ cl U is nonempty, so it contains some element ϕ(U ). The net ϕ( F , ⊆) has a θ-cluster point, say p ∈ X . Since F is also a cover, there is some V ∈ F containing p. By the definition of the θ -cluster point, there exists W ∈ F , W ⊇ V , such that ϕ(W ) ∈ cl V . But it also holds that ϕ(W ) ∈ X \ cl W , so ∅ 6= (X \ cl W ) ∩ V ⊆ (X \ W ) ∩ W , which is not possible. Then some element of F must be dense in X . T Finally, suppose (iv). Let 9 be an open filter base in X with no cluster point. Then {cl U | U ∈ 9} = ∅, so = {X \ cl U | U ∈ 9} is an open cover of X ; and since 9 is a filter base, it is directed by the inclusion. By (iv), there exists U ∈ 9, such that X = cl(X \ cl U ). Since X \ U is a closed set containing (X \ cl U ), it also contains its closure, and so X \ U = X . But this is not possible according to the fact that a filter base contains only nonempty elements. Therefore, 9 has a cluster point and (i) now follows. From the previous lemma also follows the well-known fact that for regular spaces, the compactness and almost compactness coincide. On the other hand, there exists a Hausdorff almost compact space that is not compact, as the reader may check in [1]. Hausdorff almost compact spaces are also known as H -closed spaces (also in [1], or [14]). Theorem 3.1. Let G = (X 1 , X 2 , . . . , X n , u 1 , u 2 , . . . , u n ) be a normal form game of n players. Suppose that for some i ∈ {1, 2, . . . , n}, X i is almost compact and the utility function u i is a continuous, real valued function of the argument xi ∈ X i . Then the ith player has an undominated strategy. Proof. For two strategies xi , yi ∈ X i we put xi 4 yi (sometimes we will write this relation as yi < xi ) if they satisfy the condition (1) of the definition of dominance in Section 2. It is easy to see that 4 is a preorder on X i . Let L ⊆ X i be an arbitrary linearly preordered subset of X i (that is, for every a, b ∈ L, it holds a 4 b or b 4 a). Let l be the identity mapping on X i , restricted to L. Then l is a net in an almost compact topological space X i , so l has a θ -cluster point p ∈ X i . By the definition, it means that for every closed neighborhood H of p and every t ∈ L, there exists some s ∈ L, s < t, with l(s) ∈ H . Now, suppose that the strategies sk ∈ X k of the other players, k 6 = i, are arbitrarily chosen, but fixed in this paragraph. We denote u i0 (xi ) = u i (s1 , s2 . . . , si−1 , xi , si+1 , . . . , sn ). Suppose, for a moment, that there exist some t ∈ L with u i0 ( p) < u i0 (t). Take c ∈ R such that u i0 ( p) < c < u i0 (t). Because of continuity of u i0 , H = u i0−1 ((−∞, c]) is a closed set in X i whose interior contains p. Since p is a θ cluster point of l, there exists s ∈ L, with s < t, such that s = l(s) ∈ H . But April 2014]

UNDOMINATED STRATEGIES IN NORMAL FORM GAMES

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:39 PM All use subject to JSTOR Terms and Conditions

335

then u i0 (s) ∈ (−∞, c], which is not possible, because the relation s < t means that c < u i0 (t) ≤ u i0 (s). Consequently, p is an upper bound of L. By Zorn’s Lemma, there is a maximal element m in the preordered set (X i , 4). This completes the proof, since the strategy, maximal with respect to 4, cannot be dominated. Note that Zorn’s Lema is usually formulated for partially ordered sets. Using preordered sets, its appropriate formulation can be found in [7]. Hence, the maximality of m, which is claimed in our proof, is maximality up to the equivalence of strategies. It means that there may exist another strategy m 0 ∈ X i , different from m (and also “maximal”), on which the utility function u i has the same values. Since every compact topological space is almost compact, the classical version of Moulin’s theorem now follows as a corollary. The reader can also compare Theorem 3.1 with other interesting results and techniques known from the game theory literature. For instance, H. Salonen in [12] replaced the continuity of the utility function by its upper semi-continuity. He essentially used a characterization of compactness by the centered collections of sets (in other words, having the finite intersection property, [10]), or filters and filter bases, which are topologically equivalent to nets. A similar technique was also used in [11] for iteratively undominated strategies with the continuous utility function. Now, let us check the advantage of Theorem 3.1 over its original, classical version. Example 3.2. Consider the game already described in Example 3.1. Let us define another topology on X i , where i = 1, 2, by the local base of a general point (x, y) ∈ X i : (i) the point (0, 0) has neighborhoods of the form [0, ε) × {0}, 0 < ε < 1, (ii) for every x ∈ (0, 1), the point (x, 0) has neighborhoods of the form (x − ε, x + ε) × {0}, where 0 < ε < min{x, 1 − x}, (iii) for every n = 0, 1, . . . , the point (1, n) has neighborhoods having the form (1 − ε, 1) × {0} ∪ {(1, n)}, where 0 < ε < 1. The new topology on X i is now similar to the Euclidean topology on the unit segment [0, 1], but with one important difference—the right end point of the “segment” is present infinitely many times. The space X i is T1 , but certainly non-Hausdorff and noncompact. Indeed, denoting Yn = [0, 1) × {0} ∪ {(1, n)}, the family {Yn | n = 0, 1, . . .} is an open cover of X i , having no finite subcover. However, we can show that the new topology is almost compact. Let be an open cover of X i . The subspace Y0 = [0, 1] × {0} ⊆ X i is compact since it is homeomorphic to the unit segment S [0, 1], so there exists a finite subfamily {U1 , U2 , . . . , Uk } ⊆ with Y0 ⊆ kj=1 U j . Then there is r ∈ {1, 2, . . . , k} such that (1, 0) ∈ Ur . But for every n = 1, 2, . . . , it follows that (1, n) ∈ cl Ur , so the closures of {U1 , U2 , . . . , Uk } cover X i . By condition (iv) of Lemma 3.1, X i is almost compact. The utility functions u i are continuous functions of the argument (xi , yi ), since they are continuous on the open subspaces Yn = [0, 1) × {0} ∪ {(1, n)} of X i , n = 0, 1, . . . , homeomorphic to [0, 1]. Hence, the existence of the undominated strategies now follows from Theorem 3.1. Note that similar spaces as X i are also known as examples of non-Hausdorff manifolds with some motivation in sheaf theory and mathematical physics (see, for example, [6] or [4]). ACKNOWLEDGMENTS. The authors are very grateful to both anonymous referees for many valuable suggestions and comments, in particular related to the game-theoretical part of the content of our paper, and to the editor for his assistance with preparation of the final form of the manuscript. The authors are also thankful to Professor V. A. Gorelik from the Dorodnitsyn Computing Center of the Russian Academy of Sciences for his advice on finding an appropriate game-theoretical literature at the initial stage of their work.

336

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:39 PM All use subject to JSTOR Terms and Conditions

This work is supported by a specific research grant FEKT-S-11-2/921 of the Faculty of Electrical Engineering and Communication, Brno University of Technology.

REFERENCES 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15.

´ Cs´asz´ar, General Topology. Akademiai Kiad´o, Budapest, 1978. A. D. Fudenberg, J. Tirole, Game Theory. MIT Press, Cambridge, 1991. R. Engelking, General Topology. Heldermann Verlag, Berlin, 1989. M. Heller, L. Pysiak, W. Sasin, Geometry of non-Hausdorff spaces and its significance for physics, J. Math. Phys. 52 (2011) 1–7, available at http://dx.doi.org/10.1063/1.3574352. D. S. Jankovi´c, θ-regular spaces, Internat. J. Math. Sci. 8 (1985) 615–619, available at http://dx.doi. org/10.1155/S0161171285000667. S. L. Kent, R. A. Mimna, J. K. Tartir, A note on topological properties of non-Hausdorff manifolds, Internat. J. Math. Sci. (2009) 1–4, available at http://dx.doi.org/10.1155/2009/891785. R. E. Meggison, An Introduction to Banach Space Theory. Springer-Verlag, Berlin, 1998. H. Moulin, Theorie des Jeux pour l’Economie et la Politique. Hermann Paris—Collection Methodes, Paris, 1981. , Game Theory for the Social Sciences. Second and revised edition. New York University Press, New York, 1986. J. Nagata, Modern General Topology. North-Holland, Amsterdam, 1974. K. Ritzberger, Foundations of Non-Cooperative Game Theory. Oxford University Press, Oxford, 2002. H. Salonen, On the existence of undominated Nash equilibria in normal form games, Games and Economic Behavior 14 (1996) 208–219, available at http://dx.doi.org/10.1006/game.1996.0049. W. J. Thron, Topological Structures. Holt, Rinehart and Winston, New York, 1966. N. V. Veliˇcko, H -closed topological spaces, Mat. Sb. 70(112) (1966) 98–102 (Russian). S. Vickers, Topology Via Logic. Cambridge University Press, Cambridge, 1989.

´ is an associate professor of mathematics at Brno University of Technology (Brno, Czech MARTIN KOVAR Republic). He obtained his Ph.D. (1994) from Masaryk University in Brno and his habilitation degree (2006) from Charles University in Prague. He started to study physics at Masaryk University in 1985. Even as a student, he was fascinated by the elegance and beauty of classical topological results. His growing interest in topology finally led to a change in his area of specialization. His main fields of interest are general and applied topology motivated by problems from computer science and physics. However, theoretical and mathematical physics still remain in the extended range of his scientific interests. To date, he has published approximately 30 research articles. He strongly believes that topology is a fascinating mathematical discipline of the future, with an excellent, but so far underused, potential for many applications, including computer science, physics, and modern technologies. He also believes that science is fun and that the liberty of scientific research is one the greatest values and achievements of humanity and should be carefully protected. Department of Mathematics, Faculty of Electrical Engineering and Communication, Brno University of Technology, Technick´a 8, Brno, 616 00, Czech Republic [email protected]

ALENA CHERNIKAVA received her master’s degree (2009) at the State University of Minsk in Belarus. She excelled at her studies, and in 2010 she began Ph.D. study at Brno University of Technology in Czech Republic. Her principal fields of interest are applied topology, formal concept analysis, and game theory. She is an author or co-author of several research papers. Department of Mathematics, Faculty of Electrical Engineering and Communication, Brno University of Technology, Technick´a 8, Brno, 616 00, Czech Republic [email protected]

April 2014]

UNDOMINATED STRATEGIES IN NORMAL FORM GAMES

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:39 PM All use subject to JSTOR Terms and Conditions

337

An Asymptotic Formula for (1 + 1/ x ) x Based on the Partition Function Author(s): Chao-Ping Chen, Junesang Choi Source: The American Mathematical Monthly, Vol. 121, No. 4 (April), pp. 338-343 Published by: Mathematical Association of America Stable URL: http://www.jstor.org/stable/10.4169/amer.math.monthly.121.04.338 . Accessed: 30/03/2014 17:29 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp

. JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected].

.

Mathematical Association of America is collaborating with JSTOR to digitize, preserve and extend access to The American Mathematical Monthly.

http://www.jstor.org

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:52 PM All use subject to JSTOR Terms and Conditions

An Asymptotic Formula for (1 + 1/x) x Based on the Partition Function Chao-Ping Chen and Junesang Choi

Abstract. We present a method to produce estimations of the natural logarithmic constant e, accurate to as many decimal places as we desire. The method is based on an asymptotic formula for (1 + 1/x)x , which uses the partition function.

In contrast with the continuing fascination with finding as many digits as possible of the decimal approximation of π, few mathematicians seem interested in computing the base e of the natural logarithms to a comparable precision (see [2, 6]). Joost B¨urgi seems to have formulated the first approximation to e around 1620, obtaining threedecimal-place accuracy (see [3, p. 31], [5], and [6, pp. 26–27]). One classical computation of e depends upon the well-known limit 1 x e = lim 1 + . x→∞ x

(1)

Indeed, it is easy to see that the function f (x):= (1 + 1/x)x increases and is bounded above by 3 on the interval [1, ∞); thus, a larger value of x gives a more accurate approximation to e. For example, f 105 = 2.7182 6823 . . . approximates e to four decimal places. Another classical computation of e uses Isaac Newton’s first version (1620) of what is now known as the Maclaurin series expansion for e x [2]: ex =

∞ X x2 x3 xj =1+x + + + ··· j! 2! 3! j=0

for all x ∈ R.

(2)

Setting x = 1 in (2) and choosing a large value of n, we obtain the partial sum n X 1 1 1 1 = 1 + 1 + + + ··· + , j! 2! 3! n! j=0

which gives a simple, direct approximation to e that is the best way of calculating e to high accuracy [1, 2]. Present numerical values of e are derived using either optimized versions of this Maclaurin series (2) or the continued-fraction expansion approach initiated by Euler [2]. A further classical approach to approximating e uses the Maclaurin series expansion of ln(1 + x): ln(1 + x) =

∞ X (−1) j−1 j=1

j

xj

for − 1 < x ≤ 1.

(3)

http://dx.doi.org/10.4169/amer.math.monthly.121.04.338 MSC: Primary 05A17, Secondary 11P81

338

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:52 PM All use subject to JSTOR Terms and Conditions

The only example of this alternative approach (that the authors of [2] have found in the literature) is given by replacing x with 1/x in (3) and then multiplying the resulting series by x to get 1 1 1 1 1 1 1 =1− + 2 − 3 + 4 − 5 + 6 − ··· x ln 1 + x 2x 3x 4x 5x 6x 7x

(4)

for x < −1 or x ≥ 1. By exponentiating each side of (4) and collecting the same powers of 1/x with the help of the Maclaurin series (2) for e x , we find an approximation to e that has been known by mathematicians and bankers alike since the early seventeenth century (see [2, Eq. (4)]). For x < −1 or x ≥ 1, we obtain 1 11 7 2447 959 238043 1 x = e 1− + − + − + −· · · . 1+ x 2x 24x 2 16x 3 5760x 4 2304x 5 580608x 6 (5) Setting, for example, x = 100,000 in the left-hand side of (5) yields an approximation to e that is accurate to four decimal places. Motivated by this technique, Knox and Brothers [5] (see also Brothers and Knox [2]) present an interesting and useful method that yields a new and more accurate approximation to e by combining two good approximations. We choose to demonstrate one of their many results here (see [2] or [5]). Adding approximation (5) and the approximation obtained by replacing x by −x in (5), and multiplying the resulting identity by 1/2, they obtain the following better approximation to e than that given by (5): 1 2

1 x 1 −x 11 2447 238043 1+ + 1− = e 1+ + + +· · · . x x 24x 2 5760x 4 580608x 6

(6)

Even though we can obtain as many coefficients as we please in the right-hand side of (5) by using Mathematica, here we aim at giving a formula for determining these coefficients. Our formula is based mainly on the partition function (see, e.g., [7, 9]). For our later use, we introduce the following set of partitions of an integer n ∈ N = N0 \ {0} := {1, 2, 3, . . .}: An := (k1 , k2 , . . . , kn ) ∈ Nn0 : k1 + 2k2 + · · · + nkn = n .

(7)

In number theory, the partition function p(n) represents the number of possible partitions of n ∈ N (e.g., the number of distinct ways of representing n as a sum of natural numbers, regardless of order). By convention, p(0) = 1 and p(n) = 0 for n a negative integer. For more information on the partition function p(n), please refer to [7] and the references therein. The first several values of the partition function p(n) are (starting with p(0) = 1, see [9]): 1, 1, 2, 3, 5, 7, 11, 15, 22, 30, 42, . . . . It is easy to see that the cardinality of the set An is equal to the partition function p(n). Now we are ready to present a formula that determines the coefficients a j ’s in (8), with the help of the partition function asserted by the following theorem. April 2014]

AN ASYMPTOTIC FORMULA FOR (1 + 1/x)x

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:52 PM All use subject to JSTOR Terms and Conditions

339

Theorem. The following approximation formula holds true: ∞ X 1 x aj 1+ =e x xj j=0

(8)

as x → ∞,

where the coefficients a j ( j ∈ N) are given by a0 := 1 and a j = (−1)

X

j

(k1 ,k2 ,...,k j )∈A j

1 k1 ! k2 ! · · · k j !

k1 k2 k j 1 1 1 ··· , 2 3 j +1

(9)

where the A j (for j ∈ N) are given in (7). Proof. In view of the Maclaurin series (2) of ln(1 + x), we can let   q X 1 aj  x ln 1 + = 1 + ln 1 + + O(x −q−1 ) for x → ∞ and q ∈ N, j x x j=1 where a1 , . . . , aq are real numbers to be determined. From the fundamental theorem of algebra, we see that there exist unique complex numbers x1 , . . . , xq such that 1+

a1 aq x1 xq + ··· + q = 1 + ··· 1 + . x x x x

(10)

By using the following series expansion: q z X (−1) j−1 z j ln 1 + = + O(x −q−1 ) for |z| < |x| and x → ∞, j x j x j=1

we obtain q a1 aq X (−1) j−1 S j ln 1 + + ··· + q = + O(x −q−1 ) j x x j x j=1

for x → ∞,

(11)

where j

S j = x1 + · · · + xqj Replacing x by

1 x

for j = 1, . . . , q.

in (3) and multiplying the resulting equation by x, we get

q X (−1) j−1 1 x ln 1 + =1− + O(x −q−1 ) j x ( j + 1)x j=1

for x → ∞.

(12)

We then find from (11) and (12) that Sj = − 340

j j +1

for j = 1, . . . , q,

(13)

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:52 PM All use subject to JSTOR Terms and Conditions

that is,  1   x1 + · · · + xq = − ,    2      2 2 x1 + · · · + xq2 = − , 3   ..   .    q  q q  . x1 + · · · + xq = − q +1

(14)

Let Pq (x) = x q + b1 x q−1 + · · · + bq−1 x + bq be a polynomial with zeros x1 , . . . , xq satisfying the system of equations (14). So we have Pq (x) = (x − x1 ) · · · (x − xq ).

(15)

The Newton formulas (see, e.g., [4] and references therein) give the connection between the coefficients b j and the power sums S j : for j = 1, . . . , q.

S j + S j−1 b1 + S j−2 b2 + · · · + S1 b j−1 + jb j = 0 It is known [4] that b j can be expressed in terms of S j : X

bj =

(k1 ,k2 ,...,k j )∈A j

(−1)k1 +k2 +···+k j k1 !k2 ! · · · k j !

S1 1

k1

S2 2

k2

···

Sj j

k j

,

(16)

where the A j (for j ∈ N) are given in (7). From (15), we obtain x1 xq (−1)q P (−x) = 1 + · · · 1 + . q xq x x We thus have 1+

(−1)b1 (−1)2 b2 (−1)q−1 bq−1 (−1)q bq + + · · · + + x x2 x q−1 xq x1 xq ··· 1 + . = 1+ x x

(17)

We see from (10) and (17) that the coefficients a j are given by a j = (−1) j b j = (−1) j

X (k1 ,k2 ,...,k j )∈A j

(−1)k1 +k2 +···+k j k1 !k2 ! · · · k j !

S1 1

k 1

S2 2

k2

···

Sj j

k j

,

(18)

where the S j are given in (13). Finally, substituting the expression (13) into (18) yields (9). This completes the proof. April 2014]

AN ASYMPTOTIC FORMULA FOR (1 + 1/x)x

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:52 PM All use subject to JSTOR Terms and Conditions

341

Remark. Here we give explicit numerical values of some first terms of a j by using the partition set (7) and the formula (9). This shows how easily we can determine a j ’s in (9). Obviously, a1 = −

X 1 1 k1 1 =− . k ! 2 2 k =1 1 1

For k1 + 2k2 = 2, since p(2) = 2, the partition set A2 in (7) is seen to have two elements:

A2 = {(0, 1), (2, 0)} . From (9), we have X

a2 =

(k1 ,k2 )∈A2

k1 k2 1 1 11 = . 2 3 24

1 k1 !k2 !

For k1 + 2k2 + 3k3 = 3, since p(3) = 3, as above, the partition set A3 in (7) contains three elements:

A3 = {(0, 0, 1), (1, 1, 0), (3, 0, 0)} . We then find from (9) that a3 = −

X (k1 ,k2 ,k3 )∈A3

1 k1 !k2 !k3 !

k1 k2 k3 1 1 1 7 =− . 2 3 4 16

Likewise, the partition sets A4 and A5 have 5 = p(4) and 7 = p(5) elements, respectively, and so

A4 = {(0, 0, 0, 1), (1, 0, 1, 0), (0, 2, 0, 0), (2, 1, 0, 0), (4, 0, 0, 0)}

and

A5 = {(0, 0, 0, 0, 1), (1, 0, 0, 1, 0), (0, 1, 1, 0, 0), (2, 0, 1, 0, 0), (1, 2, 0, 0, 0), (3, 1, 0, 0, 0), (5, 0, 0, 0, 0)} , which yields a4 =

2447 5760

and a5 = −

959 . 2304

We note that the explicit numerical values of a j (for j = 1, 2, 3, 4, 5) here correspond with the coefficients of 1/x j (for j = 1, 2, 3, 4, 5) in (5), respectively. By using (8), we find that 1 2

∞ X 1 + (−1) j a j 1 x 1 −x 1+ + 1− =e x x 2x j j=0

for x → ∞,

(19)

where the a j (for j ∈ N0 ) are given in (9). 342

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:52 PM All use subject to JSTOR Terms and Conditions

ACKNOWLEDGMENTS. Thanks to the Editor, Professor Scott Chapman, for his several enduring encouragements to improve the exposition of this note and to the anonymous referees for their constructive comments. Thanks also to Professor Jack R. Quine and Professor Bettye Anne Case of Florida State University for their help in improving the exposition of this note. This research was supported by the Basic Science Research Program through the National Research Foundation of Korea funded by the Ministry of Education, Science and Technology of the Republic of Korea (2012-0002957).

REFERENCES 1. G. Arfken, Mathematical Methods for Physicists. Third edition. Academic Press, New York, 1985. 2. H. J. Brothers, J. A. Knox, New closed-form approximations to the logarithmic constant e, Math. Intelligencer 20 (1998) 25–29. 3. H. T. Davis, Tables of the Mathematical Functions. Vol. I, The Principia Press of Trinity University, San Antonio, Texas, 1963. 4. H. W. Gould, The Girard-Waring power sum formulas for symmetric functions and Fibonacci sequences, Fibonacci Quart. 37 (1999) 135–140. 5. J. A. Knox, H. J. Brothers, Novel series-based approximations to e, College Math. J. 30 (1999) 269–275. 6. E. Maor, e: The Story of a Number. Princeton University Press, Princeton, New Jersey, 1994. 7. Wikipedia contributors, Partition (number theory), Wikipedia, The Free Encyclopedia, available at http: //en.wikipedia.org/wiki/Partition_function_(number_theory)#Partition_function. 8. J. Sondow, E. W. Weisstein, “e.” From MathWorld–A Wolfram Web Resource, available at http: //mathworld.wolfram.com/e.html. 9. N. J. A. Sloane, a(n) = number of partitions of n (the partition numbers). Maintained by The OEIS Foundation, available at http://oeis.org/A000041.

CHAO-PING CHEN received his Bachelor of Science degree from Henan Normal University (China) in 1986 and his Master of Science Degrees from Southwest Jiaotong University (China) in 1995. He currently teaches at Henan Polytechnic University (Jiaozuo) in China. School of Mathematics and Informatics, Henan Polytechnic University, Jiaozuo City 454003, Henan Province, China [email protected]

JUNESANG CHOI received his B.A. from Gyeongsang National University (Republic of Korea) in 1981 and his Ph.D. from the Florida State University in 1991. He currently teaches at Dongguk University (Gyeongju) in the Republic of Korea. See http://wwwk.dongguk.ac.kr/~junesang/. Department of Mathematics, Dongguk University, Gyeongju 780-714, Republic of Korea [email protected]

April 2014]

AN ASYMPTOTIC FORMULA FOR (1 + 1/x)x

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:52 PM All use subject to JSTOR Terms and Conditions

343

Stirling’s Approximation for Central Extended Binomial Coefficients Author(s): Steffen Eger Source: The American Mathematical Monthly, Vol. 121, No. 4 (April), pp. 344-349 Published by: Mathematical Association of America Stable URL: http://www.jstor.org/stable/10.4169/amer.math.monthly.121.04.344 . Accessed: 30/03/2014 17:30 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp

. JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected].

.

Mathematical Association of America is collaborating with JSTOR to digitize, preserve and extend access to The American Mathematical Monthly.

http://www.jstor.org

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:05 PM All use subject to JSTOR Terms and Conditions

NOTES Edited by Sergei Tabachnikov

Stirling’s Approximation for Central Extended Binomial Coefficients Steffen Eger

Abstract. We derive asymptotic formulas for central extended binomial coefficients, which are generalizations of binomial coefficients, using the distribution of the sum of independent discrete uniform random variables with the Central Limit Theorem and a local limit variant.

1. STIRLING’S FORMULA AND CENTRAL BINOMIAL COEFFICIENTS. For a nonnegative integer k, Stirling’s formula, √ k! ∼

k k 2πk e

where number, yields an approximation of the central binomial coefficient e is Euler’s k k! using mk = m!(k−m)! as k/2 k 2k+1 ∼√ , k/2 2πk

where we write ak ∼ bk as short-hand for limk→∞ abkk = 1. In our current note, we derive asymptotic formulas for central extended binomial, or polynomial, coefficients (cf. [2, 3, 7]). These coefficients appear in the extended binomial triangles (which we also call (l + 1)-nomial, polynomial, or multinomial triangles [8]), which are generalizations of binomial, or Pascal, triangles, where entries in row k are defined as coefficients of the polynomial (1 + x + x 2 + · · · + x l )k for l ≥ 0. Our derivation is not based upon asymptotics of factorials, but upon the limiting distribution of the sum of discrete uniform random variables.1 2. EXTENDED BINOMIAL TRIANGLES. In generalization to binomial triangles, (l + 1)-nomial triangles, for l ≥ 0, are defined in the following way. Starting with a 1 in row zero, construct an entry in row k, for k ≥ 1, by adding the overlying (l + 1) entries in row (k − 1) (some of these entries are taken as zero if not defined); thereby, row k has (kl + 1) entries. For example, the binomial (l = 1), trinomial (l = 2), and quadrinomial triangles (l = 3) start as follows, http://dx.doi.org/10.4169/amer.math.monthly.121.04.344 MSC: Primary 11B65, Secondary 11N37; 60G50 1 Throughout, we assume that all fractional values such as x = kl are integral when used in the context of 2 extended binomial coefficients. If this is not the case, then replace respective quantities with their floor, bxc, the largest integer less than or equal to x.

344

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:05 PM All use subject to JSTOR Terms and Conditions

1 1 1 1 2 1 1 3 3 1

1 1 1 1 1 2 3 2 1 1 3 6 7 6 3 1

1 1 1 1 1 1 2 3 4 3 2 1 1 3 6 10 12 12 10 6 3 1

In the (l + 1)-nomial triangle, the nth entry, for 0 ≤ n ≤ kl in row k, which we denote by nk l+1 , has the following interpretation. It is the coefficient of x n in the expansion of kl X k

(1 + x + x + · · · + x ) = 2

l k

n=0

n

xn.

(1)

l+1

It has been shown that nk l+1 denotes the number of restricted integer compositions (for a definition, see, e.g., [9] and many others) of the nonnegative integer n with k parts π1 , . . . , πk , each from the set {0, 1, . . . , l} (cf. [5]), and allows the following representation, k = n l+1

X k0 ≥0,...,kl ≥0 k0 +···+kl =k 0·k0 +1·k1 +···+l·kl =n

k , k 0 , . . . , kl

(2)

k k! , for nonnegative integers where k0 ,...,k is a multinomial coefficient, defined as k0 !...k l! l k0 , . . . , kl . We can verify representation (2) by noting that for real numbers x0 , . . . , xl , the multinomial theorem (cf. [15]) states that X k k k k (x0 + x1 + · · · + xl ) = x 0 0 · · · · · xl l . k , . . . , k 0 l k ≥0,...,k ≥0 0

l

k0 +···+kl =k

Thus, setting xi = x i for i = 0, . . . , l, (1 + x + x 2 + · · · + x l )k =

X k0 ≥0,...,kl ≥0 k0 +···+kl =k

k x 0·k0 +···+l·kl , k 0 , . . . , kl

(3)

so that comparing coefficients of the right-hand sides of (1) and (3) leads to (2). 3. GENERALIZED STIRLING’S APPROXIMATION. Our strategy for deriving approximation formulas for central extended binomial coefficients is as follows. First, we determine the asymptotic distribution of the sum of discrete uniform variables, which we easily find to be a normal distribution by the Central Limit Theorem (CLT). Then, we determine the exact distribution, which turns out to yield the normalized extended binomial coefficients nk l+1 . By relating the density of the asymptotic distribution to the density of the exact distribution (e.g., via a ‘local limit’ argument), we obtain an extended binomial analogue of Stirling’s approximation to central binomial coefficients. 3.1. Step 1: Asymptotic distribution of the sum of discrete uniform variables. Let k be a positive integer and let l be a nonnegative integer. Let X j , for j = 1, . . . , k, April 2014]

NOTES

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:05 PM All use subject to JSTOR Terms and Conditions

345

be identically and independently distributed random draws from the discrete uniform distribution on the set {0, . . . , l}, and let Sk be their sum, Sk =

k X

X j.

j=1

Obviously, by standard moments of the uniform distribution, the mean and variance of each X j are given by µ = E[X j ] =

l , 2

and σ 2 = Var[X j ] =

(l + 1)2 − 1 . 12

Hence, by independent and identical distribution of X 1 , . . . , X k , and application of √ the CLT, the random variable k( Skk − µ) converges, as k → ∞, in distribution to a normally N (0, σ 2 ) distributed random variable. Recall that convergence in distribu√ S k tion precisely means that the cumulative density function of k( k − µ) converges pointwise to the cumulative density function of the N (0, σ 2 ) distribution. 3.2. Step 2: Exact distribution of the sum of discrete uniform random variables. We now determine exactly the probability that Sk takes on the integer value n, for 0 ≤ n ≤ kl. To do so, we consider ‘isomorphic copies’ X˜ j of X j , which are independently 1 and identically multinomially distributed with probabilities p0 = · · · = pl = l+1 of types 0 to l. Each X˜ j = (A0 , . . . , Al ) is vector-valued, with P[ X˜ j = (a0 , . . . , al )] = 1 for nonnegative integers as , with a0 + · · · + al = 1, where As denotes the number l+1 of times an event of type s, for s = 0, . . . , l, occurs. Then, the sum S˜k = X˜ 1 + · · · + X˜ k has the interpretation of representing the event of drawing with replacement k balls of (l + 1) different types from a bag, where the probability of drawing type s = 0, . . . , l 1 is l+1 . Thus, by the standard interpretation of the multinomial distribution, S˜k has density P[ S˜k = (a0 , . . . , al )] = P[A0 = a0 , . . . , Al = al ] =

k a0 , . . . , al

1 l +1

k

,

where a0 +· · ·+al = k for nonnegative integers a0 , . . . , al . Then, if S˜k = (a0 , . . . , al ), Sk , the variable corresponding to S˜k , represents the integer 0 · a0 + · · · + l · al . Thus, for n such that 0 ≤ n ≤ kl, X P[Sk = n] = P[ S˜k = (a0 , . . . , al )] a0 ≥0,...,al ≥0 a0 +···+al =k 0·a0 +···+l·al =n

=

1 l +1

k

X a0 ≥0,...,al ≥0 a0 +···+al =k 0·a0 +···+l·al =n

k a0 , . . . , al

=

1 l +1

k k , n l+1

using representation (2). An arguably more straightforward derivation of the exact distribution of Sk , making use of probability generating functions (pgfs), can be given by noting that the pgf 346

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:05 PM All use subject to JSTOR Terms and Conditions

G X j (x) =

P

n≥0

P[X j = n]x n of each X j is given by l

1 X n G X j (x) = x . l + 1 n=0 Whence, the pgf of Sk is given as, by independence of X 1 , . . . , X k , G Sk (x) = G X 1 (x) · · · · · G X k (x) =

=

1 l +1

k X l

!k x

n

n=0

k X kl 1 k xn. l + 1 n=0 n l+1

Thus, G (n) Sk (0)

k 1 n! k n! l + 1 n! n l+1 k 1 k = , l +1 n l+1

P[Sk = n] =

=

where, by G (n) X (0), we denote the nth derivative of G X , evaluated at zero. 3.3. Step 3: Local limit theorem. To derive an asymptotic formula for nk l+1 , we would like to make use of the results derived in Steps 1 and 2 above. Ideally, we would like to equate the probability density function of the asymptotic normal dstribution of Sk with the exact distribution. However, as mentioned, convergence in distribution, as assured by the CLT, only guarantees pointwise convergence of cumulative density functions. On the contrary, ‘local limit theorems’ describe how the probability density function of a sum of random variables approaches the normal density function. For integer-valued random variables (also called lattice or arithmetical distributions), Gnedenko and Kolmogorov [10] provide the following result. Theorem 3.1 (see [10, p. 233]). If X 1 , X 2 , . . . are independent lattice random variables with identical distribution with finite mean µ and variance σ 2 , such that the greatest common divisor of the differences of all the values of X j taken with positive probability is 1, then (n−kµ)2 √ k σ P[Sk = n] − √1 e− 2σ 2 k → 0 2π uniformly in n as k → ∞. Since in our situation, the set of values of each X j taken with positive probability is {0, 1, . . . , l}, the greatest common divisor of the differences is clearly 1. Thus, all assumptions of Theorem 3.1 are satisfied in our case and, hence, also, the consequences hold. Therefore, the following approximation is suggested for large k:

April 2014]

√ 1 − (n−kµ)2 k σ P[Sk = n] ∼ √ e 2σ 2 k . 2π

(4)

NOTES

347

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:05 PM All use subject to JSTOR Terms and Conditions

For n = kµ = kl/2, the argument to the exponential function is zero, and thus √

1 k σ P[Sk = kl/2] ∼ √ , 2π

1 P[Sk = kl/2] ∼ √ . 2π σ 2 k

or equivalently,

Using the exact form for P[Sk = n] from Step 2 above, we hence have, bringing the normalizing term (l + 1)k to the right-hand side, k kl 2

l+1

(l + 1)k ∼q . 2 −1 2πk (l+1) 12

(5)

For example, for l = 1, Pascal’s case, l = 2, l = 3, and l = 4, we therefore have the approximations k k 2

2k+1 , ∼√ 2πk

k 3k ∼q , k 3 4 πk 3

k 3 k 2

∼q 4

4k 5 πk 2

,

and

k 2k

5

5k ∼ √ . 2 πk

In Figure 1, we show for l = 4 the distributions P[Sk = n] for k = 5, 10, 20, and their respective normal approximations. There, we can see the local limit theorem ‘at work’: The exact density function apparently approaches, pointwise, the normal density function.

Figure 1. Distributions P[Sk = n] for k = 5, 10, 20 for l = 4 fixed, and normal approximations.

348

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:05 PM All use subject to JSTOR Terms and Conditions

4. DISCUSSION. Although extended binomial coefficients, together with their connection to the sum of discrete uniform random variables, go back at least to De Moivre’s Doctrine of Chances [4] and to Euler’s [6] analytical study of the coefficients of polynomial (1), the mathematics community has apparently more or less ignored their systematic study, except for a few recent publications such as [1, 2, 5, 7, 8]. Next, using the CLT (or a local limit variant) to deduce asymptotics of mathematical objects has been suggested, for example, by Walsh [14], who derives Stirling’s formula for factorials by equating the distribution of the sum of Poisson distributed random variables with the normal density. Finally, the asymptotics of both the central binomial (l = 1) as well as the central trinomial coefficients (l = 2) seem to be known (e.g. [7, 13]), while the general formula (5) is, to the best of our knowledge, novel. However, Ratsaby [12] derives our general result (4), as an estimate of the number of restricted integer compositions, by application of Cauchy’s coefficient formula to the polynomial (1) and computation of the resulting integral by Laplace’s method for evaluation of integrals. A historical perspective of local versus central limit theorem is provided by McDonald [11]. ACKNOWLEDGMENT. The author would like to thank the anonymous reviewers for helpful comments.

REFERENCES 1. R. C. Bollinger, C. L. Burchard, Lucas’s theorem and some related results for extended Pascal triangles, Amer. Math. Monthly 97 no. 3 (1990) 198–204. 2. C. C. S. Caiado, P. N. Rathie, Polynomial coefficients and distribution of the sum of discrete uniform variables, in Eighth Annual Conference of the Society of Special Functions and their Applications. Edited by A. M. Mathai, M. A. Pathan, K. K. Jose, and J. Jacob, Pala, India, 2007. 3. L. Comtet, Advanced Combinatorics: The Art of Finite and Infinite Expansions. D. Reidel Publishing Company, Dordrecht, 1974. 4. A. De Moivre, The Doctrine of Chances: Or, A Method of Calculating the Probabilities of Events in Play. Reprint of the third (1756) edition. Chelsea, New York, 1967. 5. S. Eger, Restricted weighted integer compositions and extended binomial coefficients, J. Integer Seq., 16 (2013). 6. L. Euler, De evolutione potestatis polynomialis cuiuscunque (1 + x + x 2 + · · · )n . Nova Acta Academiae Scientarum Imperialis Petropolitinae 12 (1801), available at http://math.dartmouth.edu/~euler/. 7. N.-E. Fahssi, The polynomial triangles revisited (2012), available at http://arxiv.org/abs/1202. 0228. 8. D. C. Fielder, C. O. Alford, Pascal’s triangle: Top gun or just one of the gang?, in Applications of Fibonacci Numbers. Edited by G. E. Bergum, A. N. Philippou, A. F. Horadam, Kluwer, Dordrecht, 1991. 9. P. Flajolet, R. Sedgewick, Analytic Combinatorics. Cambridge University Press, Cambridge, 2009. 10. B. V. Gnedenko, A. N. Kolmogorov, Limit Distributions for Sums of Independent Random Variables. Second edition. Addison-Wesley, Cambridge, MA, 1968. 11. D. R. McDonald, The local limit theorem: A historical perspective, JIRSS 4 (2005) 73–86. 12. J. Ratsaby, Estimate of the number of restricted integer-partitions, Appl. Anal. Discrete Math 2 (2008) 222-233. 13. The On-Line Encyclopedia of Integer Sequences, available at http://oeis.org, 2012, Sequence A002426. 14. D. P. Walsh, Equating Poisson and normal probability functions to derive Stirling’s formula, Amer. Statist. 49 (1995) 270–271. 15. E. Weisstein, Multinomial Series—From MathWorld, A Wolfram Web Resource, available at http: //mathworld.wolfram.com/MultinomialSeries.html. Economics Department, Goethe University Frankfurt am Main, Germany [email protected]

April 2014]

NOTES

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:05 PM All use subject to JSTOR Terms and Conditions

349

A New Proof of Stirling’s Formula Author(s): Thorsten Neuschel Source: The American Mathematical Monthly, Vol. 121, No. 4 (April), pp. 350-352 Published by: Mathematical Association of America Stable URL: http://www.jstor.org/stable/10.4169/amer.math.monthly.121.04.350 . Accessed: 30/03/2014 17:30 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp

. JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected].

.

Mathematical Association of America is collaborating with JSTOR to digitize, preserve and extend access to The American Mathematical Monthly.

http://www.jstor.org

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:13 PM All use subject to JSTOR Terms and Conditions

A New Proof of Stirling’s Formula Thorsten Neuschel

Abstract. A new, simple proof of Stirling’s formula via the partial fraction expansion for the tangent function is presented.

1. INTRODUCTION. Various proofs for Stirling’s formula √ n! ∼ n n e−n 2πn,

as n → ∞,

(1.1)

have been established in the literature since the days of de Moivre and Stirling in 1730 (for a historical exposition see, e.g., [1]). Many of these proofs show that the limit lim

n→∞

nn

n! √ e−n n

exists (for instance, via the Euler–Maclaurin formula) in order to identify this limit by using the asymptotical behavior of the Wallis product, which is the crucial step. We will show that this last, quite wily, step can be replaced by a simple straightforward computation of the limit using only the partial fraction expansion for the tangent function π tan π x =

∞ X

2x

ν=0

1 2 ) 2

(ν +

− x2

.

(1.2)

This expansion was probably found by Euler by the time Stirling determined his proof via Wallis’ formula, see, e.g., [6, p. 327]. For some alternative elementary proofs of Stirling’s formula see, e.g., [1, 2, 4, 5, 7]. 2. PROOF. An application of the well-known Euler–Maclaurin formula in its simplest form (see, e.g., [8, p. 37, (6.21)]) yields √ log n! = n log n − n + 1 + log n +

Z

n−1

0

x − [x] − 1+x

1 2

d x.

In order to prove (1.1), it is sufficient to show Z

∞

0

x − [x] − 1+x

1 2

√ d x = log

2π − 1.

(2.1)

To prove this, we will show directly the identity ∞

Z 0

x − [x] − 1+x

1 2

Z dx = 0

1/2

8x 2 − π x tan π x 1 − 4x 2

d x,

(2.2)

http://dx.doi.org/10.4169/amer.math.monthly.121.04.350 MSC: Primary 41A60

350

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:13 PM All use subject to JSTOR Terms and Conditions

where the integral on the right-hand side can be evaluated by elementary calculus. We start our computation with ) (Z Z ∞ Z ν+1 ∞ ν+1/2 X x − [x] − 21 x − ν − 12 x − ν − 21 dx = dx + dx 1+x 1+x 1+x ν+1/2 0 ν ν=0 ) (Z Z 1/2 ∞ 1/2 X x − 12 x = dx + dx . 3 1+ν+x +ν+x 0 0 2 ν=0 By an easy change of variables, we observe that Z

1/2

0

x − 12 dx = − 1+ν+x

1/2

Z 0

3 2

x d x, +ν−x

so that we obtain Z 0

∞

x − [x] − 1+x

1 2

dx =

∞ Z X ν=0

=

1/2

∞ Z X ν=0

Z = 0

3 2

0 1/2

0

∞ 1/2 X ν=1

x − +ν+x

3 2

x +ν−x

! dx

−2x 2 dx (ν + 32 )2 − x 2 −2x 2 d x, (ν + 12 )2 − x 2

(2.3)

where the interchange of summation and integration is allowed, due to the uniform convergence of the series in (2.3) on the interval [0, 12 ]. Applying (1.2), we immediately obtain (2.2). At this point of the proof, we have reduced the problem of determining the constant in Stirling’s formula to a simple matter of elementary calculus as the resulting integral in (2.2) can be evaluated easily. For convenience, we will give some details. For example, using the decomposition 8x 2 1 1 = + −2 1 − 4x 2 1 + 2x 1 − 2x it can be rewritten as Z 1/2 Z 1/2 √ 8x 2 1 2 − 1 + − π x tan π x d x = log − π x tan π x d x. 1 − 4x 2 1 − 2x 0 0 Now, by a standard argumentation involving integration by parts, we can observe for 0 < < 1/2 that Z 1 − π x tan π x d x 1 − 2x 0 Z 1 = − log(1 − 2) − π x tan π x d x 2 0 Z 1 1 cos(π ) = − log cos(π) + log − log cos(π x) d x. 2 2 1 − 2 0 April 2014]

NOTES

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:13 PM All use subject to JSTOR Terms and Conditions

351

Letting tend to 1/2, we immediately obtain Z 0

1/2

1 − π x tan π x 1 − 2x

√ d x = log

π − log

√

Z 2−

1/2

log cos(π x) d x. 0

√ The remaining integral on the right-hand side can be easily evaluated to − log 2 as shown, e.g., in [2], [3]. This computation relies on the fact that its value, say c, remains unchanged if cos(π x) is replaced by sin(π x) so that we have (using the double angle formula) √

1/2

Z

log sin(2π x) d x = log 0

Z 2+2

1/2

log sin(π x) d x. 0

√ As both integrals in the last equation coincide, we obtain c = − log pletes the proof of (1.1).

2, which com-

REFERENCES 1. P. Diaconis, D. Freedman, An elementary proof of Stirling’s formula, Amer. Math. Monthly 93 (1986) 123–125. 2. W. Feller, A direct proof of Stirling’s formula, Amer. Math. Monthly 74 (1967) 1223–1225. 3. W. Feller, Correction to “A direct proof of Stirling’s formula”, Amer. Math. Monthly 75 (1968) 518. 4. R. Michel, On Stirling’s formula, Amer. Math. Monthly 109 (2002) 388–390. 5. J. Patin, A very short proof of Stirling’s formula, Amer. Math. Monthly 96 (1989) 41–42. 6. R. Remmert, Theory of Complex Functions. Springer, New York, 1991. 7. H. Robbins, A remark on Stirling’s formula, Amer. Math. Monthly 62 (1955) 26–29. 8. R. Wong, Asymptotic Approximation of Integrals. Society for Industrial and Applied Mathematics, Philadelphia, PA, 2001. Department of Mathematics, University of Trier, D-54286 Trier, Germany [email protected]

By 1914, the M ONTHLY had outgrown its financial arrangements, and it was Slaught who turned to the American Mathematical Society to adopt the M ONTHLY as an official journal. But American mathematics was growing as fast as the M ONTHLY, and the Society was already plagued by factional disputes between the Eastern establishment (the Ivy league schools) and the Midwest (led by Chicago). Slaught’s request became a controversy. Should an organization dedicated to promoting mathematical research support a journal like the M ONTHLY? Many, especially in the East (led by Osgood), thought is should not, and the AMS voted narrowly to give the M ONTHLY a pat on the back rather than money. A Century of Mathematics: Through the Eyes of the Monthly, Edited by John Ewing. Mathematical Association of America, Washington, DC, 1994, p. 4.

352

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:13 PM All use subject to JSTOR Terms and Conditions

Zeta(2) Once Again Author(s): Ralph M. Krause Source: The American Mathematical Monthly, Vol. 121, No. 4 (April), pp. 353-354 Published by: Mathematical Association of America Stable URL: http://www.jstor.org/stable/10.4169/amer.math.monthly.121.04.353 . Accessed: 30/03/2014 17:30 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp

. JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected].

.

Mathematical Association of America is collaborating with JSTOR to digitize, preserve and extend access to The American Mathematical Monthly.

http://www.jstor.org

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:27 PM All use subject to JSTOR Terms and Conditions

Zeta(2) Once Again Ralph M. Krause

Abstract. This note provides a strikingly efficient evaluation of zeta(2).

An article in the January, 2012, M ONTHLY [1] proved, in a manner that might have apP 2 pealed to Euler, his famous result that ζ (2) = ∞ 1/k = π 2 /6. The argument there 1 suggested the following, which makes the same claim and resembles the sixth in [2]. The Taylor’s series for log(1 + z) converges on the unit circle z = eiθ for z 6 = −1. Thus ∞ X log(1 + z) = (−1)k−1 z k /k,

(1)

1

and log(1 + z −1 ) =

∞ X (−1)k−1 z −k /k.

(2)

1

To be convinced that equation (1) holds on the circle of convergence (z = −1 excepted) and not merely inside it, we argue thus. For z in the first quadrant, |1/(1 + z) − P N −1 P (−z)k | = |z N /(1 + z)| ≤ |z N |. Integrating 1/(1 + z) − 0N −1 (−z)k and |z N | 0 P along a radius from z = 0 to z = eiθ shows that | log(1 + z) − 1N (−1)k−1 z k /k| < 1/(N + 1), establishing (1) on the portion of the unit circle lying in the first quadrant. (This is all we need below, although the preceding argument may be modified easily to use ever larger bounds than 1/(N + 1) and prove convergence, though not uniform convergence, on the entire unit circle with the exception of the point z = −1.) (2) then follows, as the conjugate of (1). Subtracting (2) from (1), still for z = eiθ and z 6 = −1, iθ = log(z) =

∞ ∞ X X (−1)k−1 [z k − z −k ]/k = 2i (−1)k−1 sin(kθ )/k. 1

(3)

1

Nowadays, one might verify that this is a Fourier series and let Parseval’s formula finish the job. Although Euler was perhaps a century too early for Fourier analysis, he would have been willing to integrate the first and last expressions in (3) from 0 to π/2 after dividing by i. Making free use of the familiar observation that the even terms comprise 1/4 of the sum of the series of reciprocal squares, we arrive at ∞

3X 1 π2 = . 8 4 1 k2 http://dx.doi.org/10.4169/amer.math.monthly.121.04.353 MSC: Primary 11M06

April 2014]

NOTES

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:27 PM All use subject to JSTOR Terms and Conditions

353

The brevity here legitimately ignores the fact that the sums in (1)–(3) converge only conditionally. What is essential is that they converge uniformly on the interval of integration [0, π/2], and this they do by the estimate made in paragraph 2 above. Thus Z " # N π/2 X k−1 θ −2 (−1) sin(kθ )/k dθ < (π/2)2/(N + 1). 0 1

The result now follows. REFERENCES 1. D. Kalman, M. McKinzie, Another way to sum a series: Generating functions, Euler and the dilog function, Amer. Math. Monthly 119 (2012) 42–51. 2. R. Chapman, Evaluating ζ (2), available at http://www.uam.es/personal_pdi/ciencias/ cillerue/Curso/zeta2.pdf. 3208 44th Street, N.W., Washington, DC 20016-3527 [email protected]

100 Years Ago in The American Mathematical Monthly Edited by Vadim Ponomarenko In recent years several German professors of mathematics have called public attention to the fact that the number of mathematical students at the various German universities is larger than the probable number of mathematical positions in the German schools. According to the Jahresbericht der Deutschen Mathematiker-Vereinigung, 22 (1913), page 369, the number of mathematical students is much smaller during the current year than it has been during recent years. The number of women students of mathematics is, however, still on the increase in the German universities. All copies of the M ONTHLY for January, 1913, have been exhausted. The demand for sample copies was so great for this particular number that the supply was entirely inadequate. Any one who may know of extra copies not belonging to sets will confer a great favor by informing the MANAGING EDITOR. —Excerpted from “Notes and News,” 21 (1914) 136–138.

354

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:27 PM All use subject to JSTOR Terms and Conditions

Polynomials ( x 3 –n )( x 2 + 3) Solvable Modulo Any Integer Author(s): Andrea M. Hyde, Paul D. Lee, Blair K. Spearman Source: The American Mathematical Monthly, Vol. 121, No. 4 (April), pp. 355-358 Published by: Mathematical Association of America Stable URL: http://www.jstor.org/stable/10.4169/amer.math.monthly.121.04.355 . Accessed: 30/03/2014 17:30 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp

. JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected].

.

Mathematical Association of America is collaborating with JSTOR to digitize, preserve and extend access to The American Mathematical Monthly.

http://www.jstor.org

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:47 PM All use subject to JSTOR Terms and Conditions

Polynomials (x 3 − n)(x 2 + 3) Solvable Modulo Any Integer Andrea M. Hyde, Paul D. Lee, and Blair K. Spearman

Abstract. We give an infinite family of polynomials that are solvable modulo m for every integer m > 1, yet have no roots in the rational numbers. Such polynomials are called intersective. Our classification uses only techniques available in an undergraduate course in number theory.

1. INTRODUCTION. Let f (x) be a monic polynomial with integer coefficients. We are interested in those polynomials f (x) that have no root in the rational numbers Q, but do have a root modulo m for all positive integers m. Polynomials of this type are called intersective (see Sonn [6, 7]). These polynomials provide counterexamples to the local-global principle. Further information on the local-global principle is available in the book by Gouvˆea [3, pp. 75–83.]. It is known that f (x) cannot be irreducible over Q since in this case there exist prime numbers p for which f (x) ≡ 0 (mod p) is insolvable (see Brandl, Bubboloni, and Hupp [2]). Consequently, an intersective polynomial requires at least two irreducible factors over Q. Berend and Bilu prove a theorem that can be used to establish the intersective property for a given polynomial [1]. They provide the example f (x) = (x 3 − 19)(x 2 + x + 1). The verification of the intersective property, that is, of confirming that f (x) ≡ 0 (mod m) is solvable for every m > 1, may proceed by showing that for each prime p and positive integer j, the congruence f (x) ≡ 0 (mod p j ) is solvable. General solvability then follows from the Chinese Remainder Theorem. For a given prime p, one of the factors of f (x) is proven to be solvable mod p j for all positive integers j. In this note, we propose to use techniques from an undergraduate number theory course to investigate polynomials of the form f (x) = (x 3 − n)(x 2 + x + 1), or equivalently f (x) = (x 3 − n)(x 2 + 3), http://dx.doi.org/10.4169/amer.math.monthly.121.04.355 MSC: Primary 11R09

April 2014]

NOTES

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:47 PM All use subject to JSTOR Terms and Conditions

355

classifying those that are intersective. The equivalence of these two families is due to the easily established fact that for a given prime p, the congruence x 2 + x + 1 ≡ 0 (mod p j ) is solvable for all positive integers j if and only if the congruence x 2 + 3 ≡ 0 (mod p j ) is solvable for all positive integers j. Berend and Bilu [1] state that these polynomials have the least possible degree to be intersective. The explanation of this involves Galois theory and algebraic number theory. Avoiding these more advanced theories, we make use of results from an undergraduate number theory course, particularly Hensel’s Lemma and a refined version of Hensel’s Lemma. Our method not only establishes the intersective property for the polynomials we study, but also enables a characterization of them, showing that they form an infinite set. Before we begin, we impose two simple restrictions on the value of n in our polynomials (x 3 − n)(x 2 + 3), the validity of which can be easily established by the reader. We assume that n is a positive integer and that n is cubefree. Our definition of intersective implies that n 6 = 1. Our main theorem is the following. Theorem. Let n be a cubefree positive integer, not equal to 1. Then f (x) = (x 3 − n)(x 2 + 3) is intersective if and only if the prime factors of n are of the form 3k + 1 and n ≡ 1 (mod 9). Before giving the proof of this theorem, we state the theory that we require. All of it is available in basic number theory textbooks such as Niven, Zuckerman, and Montgomery [4]. We begin with Hensel’s Lemma. Hensel’s Lemma (see [4, p. 87]). Suppose that f (x) is a polynomial with integral coefficients. If f (a) ≡ 0 (mod p j ) and f 0 (a) 6 ≡ 0 (mod p), then there is a unique t (mod p) such that f (a + t p j ) ≡ 0 (mod p j+1 ). If the condition f 0 (a) 6 ≡ 0 (mod p) holds, then the root a is called nonsingular. By repeated application of Hensel’s Lemma, a nonsingular root a of f (x) ≡ 0 (mod p) may be lifted to a root modulo p j , for j = 2, 3, . . . . We also require a refined version of Hensel’s Lemma which, in the case of a singular root, enables us to lift our solutions modulo arbitrarily high prime powers. Refined Hensel’s Lemma (see [4, p. 89]). Let f (x) be a polynomial with integral coefficients. Suppose that f (a) ≡ 0 (mod p j ), that p τ k f 0 (a), and that j ≥ 2τ + 1. If b ≡ a mod p j−τ , then f (b) ≡ f (a) (mod p j ) and p τ k f 0 (b). Moreover, there is a unique t (mod p) such that f (a + t p j−τ ) ≡ 0 (mod p j+1 ). As noted in [4, p. 89], since the hypotheses of the theorem apply with a replaced by a + t p j−τ and (mod p j ) replaced by (mod p j+1 ) but with τ unchanged, the lifting may be repeated and continues indefinitely. This means that if the polynomial congruence is solvable to a sufficiently high power of p (as defined in the Lemma), 356

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:47 PM All use subject to JSTOR Terms and Conditions

then it can be solved to all powers of p. We require one more lemma involving power congruences. Lemma (see [4, p. 101). If p is a prime and gcd(a, p) = 1, then the congruence x t ≡ a (mod p) has d = gcd(t, p − 1) solutions if a ( p−1)/d ≡ 1 (mod p), and no solutions otherwise. We are now ready to prove our theorem. Proof. Assume that the prime factors of n are of the form 3k + 1 and n ≡ 1 (mod 9). Let p be a prime. Case 1. Suppose that p ≡ 1 (mod 3). For these primes, −3 is a quadratic residue mod p (see [5, p. 440, Exercise 3]) so that the congruence x 2 + 3 ≡ 0 (mod p) is solvable for some integer u, which is clearly not divisible by p. Since the derivative of x 2 + 3 evaluated at u equals 2u, which is again not divisible by p, we may apply Lemma 1 to conclude that x 2 + 3 ≡ 0 (mod p j ) is solvable for all positive integers j. Case 2. Suppose next that p ≡ 2 (mod 3). Clearly, p - n, since n contains only prime factors of the form 3k + 1. The factor x 2 + 3 is insolvable mod p, since −3 is a quadratic nonresidue (mod p). Applying Lemma 3 to the factor x 3 − n, with t = 3, a = n, and noting that d = (3, p − 1) = 1, we see that x 3 − n ≡ 0 (mod p) is solvable if n p−1 ≡ 1 (mod p), which is true by Fermat’s Little Theorem. Thus, x 3 − n ≡ 0 (mod p) is solvable for some integer u. Since p - n, we see that p - u. We may apply Lemma 1 by observing that p - 3u 2 , the derivative of x 3 − n evaluated at u, and conclude that x 3 − n ≡ 0 (mod p j ) is solvable for all positive integers j. Case 3. Finally, suppose that p = 3. The factor x 2 + 3 has one solution mod 3 (namely x = 0), but has no solutions mod 32 . Therefore, we consider the factor x 3 − n. Since n ≡ 1 (mod 9), we have n ≡ 1, 10, 19 (mod 33 ) so that x 3 ≡ n (mod 33 ) is solvable. For solutions, we may choose x ≡ 1, 4, 7 (mod 33 ), respectively. At the same time, the derivative of x 3 − n evaluated at 1, 4, and 7 is exactly divisible by 3. Applying Lemma 2 with j = 3, τ = 1, and recalling the remark after Lemma 2, we conclude that x 3 − n ≡ 0 (mod 3 j ) is solvable for all positive integers j. Hence, by the Chinese Remainder Theorem, f (x) is solvable for all positive integers m. This establishes the intersective property. Conversely, suppose that (x 3 − n)(x 2 + 3) is intersective with n cubefree and not equal to 1. Let p be a prime. Case 1. First, suppose that p ≡ 2 (mod 3). For such a prime, x 2 + 3 ≡ 0 (mod p) is insolvable, since −3 is a quadratic nonresidue mod p as noted earlier. Therefore, we must have x 3 − n ≡ 0 (mod p j ) solvable for all positive integers j. It is easy to show that if p divides n, then p must also divide any solution x. We then see that the congruence x 3 ≡ n (mod p 3 ) is only solvable if n is divisible by p 3 , but this is a contradiction since n is cube free. Hence, p - n and n has no prime factors of the form 3k + 2. Case 2. Suppose now that p = 3. Since x 2 + 3 ≡ 0 (mod 9) is insolvable, we must have x 3 − n ≡ 0 (mod 3 j ) solvable for all positive integers j. By the same arguments as in Case 1, we require 3 - n, for otherwise x 3 − n ≡ 0 (mod 33 ) is insolvable as n is April 2014]

NOTES

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:47 PM All use subject to JSTOR Terms and Conditions

357

cubefree. Therefore, 3 is not a prime factor of n, and thus the prime factors of n must have the form 3k + 1. To complete the classification part of our proof, we study the solvability of x 3 − n ≡ 0 (mod 32 ). The nonzero cubes modulo 9 are congruent to 1 and 8. Combining this with the previously established fact that the prime factors of the positive integer n are of the form 3k + 1, we deduce that n ≡ 1 (mod 9), as required. We close by giving some examples of intersective polynomials obtained from our theorem. Factorization of n

Intersective polynomial

37

(x 3 − 37)(x 2 + 3)

7 · 13

(x 3 − 91)(x 2 + 3)

163

(x 3 − 163)(x 2 + 3)

19 · 37

(x 3 − 703)(x 2 + 3)

72 · 61

(x 3 − 2989)(x 2 + 3)

REFERENCES 1. D. Behrend, Y. Bilu, Polynomials with roots modulo every integer, Proc. Amer. Math. Soc. 124 (1996) 1663–1671. 2. R. Brandl, D. Bubboloni, I. Hupp, Polynomials with roots mod p for all primes p, J. Group Theory 4 (2001) 233–239. 3. F. Q. Gouvˆea, P-adic Numbers, An Introduction. Springer Verlag, New York, 1993. 4. I. Niven, H. S. Zuckerman, H. L. Montgomery, An Introduction to the Theory of Numbers. Fifth edition. John Wiley and Sons, New York, 1995. 5. K. H. Rosen, Elementary Number Theory and its Applications. Sixth edition. Addison-Wesley, Reading, MA, 2011. 6. J. Sonn, Polynomials with roots in Q p for all p, Proc. Amer. Math. Soc. 136 (2008) 1955–1960. 7. J. Sonn, Two remarks on the inverse galois problem for intersective polynomials, J. Theor. Nombres Bordeaux, 21 (2009) 437–439. Department of Mathematics and Statistics, University of British Columbia’s Okanagan campus, Kelowna, BC, Canada, V1V 1V7 [email protected] Department of Mathematics and Statistics, University of British Columbia’s Okanagan campus, Kelowna, BC, Canada, V1V 1V7 [email protected] Department of Mathematics and Statistics, University of British Columbia’s Okanagan campus, Kelowna, BC, Canada, V1V 1V7 [email protected]

358

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:47 PM All use subject to JSTOR Terms and Conditions

Macaulay Expansion Author(s): B. Sury Source: The American Mathematical Monthly, Vol. 121, No. 4 (April), pp. 359-360 Published by: Mathematical Association of America Stable URL: http://www.jstor.org/stable/10.4169/amer.math.monthly.121.04.359 . Accessed: 30/03/2014 17:30 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp

. JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected].

.

Mathematical Association of America is collaborating with JSTOR to digitize, preserve and extend access to The American Mathematical Monthly.

http://www.jstor.org

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:58 PM All use subject to JSTOR Terms and Conditions

Macaulay Expansion B. Sury

Abstract. Given natural numbers n and r , the “greedy” algorithm enables an us to−1 obtain expansion of the integer n as a sum of binomial coefficients in the form arr + arr−1 + ··· + a1 . We give an alternate interpretation of this expansion, which also proves its uniqueness in 1 an interesting manner.

The 1996 Iranian mathematical olympiad competition contained the following problem. For natural numbers n and r , there is a unique expansion ar −1 a1 ar + + ··· + n= r −1 1 r with each ai an integer and ar > ar −1 > · · · > a1 ≥ 0. The existence is fairly easy to prove using the “greedy” algorithm. This expansion is sometimes known as the Macaulay expansion. However, the following alternate interpretation does not seem to be well known; it gives uniqueness in an interesting manner. Inwhat follows, the following well-known convention is used: the binomial coefficient n is equated to 0 if n < r . r For each natural number r , denote by Sr the set of all r -digit numbers in some base b whose digits are in strictly decreasing order of size. Evidently, Sr is nonempty if and only if b ≥ r ; in this case, Sr has br elements. Let us now write the elements of Sr in increasing order. For instance, in base 10, the first few of the 120 members of S3 are: (2, 1, 0), (3, 1, 0), (3, 2, 0), (3, 2, 1), (4, 1, 0), (4, 2, 0), (4, 2, 1), (4, 3, 0), . . . . We will prove the following. Theorem. Given any positive integer n, and any base b such that br >n, the (n + 1) −1 th member of Sr is (ar , . . . , a2 , a1 ), where n = arr + arr−1 + · · · + a11 . In particular, −1 for each n, the Diophantine equation arr + arr−1 + · · · + a11 = n has a unique solution in positive integers ar > ar −1 > · · · > a1 ≥ 0. Here are a couple of examples to illustrate the theorem. (i) Let r = 3 and n = 12. We may take any base b so that b3 > 12. For example, b = 6 is allowed because 63 = 20. Among the 20 members in S3 , the 13th member is (5, 2, 1). Note that 5 2 1 + = 12. + 2 1 3 http://dx.doi.org/10.4169/amer.math.monthly.121.04.359 MSC: Primary 05A10

April 2014]

NOTES

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:58 PM All use subject to JSTOR Terms and Conditions

359

(ii) Let r = 3, n = 74. We may take b = 10 as 103 = 120. The 75th member of S3 is (8, 6, 3). Note that 8 6 3 + + = 74. 3 2 2 Proof of theorem. First of all, we notice that the number of members in Sr that have first digit < m equals mr ; this is because we are choosing r numbers from {0, 1, . . . , m − 1} and arranging them in decreasing order. Now, suppose the (n + 1)th member of Sr is (ar , ar −1 , . . . , a1 ). The number of members of Sr with first digit < ar is arr . The number of members of Sr , whose first digit is ar and which occur before the above member, is the number of members of Sr −1 occurring prior to (ar −1 , . . . , a1 ). Inductively, it is clear that this equals

ar −1 a2 a1 + ··· + + . r −1 2 1

Therefore, the number of members of Sr occurring prior to the (n + 1)th member above (which must be n) is ar −1 a1 ar + + ··· + . r −1 1 r This proves our result. Remark. We may proceed in a slightly different direction, if we do not use the first observation in the proof. For any k, we can obtain by induction that the number of a elements in Sk starting with some a is k−1 . Indeed, to prove this by induction, we use the identity X n−1 n m = , r r −1 m=1 which is itself seen by induction on n. ACKNOWLEDGMENTS. We are indebted to the referee for a number of constructive suggestions. In particular, she/he drew attention to a simple way to count something for which we gave a roundabout argument as remarked above. The referee’s suggestions to add some illuminating examples and to make the uniqueness argument transparent are well appreciated.

Stat-Math Unit, Indian Statistical Institute, 8th Mile Mysore Road, Bangalore 560059, India [email protected]

360

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:58 PM All use subject to JSTOR Terms and Conditions

Evaluating Lebesgue Integrals Efficiently with the FTC Author(s): J. J. Koliha Source: The American Mathematical Monthly, Vol. 121, No. 4 (April), pp. 361-364 Published by: Mathematical Association of America Stable URL: http://www.jstor.org/stable/10.4169/amer.math.monthly.121.04.361 . Accessed: 30/03/2014 17:31 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp

. JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected].

.

Mathematical Association of America is collaborating with JSTOR to digitize, preserve and extend access to The American Mathematical Monthly.

http://www.jstor.org

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:16 PM All use subject to JSTOR Terms and Conditions

Evaluating Lebesgue Integrals Efficiently with the FTC J. J. Koliha

Abstract. This note addresses evaluation of Lebesgue integrals on the real line using the Fundamental Theorem of Calculus, without having to verify that the primitive is absolutely continuous.

The Fundamental Theorem of Calculus (FTC) provides an efficient method for the evaluation of Lebesgue integrals on real intervals, but only if we can find an absolutely continuous primitive (antiderivative) to the integrand. However, checking absolute continuity can be quite difficult. In this note, we give examples of evaluation of integrals that require only continuity of the primitive. Here is a version of Lebesgue’s FTC extended to a possibly unbounded interval. Lebesgue’s FTC. Let ∞ ≤ a < b ≤ ∞. Let F : (a, b) → C be absolutely continuous on (a, b) and let F 0 = f almost everywhere on (a, b), where f : (a, b) → C is Lebesgue integrable on (a, b). If the one-sided limits F(a+) and F(b−) exist, then Z

b

f (t) dt = F(b−) − F(a+).

a

It may seem that with the absolute continuity of F, the hypothesis that f is Lebesgue integrable is redundant. Alas, no: The notorious function t

Z F(t) := Si(t) = 0

sin x d x, x

t > 0,

shows the error of our ways [2, Example 14.17]. The absolute continuity of F on (0, ∞) follows from the Mean Value Theorem; F(0+) = 0 is clear and F(∞−) = π/2 is well known. Yet the derivative F 0 (x) = f (x) = (sin x)/x is not Lebesgue integrable as Z t sin x d x = ∞. lim t→∞ 0 x It is well known that on a compact interval, the integrability of f is indeed redundant (see, for instance, [2, Theorem 14.7]). The problem with application of Lebesgue’s FTC can be seen in this situation. Suppose we know that F 0 (x) = f (x) everywhere in [a, b] and that f is Lebesgue integrable on [a, b]. Then we have a paradoxical situation of not being able to use Lebesgue’s R x FTC, since we do not know whether F is absolutely continuous. If we write G(x) = a f (t) dt, we know that G is absolutely continuous, and (F − G)0 (x) = 0 almost everywhere. However, we cannot conclude that F − G is constant. http://dx.doi.org/10.4169/amer.math.monthly.121.04.361 MSC: Primary 26A42

April 2014]

NOTES

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:16 PM All use subject to JSTOR Terms and Conditions

361

In order to overcome this problem, we need to look at a different type of FTC, one which is usually proved by methods outside the theory of Lebesgue integration. A proof that stays strictly within the realm of the Lebesgue theory was given by the author in this M ONTHLY [1]. We recall three versions of this theorem, whose proofs can be found in [1] and [2, Chapter 14]. Theorem 1 (see [1]). Let ∞ ≤ a < b ≤ ∞. Let F : (a, b) → C be such that F 0 (x) = f (x) for all x ∈ (a, b), where f : (a, b) → C is Lebesgue integrable on (a, b). If the one-sided limits F(a+) and F(b−) exist, then b

Z

f (t) dt = F(b−) − F(a+).

a

Even if we tighten the hypotheses to assume that F has a derivative on a compact interval [a, b] (with one-sided derivatives at the end points), the integrability of f cannot be dropped due to a possible blowout of the positive and negative oscillation of f . To see this, define 1 if t 6 = 0 and F(0) = 0, F(t) = t cos t2 2

2

and f (t) = F 0 (t) if t 6 = 0 and f (0) = 0. But f is not Lebesgue integrable on [0, 1], since [2, Example 14.15] for details.)

R1 ε

| f (t)| dt → ∞ as ε → 0+. (See

Theorem 2 (see [1]). Let ∞ ≤ a < b ≤ ∞. Let F : (a, b) → C be continuous on (a, b) and let F 0 (x) = f (x) nearly everywhere on (a, b), where f : (a, b) → C is Lebesgue integrable on (a, b). If the one-sided limits F(a+) and F(b−) exist, then b

Z

f (t) dt = F(b−) − F(a+).

a

The expression nearly everywhere means ‘everywhere except for a countable set’. If F is continuous on (a, b), F 0 = f nearly everywhere on (a, b), and the one-sided limits F(b−) and F(a+) exist, then we say that f is Newton integrable on (a, b), and define its Newton integral by (N )

Z

b

f (t) dt := F(b−) − F(a+).

a

R1 Theorem 1 enables us to calculate the integral 0 t −1/2 dt by observing that F(t) = 2t 1/2 is a primitive for the integrand f (t) = t −1/2 everywhere in (0, 1), that F(0+) = 0 and F(1−) = 2, but we have to know that f is Lebesgue integrable on (0, 1). For this we can use, for instance, the Monotone Convergence Theorem applied to the truncations f n = min( f, n) of f . But this does not seem to be the most efficient way to do it—we would like to conclude the integrability of f directly from the existence of the Newton integral. For this we need to consider absolute Newton integrability. We say 362

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:16 PM All use subject to JSTOR Terms and Conditions

that a function f : (a, b) → C is absolutely Newton integrable if the Newton integral exists for both f and | f | (where | f | is real valued and nonnegative). Here is the desired theorem. Theorem 3 (see [1]). Let ∞ ≤ a < b ≤ ∞. Let f : (a, b) → C be absolutely Newton integrable on (a, b). Then f is Lebesgue integrable on (a, b), and Z

b

f (t) dt = (N )

a

Z

b

f (t) dt.

a

The readers can hone their skills by evaluating the following integrals using Theorems 1, 2, or 3. Example 1. Evaluate Lebesgue integrals efficiently: (i)

Z

1

1 t 3/4

0

+ i log t dt,

(ii)

Z

2

1

t − 3i dt, and (iii) t + 2i

∞

Z 0

dt . (2t + i)3

So far, the substantial power hidden in Theorems 2 and 3 has not been fully utilized, namely the fact that the derivative of the continuous function F may exist only nearly everywhere. We illustrate this in the following examples. Example 2. Let f : (0, 1) → C be the function defined by f (t) = 0 if t is rational, and f (t) = log t + it −4/5 otherwise. Let ϕ be the characteristic function of (0, 1) \ Q. Then F1 (t) = t log t − t and F2 (t) = 5t 1/5 are generalized primitives to f 1 (t) = ϕ(t) log t and f 2 (t) = ϕ(t) t −4/5 on (0, 1), respectively. Further, F1 (1) = −1, F1 (0+) = 0, F2 (1) = 5, and F2 (0) = 0. Both f 1 and f 2 are absolutely Newton integrable as they do not change sign on (0, 1). By Theorem 3, f is Lebesgue inteR1 R1 R1 grable with 0 f = 0 f 1 + i 0 f 2 = −1 + i5. (Note that by splitting the real and imaginary parts of f , we avoided the need for finding a generalized primitive for | f (t)| = ϕ(t)(log2 t + t −8/5 )1/2 . This is not always the most efficient maneuver—see Example 1 (iii).) Example 3. Let f be defined on the interval (0, 1) by f (x) = √

1 (n + 2){(n + 1)x − n}

if

xn < x ≤ xn+1 ,

n = 0, 1, 2, . . .

where xn = n/(n + 1), n = 0, 1, 2, . . . . First sketch a graph of f ; it reveals infinitely many vertical asymptotes at the points xn , n = 0, 1, 2, . . . , neatly clustering near x = 1. On each interval (xn , xn+1 ), a primitive to f is F(x) =

p 2 (n + 1)x − n + cn , √ (n + 1) n + 2

x ∈ (xn , xn+1 ).

The constants of integration cn must be chosen wisely to make F continuous on (0, 1). From F(xn −) = F(xn +), we obtain cn = 2/[(n(n + 1)] + cn−1 . Choosing c0 = 0, we get cn = 2

n X k=1

April 2014]

n X 1 1 1 1 2n =2 − =2 1− = , k(k + 1) k k+1 n+1 n+1 k=1 NOTES

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:16 PM All use subject to JSTOR Terms and Conditions

363

n = 0, 1, 2, . . . Setting F(xn ) = F(xn −) = F(xn +) for n = 1, 2, 3, . . . , we make F continuous on (0, 1), but the derivatives F 0 (xn ) fail to exist for n = 1, 2, 3, . . . . As the integrand is nonnegative, its Newton integrability implies absolute Newton integrability. Clearly, F(0+) = 0. Further, F is increasing on (0, 1) being continuous there and having a positive derivative nearly everywhere in (0, 1) (see [2, Theorem B25]). Also, F is bounded on (0, 1) as on each interval (xn , xn+1 ] we have F(xn+1 ) = 2xn+1 ≤ 2. Hence, the limit F(1−) exists and is equal to R1 limn→∞ F(xn+1 ) = 2. Thus, 0 f (x) d x = F(1−) − F(0+) = 2. We note that f is not improperly Riemann integrable. Example 4. A striking example of a Lebesgue integrable function that is not improperly Riemann integrable and that has a vertical asymptote at each rational point of the interval [0, 1] is given by Richardson in [3, Example 5.44]: f (x) =

∞ X

2−k |x − qk |−1/2 ,

k=1

where (qk ) is a sequence containing all rational numbers in [0, 1]. Write f k (x) = 2−k |x − qk |−1/2 for x ∈ [0, 1] \ {qk }, k = 1, 2, . . . . Then f k is absolutely Newton integrable with a generalized primitive Fk (x) = 2−k+1 sgn(x − qk )|x − qk |1/2 in [0, 1], R1 1/2 and the integral (N ) 0 f k = Fk (1−) − Fk (0+) = 2−k+1 ((1 − qk )1/2 + qk ). By Theorem 3, this is also Lebesgue integral of f k . We have α :=

∞ Z X k=1

0

1

| f k (t)| dt =

∞ X

1/2

2−k+1 ((1 − qk )1/2 + qk ) < ∞.

k=1

P By the term-by-term integration of series [2, Theorem 13.35], f = k f k converges R1 almost everywhere in [0, 1], is Lebesgue integrable, and 0 f (t) dt = α. ACKNOWLEDGMENT. I would like to thank the referees for their comments, which led to improved presentation of this note.

REFERENCES 1. J. J. Koliha, A fundamental theorem of calculus for Lebesgue integration, Amer. Math. Monthly 113 (2006) 551–555. 2. , Metrics, Norms and Integrals: An Introduction to Contemporary Analysis. World Scientific Publishing, Singapore, 2008. 3. L. F. Richardson, Measure and Integration: A Concise Introduction to Real Analysis. John Wiley, New York, 2009. The University of Melbourne, Melbourne VIC 3010, Australia [email protected]

364

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:16 PM All use subject to JSTOR Terms and Conditions

Problems and Solutions Source: The American Mathematical Monthly, Vol. 121, No. 4 (April), pp. 365-372 Published by: Mathematical Association of America Stable URL: http://www.jstor.org/stable/10.4169/amer.math.monthly.121.04.365 . Accessed: 30/03/2014 17:31 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp

. JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected].

.

Mathematical Association of America is collaborating with JSTOR to digitize, preserve and extend access to The American Mathematical Monthly.

http://www.jstor.org

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:09 PM All use subject to JSTOR Terms and Conditions

PROBLEMS AND SOLUTIONS Edited by Gerald A. Edgar, Doug Hensley, Douglas B. West with the collaboration of Itshak Borosh, Paul Bracken, Ezra A. Brown, Randall Dougherty, Tam´as Erd´elyi, Zachary Franco, Christian Friesen, Ira M. Gessel, L´aszl´o Lipt´ak, Frederick W. Luttmann, Vania Mascioni, Frank B. Miles, Richard Pfiefer, Dave Renfro, Cecil C. Rousseau, Leonard Smiley, Kenneth Stolarsky, Richard Stong, Walter Stromquist, Daniel Ullman, Charles Vanden Eynden, Sam Vandervelde, and Fuzhen Zhang. Proposed problems and solutions should be sent in duplicate to the MONTHLY problems address on the back of the title page. Proposed problems should never be under submission concurrently to more than one journal. Submitted solutions should arrive before August 31, 2014. Additional information, such as generalizations and references, is welcome. The problem number and the solver’s name and address should appear on each solution. An asterisk (*) after the number of a problem or a part of a problem indicates that no solution is currently available.

PROBLEMS 11768. Proposed by Ovidiu Furdui, Technical University of Cluj-Napoca, ClujNapoca, Romania. Let f be a bounded continuous function mapping [0, ∞) to itself. Find sZ sZ ! ∞ ∞ n n lim n f n+1 (x)e−x d x − f n (x)e−x d x . n→∞

0

0

11769. Proposed by P´al P´eter D´alyay, Szeged, Hungary. Let a1 , . . . , an and b1 , . . . , bn be positive real numbers. Show that 2  1/2  n n n n X X X X a j ak a j ak al am  aj   −2 . ≤ 2 2 b (b j + b j ) (b j + bk ) l,m=1 (bl + bm )3 j,k=1 j,k=1 j=1 j 11770. Proposed by Spiros P. Andriopoulos, Third High School of Amaliada, Eleia, Greece. Prove, for real numbers a, b, x, y with a > b > 1 and x > y > 1, that ax − by a + b (x+y)/2 a+b > log . x−y 2 2 11771. Proposed by D. M. B˘atinet¸u-Giurgiu, “Matei Basarab” National College, Bucharest, Romania, and Neculai Stanciu, “George Emil Palade” School, Buz˘au, Qb(n−1)/2c Romania. Let n!! = i=0 (n − 2i). Find √ p π n+1 (n + 1)! n lim (2n − 1)!! tan −1 . √ n→∞ 4 n n! http://dx.doi.org/10.4169/amer.math.monthly.121.04.365

April 2014]

PROBLEMS AND SOLUTIONS

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:09 PM All use subject to JSTOR Terms and Conditions

365

11772. Proposed by Mircea Merca, University of Craiova, Craiova, Romania. Let n be a positive integer. Prove that the number of integer partitions of 2n + 1 that do not contain 1 as a part is less than or equal to the number of integer partitions of 2n that contain at least one odd part. 11773. Proposed by Moubinool Omarjee, P Lyc´ee Henri IV, Paris, France. Given a positive real number a0 , let an+1 = exp − nk=0 ak for n ≥ 0. For which values of b does P∞ b n=0 (an ) converge? 11774. Proposed by Yunus Tunc¸bilek, Ataturk High School of Science, Istanbul, Turkey and Danny Lee, Herkimer Senior High School, NY, NY. Let ω be the circumscribed circle of triangle ABC. The A-mixtilinear incircle of ABC and ω is the circle that is internally tangent to ω, AB, and AC, and similarly for B and C. Let A0 , PB , and PC be the points on ω, AB, and AC, respectively, at which the A-mixtilinear incircle touches. Define B 0 and C 0 in the same manner that A0 was defined. (See figure.) B0

A

C0

PC O PB

OA C

B

A0

Prove that triangles C 0 PB B and C PC B 0 are similar.

SOLUTIONS The Lenstra Constant of a Ring 11628 [2012, 162]. Proposed by Jeffrey C. Lagarias and Michael E. Zieve, University of Michigan, Ann Arbor, MI. Define the Lenstra constant L(R) of a commutative ring R to be the size of the largest subset A of R such that a − b is a unit (invertible element) in R for any distinct elements a, b ∈ A. Show that for each positive integer N , the Lenstra constant of the ring Z(1/N ) is the least prime that does not divide N . Solution by Mark D. Meyerson, United States Naval Academy, Annapolis, MD. The e elements of Z(1/N ) are the numbers of the form k/N r with k, r ∈ Z. Let p11 · · · pmem be the prime factorization of N ; each ei is a positive integer. The units in Z(1/N ) are d numbers of the form ± p1 1 · · · prdr with each di ∈ Z. Let p be the least prime that does not divide N . The set {1, . . . , p} has the property that any difference of two distinct elements is a unit, since any prime factor of such a difference is a prime factor of N . Hence, L(Z(1/N )) ≥ p. 366

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:09 PM All use subject to JSTOR Terms and Conditions

Now, let L be a subset of Z(1/N ) such that any nonzero difference is a unit, and suppose that |L| > p. By deleting extra elements, we may assume |L| = p + 1. If we multiply the p + 1 elements of L by a sufficiently high power of N to make all the elements integers, the nonzero differences will still be units. However, by the pigeonhole principle, two of the p + 1 elements are congruent mod p. Their difference is a multiple of p and hence is not a unit. It follows that L(Z(1/N )) ≤ p. The two inequalities prove that p is the Lenstra constant of this ring. Also solved by P. Budney, N. Caro (Brazil), R. Chapman (U. K.), W. Chengyuan (Singapore), P. P. D´alyay (Hungary), S. Dey (India), D. Fleischman, O. Geupel (Germany), Y. J. Ionin, B. Karaivanov, J. H. Lindsey II, O. Lossers (Netherlands), A. Magidin, G. Martin (Canada), M. A. Prasad (India), F. Richman, J. Riegsecker, K. Schilling, J. H. Smith, J. H. Steelman, R. Stong, M. Tetiva (Romania), Colgate University Problem Solving Group, NSA Problems Group, TCDmath Problems Group (Ireland), Texas State University Problem Solving Group, University of Louisiana at Lafayette Math Club, and the proposers.

Rotatable Quasigroups 11631 [2012, 247–248]. Proposed by P´al P´eter D´alyay, Szeged, Hungary. A quasigroup (Q, ∗) is a set Q together with a binary operation ∗ such that for each a, b ∈ Q there exist unique x and unique y (which may be equal) such that ax = b and ya = b. The Cayley table of a finite quasigroup is its ‘times table’. A quasigroup has property P if each row of the table is a rotation of the first row. Find all positive integers n for which there exists a quasigroup ({1, . . . , n}, ∗) with property P in which all elements are idempotent. (For instance, the Cayley table below defines a binary operation on {1, . . . , 5} with property P in which each element is idempotent.) * 1 2 3 4 5

12345 15432 32154 54321 21543 43215

Solution by Fred Richman, Florida Atlantic University, Boca Raton, FL. Such quasigroups exist if and only if n is odd. Cayley tables are just Latin squares; idempotence requires diagonal 1, . . . , n in order. The table is then determined by its first row and property P. The problem is thus to find a permutation of 1, . . . , n as the first row so that the entries in the first column are distinct, since property P then completes a Latin square for the table. We calculate the first entry in row 1 ∗ k. This row is a rotation of row 1, and it must have 1 ∗ k in column 1 ∗ k. Also row 1 has 1 ∗ k in column k, so row 1 is rotated leftward by k − (1 ∗ k) positions to become row 1 ∗ k. Thus, the first entry in row 1 ∗ k is 1 ∗ [k − (1 ∗ k) + 1]. For these values to be distinct, the values k − (1 ∗ k) must be distinct modulo n. When n is odd, 2 is invertible (modulo n). Setting 1 ∗ k ≡ 2 − k as in the proposer’s example yields k − (1 ∗ k) ≡ 2(k − 1), and these elements are distinct (modulo Pn n). When n is even, the values k − (1P ∗ k) cannot be distinct (modulo n) because i=1 i= (n + 1)n/2 ≡ n/2 (mod n) and nk=1 (k − (1 ∗ k)) = 0. Editorial comment. When n is odd, one can require even more: There are many idempotent commutative quasigroups on Zn , such as by putting (i + j)/2 in position (i, j), using the uniqueness of the multiplicative inverse of 2. This construction for April 2014]

PROBLEMS AND SOLUTIONS

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:09 PM All use subject to JSTOR Terms and Conditions

367

n = 2k + 1 is used in the Bose construction of a Steiner triple system on 6k + 3 elements (R. C. Bose, On the construction of balanced incomplete block designs, Ann. Eugenics 9 (1939), 353–399). Also solved by D. Beckwith, R. Chapman (U. K.), S. M. Gagola Jr., O. Geupel (Germany), A. Habil (Syria), E. A. Herman, Y. J. Ionin, B. Karaivanov, J. H. Lindsey II, J. M. Lockhart, O. P. Lossers (Netherlands), C. R. Pranesachar (India), R. E. Prather, J. H. Steelman, R. Stong, J. Wojdylo, Colgate University Problem Solving Group, GCHQ Problem Solving Group (U. K.), TCDmath Problem Group (Ireland), and the proposer.

A Harmonic Identity 11633 [2012, 248]. Proposed P by Anthony Sofo, Victoria University, Melbourne, Australia. For real a, let Hn(a) = nj=1 j −a . Show that for integers a, b, and n with a ≥ 1, b ≥ 0, and n ≥ 1, n (a) X k(Hk2 + Hk(2) ) + 2(k + b)a Hk(1) Hk+b−1

(a) = Hn+b (Hn2 + Hn(2) ).

k(k + b)a

k=1

Solution by Subhadip Dey, Bangalore City, Karnataka, India. As in the problem, we use the notation Hn = Hn(1) and H0(a) = 0. Using the identities Hn2 + Hn(2) = 2

j n X n X X 1 Hj =2 i j j j=1 i=1 j=1

j n X X

and

ai b j =

n X n X

ai b j ,

i=1 j=i

j=1 i=1

the first term on the left side of the identity becomes n X n n n k X X X Hj X 1 Hk2 + Hk(2) Hj =2 =2 . a a (k + b) j (k + b) j (k + b)a k= j k=1 j=1 j=1 k=1

Therefore, we compute n n n n n (a) X X X X Hk Hk+b−1 1 Hk2 + Hk(2) Hj X H j (a) + 2 = 2 + 2 H a a (k + b) k j k= j (k + b) j j+b−1 k=1 k=1 j=1 j=1   n n n X X H j X 1 H j (a) (a) (a)  =2 + H = 2 Hn+b = Hn+b (Hn2 + Hn(2) ). j+b−1 a j (k + b) j k= j j=1 j=1

Editorial comment. Several solvers noted that the identity is valid for all real a. E. A. Herman generalized it to n p ( p) (a) X k(Hk + Hk ) + Z k, p (k + b)a Hk Hk+b−1 k=1

k(k + b)a

where p is a positive even integer and Z k, p =

P p−1 j=1

p j

(a) = Hn+b (Hnp + Hn( p) ),

j−1

Hk

− k1

p−1− j

.

Also solved by P. Bracken, R. Chapman (U. K.), P. P. D´alyay (Hungary), E. S. Eyeson, O. Geupel (Germany), E. A. Herman, B. Karaivanov, O. Kouba (Syria), O. P. Lossers (The Netherlands), M. Omarjee (France), C. R. Pranesachar (India), M. A. Prasad (India), J. H. Steelman, R. Stong, R. Tauraso (Italy), GCHQ Problem Solving Group (U. K.), and the proposer.

368

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:09 PM All use subject to JSTOR Terms and Conditions

A Fractional Integral 11637 [2012, 344]. Proposed by Ovidiu Furdui, Technical University of Cluj-Napoca, Cluj, Romania. Let m ≥ 1 be a nonnegative integer. Let {u} = u − buc; the quantity {u} is called the fractional part of u. Prove that Z 1 m m 1 X 1 xm dx = 1 − ζ (k + 1). x m + 1 k=1 0 (Here ζ denotes the Riemann zeta function.) Solution by Patrick J. Fitzsimmons, San Diego, CA. First note that 1 ≤ x < n1 . From this it follows that n+1 1

Z 0

1 x

=

1 x

− n if

m 1 ∞ Z 1 ∞ X X n 1 (1 − nx)m+1 n m m x dx = (1 − nx) d x = 1 x −n(m + 1) 1 n=1 n+1 n=1

n+1

m+1 ∞ 1 X1 1 = . m + 1 n=1 n n + 1 On the other hand, with Z =

Pm

k=1

ζ (k + 1), we have

m X ∞ ∞ X m X X 1 1 Z= = k+1 k+1 n n k=1 n=1 n=1 k=1

=m+

∞ X n=2

=m+

∞ X n=2

1 n2

1 n m+2 − n1

− 1

=m+

1 − n(n − 1)

∞ X n=2

∞ X n=2

1 1 1− m n(n − 1) n ∞

X 1 1 = m + 1 − . m+1 (n − 1)n n(n + 1)m+1 n=1

Thus both sides of the stated identity equal

1 m+1

P∞

1 n=1 n(n+1)m+1 .

Editorial comment. A similar problem appeared as Problem 1845, Math. Mag., 84 (April 2011), 155–156, and as Problem 11206, this M ONTHLY 114 (2007), 928–929. Eugene A. Herman showed for a > m − 1 that Z 1 m m+1 m 1 1 1 X m+1−k a ζ (k + 1 − m + 1) a+1 . x dx = − x a − m + 1 m + 1 k=1 0 m+1−k Also solved by T. Amdeberhan, P. J. Anderson (Canada), M. Bataille (France), D. Beckwith, K. N. Boyadzhiev, M. A. Carlton, N. Caro (Brazil), R. Chapman (U. K.), M. W. Coffey, C. Curtis, P. P. D´alyay (Hungary), E. S. Eyeson, D. Fleischman, O. Geupel (Germany), M. L. Glasser, M. Goldenberg & M. Kaplan, D. Gove, G. C. Greubel, J.-P. Grivaux (France), J. A. Grzesik, E. A. Herman, E. Hysnelaj (Australia) & E. Bojaxhiu (Germany), W. Janous (Austria), B. Karaivanov, D. R. Kim (Korea), O. Kouba (Syria), H. Kwong, J. B. Little, O. P. Lossers (Netherlands), I. Mez˝o (Hungary), U. Milutinovi´c (Slovenia), J. Minkus, R. Nandan, M. Omarjee (France), P. Perfetti (Italy) T. Perrson & M. P. Sundqvist (Sweden), C. R. Pranesachar (India), M. A. Prasad (India), R. Pratt, V. Sah, J. Schlosberg, N. C. Singer, A. Stenger, R. Stong, R. Tauraso (Italy), D. B. Tyler, J. Vinuesa (Spain), T. Viteam (Uruguay), M. Vowe (Switzerland), A. Witkowski (Poland), J. Zacharias, GCHQ Problem Solving Group (U. K.), Missouri State University Problem Solving Group, NSA Problems Group, TCDmath Problem Group (Ireland), and the proposer.

April 2014]

PROBLEMS AND SOLUTIONS

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:09 PM All use subject to JSTOR Terms and Conditions

369

Independent Triples in a Discrete Probability Space 11643 [2012, 426]. Proposed by Eugen J. Ionascu, Columbus State University, Columbus, GA. Let r be a real number with 0 < r < 1, and define a discrete probability measure P on N by P(k) = (1 − r )r k−1 for k ≥ 1. Show that there are uncountably many triples (A1 , A2 , A3 ) of subsets of N that are mutually independent, that is, P(Ai ∩ A j ) = P(Ai )P(A j ) for i 6 = j and P(A1 ∩ A2 ∩ A3 ) = P(A1 )P(A2 )P(A3 ). S Solution bySOliver Geupel, Br¨uhl, NRW, Germany. Let A1 = m≥0 {4m + 1, 4m + 2} and S A2 = m≥0 {4m + 1, 4m + 3}. For any set B of nonnegative integers, let A3 = m∈B {4m + 1, 4m + 2, 4m + 3, 4m + 4}. Since B is arbitrary, there are uncountably many such triples. We show that the events A1 , A2 , A3 are mutually independent. We have P(A1 ) = (1 − r )

∞ X 1 1+r = , (r 4m + r 4m+1 ) = (1 − r ) 4 1−r 1 + r2 m=0

∞ X 1 + r2 1 P(A2 ) = (1 − r ) (r 4m + r 4m+2 ) = (1 − r ) = , and 4 1 − r 1 + r m=0 X X P(A3 ) = (1 − r ) (r 4m + r 4m+1 + r 4m+2 + r 4m+3 ) = (1 − r 4 ) r 4m . m∈B

m∈B

Furthermore, P(A1 ∩ A2 ) = (1 − r )

∞ X

(r 4m

1−r = P(A1 )P(A2 ), 1 − r4 X + r 4m+1 ) = (1 − r 2 ) r 4m = P(A1 )P(A3 ),

(r

+r

r 4m =

m=0

P(A1 ∩ A3 ) = (1 − r )

X m∈B

P(A2 ∩ A3 ) = (1 − r )

X

m∈B 4m

4m+2

) = P(A2 )P(A3 ),

m∈B

and P(A1 ∩ A2 ∩ A3 ) = (1 − r )

X

r 4m = P(A1 )P(A2 )P(A3 ).

m∈B

Editorial comment. Many solvers noted that there are trivial solutions, such as A1 = A2 = N and A3 arbitrary. The solution presented here demonstrates that the sets can be required to be nontrivial. Solved also by M. Carlton, J. H. Lindsey II, M. D. Meyerson, M. Rajeswari (India), K. Schilling, R. Stong, GCHQ Problem Solving Group (U. K.), and the proposer.

Factorable Polynomials 11645 [2012, 427]. Proposed by Christopher J. Hillar, University of California, Berkeley, CA, Lionel Levine, Cornell University, Ithaca, NY, and Darren Rhea, University of California, San Francisco, CA. Determine all positivePintegers n such that the polynomial g in two variables given by g(x, y) = 1 + y 2 nk=1 x 2k + y 4 x 2n+2 factors in C[x, y]. 370

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:09 PM All use subject to JSTOR Terms and Conditions

Solution by O. P. Lossers, Eindhoven University of Technology, Eindhoven, The Netherlands. For n = 1, g has x 2 y 2 + ξ as a factor, where ξ is a primitive cube root of unity in C. For n = 2, g has x 2 y 2 + 1 as a factor. We claim that g does not factor in C[x, y] when n ≥ 3. Equivalently, P we claim that h does not factor in C[x, y] when n ≥ 3, where h(x, y) = y 4 + y 2 nk=1 x 2k + x 2n+2 . First, we note that h is a polynomial in y 2 over C[x], so if h has a linear factor, necessarily of the form y + a(x), then y − a(x) is another linear factor and so y 2 − a(x)2 is a quadratic factor of h. If our claim is false, then h factors as a product of two quadratic polynomials in y over C[x], and such a factorization has the form h(x, y) = (y 2 + a(x)y + λx r )(y 2 − a(x)y + λ−1 x s ), where r and s are nonnegative integers such that r + s = 2n + 2 and a(x) ∈ C[x]. Inspecting the coefficient of y shows that λx r a(x) = λ−1 x s a(x). Now, a(x) = 0 is impossible if n ≥ 3, as the expression for h(x, y) would not have enough terms. Therefore, r = s = n + 1 and λ = ±1. P Let σ be the polynomial given by σ (x) = nk=1 x 2k . From the coefficient of y 2 in h, we see that σ = 2λx n+1 − a 2 (x), with a of degree n. Writing a = ib and b = c(x 2 ) + xd(x 2 ) gives σ = 2λx n+1 + c2 (x 2 ) + 2xc(x 2 )d(x 2 ) + x 2 d 2 (x 2 ).

(1)

If n is even, then equating odd parts in (1) gives 0 = 2λx n+1 + 2xc(x 2 )d(x 2 ), whence c and d must be monomials. But then the left side of (1) has n terms while the right side has just two. So n is odd, say n = 2m + 1. In this case, from (1) it follows that cd = 0, and since b(x) = c(x 2 ) + xd(x 2 ) has degree 2m + 1, it must be c that is 0 so that b can have odd degree. Writing z = x 2 , we thus have 2m+1 X

z k = zd 2 (z) + 2λz m+1 ,

(2)

k=1

P j m where d has the form d = 1 + m−1 j=1 d j z + z with ∈ {−1, 1}. The first m terms of d now coincide with those of (1 − z)−1/2 , so d j = (−1) j −1/2 for 0 ≤ j ≤ m − 1. j A similar calculation for the last m coefficients of d shows that dm− j = d j for 0 ≤ j ≤ m − 1. But that gives contradictory values for d1 when m ≥ 1, so there is no factorization if n ≥ 2, as claimed. Also solved by G. Apostolopoulos (Greece), R. Chapman (U. K.), P. P. Dalyay (Hungary), D. Fleischman, O. Geupel (Germany), M. Goldenberg & M. Kaplan, E. A. Herman, B. Karaivanov, O. Kouba (Syria), J. H. Lindsey II, A. Magidin, M. A. Prasad (India), N. Singer, R. Stong, E. Verriest, and the proposers.

A Geometric Inequality 11646 [2012, 427]. Proposed by P´al P´eter D´alyay, Szeged, Hungary. Let ABC be an acute triangle, and let A1 , B1 , C1 be the intersection points of the angle bisectors from A, B, C to the respective opposite sides. Let R and r be the circumradius and the inradius of ABC, and let R A , R B , RC be the circumradii of the triangles AC1 B1 , BA1 C1 , and CA1 B1 , respectively. Let H be the orthocenter of ABC, and let da , db , dc be the distances from H to sides BC, CA, and AB, respectively. Show that 2r (R A + R B + RC ) ≥ R(da + db + dc ). April 2014]

PROBLEMS AND SOLUTIONS

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:09 PM All use subject to JSTOR Terms and Conditions

371

Solution by Peter N¨uesch, Switzerland. Our solution uses Problem 11552 (this M ONTHLY, October 2012, p. 702–703). We write a, b, c for the lengths of the sides of 4ABC, s for the semi-perimeter, and α, β, γ for the measures of the angles. From the definitions, we have AB1 =

bc , c+a

bc , a+b

AC1 =

B1 C1 = a1 = 2R A sin α.

Using a = 2R sin α, we get R A = Ra1 /a. Thus, a1 b1 c1 r R A + R B + RC = R + + ≥ R 1+ = R + r, a b c R where the inequality is Problem 11552. From da = 2R cos β cos γ , we have da + db + dc = 2R(cos β cos γ + cos γ cos α + cos α cos β) =

r 2 + s 2 − 4R 2 . 2R

Note that (r 2 + s 2 − 4R 2 )/2R ≤ (2r (R + r ))/R, since this is a rearrangement of a 2 2 2 2 2 Blundon inequality, √ s ≤ 4R + 4Rr + 3r . (This follows from s ≤ 2R + 10Rr − 2 r + 2(R − 2r ) R(R − 2r ), found in W. J. Blundon, Inequalities associated with the triangle, Canad. Math. Bull. 8 (1965) 615–626.) This proves 2r (R A + R B + RC ) ≥ 2r (R + r ) ≥ R(da + db + dc ). Also solved by B. Karaivanov, J. Zacharias, and the proposer.

A Subset That Is Not Closed 11648 [2012, 427]. Proposed by Moubinool Omarjee, Paris, France. Let R 1 E be the set of all continuous, differentiable functions from (0, 1] into R such that 0 t 1/2 f 2 (t) dt R1 R1 converges. Let F be the set of all f in E such that 0 t −3/2 f 2 (t) dt and 0 t 1/2 f 0 (t)2 dt converge. Equip E with the distance 1/2 Z 1 t 1/2 ( f − g)2 (t) dt d( f, g) = 0

to make it a metric space. Is F a closed subset of E? Solution by O. P. Lossers, Eindhoven University of Technology, Eindhoven, The Netherlands. No, F is not closed. Consider f (t) = t 1/4 , so that f ∈ E but f 6 ∈ F. Let φ be a differentiable function with minimum 0 and maximum 1, and such that φ(t) = 0 for 0 < t < 1 and φ(t) = 1 for t > 2. Define φ (t) = φ(t/) for > 0. Note that φ f ∈ F. Now Z 1 Z 2 2 d( f, φ f )2 = t 1/2 1 − φ(t/) f (t)2 dt ≤ t 1/2 f (t)2 dt, 0

0

which goes to 0 as goes to 0. Hence, f is in the closure of F. Editorial comment. If the problem statement had said “continuously differentiable” and not just “continuous, differentiable”, then the above argument would in fact show that F is dense in E. Also solved by P. P. D´alyay (Hungary), O. Kouba (Syria), J. H. Lindsey II, R. Stong, and GCHQ Problem Solving Group (U. K.).

372

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:09 PM All use subject to JSTOR Terms and Conditions

Review Encounters with Chaos and Fractals . 2nd edition. By Denny Gulick. Chapman and Hall/CRC Press, Boca Raton, 2012, xvi + 371 pp., ISBN 978-1-58488-517-7, $79.95. Review by: Jeffrey Nunemacher The American Mathematical Monthly, Vol. 121, No. 4 (April), pp. 373-376 Published by: Mathematical Association of America Stable URL: http://www.jstor.org/stable/10.4169/amer.math.monthly.121.04.373 . Accessed: 30/03/2014 17:31 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp

. JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected].

.

Mathematical Association of America is collaborating with JSTOR to digitize, preserve and extend access to The American Mathematical Monthly.

http://www.jstor.org

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:47 PM All use subject to JSTOR Terms and Conditions

REVIEWS Edited by Jeffrey Nunemacher Mathematics and Computer Science, Ohio Wesleyan University, Delaware, OH 43015

Encounters with Chaos and Fractals, 2nd edition. By Denny Gulick. Chapman and Hall/CRC Press, Boca Raton, 2012, xvi + 371 pp., ISBN 978-1-58488-517-7, $79.95.

Reviewed by Jeffrey Nunemacher How can we convince undergraduates that mathematics is as modern and vibrant as physics or biology in these days of the Higgs boson and genome sequencing? Certainly ordinary calculus, although it is intellectually rich, does not do the trick. Since most promising mathematics and science students see it first in high school, the level of excitement that I still remember from seeing it presented relatively rigorously in college many years ago is simply not present today. My candidate for a teachable contemporary mathematical topic that can attract modern students is chaotic dynamical systems, or to give the subject a more enticing name, chaos and fractals. I have taught courses on this subject at a variety of levels from freshman honors to senior capstone. And the text that I have enjoyed using the most (at least for a lower-level version) is the Gulick book, which has recently appeared in a second edition. The new edition offers more material on fractals (three chapters rather than one) and gives expanded coverage of background material and attention to modern algorithms. This second edition is the subject of the current review. The subject of chaos was invented around the turn of the twentieth century by Poincar´e (but named much later by Yorke). He showed that a deterministic system of second-order differential equations modeling a particular three-body solar system can have solutions that display sensitive dependence on initial conditions. Thus, some trajectories simply cannot be predicted with any degree of accuracy over the long term. But the subject did not really take off until the development of software for experimentation and graphics. Once these tools were available and applications to subjects like weather prediction and chemical reactions were discovered, there was incentive to find the correct mathematical framework and to build an appropriate theory. Some chaotic trajectories display fractal behavior, so this modern geometric concept occurs naturally in the study of chaotic systems. Fractals also occur as the limit sets of simple discrete dynamical systems. Take, for instance, the iterated function system (IFS) defined by the three affine mappings of the plane: T1 (v) = 1/2v, T2 (v) = √ 1/2v + (1/2, 0), T3 (v) = 1/2v + (1/4, 3/4). If we start with the origin and iterate this IFS many times, the limit set is the famous Sierpi´nski gasket, and a good computer image is obtained by using the tenth iterate. It is possible to teach much of this material to motivated students who have a background of only first-semester calculus. Of course, the more mathematics a student knows the better, but a course can be taught with this very minimal prerequisite. Encountered during the course will be some topics from sophomore courses including iteration (discrete mathematics), matrices (linear algebra), the qualitative study of solutions of differential equations, and algorithms employing pseudorandom numbers http://dx.doi.org/10.4169/amer.math.monthly.121.04.373

April 2014]

REVIEWS

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:47 PM All use subject to JSTOR Terms and Conditions

373

(programming and statistics). But it can be argued that seeing interesting topics from these courses as they arise naturally is the best possible motivation for then taking the standard courses. Ideas from topology and real analysis, both on the line and in abstract spaces, also come up naturally as the course proceeds. The subjects of chaos and fractals have been part of the undergraduate mathematical landscape ever since Devaney’s first edition of his attractive book [3] in 1985. A special issue of the College Mathematics Journal (Volume 22, No. 1, January 1991) was devoted to this new topic and discusses how it might fit into the undergraduate curriculum. Let me list particular aspects of the subject that I find particularly well emphasized in the Gulick book. 1. There are fundamental simple, yet fecund, examples to explore and generalize, e.g., the quadratic mapping in one variable, the Sierpi´nski gasket, Smale’s Horseshoe mapping in the plane, the Lorenz system of differential equations (which provided the first example of chaos in a real situation), the Mandelbrot set in the complex plane. Most students and some professors do not appreciate how crucial examples are for the development of a mathematical subject. Since most of the mathematics that we teach is quite old, motivating examples are often treated very briefly in the rush to get to theorems. The examples in chaos and fractals are rich and somewhat complicated, and it is natural to linger over them. Thus the subject is a good corrective to standard courses. Gulick does a good job analyzing these examples, starting with easy mathematics but getting to some depth. 2. A variety of tools and theory are useful in exploring these examples, e.g., derivatives and Jacobians, Lyapunov exponents, symbolic dynamics, conjugacy, bifurcation theory. 3. It is nontrivial to arrive at the best definitions on which to base the relevant theory, e.g., strong and weak chaos for function iteration, the Hausdorff metric on the space of compact sets in the plane, and various versions of dimension. Recall the Bourbaki point of view that definitions in mathematics should be carefully constructed (and perhaps difficult) in order to make the theorems easy. 4. There are surprising fundamental theorems, e.g., Sharkovsky’s Theorem about the occurrence of periods in one dimension based on a particular total ordering of the natural numbers, the Stable and Unstable Manifold Theorem, the connectedness of the Mandelbrot set. 5. It is natural to explore the phenomena of both chaos and fractals using computational resources, e.g., to draw bifurcation diagrams, to approximate fractal sets. Chaos is a wonderful subject for exploratory mathematics. 6. Finally, examples from a wide variety of applied areas are available to show the relevance of this subject, e.g., chaotic pendula in mechanics or fractal coastlines in geography. While there are many excellent texts about these subjects at various levels, I have not found a better book than Gulick’s for a serious course at the honors freshman or sophomore level. There are very elementary books that concentrate on intuitive understanding and visual images, and many others that require a greater depth of mathematical background. One requirement, which to me is important, is that the course (and thus the text) should treat both discrete and continuous dynamical systems. I feel that the richness and applications of the subject can only be seen by studying both types. Excellent books that fail to satisfy this criterion (and which also are too advanced for beginning students) are [1], [7], and [10]. A standard book on fractals and the algorithms to produce them on a computer is [2]. Typically, for my course I use a main 374

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:47 PM All use subject to JSTOR Terms and Conditions

text and then also a good expository book of broader scope. In the past I have selected popular books by Gleick [4], Peterson [6], Ruelle [8], or Stewart [9]. Of these, the one I’ve enjoyed using the most is [9]. I’ve also required each student to do an independent project, which can be experimental, computational, or mathematical. Next, I will briefly discuss the contents of Gulick’s book. The choice of topics is particularly well selected for the not particularly advanced but still seriously mathematical undergraduate course that I envision. The book begins with two chapters on discrete one-dimensional iteration, the first devoted to simple examples, fixed and periodic points, and bifurcation, and the second focusing on chaotic behavior. Some of the examples are explored in some detail; for example, the study of the logistic family Q µ (x) = µx(1 − x) requires ten pages. The two most common bifurcations, namely the period-doubling bifurcation and the tangent bifurcation, are studied and explored in examples and problems. The Li-Yorke Theorem, which asserts that if f is continuous on a closed interval J and maps J into itself, then if f has a period-3 point it also has points of all other periods, is proven in detail, while its generalization by Sharkovsky is simply stated and discussed. By the way, the Li-Yorke result first appeared in this M ONTHLY in 1975 [5] and is one of the early papers that made the subject of chaos popular. The tools needed in one dimension are the single-variable derivative and a computational system to explore examples of iteration. Chapter 3 generalizes these ideas to two dimensions using simple matrix theory and the Jacobian, and explores two classic examples of chaotic behavior: the H´enon quadratic mapping and Smale’s Horseshoe. Chapter 4 moves from the discrete setting to continuous dynamical systems, which are defined in terms of first-order differential equations. It generalizes the basic concepts to this setting and explores the pendulum system and the Lorenz system as two examples. No experience in solving differential equations is necessary. The basic idea of a differential equation defining a flow, together with some of the basic properties of the flow, is developed. Continuous dynamical systems require more machinery and sophistication to develop (which is mostly not done in this book). However, the most important applications of chaos to reality lie in this realm. There are also some philosophical points to make about the modeling process. For example, since chaos is a mathematical construct, it can apply to a given mathematical mode of reality but never to physical reality itself. Thus no phenomenon can ever be chaotic in the mathematical sense. The last three chapters of the book concentrate on fractals. Chapter 6 introduces the basic idea of a fractal and discusses self-similarity and various kinds of fractal dimension. It also presents some basic examples, such as the Cantor set, the Sierpi´nski gasket, and the H´enon attractor. Chapter 7 discusses Barnsley’s Iterated Function Systems using metric spaces and shows how they can be used to generate fractals on a computer. This chapter includes several elegant and useful results, such as the completeness of the collection of compact sets in the plane under the Hausdorff metric. Finally, Chapter 8 studies fractals in the complex plane and introduces Julia sets and the Mandelbrot set. The second edition of the book offers enhanced coverage of fractals beyond what was presented in the first edition. An appendix in the book presents MATLAB functions to allow the study of iteration empirically and to generate on the computer the classical images associated with chaos and fractals. For instance, there is MATLAB code to produce the bifurcation diagram of the logistic mapping Q µ (x) as µ varies over an interval, to draw the H´enon attractor, and to display Julia sets and the Mandelbrot set. It is a good choice to offer these experimental tools in a commonly available package, since the operation of the code can be understood with minimal effort. However, minor errors in some of the code April 2014]

REVIEWS

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:47 PM All use subject to JSTOR Terms and Conditions

375

will cause problems for beginning users of MATLAB. For example, in Program 4 to produce a bifurcation diagram, I found four separate errors: The increments are 0.01, not 0.001 as promised (and 0.001 is necessary to obtain a good picture); a closing parenthesis is needed for the axis command; “hold on” should replace “holdon”; and “m” should replace “n” as an argument for the function Qm. The author intends to correct the errors on an Errata page. It is important for inexperienced users to be able to use the code, since the ability to experiment with examples is one of the most attractive features of this area of mathematics. I also found a few small errors in the text and exercises and some imprecise statements, such as Lemma 1 on page 227, which is stated for all increasing continuous functions but applies only to Cantor-like ones. Also, there is a loose statement on page 214 that asserts concepts pertinent to two-dimensional differential equations apply equally well in dimension three. The Poincar´e-Bendixson Theorem, which is used in the book, is a rather stark counterexample to this assertion. Despite these minor errors, I feel that this book is the best text available for a midlevel undergraduate course on chaos and fractals. The choice of topics, readable prose, and level of presentation make it a very attractive book. REFERENCES 1. K. T. Alligood, T. D. Sauer, J. A. Yorke, Chaos: An Introduction to Dynamical Systems. Springer-Verlag, New York, 1996. 2. M. F. Barnsley, Fractals Everywhere. Second edition, Academic Press, Boston, MA, 1993. 3. R. L. Devaney, An Introduction to Chaotic Dynamical Systems. Second edition, Addison-Wesley, Reading, MA, 1989. 4. J. Gleich, Chaos: The Making of a New Science. Viking, New York, 1988. 5. T. Y. Li, J .A. Yorke, Period three implies chaos, Amer. Math. Monthly 82 (1975) 985–992. 6. I. Peterson, Newton’s Clock: Chaos in the Solar System. W. H. Freeman, New York, 1995. 7. R. C. Robinson, An Introduction to Dynamical Systems. Prentice Hall, Englewood Cliffs, NJ, 2004. 8. D. Ruelle, Chance and Chaos. Princeton University Press, Princeton, NJ, 1993. 9. I. Stewart, Does God Play Dice? The New Mathematics of Chaos. Second edition, Blackwell, Malden, MA, 2002. 10. S. H. Strogatz, Nonlinear Dynamics and Chaos With Applications to Physics, Biology, Chemistry, and Engineering. Addison-Wesley, Reading, MA, 1994. Ohio Wesleyan University, Delaware, OH 43015 [email protected]

376

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:47 PM All use subject to JSTOR Terms and Conditions

Back Matter Source: The American Mathematical Monthly, Vol. 121, No. 4 (April) Published by: Mathematical Association of America Stable URL: http://www.jstor.org/stable/10.4169/amer.math.monthly.121.04.bm . Accessed: 30/03/2014 17:31 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp

. JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected].

.

Mathematical Association of America is collaborating with JSTOR to digitize, preserve and extend access to The American Mathematical Monthly.

http://www.jstor.org

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:56 PM All use subject to JSTOR Terms and Conditions

New in the MAA eBooks Store Illustrated Special Relativity through Its Paradoxes: A Fusion of Linear Algebra, Graphics, and Reality Spectrum

B John dePillis and José Wudka By SSpectrum Series

T text illustrates and resolves several apparent The p paradoxes of Special Relativity including the twin p paradox and train-and-tunnel paradox. Assuming a minimum of technical prerequisites the authors iintroduce inertial frames and use them to explain a variety of phenomena: the nature of simultaneity, tthe proper way to add velocities, and why fasterthrough its tthan-light travel is impossible. Most of these eexplanations are contained in the resolution of aapparent paradoxes, including some lesser-known ones: the pea-shooter paradox, the bug-and-rivet o p paradox, and the accommodating universe paradox. The explanation of time and length contraction is T especially clear and illuminating.

IIllustrated ll Special Relativity

Paradoxes John dePillis & José Wudka Illustrations and animations by John dePillis

The roots of Einstein’s work in Maxwell’s lead the authors to devote several chapters to an exposition of Maxwell’s equations. The authors establish that those equations predict a frame-independent speed for the propagation of electromagnetic radiation, a speed that equals that of light. Several chapters are devoted to experiments of Roemer(SYMBOL!), Fizeau, and de Sitter to measure the speed of light and the Michelson-Morley experiment abolishing the aether. Throughout the exposition is thorough, but not overly technical, and often illustrated by cartoons. The volume might be suitable for a one-semester generaleducation introduction to Special Relativity. It is especially well-suited to self-study by interested laypersons or use as a supplement to a more traditional text. eISBN 978-1-61444-517-3 2013, 478 pp. Catalog Code: ISR PDF Price: $33.00

To order, visit www.maa.org/ebooks/ISR.

MATHEMATICAL ASSOCIATION OF AMERICA

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:56 PM All use subject to JSTOR Terms and Conditions

MATHEMATICAL ASSOCIATION OF AMERICA 1529 Eighteenth St., NW • Washington, DC 20036

Recently Released from the MAA Distilling Ideas: An Introduction to Mathematical Thinking By Brian P. Katz and Michael Starbird MAA Textbooks Mathematics is not a spectator sport: successful students of mathematics grapple with ideas for themselves. Distilling Ideas presents a carefully designed sequence of exercises and theorem statements that challenge students to create proofs and concepts. As students meet these challenges, they discover strategies of proofs and strategies of thinking beyond mathematics. In other words, Distilling Ideas helps its users to develop the skills, attitudes, and habits of mind of a mathematician and to enjoy the process of distilling and exploring ideas. Catalog Code: DIMT List Price: $54.00 MAA Member: $45.00

ISBN: 978-1-93951-203-1 171 pp., Paperbound, 2013

To order, visit maa-store.hostedbywebstore.com or call 800-331-1622.

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:56 PM All use subject to JSTOR Terms and Conditions

Front Matter Source: The American Mathematical Monthly, Vol. 121, No. 4 (April), pp. 281-282 Published by: Mathematical Association of America Stable URL: http://www.jstor.org/stable/10.4169/amer.math.monthly.121.04.fm . Accessed: 30/03/2014 17:27 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp

. JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected].

.

Mathematical Association of America is collaborating with JSTOR to digitize, preserve and extend access to The American Mathematical Monthly.

http://www.jstor.org

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:27:55 PM All use subject to JSTOR Terms and Conditions

THE AMERICAN MATHEMATICAL

MONTHLY VOLUME 121, NO. 4

APRIL 2014

Periodicity Domains and the Transit of Venus

283

Andrew J. Simoson

A Drug-Induced Random Walk

299

Daniel J. Velleman

Analytical Solution for the Generalized Fermat–Torricelli Problem

318

Alexei Yu. Uteshev

On the Proof of the Existence of Undominated Strategies in Normal Form Games

332

Martin Kov´ ar and Alena Chernikava

An Asymptotic Formula for (1 + 1/x)x Based on the Partition Function

338

Chao-Ping Chen and Junesang Choi

NOTES Stirling’s Approximation for Central Extended Binomial Coefficients

344

Steffen Eger

A New Proof of Stirling’s Formula

350

Thorsten Neuschel

Zeta(2) Once Again

353

Ralph M. Krause

Polynomials (x 3 − n)(x 2 + 3) Solvable Modulo Any Integer

355

Andrea M. Hyde, Paul D. Lee, and Blair K. Spearman

Macaulay Expansion

359

B. Sury

Evaluating Lebesgue Integrals Efficiently with the FTC

361

J. J. Koliha

PROBLEMS AND SOLUTIONS

365

REVIEWS Encounters with Chaos and Fractals By Denny Gulick Jeffrey Nunemacher

MATHBITS 331, A One-Sentence Line-of-Sight Proof of the Extreme Value Theorem An Official Publication of the Mathematical Association of America

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:27:55 PM All use subject to JSTOR Terms and Conditions

373

Latest in the MAA Notes Series Applications of Mathematics in Economics Warren Page, Editor Applications of Mathematics in Economics presents an overview of the (qualitative and graphical) methods and perspectives of economists. Its objectives are not intended to teach economics, but rather to give mathematicians a sense of what mathematics is used at the undergraduate level in various parts of economics, and to provide students with the opportunities to apply their mathematics in relevant economics contexts. The volume’s applications span a broad range of mathematical topics and levels of sophistication. Each article consists of self-contained, stand-alone, expository sections whose problems illustrate what mathematics is used, and how, in that subdiscipline of economics. The problems are intended to be richer and more informative about economics than the economics exercises in most mathematics texts. Since each section is self-contained, instructors can readily use the economics background and worked-out solutions to tailor (simplify or embellish) a section’s problems to their students’ needs. Overall, the volume’s 47 sections contain more than 100 multipart problems. Thus, instructors have ample material to select for classroom uses, homework assignments, and enrichment activities. eISBN: 9781614443179 Print ISBN: 9780883851920 ebook: $24.00 Print on demand (paperbound): $40.00

To order go to www.maa.org/ebooks/NTE82 MATHEMATICAL ASSOCIATION OF AMERICA

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:27:55 PM All use subject to JSTOR Terms and Conditions

THE AMERICAN MATHEMATICAL

MONTHLY Volume 121, No. 4

April 2014

EDITOR Scott T. Chapman Sam Houston State University NOTES EDITOR Sergei Tabachnikov Pennsylvania State University

Douglas B. West University of Illinois

BOOK REVIEW EDITOR Jeffrey Nunemacher Ohio Wesleyan University

PROBLEM SECTION EDITORS Gerald Edgar Ohio State University

Doug Hensley Texas A&M University

ASSOCIATE EDITORS William Adkins Louisiana State University David Aldous University of California, Berkeley Elizabeth Allman University of Alaska, Fairbanks Jonathan M. Borwein University of Newcastle Jason Boynton North Dakota State University Edward B. Burger Southwestern University Minerva Cordero-Epperson University of Texas, Arlington Allan Donsig University of Nebraska, Lincoln Michael Dorff Brigham Young University Daniela Ferrero Texas State University Luis David Garcia-Puente Sam Houston State University Sidney Graham Central Michigan University Tara Holm Cornell University Roger A. Horn University of Utah Lea Jenkins Clemson University Daniel Krashen University of Georgia Ulrich Krause Universit¨ at Bremen

Jeffrey Lawson Western Carolina University C. Dwight Lahr Dartmouth College Susan Loepp Williams College Irina Mitrea Temple University Bruce P. Palka National Science Foundation Vadim Ponomarenko San Diego State University Catherine A. Roberts College of the Holy Cross Rachel Roberts Washington University, St. Louis Ivelisse M. Rubio Universidad de Puerto Rico, Rio Piedras Adriana Salerno Bates College Edward Scheinerman Johns Hopkins University Anne Shepler University of North Texas Susan G. Staples Texas Christian University Dennis Stowe Idaho State University Daniel Ullman George Washington University Daniel Velleman Amherst College

EDITORIAL ASSISTANT Bonnie K. Ponce

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:27:55 PM All use subject to JSTOR Terms and Conditions

NOTICE TO AUTHORS

Proposed problems or solutions should be sent to:

The MONTHLY publishes articles, as well as notes and other features, about mathematics and the profession. Its readers span a broad spectrum of mathematical interests, and include professional mathematicians as well as students of mathematics at all collegiate levels. Authors are invited to submit articles and notes that bring interesting mathematical ideas to a wide audience of MONTHLY readers.

DOUG HENSLEY, MONTHLY Problems Department of Mathematics Texas A&M University 3368 TAMU College Station, TX 77843-3368.

The MONTHLY’s readers expect a high standard of exposition; they expect articles to inform, stimulate, challenge, enlighten, and even entertain. MONTHLY articles are meant to be read, enjoyed, and discussed, rather than just archived. Articles may be expositions of old or new results, historical or biographical essays, speculations or definitive treatments, broad developments, or explorations of a single application. Novelty and generality are far less important than clarity of exposition and broad appeal. Appropriate figures, diagrams, and photographs are encouraged.

In lieu of duplicate hardcopy, authors may submit pdfs to [email protected]. Advertising correspondence should be sent to: MAA Advertising 1529 Eighteenth St. NW Washington DC 20036. Phone: (877) 622-2373, E-mail: [email protected]. Further advertising information can be found online at www.maa.org.

Notes are short, sharply focused, and possibly informal. They are often gems that provide a new proof of an old theorem, a novel presentation of a familiar theme, or a lively discussion of a single issue.

Change of address, missing issue inquiries, and other subscription correspondence can be sent to:

Submission of articles, notes, and filler pieces is required via the MONTHLY’s Editorial Manager System. Initial submissions in pdf or LATEX form can be sent to the Editor Scott Chapman at

All of these are at the address:

http://www.editorialmanager.com/monthly The Editorial Manager System will cue the author for all required information concerning the paper. Questions concerning submission of papers can be addressed to the Editor at [email protected]. Authors who use LATEX can find our article/note template at http://www.shsu.edu/~bks006/Monthly. html. This template requires the style file maamonthly.sty, which can also be downloaded from the same webpage. A formatting document for MONTHLY references can be found at http://www.shsu.edu/ ~bks006/FormattingReferences.pdf. Follow the link to Electronic Publications Information for authors at http://www.maa.org/pubs/monthly. html for information about figures and files, as well as general editorial guidelines. Letters to the Editor on any topic are invited. Comments, criticisms, and suggestions for making the MONTHLY more lively, entertaining, and informative can be forwarded to the Editor at [email protected]. The online MONTHLY archive at www.jstor.org is a valuable resource for both authors and readers; it may be searched online in a variety of ways for any specified keyword(s). MAA members whose institutions do not provide JSTOR access may obtain individual access for a modest annual fee; call 800-3311622. See the MONTHLY section of MAA Online for current information such as contents of issues and descriptive summaries of forthcoming articles: http://www.maa.org/

MAA Service Center, [email protected].

The Mathematical Association of America 1529 Eighteenth Street, N.W. Washington, DC 20036. Recent copies of the MONTHLY are available for purchase through the MAA Service Center: [email protected], 1-800-331-1622. Microfilm Editions are available at: University Microfilms International, Serial Bid coordinator, 300 North Zeeb Road, Ann Arbor, MI 48106. The AMERICAN MATHEMATICAL MONTHLY (ISSN 0002-9890) is published monthly except bimonthly June-July and August-September by the Mathematical Association of America at 1529 Eighteenth Street, N.W., Washington, DC 20036 and Lancaster, PA, and copyrighted by the Mathematical Association of America (Incorporated), 2014, including rights to this journal issue as a whole and, except where otherwise noted, rights to each individual contribution. Permission to make copies of individual articles, in paper or electronic form, including posting on personal and class web pages, for educational and scientific use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear the following copyright notice: [Copyright the Mathematical Association of America 2014. All rights reserved.] Abstracting, with credit, is permitted. To copy otherwise, or to republish, requires specific permission of the MAA’s Director of Publications and possibly a fee. Periodicals postage paid at Washington, DC, and additional mailing offices. Postmaster: Send address changes to the American Mathematical Monthly, Membership/Subscription Department, MAA, 1529 Eighteenth Street, N.W., Washington, DC, 20036-1385.

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:27:55 PM All use subject to JSTOR Terms and Conditions