Chapters 5,6,7 Solutions to Exercises Niteesh Sahni ∗
1
Norm Normal al Dist Distri ribu buti tion on
A random variable X variable X following following a Normal distribution with parameters µ and σ has the probability density function given by (x
√ 21πσ e
−
f ( f (x) = for all x all x
∈ R.
Theorem 1.1.
∞
−∞
e
x2
−
√ dx = dx = π , and
∞
0
− µ)2
2σ 2
e
x2
−
dx = dx =
(1)
√ π 2
.
Theorem Theorem 1.2. The function f f given by equation (1) is a probability density
function. Proof. It is easy to see that f ( f (x) ∞
−∞
≥ 0 for all x ∈ R. So it remains to show that x−µ f ( f (x)dx = dx = 1. Now by putting t putting t = √ , we can write 2σ ∞
1 f ( f (x)dx = π −∞ 1 = π = 1.
√
∞
2
e−t dt
−∞
√ × √ π
(by Theorem 1.1)
µ = 0, and σ σ = 1 in equation (1) we get the probability Definition 1.3. Putting µ = density function for the standard normal distribution:
f ( f (x) = ∗
x2 1 − e 2 , x 2π
√
∈ R.
Asst. Professor, Professor, Deptt. of Maths, Shiv Nadar Universit University y
1
Usuall Usually y, a table table of values alues is provid provided ed for the cumula cumulativ tivee functi function on of the standard normal distribution. This suffices to calculate the probabilities in case of any Normal distribution with parameters µ and σ . We can see this through the following computation: 2
P ( P (a
≤ X ≤ ≤ b)
−(x − µ) 2σ e dx −(x − µ) b
1 2πσ
√
=
a
=
1 2π b
√
= F
2
b
√ 21πσ
=
2
2σ 2
e
dx
−∞
b−µ σ
−t2
− √ 21πσ
1 2π
e 2 dt
− √ − µ − F a − µ . −∞
σ
a−µ σ
a
e
−(x − µ)2 2σ 2
dx
−∞
−t2
e 2 dt
−∞
σ
variable Z follows follows a standard normal disNotation 1.4. Suppose a random variable Z tribution. tribution. Then for any α, 0 P ( P (Z > zα ) = α. α .
1 , there exists a real number zα such that ≤ α ≤ 1,
2
Exam Examp ples
Example 2.1. Use the table given at the back of the textbook to find the prob-
abilities that a random variable following the standard normal distribution will take on a value (a) between between 0.87 and 1.28 28;; (b) between between 0.34 and 0.62 62;;
− −
(c) grea greater ter than 0.85 85;; (d) grea greater ter than 0.65 65..
− −
F for the standard normal disSolution. The cumulative distribution function F for tribution is given by F ( F (z ) =
1 2π
√
z
t2
e− dt. 2
−∞
(a) So the prob probabili ability ty that X lies X lies between 0.87 and 1.28 is F (1 F (1..28)
(b) P ( P ( 0.34
−
F (0..87) − F (0
62) is given by ≤ X ≤ ≤ 0.62) is F (0 F (0..62) − F ( F (−0.34) =
= 0.8997 = 0.0919
− 0.8078
0.732371107 = 0.33879098 2
− 0.397431887
Usuall Usually y, a table table of values alues is provid provided ed for the cumula cumulativ tivee functi function on of the standard normal distribution. This suffices to calculate the probabilities in case of any Normal distribution with parameters µ and σ . We can see this through the following computation: 2
P ( P (a
≤ X ≤ ≤ b)
−(x − µ) 2σ e dx −(x − µ) b
1 2πσ
√
=
a
=
1 2π b
√
= F
2
b
√ 21πσ
=
2
2σ 2
e
dx
−∞
b−µ σ
−t2
− √ 21πσ
1 2π
e 2 dt
− √ − µ − F a − µ . −∞
σ
a−µ σ
a
e
−(x − µ)2 2σ 2
dx
−∞
−t2
e 2 dt
−∞
σ
variable Z follows follows a standard normal disNotation 1.4. Suppose a random variable Z tribution. tribution. Then for any α, 0 P ( P (Z > zα ) = α. α .
1 , there exists a real number zα such that ≤ α ≤ 1,
2
Exam Examp ples
Example 2.1. Use the table given at the back of the textbook to find the prob-
abilities that a random variable following the standard normal distribution will take on a value (a) between between 0.87 and 1.28 28;; (b) between between 0.34 and 0.62 62;;
− −
(c) grea greater ter than 0.85 85;; (d) grea greater ter than 0.65 65..
− −
F for the standard normal disSolution. The cumulative distribution function F for tribution is given by F ( F (z ) =
1 2π
√
z
t2
e− dt. 2
−∞
(a) So the prob probabili ability ty that X lies X lies between 0.87 and 1.28 is F (1 F (1..28)
(b) P ( P ( 0.34
−
F (0..87) − F (0
62) is given by ≤ X ≤ ≤ 0.62) is F (0 F (0..62) − F ( F (−0.34) =
= 0.8997 = 0.0919
− 0.8078
0.732371107 = 0.33879098 2
− 0.397431887
(c) P ( P (X > 0. 0 .85) is 85) is given by 1
(d) P ( P (X >
P (X ≤ − P ( ≤ 0.85)
= 1 F (0 F (0..85) = 1 0.802337457 = 0.197662543. 197662543.
− −
65) is given by −0.65) is 1 − P ( P (X ≤ ≤ −0.65)
= 1 F ( F ( 0.65) = 1 0.291159687 = 0.708840313. 708840313.
− − −
Example 2.2. The actual amount of instant coffee that a filling machine puts
into “4 ounce” jars can be thought of as a random variable having a normal distribution with σ σ = 0.04 04 ounce. ounce. If only 2% only 2% of the jars are to contain less than 4 ounces, what should be the average fill of these jars? X be the random variable that stands for the amount of coffee Solution. Let X being eing put in a jar. jar. It is given given that X N ( N (µ, σ ), σ = 0.04 04.. We are requir requireed to find µ so that P ( P (X < 4) = 0. 0.02 02.. Therefore, Therefore,
∼ ∼
0.02 =
X − µ 4 − µ P < σ 4 − µ 0.04
= F
0.04
.
By looking at the table the value that comes closest to 0.02 is 2.05 05.. 4 µ So = 2.05 05.. Hence µ = 4.082 082.. 0.04
−
3
− −
−
Exer Exerci cise sess 5.19 5.19-5 -5.4 .43 3
Q 5.19. Use the table given at the back of the textbook to find the probabilities
that a random variable following the standard normal distribution will take on a value (a) less than 1.75 75;; (b) less than 1.25 25;;
− −
(c) grea greater ter than 2.06 06;; (d) grea greater ter than 1.82 82..
− −
required probability probability is P ( P (Z Solution. (a) The required
75). Now, ≤ ≤ 1.75).
P ( P (Z
≤ ≤ 1.75)
= F (1 F (1..75) = 0.959940843 959940843.. 3
(b) The required required probability probability is P ( P (Z <
25). Now, −1.25). P ( P (Z < −1.25) = F ( F (−1.25)
= 0.105649774 105649774..
(c) The required required probability probability is P ( P (Z > 2. 2 .06). 06). Now, P ( P (Z > 2. 2.06) = 1 P ( P (Z 2.06) = 1 F (2 F (2..06) = 0.98030073. 98030073.
− −
(d) The required required probability probability is P ( P (Z > P ( P (Z >
−1.82)
≤ ≤
82). Now, −1.82). = 1 − P ( P (Z ≤ ≤ −1.82) = 1 − F ( F (−1.82) = 0.965620498 965620498..
Q 5.20. Use the table given at the back of the textbook to find the probabilities
that a random variable following the standard normal distribution will take on a value (a) between between 0 and 2.3; (b) between between 1.22 and 2.43 43;; (c) between between 1.45 and 0.45 45;;
− − − − (d) between between − 35.. −1.70 and 1.35
required probability, probability, P (0 P (0 < < X < 2. 2 .3), 3), is given by Solution. (a) The required P (0 P (0 < < X < 2. 2 .3)
= P ( P (X 2.3) P ( P (X = F (2 F (2..3) 0.5 = 0.48927589 48927589..
≤ ≤ − −
≤ ≤ 0)
(b) The required required probability, probability, P (1 P (1..22 22 < < X < 2. 2 .43), 43), is given by P (1 P (1..22 22 < < X < 2. 2 .43)
= P ( P (X 2.43) P ( P (X = F (2 F (2..43) F (1 F (1..22) = 0.103683026 103683026..
≤ ≤
−
(c) The required required probability, probability, P ( P ( 1.45 45 < < X <
−
−0.45)
≤ ≤ 1.22)
45), is given by −0.45), P ( P (X ≤ P (X ≤ ≤ −0.45) − P ( ≤ −1.45) F ( F (−0.45) − F ( F (−1.45)
−
P ( P ( 1.45 45 < < X <
−
=
= = 0.252825961 252825961.. 4
(d) The required probability, P ( 1.70 < X < 1.35), is given by
−
P ( 1.70 < X < 1.35)
= P (X 1.35) P (X = F (1.35) F ( 1.70) = 0.866926546.
−
≤
− − −
≤ −1.70)
Q 5.21. Find z if the probability that a random variable having the standard
normal distribution will take on a value (a) less than z is 0.9911; (b) greater than z is 0.1093; (c) greater than z is 0.6443; (d) less than z is 0.0217; (e) between z and z is 0.9298.
−
Solution. (a) We have to find z such that F (z) =
z
−t 2
e = 0.9911. For −∞ this, you have to look at the table and see for what value the answer comes closest to 0.9911. In this case, it turns out to be z = 2.37. 2
(b) We have to find z such that P (X > z) = 0.1093, that is, F (z) = 0.8907. By looking at the table we find that the value for which the probability comes closest to 0.8907 is z = 1.23. (c) We have to find z such that P (X > z) = 0.6443, that is, F (z) = 0.3557. By looking at the table we find that the value for which the probability comes closest to 0.3557 is z = 0.370.
−
(d) We have to find z such that P (X < z) = 0.0217, that is, F (z) = 0.0217. By looking at the table we find that the value for which the probability comes closest to 0.0217 is z = 2.02.
−
(e) Here we are required to find z such that z
z
f (t)dt =
z
−
f (t)dt −
z
e z
−
−t 2 2
dt = 0.9298. Note that
f (t)dt = F (z) F ( z). But F ( z) = 1 F (z) (see solution 5.41). So, F (z) = 0.9649. By looking at the table we find that the value for which the probability comes closest to 0.9649 is z = 1.81. z
−
−
−∞
−∞
− −
−
Q 5.22. If a random variable has a normal distribution, what are the probabil-
ities that it will take on a value within (a) 1 standard deviation of the mean; (b) 2 standard deviations of the mean; (c) 3 standard deviation of the mean;
5
(d) 4 standard deviation of the mean? Solution. Note that if X follows a normal distribution with mean as µ and
standard deviation as σ, then the random variable Z = dard normal distribution.
X
− µ follows the stan-
σ
(a) Here we are required to find P (µ
− σ < X < µ + σ). P (µ − σ < X < µ + σ) = P (−1 < Z < 1) = F (1) − F (−1) = 0.6827.
(b) Here we are required to find P (µ
− 2σ < X < µ + 2σ). P (µ − 2σ < X < µ + 2σ) = P (−2 < Z < 2) = F (2) − F (−2) = 0.9945.
(c) Here we are required to find P (µ
− 3σ < X < µ + 3σ). P (µ − 3σ < X < µ + 3σ) = P (−3 < Z < 3) = F (3) − F (−3) = 0.9973.
(d) Here we are required to find P (µ
− 4σ < X < µ + 4σ). P (µ − 4σ < X < µ + 4σ) = P (−4 < Z < 4) = F (4) − F (−4) = 0.9999.
Q 5.23. Verify that
(a) z0.005 = 2.575; (b) z0.025 = 1.96. Solution. (a) Here we are required to find out λ
∈ R such that P (Z > λ) =
0.005. That is, F (λ) = 0.995. By looking at the table we find that λ lies between 2.57 and 2.58. Using linear interpolation we find that for λ = 2.575, F (λ) = 0.995. (b) Here we are required to find out λ R such that P (Z > λ) = 0.025. That is, F (λ) = 0.975. By looking at the table we find that λ = 1.96.
∈
Q 5.24. Given a random variable having the normal distribution with µ = 16.2
and σ2 = 1.5625, find the probabilities that it will take on a value 6
(a) greater than 16.8; (b) less than 14.9; (c) between 13.6 and 18.8; (d) between 16.5 and 16.7. Solution. Let Z =
X
− 16.2 .
1.25
(a) P (X > 16.8)
= P (Z > 0.48) = 1 F (0.48)
−
= 0.3156. (b) P (X < 14.9) = P (Z < 1.04) = F ( 1.04) = 0.1491.
−
−
(c) P (13.6 < X < 18.8) = P ( 2.08 < Z < 2.08) = F (2.08) F ( 2.08)
−
= 0.9624.
− −
(d) P (16.5 < X < 16.7) = P (0.24 < Z < 0.4) = F (0.4) F (0.24) = .0.0606.
−
Q 5.25. The time, T , for a super glue to set can be treated as random variable
having a normal distribution with mean 30 sec. Find its standard deviation if the probability is 0.20 that it will take on a value greater than 39.2 sec. Solution. It is given that T
∼ N (30, σ). We want to estimate σ so that
P (T > 39.2) = 0.20. Therefore, 0.20 =
T − 30 39.2 − 30 P > σ 9.2 σ
= 1 So F
9.2
σ σ = 10.93.
− F
σ
.
= 0.80. By looking at the table we find that
7
9.2 = 0.8416. Hence σ
Q 5.26. The time taken by a bag of popcorn to pop can be treated as random
variable having a normal distribution with standard deviation 10 sec. If the probability is 0.8212 that the bag will take less than 282.5 sec. to pop, find the probability that it will take longer than 258.3 sec. to pop. Solution. Let the random variable T stand for the time needed for a bag of
popcorn to pop. We have T N (µ, 10). In order to find the required probability, first we have determine µ. For this we make use of the given condition that P (T < 282.5) = 0.8212. Now,
∼
0.8212 = =
T − µ 282.5 − µ P < 10 282.5 − µ 10 F
10
.
282.5 µ = 0.9199. So µ = 273.3. There10
−
By looking at the table we find that fore,
P (T > 258.3) = P
T − 273.3
>
10 1 F ( 1.5) 0.9331.
= =
−1.5
− −
Q 5.27. The time required to assemble a piece of machinery can be treated as
random variable having a normal distribution with µ = 12.9 minutes and σ = 2.0 minutes. What is the probability that the assembly of a piece of machinery of this kind will take (a) at least 11.5 minutes; (b) between 11.0 and 14.8 minutes? Solution. Let T stand for the time required to assemble a piece of machinery.
We have T
∼ N (12.9, 2.0).
(a) P (T
≥ 11.5)
= P = =
T − 12.9
2.0 1 F ( 0.7) 0.7580.
>
−0.7
− −
(b) P (11.0 < T < 14.8)
= P
−0.95 < 2F (0.95) − 1
= = 0.6579. 8
− 12.9 < 0.95
T
2.0
Q 5.28. Find the quartiles:
−z0.25,
z0.50 , and z0.25
of the standard normal distribution. Solution. z0.25 is a real number such that P (Z > z0.25 ) = 0.25, which means
that F (z0.25 ) = 1 0.25. By looking at the table, the probability 0.75 is attained at 0.6745. Hence z0.25 = 0.6745. Similarly, z0.5 = 0. Therefore the first, second, and third quartiles are respectively 0.6745, 0, and 0.6745.
−
−
Q 5.29. In a photographic process, the time to process 8
× 10 prints from a
memory card may be looked upon as a random variable having the normal distribution with µ = 10.28 seconds, and σ = 0.12 seconds. Find the probability that it will take (a) any where from 10.00 to 10.50 seconds to process one of the prints; (b) at least 10.20 seconds to process one of the prints; (c) at most 10.35 seconds to process one of the prints. Solution. Suppose the random variable T measures the time to process 8
prints from a memory card. We have T
∼ N (10.28, 0.12).
(a) P (10.0 < T < 10.5)
= P
− 10.28 < 1.83
T
−2.33 < 0.12 F (1.83) − F (−2.33)
= = 0.9568. (b) P (T
≥ 10.20)
T − 10.28
= 1
− P 0.12 1 − F (−0.66)
<
−0.66
= = 0.7475. (c) P (T
≤ 10.35)
= P
T − 10.28
0.12 = F (0.5833) = 0.7201.
9
≤ 0.5833
× 10
Q 5.30. With reference to exercise 5.29, for which value is the probability 0.95
that it will be exceeded by the time it takes to process one of the prints? Solution. Here we are supposed to find a cut-off point Z 0 for which
P (T > z 0 ) = 0.95. Now, 0.95 =
T − 10.28 z − 10.28 P > 0.12 z − 10.28 0.12 0
= 1
So, F z0
z − 10.28 0
0.12
− F
0
0.12
= 0.05. By looking at the table, we see that
− 10.28 = −1.645. Hence z0 = 10.082. 0.12
Q 5.31. The ideal specification for a certain kind of washers is that the inside
diameter must be within 0.005 of 0.300 inch. If the inside diameters of the washers supplied by the manufacturer may be looked upon as a random variable having the normal distribution with µ = 0.302 inch, and σ = 0.003 inch, what percentage of these washers will meet the ideal specifications? Solution. Let the random variable D measure the diameter of a washer. We
have D N (0.302, 0.003). We want to estimate the probability that: 0.295 D 0.305. Now,
∼
≤ ≤
P (0.295
≤ D ≤ 0.305)
=
0.302 − P −2.33 ≤ ≤ 1.0 0.003 F (1.0) − F (−2.33) D
= = 0.8314.
Q 5.32. With reference to example 2.2, verify that if the variability of the filling
machine is reduced to σ = 0.025 ounce, this will lower the required average amount of coffee to 4.05 ounces, yet keep 98% of the jars above 4 ounces. Solution. If X is a random variable that measures the amount of coffee that
can be filled into a jar, and X P (X > 4) = 0.98. Now, 0.98 =
∼
X − µ 4 − µ P > 0.025 4 − µ 0.025
= 1
So, F
4 − µ
0.025 Hence µ = 4.05.
N (µ, 0.025). We have to determine µ if
− F
0.025
.
= 0.02. By looking at the table we find that
10
4 µ = 0.025
−
−2.054.
Q 5.33. A stamping machine produces can tops whose diameters are normally
distributed with σ = 0.01 inch. At what mean diameter should the machine be set so that no more than 5% of the can tops produced have diameters exceeding 3 inches? Solution. Suppose the random variable D stand for the diameter of a can top.
We want to find µ so that P (D > 3) = 0.95. This implies that F 0.95. By looking at the table we see that
3 − µ 0.01
=
3 µ = 1.6449. That is, µ = 2.9835. 0.01
−
Q 5.34. Extruded plastic rods are automatically cut into nominal lengths of 6
inches. Actual lengths are normally distributed about a mean of 6 inches and their standard deviation is 0.06 inch. (a) What proportion of the rods have lengths that are outside the tolerance limits of 5.9 inches to 6.1 inches? (b) To what value does the standard deviation need to be reduced if 99% of the rods must be within tolerance? Solution. The length, L, of a rod follows a normal distribution with parameters
µ = 6, and σ = 0.06. (a) P (5.9
≤L≤
L 6 6.1) = P 1.66 1.66 0.06 = F (1.66) F ( 1.66) = 0.9030.
− ≤ − ≤ − −
− 0.9030 = 0.097. (b) We have to find σ such that L ∼ N (6, σ), and P (5.9 ≤ L ≤ 6.1) = 0.99. So the proportion of rods that are outside these limits is 1 Now,
0.99 = =
−0.1 L − 6 0.1 P ≤ σ ≤ σ σ 0.1 −0.1 F − F σ σ 0.1
= 2F So, F
0.1
σ
− 1.
= 0.995. By looking at the table we see that
σ Hence, σ = 0.0388.
11
0.1 = 2.576. σ
Q 5.35. If a random variable has the binomial distribution with n = 40, and p = 0.40, use the normal approximation to determine the probabilities that it will take on
(a) the value 22; (b) a value less than 8. Solution. Here we are approximating a discrete model using a continuous model
so the continuity correction needs to be applied. Which means that the cut-off values are to be thought of as intervals, for example 22 will be thought of as the interval (21.5, 22.5). This fact is guaranteed by the De-Moivre-Laplace theorem. The given binomial variable, X , can be approximated with the standard normal X np variable Z = . np(1 p) (a)
− −
P (X = 22)
= P (21.5 < X < 22.5) 21.5 16 22.5 16 P
≈
− −
−
−
−
(b) P (X < 8)
= P (X < 7.5) 7.5 16 P Z < 3.098 = F ( 2.7433) = 0.003.
≈
−
−
Q 5.36. A manufacturer knows that on an average 2% of the electric toasters
that he makes will require repairs within 90 days after they are sold. Use the normal approximation to the binomial distribution to determine the probability that among 1200 of these toasters at least 30 will require repairs within the first 90 days after they are sold. Solution. Let the random variable X denote the number of toasters that require
repairs in first 90 days. Here X follows a binomial distribution with n = 1200, and p = 0.02. We want to approximate X with the standard normal variable
12
Z =
X np(1− np− p) .
P (X
≥ 30)
=
1
− P (X < 30) = 1 − P (X < 29.5) 29.5 − 24 ≈ 1 − P Z < 4.85 = 1 − F (1.134) =
0.1283.
Q 5.37. The probability that an electronic component will fail in less than 1000
hours of continuous use is 0.25. Use the normal approximation to the binomial distribution to find the probability that among 200 such components fewer than 45 will fail in less than 1000 hours of continuous use. Solution. Let the random variable X denote the number of electronic com-
ponent will fail in less than 1000 hours of continuous use. Here X follows a binomial distribution with n = 200, and p = 0.25. We want to approximate X X np with the standard normal variable Z = . np(1 p)
≈
− − P (X < 44.5) 44.5 − 50
=
1
=
0.1845.
P (X < 45) =
P Z <
6.12 F ( 0.8981)
− −
Q 5.38. A safety engineer feels that 30% of all industrial accidents in her plant
are caused by failure of employees to follow instructions. If this figure is correct, find approximately the probability that among 84 accidents in this plant between 20 and 30 (inclusive) accidents will be due to the failure of the employees. Solution. Let the random variable X denote the number of accidents occurring
in the plant. Note that X follows a binomial distribution with n = 84, and p = 0.30. We can approximate X with the standard normal variable Z = X np . np(1 p)
− −
P (20
≤ X ≤ 30)
= P (X = = = =
≤ 30) − P (X < 20) P (X ≤ 30.5) − P (X < 19.5) 30.5 − 25.2 19.5 − 25.2 P Z ≤ − P Z ≤ 4.2 4.2 F (1.262) − F ( −1.357) 8091.
13
Q 5.39. If 62% of all clouds seeded with silver iodide show spectacular growth,
what is the probability that among 40 clouds seeded with silver iodide at most 20 will show spectacular growth? Solution. Let the random variable X denote the number of clouds that show
spectacular growth. Note that X follows a binomial distribution with n = 40, and p = 0.62. We can approximate X with the standard normal variable Z = X np . np(1 p)
− −
P (X
≤ 20)
= P (X
≤ 20.5) 20.5 − 24.8
= P Z
≤
3.069
= F ( 1.40) = 0.081.
−
Q 5.40. Find the probabilities that the proportion of heads will be anywhere
from 0.49 to 0.51 when a balanced coin is flipped (a) 1, 000 times; (b) 10, 000 times. Solution. Let the random variable X denote the number of heads.
(a) Here X follows a binomial distribution with n = 1000, and p = 0.5. Now, P (0.49
≤ X/n ≤ 0.51)
= P (490 X 510) = P (X 510) P (X < 490) = P (X 510.5) P (X < 489.5) 510.5 500 = P Z P Z 15.81 = F (0.6640) F ( 0.6640)
≤ ≤ ≤ − ≤ −− ≤ − −
−
≤
489.5 500 15.81
−
= 0.4933.
(b) Here X follows a binomial distribution with n = 10, 000, and p = 0.5. Now, P (0.49
≤ X/n ≤ 0.51)
= P (4900 X 5100) = P (X 5100) P (X < 4900) = =
≤ ≤ ≤ − P (X ≤ 5100.5) − P (X < 4899.5) 5100.5 − 5000 4899.5 − 5000 P Z ≤ − P Z ≤ 50 50 F (2.01) − F ( −2.01)
= = 0.9555.
14
Q 5.41. Let F be the cumulative distribution function for the standard normal
distribution. Prove that F ( z) = 1
−
Solution. By substituting x =
Now,
− F (z).
−t, we can show that z e −
−t2 2
−∞
dt =
∞
z
e
−t2 2
dt.
∞
1 =
f (t)dt
−∞
z
=
∞
f (t)dt +
f (t)dt
z
−∞
= F (z) + F ( z)
−
Q 5.42. Verify that the parameter µ in the formula for the normal density
function is in fact the mean. Solution. The density function for the normal distribution is given by
(x
1 √ 2πσ e
−
f (x) =
− µ)2
2σ 2
,
− ∞ < x < ∞.
Now, ∞
E (X )
=
xf (x)dx
−∞
= Put t =
1 2πσ
√
(x
∞
−
− µ)2
2σ 2
xe
dx.
−∞
√ −2σµ , so that dx = √ 2σdt, and
x
E (X )
= = = =
√ √ √ √ √ √
∞ 1 ( 2σt + µ)e−t dt π −∞ ∞ 1 2σ te−t dt + µ π −∞ 1 0+µ π π µ. 2
2
∞
−∞
t2
−
e
dt
Q 5.43. Verify that the parameter σ 2 in the formula for the normal density
function is in fact the variance. 15
Solution. The density function for the normal distribution is given by
(x
1 √ 2πσ e
−
f (x) =
− µ)2
2σ 2
,
− ∞ < x < ∞.
The variance is given by ∞
E ((X
2
− µ) )
=
(x
−∞
= Put t =
1 2πσ
√
− µ)2f (x)dx (x
∞
−
(x
−∞
− µ)2e
− µ)2
2σ 2
dx.
√ −2σµ , so that dx = √ 2σdt, and
x
E ((X
2
− µ) )
=
1 π
∞
√ 2σ t e dt 4σ √ π t(te )dt 1 1 4σ √ π 2 te + 2 1 √ π 4σ 2 2
−∞
2
=
t2
−
∞
t2
−
0
∞
2
=
t
−
∞
2
0
0
2
e−t dt
2
=
√ π
0+
2 2
= σ2.
4
Exercises 5.71-5.93
Q 5.71. Two scanners are needed for an experiment. Out of the 5 available,
two have electronic defects, another one has a defect in memory, and two are in good working order. Two units are selected at random. (a) Find the joint distribution of X 1 = the number with electronic defects, and X 2 = the number with defect in memory. (b) Find the probability of 0 or 1 total defects among the two selected. (c) Find the marginal probability distribution of X 1 . (d) Find the conditional probability distribution of X 1 given X 2 = 0. Solution. X 1 can take values 0, 1, and 2. X 2 can take values 0 and 1.
16
(a) The joint distribution f (x, y) = P (X 1 = x, X 2 = y) is tabulated below: 0 1
0 0.1 0.2
1 0.4 0.2
2 0.1 0
(b) Here we are required to find P (X 1 + X 2
≤ 1), which is calculated as follows:
P (X 1 + X 2
≤ 1)
= f (0, 0) + f (0, 1) + f (1, 0) = 0.1 + 0.4 + 0.2 = 0.7.
(c) The marginal distribution of X 1 is given by f 1 (x) =
f (x, y), where x = y
0, 1, 2. Hence f 1 (0) = 0.3, f 1 (1) = 0.6, and f 1 (2) = 0.1.
(d) The conditional distribution of X 1 given X 2 = 0 is given by Q 5.72. Two random variables X and Y are independent, and each has a
binomial distribution with success probability 0.4 and 2 trials. (a) Find the joint probability distribution function. (b) Find the probability that X < Y . Solution. Recall that if X follows a binomial distribution with parameters n
and p, then P (X = x) = nC x px (1
− p)n
x
−
.
(a) Since X and Y are independent so the joint probability distribution function is the product of the individual distribution functions. Hence f (x, y)
= =
2
x
2−x
2
y
2−y
C 0.4 0.6 C 0.4 0.6 2
x 2
y 4−x−y
C x C y 0.4x+y 0.6
,
for all x = 0, 1, 2, and y = 0, 1, 2. (b) The probability that X < Y is given by f (0, 1) + f (0, 2) + f (1, 2)
= 2C 0 2C 1 (0.4)(0.6)3 + 2C 0 2C 2 (0.4)2 (0.6)2 + 2C 1 2C 2 (0.4)3 (0.6) = 0.3072.
Q 5.73. If two random variables have the joint density
f (x, y) =
xy
for 0 < x < 2, 0 < y < 1 0 otherwise
find the probabilities that 17
(a) both random variables will take on values less than 1; (b) the sum of the values taken by both random variables is less than 1. Solution. Let the two random variables be X and Y .
(a) The probability that X < 1, and Y < 1 is given by 1
1
0
1
1
xydydx
f (x, y)dydx =
0
0
0
1
=
1
xdx
ydy
0
0
1 1 2 2 1 . 4
=
×
=
(b) The probability that X + Y < 1 is given by 1
1−x
0
1
f (x, y)dydx =
0
1−x
xydydx x ydy dx (x − 1) 0
0
1
=
1−x
0
0
1
=
2
x
0
2
dx
1 . 24
=
Q 5.74. With reference to exercise 5.73, find the marginal densities of the two
random variables. Solution. (a) The marginal density of X is given by ∞
f 1 (x)
=
f (x, y)dy
0
1
=
xydy
0
= and f 1 (x) = 0 for x
x , for 0 < x < 2. 2
∈ (0, 2).
(b) The marginal density of Y is given by ∞
f 2 (y)
=
f (x, y)dx
0
2
=
xydx
0
= 2y, for 0 < y < 1. 18
and f 2 (y) = 0 for y
∈ (0, 1).
Q 5.75. With reference to exercise 5.73, find the joint cumulative distribution
function of the two random variables. Also find the individual cumulative distribution functions. Are the two variables independent? Solution. The joint cumulative distribution function is given by
F (x, y) = P (X
≤ x, Y ≤ y) =
x
y
−∞
−∞
f (s, t)dtds.
(2)
We have to evaluate the double integral in equation(2) for all possible values of (x, y). The cases are as follows: (i) 0 < x < 2 and 0 < y < 1. Here we have F (x, y)
= =
0
y
−∞
−∞
0
0
−∞
x
+
0
x
0
−∞
−∞
x
0
0
x
y
sds
tdt
0
(ii) x
0
stdtds 0
=
0
f (s, t)dtds
0
y
0
y
y
= 0+0+0+
=
y
0
−∞
x
−∞
0
x
=
y
f (s, t)dtds + f (s, t)dtds f (s, t)dtds + f (s, t)dtds f (s, t)dtds + f (s, t)dtds 0
1 2 2 x y . 4
≥ 2 and 0 < y < 1. Here we have F (x, y)
= =
x
0
x
−∞
−∞
−∞
0
0
0
−∞
−∞
2
+
0
=
0
0
2
y
stdtds
0 2
0
y
0
x
−∞
2
y
2
−∞
2
0
−∞
0
x
+
y
f (s, t)dtds + f (s, t)dtds f (s, t)dtds + f (s, t)dtds f (s, t)dtds + f (s, t)dtds f (s, t)dtds + f (s, t)dtds 0
= y . 19
y
0
(iii) x
≥ 2 and y ≥ 1. Here we have
F (x, y)
= =
0
y
−∞
−∞
0
0
−∞
0
y
0
1
2
−∞
2
2
0
x
y
−∞
1
0
−∞
0
0
−∞
0
2
1
2
−∞
2
=
x
−∞
0
0
x
+
y
0
−∞
2
+
2
f (s, t)dtds + f (s, t)dtds + f (s, t)dtds f (s, t)dtds + f (s, t)dtds + f (s, t)dtds f (s, t)dtds + f (s, t)dtds + f (s, t)dtds f (s, t)dtds + f (s, t)dtds + f (s, t)dtds 1
x
0
1
y
2
y
1
1
stdtds = 1.
0
0
(iv) 0 < x < 2 and y
≥ 1. Here we have
F (x, y)
= =
0
y
−∞
−∞
0
0
−∞
x
+
0
(v) x
−∞
0
1
−∞
−∞
0
x
0
0
−∞
1
0
y
x
0
0
1
y
1
1
stdtds
0
=
y
0
−∞
x
=
x
f (s, t)dtds + f (s, t)dtds f (s, t)dtds + f (s, t)dtds + f (s, t)dtds f (s, t)dtds + f (s, t)dtds + f (s, t)dtds 0
1 2 x . 4
≤ 0 or y ≤ 0. Here we have F (x, y) = 0.
All the above cases can be summarized as
1 x y y4 F (x, y) = 14 x 10
2 2
2
2
0 < x < 2, 0 < y < 1 x
≥ 2,
0 < y < 1
0 < x < 2, y x x
≥ 1
≥ 2, y ≥ 1 ≤ 0 or y ≤ 0
Next we calculate the individual cumulative distributions of X and Y . If f 1 is the marginal density for X , then the cumulative distribution for X is given by x F 1 (x) = −∞ f 1 (s)ds. Similarly we can write the cumulative distribution for Y . Therefore, 0 x 0 1 2 F 1 (x) = x 0 < x < 2 4 1 x 2
≤ ≥
20
Similarly,
0 F (y) = y1 2
y 0 0 < y < 1 x 1
≤ ≥
2
It is easy to verify that F (x, y) = F 1 (x)F 2 (y). Hence the two random variables are independent. Q 5.76. If two random variables have the joint density
f (x, y) =
6 5 (x
0
+ y2)
for 0 < x < 1, 0 < y < 1 otherwise
find the probability that 0.2 < X < 0.5 and 0.4 < Y < 0.6. Solution. The probability that 0.2 < X < 0.5 and 0.4 < Y < 0.6 is given by 0.5
0.6
0.2
0.5
0.6
x 6
6 5
f (x, y)dydx =
0.4
=
0.2 0.5
5
2
x + y dy dx
0.4
19 + 5 375
0.2
dx
543 . 12500
=
Q 5.77. With reference to exercise 5.76, find the joint distribution function of
the two random variables and use it to verify the value obtained for the probability. Solution. Proceeding along similar lines as exercise 5.75, we obtain
3 2 53 x y + 2 5 xy y + y 5 5 F (x, y) = 3 5 x + 52 x 10 2
3
0 < x < 1, 0 < y < 1
3
x
2
Note that
b
a
f (x, y)dydx =
0 < y < 1
0 < x < 1, y x x
≥ 1
≥ 1, y ≥ 1 ≤ 0 or y ≤ 0
0.5
d
≥ 1,
0.6
f (x, y)dydx x 19 6 0.4 0.5
c
−∞
= =
21
5
0.2
543 . 12500
5
+
375
dx
Q 5.78. With reference to exercise 5.76, find both marginal densities and use
them to find the probabilities that (a) X > 0.8; (b) Y < 0.5. Solution. The marginal densities for X and Y will be denoted by f 1 , and f 2
respectively. These are computed as follows: ∞
f 1 (x)
=
f (x, y)dy f (x, y)dy (x + y )dy for 0 < x < 1 0 (3x + 1) for 0
1
=
0
= =
1 6 0 5
2 5
0
2
otherwise.
and ∞
f 2 (y)
=
f (x, y)dx f (x, y)dx (x + y )dx for 0 < y < 1 0 (1 + 2y ) for 0
1
=
0
= =
1 6 0 5
3 5
0
2
2
otherwise.
(a) ∞
P (X > 0.8) =
f 1 (x)dx
0.8 1
2 (3x + 1)dx 0.8 5 = 0.296. =
(b) 0.5
P (Y < 0.5)
=
f 2 (y)dy
−∞
0.5
=
0
= 0.35. 22
3 (1 + 2y 2 )dy 5
Q 5.79. With reference to exercise 5.76, find
(a) an expression for f c (x y) for 0 < y < 1;
| (b) an expression for f c (x|0.5);
(c) the mean of the conditional density of the first random variable when the second takes on the value 0.5. Solution. (a) For 0 < y < 1, we have the expression:
f c (x y)
f (x, y) f 2 (y) 6 2 5 (x + y ) 3 2 5 (1 + 2y )
=
|
=
2(x + y 2 ) , for 0 < x < 1. (1 + 2y2 )
=
and f c (x y) = 0 when x
|
∈ (0, 1).
(b) f c (x 0.5)
|
= =
and f c (x 0.5) = 0 when x
|
2(x + (0.5)2 ) (1 + 2(0.5)2 ) 4 1 x+ for 0 < x < 1. 3 3
∈ (0, 1).
(c) The mean of a random variable having the probability density f c (x 0.5), is given by
|
1
4 0
1 x + x dx = 3 3 2
11 . 18
Q 5.81. If three random variables have the joint density
k(x + y)e f (x,y,z) = 0
z
−
find
(a) the value of k; (b) the probability that X < Y and Z > 1.
23
for 0 < x < 1, 0 < y < 2, z > 0 otherwise
Solution. (a) Since f (x,y,z) is a probability density, so ∞
1 =
∞
∞
−∞
−∞
f (x,y,z) dxdydz (x + y)e dzdydx k k (x + y) dydx e −∞
1
=
2
z
−
0
0
0
1
=
∞
2
∞
z
−
0
0
0
dz
1
= k
(2x + 2)dx
·1
0
= 3k. Therefore, k =
1 . 3
(b) ∞
P (X < Y and Z > 1)
=
k (x + y) dydx 1
e
dz
1
X
=
z
−
(x + y) dydx 3e 0 x 5 = . 6e Q 5.83. A pair of random variables has the circular normal distribution if their joint density is given by f (x, y) =
1 −[(x−µ e 2πσ 2
1
)2 +(y −µ2 )2 ]/2σ 2
for x, y
∈ R.
(a) If µ1 = 2, µ 2 = 9 < Y < 3.
−
−2, and σ = 10, find the probability that −8 < X < 14 and
(b) If µ1 = 0, µ2 = 0, and σ = 3, find the probability that (X, Y ) is in the region between the two circles x2 + y2 = 9 and x2 + y 2 = 36. Solution. (a)
P ( 8 < X < 14, 9 < Y < 3)
−
−
= =
14
3
e dx e dy 1 1 √ 2πσ e √ 2πσ e dx dy 14 − 2 −8 − 2 3 + 2 −9 + 2 1 200π
(x−2)2 /200
−
8 14
(y+2)2 /200
−
9
−
−
(x−2)2 /200
8
−
= =
3
−
F
F − F 10 (F (1.2) − F (−1))(F (0.5) − F (−0.7)) 10
= (0.7263)(0.4495) = 0.3265. 24
(y +2)2 /200
−
9
−
10
− F
10
(b) You may choose to skip this part as it involves change of variables in a double integral. However, curious readers may want to read it. For the sake of convenience, we represent the two circles in polar coordinates: x = rcosθ, y = rsinθ, so that the region between x2 + y 2 = 9 and x2 + y 2 = 36 corresponds to 3 < r < 6, 0 < θ < 2π. Let us denote this region by A. Now, 1 2πσ 2
(x2 +y2 )/2σ2
−
e
dxdy
1 2πσ 2
=
A
6
=
r 2 /2σ 2
−
3
1 2πσ 2 0 1 1 . e e2
=
2π
e 0 2π
rdθdr
6
dθ
r 2 /2σ 2
−
e
rdr
3
√ −
Here x and y are changed to r and θ respectively, so dxdy has to be replaced by J drdθ, where J is determinant of the Jacobian matrix
| |
In this case, J = minant is r.
| |
cosθ
−rsinθ
sinθ rcosθ
∂x ∂r ∂y ∂r
∂x ∂θ ∂y ∂θ
.
and one can easily see that its deter-
Q 5.88. Establish a relationship between f 1 (x1 x2 ), f 2 (x2 x1 ), f 1 (x1 ), and
|
f 2 (x2 ).
|
Solution. By definition of joint conditional density:
f 1 (x1 x2 ) =
|
f (x1 , x2 ) f 2 (x2 )
f 2 (x2 x1 ) =
f (x1 , x2 ) f 1 (x1 )
and
|
Therefore we have the following relationship: f 1 (x1 x2 ) =
|
f 2 (x2 x1 )f 1 (x1 ) . f 2 (x2 )
|
Q 5.89. If X 1 has mean 1 and variance 5 while X 2 has mean
−1 and variance
5, and the two are independent, find (a) E (X 1 + X 2 ); (b) V ar(X 1 + X 2 ). Solution. (a) E (X 1 + X 2 ) = E (X 1 ) + E (X 2 ) = 0.
(b) By independence, V ar(X 1 + X 2 ) = V ar(X 1 ) + V ar(X 2 ) = 10.
25
Q 5.90. If X 1 has mean
3 and variance 2 while X 2 has mean 5 and variance 4, and the two are independent, find
−
(a) E (X 1
− X 2); (b) V ar(X 1 − X 2 ). Solution. (a) E (X 1
− X 2) = E (X 1) − E (X 2) = −8. (b) By independence, V ar(X 1 − X 2 ) = 1 2 V ar(X 1 ) + ( −1)2 V ar(X 2 ) = 6. Q 5.91. If X 1 has mean 1 and variance 3 while X 2 has mean
−2 and variance
5, and the two are independent, find (a) E (X 1 + 2X 2
− 3); (b) V ar(X 1 + 2X 2 − 3). Solution. (a) E (X 1 + 2X 2
− 3) = E (X 1) + 2E (X 2) − 3 = 1 − 4 − 3 = −6.
(b) V ar(X 1 + 2X 2
− 3)
= V ar(X 1 + 2X 2 ) = 12 V ar(X 1 ) + 2 2 V ar(X 2 ) = 23.
(by independence)
Q 5.92. The time X 1 for an older machine to complete a check on a computer
chip has mean 65ms and variance 16. The time X 2 for a newer machine to complete a check on a computer chip has mean 45ms and variance 9. Find the expected time saving using the newer model when (a) checking a single chip; (b) checking 200 chips. (c) Find the standard deviations in parts (a) and (b), assuming all of the checking times are independent. Solution. The time saving is given by the random variable Y = X 1
− X 2.
(a) E (Y ) = E (X 1 )
− E (X 2) = 20ms. (b) E (200Y ) = 200 × 20 = 4000ms. (c) V ar(Y ) = V ar(X 1 ) + V ar(X 2 ) = 25, V ar(200Y ) = (200)2 V ar(Y ) = 106 . Therefore the standard deviations are respectively 5 and 1000.
26
be independent and let each have the same marginal Q 5.93. Let X 1 , X 2 , . . . , X20 distribution with mean 10 and variance 3. Find (a) E (X 1 + X 2 +
·· · + X 20); (b) V ar(X 1 + X 2 + ·· · + X 20) Solution. (a) E (X 1 + X 2 + · ·· + X 20 ) = 20 × 10 = 200. (b) V ar(X 1 + X 2 + ·· · + X 20) = 20 × 3 = 60. Q 5.94. Let f (x) = 0.2 for x = 0, 1, 2, 3, 4.
(a) Find the moment generating function. (b) Use the above MGF to find E (X ) and E (X 2 ). Solution. (a) Note that this is a discrete distribution, so the MGF is given by 4
M (t)
=
(0.2)e
tx
x=0
= = (b)
1 (1 + et + e2t + e3t + e4t ) 5 1 e5t . 5(1 et )
− −
M (t) = (0.2)(et + 2e2t + 3e3t + 4e4t ). and
M (t) = (0.2)(et + 4e2t + 9e3t + 16e4t ).
Therefore, E (X ) = M (0) = 2, and E (X 2 ) = M (0) = 6. Q 5.95. Let f (x) = 0.25 2C x for x = 0, 1, 2.
(a) Find the moment generating function. (b) Use the above MGF to find E (X ) and E (X 2 ). Solution. (a) Note that this is a discrete distribution. So the MGF is given by 2
M (t)
=
(0.25) C e 2
x
tx
x=0
= 0.25 + 0.50et + 0.25e2t .
27
(b)
M (t) = (0.5)(et + e2t ) and
M (t) = (0.5)(et + 2e2t ).
Therefore, E (X ) = M (0) = 1, and E (X 2 ) = M (0) = 1.5. Q 5.96. Let Z
∼ N (0, 1).
(a) Find the moment generating function of Z 2 . (b) Use the above MGF to judge the distribution for Z 2 . Solution. (a) Note that Z 2 follows a continuous distribution. So the MGF is
given by ∞
M Z (t) 2
=
1 2π
√
2
etz e−z
2
/2
dz
−∞ ∞
=
1 2π
√
e−(1−2t)z
2
/2
dz
−∞ ∞
=
1 2π
√
√ − √ 2tz
1
−
e
2
2
dz
−∞ ∞
= =
1 π
1
√ √ 1 − 2t
y2
−
e
dy
√ 1 − 2tz √ 2 Put y =
−∞
√ 1 1− 2t .
(b) Recall that the MGF of the Gamma distribution with parameters α and β is 1 . The MGF in part (a) corresponds to Gamma distribution with (1 βt)α α = 0.5, and β = 2. This in fact is the χ2 distribution with 1 degree of freedom.
−
Q 5.97. Let X be a continuous random variable having probability density
2e
2x
−
f (x) =
0
for x > 0 otherwise
(a) Find the MGF. (b) Use the above MGF to obtain E (X ) and E (X 2 ). 28
Solution. (a) The MGF is given by ∞
M (t)
=
2
etx e−2x dx
0
∞
= 2
e(t−2)x dx
0
=
2
∞
−2 0 2 = if t < 2. 2−t (b) M (t) = 2(2 − t) 2 , and M (t) = 4(2 − t) 3 . Therefore, E (X ) = M (0) = 2
−
t
e(t−2)x
−
0.5 and E (X ) = M (0) = 0.5.
5
Important Theorems from Chapter 6
Theorem 5.1. If a random sample of size n is taken from an infinite population
having mean µ and variance σ 2 , then X is a random variable whose distribution σ2 has mean µ, and variance as . n Theorem 5.2 (Central Limit). If X is the mean of a random sample of size n
taken from a population having mean µ and variance σ 2 , then Z =
X µ σ/ n
−√
is a random variable whose distribution function approaches that of the standard normal distribution as n .
→∞
6
Exercises 6.9-6.19
Q 6.9. Given the infinite population whose distribution is given by
x f(x) 1 0.25 2 0.25 3 0.25 4 0.25 List the 16 possible samples of size 2 and use this list to construct a distribution of X for random samples of size 2 from the given population. Verify that the mean and variance of this sampling distribution are identical with the corresponding values expected according to Theorem 5.1.
29
Solution. First, note that the given distribution has µ = 2.5, and σ 2 = 1.25.
The various samples of size 2 are: (1,1) (2,1) (3,1) (4,1)
(1,2) (2,2) (3,2) (4,2)
(1,3) (2,3) (3,3) (4,3)
(1,4) (2,4) (3,4) (4,4)
and the respective means (the values of X ) are: 1 1.5 2 2.5
1.5 2 2.5 3
2 2.5 3 3.5
2.5 3 3.5 4
Now the distribution for X is X 1 1.5 2 2.5 3 3.5 4
P (X ) 1/16 2/16 3/16 4/16 3/16 2/16 1/16
It can easily be calculated that the mean of this distribution is 2.5, and the variance is 0.625. Note that 0.625 is equal to σ 2 /2 and this confirms with Theorem 5.1. Q 6.11. When we sample from an infinite population, what happens to the
standard error of the mean if the sample size is (a) increased from 50 to 200; (b) increased from 400 to 900; (c) decreased from 225 to 25; (d) decreased from 640 to 40; Solution. Recall that the standard error of the mean is given by
√ σn , where
σ is the population standard deviation and n is the sample size. Let S E n denote the standard error of the mean when the sample size is n. (a)
√ √
SE 200 SE 50
σ/ 200 = σ/ 50 1 = . 2 Therefore, the standard error reduces to half of the original error. 30
(b) SE 900 SE 400
√ √
σ/ 900 σ/ 400 2 . 3
= =
Therefore, the standard error reduces to two-third’s of the original error. (c) SE 25 SE 900
√ √
σ/ 25 = σ/ 225 = 3.
Therefore, the standard error increases three fold. (d) SE 40 SE 640
√ √
σ/ 40 = σ/ 640 = 4.
Therefore, the standard error increases four fold.
Q 6.13. For large sample size n, verify that there is a 50
− 50 chance that the
mean of a random sample from an infinite population with standard deviation σ σ will differ from µ by less than 0.6745 . n
· √
Solution. We are required to show that P
σ X − µ < 0.6745 · √ = 0.5. n
Since the sample size is large so by Central Limit Theorem, the random variable X µ Z = follows the standard normal distribution. Thus, σ/ n
−√
P
σ X − µ < 0.6745 · √ n
= P ( Z < 0.6745)
| | P ( −0.6745 < Z < 0.6745) F (0.6745) − F (−0.6745)
= = = 0.50.
Q 6.14. The mean of a random sample of size n = 25 is used to estimate the
mean of an infinite population that has standard deviation σ = 2.4. What can we assert about the probability that the error will be less than 1.2, if we use 31
(a) Chebyshev’s theorem; (b) the Central limit theorem? Solution. Here we are required to estimate P ( X µ < 1.2) using two different
| −|
methods.
√
(a) The sample standard deviation is σ/ n = 2.4/5. Choose k so that 1.2 = k
2.4 . 5
This yields k = 2.5. Therefore by Chebyshev’s theorem, P
|X − µ| < 2.5 ×
2.4 5
1 ≥ 1 − (2.5) 2 = 0.84.
(b) If we consider n = 25 to a large enough sample size, then by Central limit X µ theorem Z = follows the standard normal distribution. Therefore, σ/ n
−√
P ( X
5 µ < 1.2) = P Z < 1.2 2.4 = F (2.5) F ( 2.5) = 0.9876.
| − |
| |
× − −
Q 6.15. Hard disks for computers must spin evenly, and one departure from
level is called roll. The roll for any disk can be modeled as a random variable having µ = 0.225 mm and standard deviation 0.0042 mm. The sample mean roll X will obtained from a random sample of 40 disks. What is the probability that X lies between 0.2245 and 0.2260 mm? Solution. By Central limit theorem, Z =
X µ follows the standard 0.0042/ 40
− √
normal distribution. Therefore, 0.2245 0.2250 0.2260 0.2250
− −
P (0.2245 < X < 0.2260) = P ( = = =
− −
32
− −
Q 6.16. A wire bonding process is said to be in control if the mean pull strength
is 10 pounds. It is known that the pull strength measurements are normally distributed with a standard deviation of 1.5 pounds. Periodic random samples of size 4 are taken from this process and the process is said to be “out of control”if a sample mean is less than 7.75 pounds. Comment. Solution. Since the sample is drawn from a normal population, so X is a
normal variable with mean same as the population mean of 10 and standard σ 1.5 deviation = . Therefore, 2 n
√
P (X < 7.75)
X µ 7.75 10 < ) 0.75 0.75 X µ = P ( < 3.0) 0.75 = F ( 3.0) = 0.00135. = P (
− −
−
−
−
Thus the probability that the process is out of control is very small. This means that the process is actually in control. Q 6.17. If the distribution of the weights of all men traveling by air between
Dallas and El Paso has a mean of 163 pounds and a standard deviation of 18 pounds. What is the probability that the combined weight of 36 men traveling on a plane between these two cities is more than 6000 pounds? be the random sample of 36 passenger weights. We Solution. Let X 1 , . . . , X36 are required to find P (
36
X > 6000).
i=1
i
By Central limit theorem,
X
− 163 3
follows the standard normal distribution. Therefore, 36
X > 6000) P ( i
= P (X > 166.67)
i=1
= P (
X
− 163 > 1.22)
3 = 1 F (1.22) = 0.1108.
−
Q 6.20. The tensile strength of a new composite can be modeled as a normal
distribution. A random sample of 25 specimens has mean 45.3 and standard deviation 7.9. Does this information tend to support or refute the claim that the mean of the population is 40.5?
33
X µ follows a t distribution with 24 degrees of 7.9/5 freedom. For µ = 40.5 we have
−
Solution. We know that
t =
−
45.3 40.5 = 3.04. 7.9/5
−
Thus the sample mean of 45.3 is more than 3 standard deviations away from the true mean and the probability of this happening is very low because for 3 t 3 the probability is 99.38%. Hence, we tend to refute the claim.
− ≤
≤
Q 6.21. The following are the times between 6 calls for an ambulance and
the patient’s arrival at the hospital: 27, 15, 20, 32, 18, and 26 minutes. The ambulance service claims that it takes on an average 20 minutes to complete the journey. Comment on the reasonableness of this claim. Solution. First we calculate the sample mean and standard deviation. These
X 20 follows a t distribution 6.39/ 6 with 5 degrees of freedom. We thus calculate turn out to be x = 23 and S = 6.39. Now
t =
− √
−
23 20 = 1.15. 6.39/ 6
− √
There is approximately 70% chance for t to lie within the interval from 1.15 to 1.15, and approximately 30% chance to lie outside it. So we don’t have enough evidence against the claim.
−
Q 6.22. A process of making certain bearings is under control if the diameters
of the bearings have a mean of 0.5000 cm. What can we say about the process if a sample of 10 of these bearings has a mean diameter of 0.5060 cm and a standard deviation of 0.0040 cm? Solution. We know that
X 0.50 follows a t distribution with 9 degrees of 0.004/ 10
− √
freedom. Now
−
0.5060 0.5000 = 4.74. 0.004/ 10 Thus the sample mean of 0.5060 is more than 4 standard deviations away from the true mean and this is very unlikely. Hence, we conclude that the process is not under control. t =
− √
Q 6.23. Hard disks for computers must spin evenly, and one departure from
level is called pitch. Samples are regularly taken from production and each disk in the sample is placed in test equipment that yields a measurement of pitch. From many samples it is concluded that the population is normal. The variance is σ2 = 0.065 when the process is in control. A sample size of 10 is collected each week. The process will be declared out of control if the sample variance exceeds 0.122. What is the probability that it will be declared out of control even though σ 2 = 0.065? 34
Solution. Here we are concerned about the distribution of the sample variance
(n 1)S 2 S 2 . We know that the random variable follows a χ2 distribution σ2 with n 1 degrees of freedom. To comment on whether the process is out of control or not, we have to estimate the probability P (S 2 > 0.122), which is done as follows:
−
−
−
P (S 2 > 0.122)
9S 2 9 0.122 > ) 2 σ 0.065 9S 2 9 0.122 = P ( 2 > ) σ 0.065 = χ2 (16.89)
× ×
= P (
= 0.0504. Hence there is approximately 5% chance that the process will be declared out of control given that the population variance is 0.065. Q 6.24. A random sample of 10 observations is taken from a normal population
having the variance σ2 = 42.5. Find the approximate probability of obtaining a sample standard deviation between 3.14 and 8.94. Solution. Again we are concerned about the distribution of the sample variance
9S 2 S 2 . Since the random variable 2 follows a χ 2 distribution with 9 degrees of σ freedom, we have
−
2
2
2
2
P ((3.14) < S < (8.94) ) =
2
2
9(3.14) 9S 9(8.94) P < < 42.5 σ 42.5 9S 2
2
=
P 2.088 <
=
χ2 (2.088)
=
0.94.
σ2
< 16.925
− χ2(16.925)
(watch this step carefully)
Q 6.27. The χ2 distribution with 4 degrees of freedom is given by
−
f (x) =
1 −x/2 4 xe
0
x > 0 x 0
≤
Find the probability that the variance of a random sample of size 5 from a normal population with σ = 12 will exceed 180. Solution. We need to find P (S 2 > 180). Now,
P (S 2 > 180)
4S 2 > 5) σ2 ∞ 1 −x/2 = xe dx 4 5 = 0.2872. = P (
35
Q 6.30. Let Z 1 , . . . , Z5 be independent standard normal random variables.
(a) Specify the distribution of Z 22 + Z 32 + Z 42 + Z 52 . (b) Specify the distribution of
Z + Z Z + Z + Z 1
2 2
2 3
2 4
2 5
4
Solution. (a) We know that the sum of squares of n independent standard nor-
mal variables follows a χ2 distribution with n degrees of freedom. There fore Z 22 + Z 32 + Z 42 + Z 52 represents a χ 2 variable with 4 degrees of freedom.
−
(b) We know that if a standard normal variable Z and χ2ν (chi square variable Z with ν d.f.) are independent then follows a t distribution with ν χ2ν ν d.f. Since Z 22 + Z 32 + Z 42 + Z 52 represents a χ2 variable with 4 d.f., so Z 1 follows a t distribution with 4 d.f. Z 22 + Z 32 + Z 42 + Z 52 4
−
−
Q 6.31. Let Z 1 , . . . , Z 6 be independent standard normal random variables. Spec-
ify the distribution of
Z +Z Z − +Z Z + Z 1
2 3
2
2 4
2 5
2 6
2
.
Solution. Correct the mistake in the book. Replace 8 by 2. The given
expression can be re-written as Z 1
− Z 2 √ 2
Z + Z + Z + Z . 2 3
2 4
2 5
2 6
4
Since the numerator
Z 1
− Z 2 is a standard normal variable which is independent √ 2
from the chi square random variable Z 32 + Z 42 + Z 52 + Z 62 having d.f. as 4, we see that the above expression follows a t distribution with d.f. 4.
−
Q 6.32. Let Z 1 , . . . , Z7 be independent standard normal random variables.
(a) Specify the distribution of Z 12 + Z 22 + Z 32 + Z 42 . (b) Specify the distribution of Z 52 + Z 62 + Z 72 . 36
(c) Specify the distribution of the sum of variables in (a) and (b). Solution. (a) Z 12 + Z 22 + Z 32 + Z 42 follows a χ2 distribution with d.f. 4.
(b) Z 52 + Z 62 + Z 72 follows a χ2 distribution with d.f. 3. (c) Z 12 +
· ·· + Z 72 follows a χ2 distribution with d.f. 7.
Q 6.33. Let the chi square random variables χ21 , with ν 1 d.f., and χ22 , with ν 2
d.f. be independent. Show that χ 21 + χ22 is a chi square variable with ν 1 + ν 2 d.f. Solution. We can write χ 21 and χ 22 as a sum of squares of independent random
variables:
χ21 = Z 12 +
and
χ22 = Z ν 2 +1 + 1
Since χ 21 and χ 22 are
·· · + Z ν 2
1
·· · + Z ν 2 +ν . 1
2
independent we have that Z i s are independent. So χ 21 +χ22 is a sum of squares of ν 1 + ν 2 standard normal variables, and hence the claim.
7
Chapter 7 Exercises
X µ follows a standard normal distribution. σ/ n So we can assert with probability 1 α that For large n, the random variable
−√ −
−√ µ ≤ zα/2. −zα/2 ≤ X σ/ n This can be re-written as
|X −√ µ| ≤ zα/2. σ/ n
Here z α/2 is the cut-off point such that the area to its right is α/2. Definition 7.1 (Maximum Error of Estimate) . The number E = zα/2
√ σn
is
called the maximum error of estimate with probability 1 α. In case σ is not s known then the maximum error estimate is given by E = tα/2 , where s is n the sample standard deviation.
−
√
Definition 7.2 (Large sample confidence interval for µ). After the sample mean
x has been calculated, the (1 ulation mean µ is
− α) × 100% confidence interval for the true popσ σ x − zα/2 √ , x + zα/2 √ . The end points are known as n n
the confidence limits.
37
Definition 7.3 (Small sample confidence interval for µ). After the sample mean
x has been calculated, and n < 30, the (1 α) 100% confidence interval for σ σ the true population mean µ is x tα/2 , x + tα/2 . The end points are n n known as the confidence limits.
−
− × √ √
Remark. When the true population standard deviation σ is not known, then
we can replace it by the sample standard deviation s in the above definitions. Q 7.1. Civil engineers collected data from one area of Wisconsin on the amount
of salt (in tons) used to keep highways drivable during snowstorm. A sample pertaining to n = 30 storms was collected and it was found that the sample mean x = 1798.4 tons, and the sample standard deviation s = 819.35 tons. What can one assert with 95% confidence about the maximum error of estimation of the true population mean? X µ fols/ n 1 d.f. Hence the maximum error with 95%
Solution. Here the population standard deviation is not known, so
lows a t distribution with n confidence is
−
−
−√
819.35 30 = 2.045 149.59
E = t0.025
√ ×
= 305.95.
Q 7.2. With reference to the previous exercise construct a 95% confidence in-
terval for the true population mean amount of salt required for a snowstorm. s = Solution. For 95% confidence level, the lower confidence limit is x t0.025 n s 1492.45, and the upper confidence limit is x + t0.025 = 2104.35. Thus we are n 95% sure that µ will lie in the interval (1492.45, 2104.35).
−
√
√
Q 7.3. An industrial engineer collected data on the labor time required to pro-
duce an order of automobile mufflers using a heavy stamping machine. The data for n = 52 orders has x = 1.865 hours, and s = 1.250 hours. What can one assert with 95% confidence about the maximum error of estimation of the true population mean? X µ fols/ n 1 d.f. Hence the maximum error with 95%
Solution. Here the population standard deviation is not known, so
lows a t distribution with n confidence is
−
−
1.250 52 = 2.007 0.1733 = 0.348.
E = t0.025
38
√ ×
−√
However since n = 52 is a large enough sample so we can also use the standard 1.250 normal estimate instead of the t estimate. This would yield E = z 0.025 = 52 1.96 0.1733 = 0.3397. You can see that the difference is negligible.
√
×
Q 7.4. With reference to the previous exercise construct a 95% confidence in-
terval for the true population mean labor time. Solution. For 95% confidence level, the lower confidence limit is x
− 0.348 =
1.517, and the upper confidence limit is x + 0.348 = 2.213. Thus we are 95% sure that µ will lie in the interval (1.517, 2.213). Q 7.5. The manufacture of large LCD’s is difficult. Some defects are minor and
can be removed; others are unremovable. The number of unremovable defects of a sample of n = 45 displays has x = 2.667 and s = 3.057. What can one assert with 98% confidence about the maximum error of estimation of the true population mean? X µ fols/ n 1 d.f. Hence the maximum error with 98%
Solution. Here the population standard deviation is not known, so
lows a t distribution with n confidence is
−
−
−√
3.057 45 = 2.41 0.456 = 1.10.
E =
t0.01
√ ×
However since n = 45 is a large enough sample so we can also use the standard 3.057 normal estimate instead of the t estimate. This would yield E = z 0.01 = 45 2.33 0.456 = 1.060. You can see that the difference is quite small.
√
×
Q 7.6. With reference to the previous exercise construct a 98% confidence in-
terval for the true population mean labor time. Solution. For 98% confidence level, the lower confidence limit is x
− 1.10 =
1.567, and the upper confidence limit is x + 1.10 = 3.767. Thus we are 98% sure that µ will lie in the interval (1.567, 3.767). Q 7.9. In a study of automobile collision insurance costs, a random sample of
80 body repair costs for a particular type of damage had a mean of $ 472.36 and a standard deviation of $ 62.35. If x = $ 472.36 is used as a point estimate of the true average repair cost of this kind of damage, with what confidence can one assert that the error does not exceed $ 10 ? Solution. The maximum error of estimation of the true population mean is
given by E = z α/2 39
√ sn .
We are required to find α so that zα/2
62.35 = 10. 80
√
That is, zα/2 = 1.43. From the standard normal table we see that F (1.43) = 0.9236. Therefore, α = 2(1 0.9236) = 0.1527. Hence we can assert with a confidence of (1 0.1527) 100% = 84.73% that the error will not exceed $ 10.
−
×
−
Q 7.10. If we want to determine the average mechanical aptitude of a large
group of workers, how large a random sample will we need to be able to assert with probability 0.95 that the sample mean will not differ from the true mean by more than 3.0 points? (Assume that it is known from past experience that σ = 20.0). Solution. Here α = 0.05. Since we are talking about a large sample so the
√ σn . We have √ to find n so that E = 3.0. Solving we get n = 13.067. Hence n ≈ 171. maximum error of estimation of true mean is given by E = z 0.025
Q 7.11. If we want to use the mean of a random sample to estimate the average
amount of time students take to get from one class to another, and we want to be able to assert with 99% confidence that the error is at most 0.25 minute. If it can be presumed from experience that σ = 1.40 minutes, how large a sample should we take? Solution. Here we have to determine n so that
z0.005
1.40 = 0.25. n
√
From the table we see that z0.005 = 2.576. Therefore n = 208. Q 7.12. One novel process of making green gasoline takes biomass in the form of sucrose and converts it into gasoline using catalytic reactions. At one step in a pilot plant process, a chemical engineer measures the output of carbon chains of length three. Nine runs with same catalyst produced the yield: 0.63, 2.64, 1.85, 1.68, 1.09, 1.67, 0.73, 1.04, 0.68. What can the chemical engineer assert with 95% confidence about the maximum error if he uses the sample mean to estimate the true mean yield? Solution. We first calculate x = 1.334, and s = 0.674. Here
t distribution with 8 d.f. Therefore, the maximum error is
−
t0.025
0.674 = 2.306 3
40
= 0.518. × 0.674 3
X µ follows a s/ n
−√
Q 7.13. With reference to the previous exercise, assume that the yield has a
normal distribution and obtain a 95% confidence interval for the true mean yield of the pilot plant process. X µ follows s/ n a t distribution with 8 d.f. because we know the sample variance and not the population variance. Thus the 95% confidence interval is x 0.518 = 1.334 0.518. Solution. Even though the yield is normally distributed, the ratio
−
±
−√
±
Q 7.14. To monitor a complex chemical process, n = 9 were made on a key
performance indicator: 123, 106, 114, 128, 113, 109, 120, 102, 111. What can we assert with 95% confidence about the maximum error if we use the sample mean an an estimator of the true mean value of the indicator? Solution. We first calculate x = 114, and s = 8.34. Now the maximum error
with 95% confidence is given by t0.025
8.34 = 6.41. 3
Q 7.15. With reference to the previous exercise, assume that the key perfor-
mance indicator has a normal distribution and obtain a 95% confidence interval for the true mean yield of the pilot plant process. Solution. Even though the key performance indicator is normally distributed,
X µ follows a t distribution with 8 d.f. because we know the sample s/ n variance and not the population variance. Thus the 95% confidence interval is x 6.41 = 114.0 6.41. the ratio
−√
±
−
±
Q 7.16. Material costs for rebuilding traction motors is studied for a sample
of n = 29 motors. A computer calculation gives x = 1.4707 and s = 0.5235 thousand dollars. Obtain a 90% confidence interval for the mean material costs to rebuild the motor. X µ follows a t distribution with 28 d.f. because we know s/ n the sample variance and not the population variance. Thus the maximum error 0.5235 of approximation with 90% confidence is E = t 0.05 = 0.1652. Therefore 29 90% confidence interval is x 0.1652 = 1.4707 0.1562. Solution. Here
−√
−
√
±
±
Q 7.20. Ten bearings made by a certain process have a mean diameter of 0.5060
cm and a standard deviation of 0.0040 cm. Assuming that data may be looked upon as a random sample from a normal population, construct a 95% confidence interval for the actual average diameter of bearings made by this process. 41
X µ follows a t distribution with 9 d.f. because we know s/ n the sample variance and not the population variance. Thus the maximum error 0.004 of approximation with 95% confidence is E = t 0.025 = 0.00286. Therefore 10 95% confidence interval is x 0.00286 = 0.5060 0.00286. Solution. Here
−√
−
√
±
±
Q 7.21. The freshness of produce at a mega-store is rated on a scale of 1 to 5,
with 5 being very fresh. From a random sample of 36 customers, the average score was 3.5 with a standard deviation of 0.8. (a) Obtain a 90% confidence interval for the population mean( µ) score for all customers. (b) Does µ lie in your interval obtained in part (a)? (c) In a long series of experiments, with new random samples collected for each experiment, what proportion of the resulting confidence intervals will contain the true population mean? X µ can be assumed to follow a Z distribution since s/ n n > 30. Thus the maximum error of approximation with 90% confidence is 0.8 E = z 0.05 = 0.2193. Therefore the 90%confidence interval is x 0.2193 = 6 3.5 0.2193.
Solution. (a) Here
−√
±
±
(b) µ may or may not lie in this confidence interval. We are only 90% certain that will lie in this confidence interval. (c) Each time an experiment is performed, we get a 90% confidence interval. Since the number of trials are large, we conclude by the law of large numbers that 90% of these intervals will contain µ. Q 7.22. A copy shop records that in n = 64 cases, the cartridge for copy machine
lasted an average of 18, 300 copies with a standard deviation of 2, 800 copies. (a) Obtain a 95% confidence interval for the population mean( µ) number of copies before a new cartridge is needed for the copy machine. (b) Does µ lie in your interval obtained in part (a)? (c) In a long series of experiments, with new random samples collected for each experiment, what proportion of the resulting confidence intervals will contain the true population mean? X µ can be assumed to follow a Z distribution since s/ n n > 30. Thus the maximum error of approximation with 95% confidence 2800 is E = z0.025 = 685.99. Therefore the 95%confidence interval is x 8 685.99 = 18300 685.99.
Solution. (a) Here
−√
±
±
42
(b) µ may or may not lie in this confidence interval. We are only 95% certain that will lie in this confidence interval. (c) Each time an experiment is performed, we get a 95% confidence interval. Since the number of trials are large, we conclude by the law of large numbers that 95% of these intervals will contain µ. Q 7.39. A computer manufacturer wants to establish that the average time to
set up a new desktop computer is less than 2 hrs. (a) Formulate the null and alternate hypothesis. (b) What error could be made if µ = 1.9? (c) What error could be made if µ = 2.0? Solution. (a) H 0 : µ = 2.0, H 1 : µ < 2.0.
(b) In case µ = 1.9, then the Null hypothesis is true. So the only error that we could possibly make is to reject H 0 . Thus only type 1 error is possible in this case. (c) In case µ = 2.0, then also H 0 holds. So the only error that we could possibly make is to reject H 0 . Thus only type 1 error is possible in this case. Q 7.40. A manufacturer of 4 speed clutches for automobiles claims that the
clutch will not fail until after 50, 000 miles. w (a) Interpreting this as a statement about the mean, formulate the null and alternate hypothesis for verifying the claim. (b) What error could be made if µ = 55, 000? (c) What error could be made if µ = 50, 000? Solution. (a) H 0 : µ = 50, 000, H 1 : µ > 50, 000.
(b) In case µ = 55, 000, then the alternate hypothesis is true. So the only error that we could possibly make is to reject H 1 , that is accepting the false H 0 . Thus only type 2 error is possible in this case. (c) In case µ = 50, 000, then H 0 holds. So the only error that we could possibly make is to reject H 0 . Thus only type 1 error is possible in this case. Q 7.41. An airline claims that the typical flying time between two cities is 56
minutes. (a) Formulate a test of hypothesis with the intent of establishing that the population mean flying time is different from the published time of 56 minutes. 43