Chapter 9 Preliminary Concepts on Statistical IInference nference In descriptive statistics the use of single measures to describe a set of data or distribution was introduced. These measures are the central measures of variability. variability. The central measures include in clude the mean, the median and the mode while the range, quartile range, mean deviation, variance, standard deviation and the coecient of variant are the measures of variability. variability. The other branch of statistics is the inferential statistics. This branch or category of statistics enables us to make estimates of population values called parameters and to make the statements about computed statistics acceptable to some degree of condence. The statistical method concerned with making estimates of population values is called statistical inference. This particular method and process will help us determine how accurate and acceptable our generalizations are. tatistics plays important imp ortant role in the eld of applied scientic research. To To get the data or information about the population, the researcher may !ust use a portion of it in order to eliminate at least the cost and time constraints. tatistics o"ers varied tools and techniques that will help the researcher draw reliable and valid inferences or generalization about the population using the sample as basis. #t this point, certain basic concepts are needed to be claried in order to understand and appreciate the concept of statistical inference better. The following are the discussion of the two sub$areas of inferential statistics namely, statistical estimation and the test of hypothesis. Statistical Estimation
In statistical estimation, we also consider a population and a sample. %ecall that a population is an aggregate of persons, ob!ects, events, places and actions to certain stimuli that have a unique pattern of qualities. ometimes, this is referred to as the universe in statistical investigation. &owever, a sample is a portion or a smaller part of the population that truly represents the unique qualities or characteristics of the population. The acceptability of the sample depends on how well the sampling technique has been selected and employed. 'or e(ample, in the study on the ad!ustment problems of freshmen students from the )epartment of *iberal #rts, the researchers considered all the students coming from all the seven +- programs. &owever, they opted not to use all the students on account of their big number. They used lovins formula to determine the appropriate sample size. # stratied proportional sampling was employed to determine the number of students per program. # test on /ersonal #d!ustment
inventory was administered and they use the test results to report the encountered problems regarding the ad!ustments of the entire group of freshmen students belonging to the department.
Inferential statistics help facilitate our work. Imagine, for instance, that the population of the said department cited in the e(ample above was 0,111 students distributed among seven programs. It would be e(tremely laborious for the researchers to involve all the students. They can make their work easier by drawing representative sample for each program which is proportional to the population size of every program. The total sample size will be 234 students only applying lovins formula and using 56 margin of error. error. #nd yet, they can report about the results as re7ective of the ad!ustment of the whole freshmen students. Parameter and Statistics
tatistical inference also deals with the concepts about parameter and statistics that are involved in the estimation and further in the testing the hypothesis. Parameters e(ist whether they are computed or not. In practice, these parameters are the attributes of a population. The numerical descriptive measures
´ , median μ and mode μ and the of central tendencies such as the mean μ ~
^
2
measures of variation including variance σ and the standard deviation 8 are not known unless we invoke probability sampling techniques. The 9reek symbols used here are read as mufor the central measures and sigma for the measures of variations. Statistics are the computed measures about the sample. The sample mean
is symbolized as x´ and the sample standard deviation by s. The term statistics is synonymous to the concepts of estimates. Sampling Methods Revisited
The degree to which a particular statistic appro(imates appro(imates its corresponding parameter value depends upon how impartially we have drawn our sample. ampling theory is based on the theoretical use of the word : random;. %andom sampling which is the most commonly used sampling technique has two properties. 'irst is equiprobability which means that each member of the population has an equal chance of being drawn and be included in the sample. If for instance there are 511 members in the population, the probability of each member
1
to be drawn is
500
. This is especially true when sampled cases are replaced or
returned to the original pool. The second property is independence. This means that the chance of one member being drawn does not a"ect the chances of the other members getting chosen. 'or e(ample, in a population where there are father and son members not necessarily paired, when the father is drawn, this does not mean that the son will automatically be included in the sample. &ence, the selection of the father is independent of the inclusion of the son in this sample selection.
(amples of this this type type are are the the acci accide dent ntal al samp samplin ling, g, quota quota sampl sampling ing and and the the purpo purposiv sive e sampling. The convenience and economical use of this type are its advantages.
The second type is the probability sampling wherein every individual has an equal chance of becoming a part of the sample. >(amples under these type include the simple random sampling, stratied stratied random sampling, cluster sampling, syst system emat atic ic sampl sampling ing and and mult multist istag age e sampl samplin ing. g. It is also also note noted d whenev whenever er the the sampling is not carried out like in the probability sampling then the result is a biased sample. The sampling error is the di"erence between a particular value, and its corresponding statistic. upposing that someone administered an intelligence test to a random sample of incoming freshmen in a certain college and computed the mean. The statistic is an estimate of the parameter mean x´ . The sampling error is denoted as eis the di"erence di"erence between the population mean and the sample mean, thus ? $ x´
@ e.
Point and Interval Estimation
There are two types of an estimator, these are the point estimator and interval estimator. The point estimator is a rule or formula that gives the value of the gathered information for estimating a particular parameter. #n e(ample of a point point estimator estimator is the sample sample mean mean
x ´ , in which it estimates the value of the
population mean, ?. The second type is referred to as the interval estimator in which it is a rule or formula that gives a set of computed values of the gathered information indicating the range of values in which the parameter to be estimated
will lie. `
Standard Error
The standard error of the distribution of the means is denoted by >+ x ´ -. If it were possible to draw a sample from a population 011 times and each
time the mean of each sample is computed,
s
√ n
+0-
The acceptability of the representatives representatives of the particular sample is determined by the magnitude of the standard error. error. 'urthermore, the formula above indicates that the magnitude of the standard error of the distribution of the means is dependent on two measures. 'irst is the standard deviation or variability of scores around the mean and second is the size of the sample or the number of cases being studied. #ypothesis $esting
The goal of hypothesis is not to question the computed value of the sample statistic but to make a !udgment about the di"erence between the sample statistics and a hypothesized population p opulation parameter. parameter. &ypothesis testing enables a researcher to generalize a population from relatively small samples. In many instances, a population from relatively small samples. In many instances, a researcher can only rely on the information provided for by a part of the population. # hypothesis is a tentative e(planation for certain events, phenomena or behaviors. It is a statement of prediction of the relationship between or among variables. It is also the most specic statement of a problem in which the variables considered as measurable and that the statement species how these variables are related. 'urthermore, 'urthermore, this statement is testable which means that the relationship between the variables can be put into test by means of the application of appropriate statistical test on the data gathered about the variables. %ull and <ernative #ypothesis
There are two kinds kinds of hypotheses, the null hypothesis and the alternative hypothesis. %ull hypothesis' which is denoted as & 1is the statement of equality indicating no e(istence of relationship between the variables variabl es under study. study. This statement is tested for the purpose of being accepted or re!ected. >(amples of a null hypothesis are given below. >(ample 0 >(ample >(ample 2 >(ample >(ample C >(ample D
The Bathematics Bathematics ability ability test scores of the control group do not di"er with that of the e(perimental group. The !ob !ob perfor performan mance ce of a group group of of employ employees ees work working ing in a class class # hotel is independent on their working condition. The schola scholasti stic c compet competitio ition n among among 're 'reshma shman n student students s has no relationship on their academic achievement. There is no di"erence in the college entrance e(amination scores obtained by the students in the public and private schools.
The alternative hypothesis' which is denoted as &a is also termed as the research hypothesis. It is a statement of the e(pectation derived from the theory under the study. study. It species an e(istence of a di"erence and is therefore termed as non(directional alternative hypothesis. >(amples of a non$ directional alternative hypothesis are given below. >(ample 5 >(ample >(ample 4 >(ample >(ample 3
The Bathem Bathematics atics ability test scores scores of the control control group group di"ers di"ers with that of the e(perimental group The !ob !ob perfor performan mance ce of a group group of of employ employees ees work working ing in a class class # hotel is related on their working condition. The scholastic competition among 'reshmen students has a relationship on their academic achievement. There is a di"erence in the college entrance e(amination scores obtained by the students in the public and private schools.
It can be predictive hypothesis which species that one group performs better than the other and is therefore termed as directional alternative hypothesis. >(ample E
The Bathematics Bathematics ability ability test scores of the control group is lower than that of the e(perimental group. >(ample 01 &igh scores in the mental mental ability test corresponds corresponds to high scores on the self$concept test. >(ample 00 tudents e(posed to time pressure pressure has a negative e"ect on their reading comprehension skills. >(ample 02 The brand of cellular phone used by college students in FGH niversity has a positive e"ect in developing ones self$image. )irectional and %on(directional $ests of #ypothesis
The non(directional tests of hypothesis is also referred to as the t*o( tailed test. It makes use of the two opposite sides or tails of the statistical model or distribution. This indicates that no assertion is made whether the di"erence falls within the positive or negative end of the distribution. The illustration of this test at 56 level of signicance is presented on the ne(t page. The directional test of hypothesis is also referred referred to as the one(tailed test. It makes use of only one side or tail of the statistical model or distribution which can be left$tailed or a right$tailed test. This indicates that an assertion is made to whether the positive end of the distribution for a right(tailed test. The illustration of this right$tail test at 56 level of signicance is presented on gure E.2 and the left$tailed test at the same 56 level of signicance is presented on 'igure E.C Critical Region and Critical +alue
The critical region is a set of values of the test statistic +computed from the gathered data set- that is chosen before the e(periment to dene the conditions under which the null hypothesis will be re!ected. The critical value or values separate the critical region from the values of test statistic that would lead to the re!ection of the null hypothesis. The critical values depend on the nature of the null hypothesis, the relevant sampling distributions and the level of signicance. In two$tailed tests, the level of significance J is divided equally between two tails that constitute the critical region. 'or e(ample, in a two tailed test with a significance level of J @ 56, there is an area of 2.56 in each of the two tails as shown on the gure E.0. Kn the other hand, for one tailed$tests, the level of signicance constitutes the critical region that can be found on either the left or right tail of the distribution. $ype I and $ype II Errors
In testing the null hypothesis, our conclusion is that of re!ecting or accepting it. =orrect decisions happen when we re!ect a null hypothesis when it is false or when we accept a null hypothesis when it is true. Ktherwise, the decisions are wrong. That is, when we re!ect re!ect a true null hypothesis or when we accept a false null hypothesis. These two possible scenarios of committing a wrong decision give two di"erent types of error in a statistical decision making. These errors are not a miscalculation or procedural misstep. They are actual error that can occur when a rare event happens by chance. The rst type is the type I error. This is the chance of re!ecting the null hypothesis when it is true. It is also referred to as the signicance level and denoted by the 9reek symbol alpha +J- to represent the probability of a type I error. The common values for J are 06, 56 and 016. The second type is the type II error. This is the chance of failing to re!ect
the null hypothesis when it is false. It is denoted by the 9reek symbol beta +L- to represent the probability of a type II error. error. Con,dence Interval
Kn the previous section, we discussed the chance of committing type I error, which is denoted as J. This is also referred as the level of significance. # confidence level is denoted as 0 M J , which represents the chance of accepting the null hypothesis when in fact it is true. This is usually attached to the notion of interval estimation in which we are attaching certainty that the parameter we are estimating will lie on the interval where the lower and upper bounds e(ist. The common values used for the condence level is E16, E56 and EE6. # con,dence interval is constructed when we attach a condence level to the interval estimate of a particular parameter. # general formula for constructing a 011 ( +0 M J-6 confidence interval for the parameter we used to estimate is given byA byA >stimator N .>. +estimator- ( critical value +2
α 2
.
Steps in Performing #ypothesis $esting
The following are the steps in performing hypothesis testingA 0. tate the the null and the the alternative alternative hypothes hypothesis is of the given given problem. problem. 2. )etermine )etermine the level level of signican signicance ce and the the level of signicance signicance and and the direction of test will be based on whether the alternative hypothesis is stated as left or right tailed test or two$tailed test. C. )etermine )etermine the appropriat appropriate e statistical statistical test based based on the level of measurement measurement of the gathered data. D.
$esting #ypothesis for the Mean for Single Sample Case-
Oy the following the procedures in testing the hypothesis for only a single mean, we have a hypothesized mean +? 1-. ymbolically the null hypothesis is written asA &oA ? @ ? 1+Ctating that the mean is equal to the hypothesized mean. Kn the other hand, the alternative hypothesis is written as symbolically asA &aA ? @ ? 1 +D&aA ? P ? 1 +5&aA ? Q ? 1 +4 The e(pression in +D- is used when the alternative hypothesis is non$ directional and hence undergoing a two$tailed test. The remaining e(pressions +5and +4- are used when the alternative hypothesis is directional and hence undergoing a one$tailed test which is either right or left$tailed test respectively. respectively. RR The The dec decis ision ion rule rule is is sta state ted d as follo followsA wsA %e!ect e!ect the the null null hyp hypot othe hesis sis if the the absolute value of the test statistic e(ceeds the critical value. Ktherwise, accept the null hypothesis. Population +ariance +ariance is .no*n
In order to draw inference on a mean in one$population case assuming that the entries are normally distributed and the variance is known, we use the /(test. The /(statistic , Hcis the test statistic used in order to lead for the re!ection of null hypothesis in favor of the alternative hypothesis. This is computed as follows x´ − μ 0
/c "
σ / √ n
+-
σ is the population standard deviation which is known or given and n is
the sample size. The critical value is obtained using the H$tabular value located in #ppendi( 2. 'or a two$sided test we consider the value of 0 $
α 2
, the value is symbolically
written as H
α 2
. Ktherwise, for a one$sided test, we consider the value of 0 M J and
the value is symbolically represented as H J. >(ample >(ample 0
# random random sample sample of of 011 stude students nts enro enrolled lled in tat tatisti istics cs course course under under professor FGH shows that the average grade in the midterm e(amination is 356. /rofessor FGH claims that the average grade of the students in the midterm e(amination is at least 316 with a standard deviation 046. Is there an evidence to say that the claim of the /rofessor FGH is correct at 56 level of signicanceS
olutionA
μ @
316, which means that the average grade of /rofessor FGHs students is greater than 316 and the alternative hypothesis as &aA μ > 80 , which means that the average grade of /rofessor FGHs students is greater than 316. ince the last statement on given problem asserts it to fall positive end of the distribution, we consider this is a one$tailed test. Thus, our decision rule is to re!ect the null hypothesis if H cP HJ. Ktherwise, accept the null hypothesis. The test statistic together with the H$T H$Tabular value is computed as followsA Z
c=
0.85 − 0.80
(
0.16
√ 100 100
)
= 0.05 = 3.125 0.016
o that H c @ C.025 against the H$tabular value of H 1.15@ 0.4D5. Kur decision based on this computation is to re!ect the null hypothesis in favor of the alternative hypothesis. Thus, we conclude that the claim of /rofessor FGH is true since the average grade by his student in tatistics is greater than 316 at at 56 level of signicance. >(ample 2 # random sample of 011 recorded deaths in . during the past year showed on average life span of 0.3 yrs. assuming a population standard deviation of 3.E years, does this seem to indicate that the mean life span today is greater than 1 yearsS se a 1.15 level of signicance. 0.- &KA U @ 1 1 A U P 1 2.- H$ Test +right tailedC.C.- [email protected] [email protected] 5 0 $ 1.15 @ 1.E5 D.- %e!ect %e!ect &K if H= P Htab Htab
5.-
Z c =
( 71.8 −70 ) √ 100 100 8.9
Hc @ 2.12 /+z P 2.12- @ 0 M /+z P 2.12@ 0 M 1.E3C @ 1.120 VVWoteA P value is the lowest signicance in which the observed value of the test statistic is signicant. 4.- Hc P HtabX HtabX %e!ect %e!ect &K .- The mean life span span today is greater greater than 1. >(ample C # manufacturer of of sports equipment has developed a new synthetic finishing line that he claims has a mean breaking strength of 3 kg with 8@ 1.5 kg test the hypothesis that U@ 3 kg against U@ 3kg if a random sample of 51 lines is tested and found to have a mean breaking strength of .3 kg. se 1.10 level of signicance 0.0.- &KA &KA U @ 3 A U Y 3 2.- H$ Test Test +YC.- J@ 1.10Z 1.10Z2 2 Htab0@ $2.55 Htab2@ 2.55 D.- %e!ect %e!ect &K if H= Q Htab0 Htab0 %e!ect &K if H= P Htab2 50 ( 7.8 −8 ) √ 50 Z = c 5.0.5 Hc @ $2.323 /+z Q $2.323- @ 1.112C+2@ 1.11D4 4.- Hc Q Htab0 Htab0 X %e!ect %e!ect &K .- The mean mreaking mreaking strengths strengths is not equal to 3kg.
T B>#W T>T
Zc =
X 1− X 2
√
2
2
σ 1 σ 2 n1
+
n2
(ample 0 #n admission test was administered to incoming freshman in 2 colleges. Two Two independent sample of 051 students each are randomly selected and the mean mean scored of the given samples are F 0 @ 33 and F 2 @ 35. #ssume that the variances of the test scores are D1 and C5 respectively. respectively. Is the di"erence between the mean scores signicant or can be attributed to chanceS se 1.10 level of signicance. 0.0.- &KA &KA U0 @ U2 A U0 Y U2 2.- H$ Test +two$tailedC.C.- J@ 1.10 1.10 Htab0@ $2.53 Htab2@ 2.53 D.- %e!ect %e!ect &K if H= Q Htab0 Htab0 %e!ect &K if H= P Htab2 Zc =
5.-
− 85
88
√
40 150
+
35 150
Hc @ D.2D24 4.- Hc P Htab0 Htab0 X #ccept #ccept &K .- There is a signicant di"erence di"erence and can be attributed to chance.
Population +ariance +ariance is 0n.no*n
In order to draw inference on a mean in one$population case assuming that the entries are normally distributed but the variance is unknown, we use the t$test. The test statistic used in order to lead for the re!ection re!ection of null hypothesis in favor of the alternative hypothesis is the t$statistic, t c , which is computed as followsA
x ´ − μ 0 t c = s / √ n
s is the sample standard deviation which is known or given and n is the
sample size. The critical value is obtained using the t$tabular value located in #ppendi( 5. 'or a two$sided test we look for the value of df, which is referred as the degrees of t α
freedom, this is symbolically written
2
( n−1 )
. Ktherwise, for a one$sided test, we
look on the column of J and look for the value of df, this is symbolically written as t α ( n−1) . >(ample 0
# random sample of 011 students enrolled in tatistics course under professor FGH shows that the average grade in the midterm e(amination is 356 with computed standard deviation of 256. /rofessor FGH claims that the average grade of the students in the midterm e(amination is at least l east 316. Test Test the claim of the professor at 56 level of signicance.
olutionA
μ
@ 316, which means that the average grade of /rofessor FGHs students is 316 and the alternative hypothesis as &aA μ ≠ 80 , which means that the average grade of /rofessor FGHs students is not 316. ince the last statement on given problem does not assert to whether it falls within the positive or negative end, we consider this as a non$directional test. Thus, our decision rule is to re!ect the null hypothesis if t cP
t α 2
( n−1 )
.
Ktherwise, accept
the null hypothesis. The test statistic together with the t$tabular value is computed as followsA 0.85
tc @
(
−0.80
0.25 100 √ 100
)
0.05
@
0.025
@ 2.111
o that t c @ 2.111 against the t $tabular value of t 1.125+EE-@ 0.E41. Wotice that on the values of n in the t$table is only up to C1 and then followed by IW' +innity-, which is used for n greater than C1. Kur decision based on this computation is to re!ect the null hypothesis in favor of the alternative hypothesis. Thus, we conclude that the claim of /rofessor FGH is not true since the average grade obtained by his students in tatistics is not 316. # 011 ( +0$J-6 confidence interval is constructed whenever the null hypothesis of the two$tailed test is re!ected. Ktherwise, this condence interval will not be constructed. In order to determine the possible values that the true average grade will lie, we will construct a E56 condence interval. sing formula +2-, we have the estimator for the mean which is the average grade by the students to 356, the tabular value of 0.E41 and the standard error of the estimate is given by 0.25 100 √ 100
=0.025
which is equivalent to 2.56.
sing the results on e(ample 3.0D, the resulting condence interval is given as 356 N +2.56-+0.E4- @356 N D.E6. That is, with an attached E56 condence coecient, the true average grade obtained by /rofessor FGHs students lies within 3E.E6 to 31.06 which is eventually higher than the hypothesized value of 316. >(ample 2
# random sample of 25 female high school students show that their average body mass inde( +OBI- is about 03 points with a standard deviation of D.5 points. Test Test the hypothesis that the average OBI of the female high school students is lower than 0E points at 56 level of signicance.
olutionA The null hypothesis is stated as &o A? @0E, against the alternative hypothesis of &aA ? Q 0E. The last statement on given problem asserts it to fall within the negative end of the distribution which is considered it as this left$tailed test. Thus, our decision rule is to re!ect the null hypothesis. The test e t$ statistic together with the t$tabular value is computed as followsA
ǀt c ǀ=ǀ
18
−19
( ) 4.5
ǀ=ǀ
−1 0.9
ǀ =1.111 versus
√ 25
t ( 0.05)(25−1)=t (0.05 )( 24)=1.711
Kur decision based on this computation is to accept the null hypothesis. Thus, we conclude that average OBI of female high school students is about 0E points. >(ample >(ample C The manage managerr of a car car rental rental agenc agency y claims claims that that the the average average milea mileage ge of cars rented is less than 3111. # sample of 5 auto$mobiles has an average mileage of 2C, with st. dev. of 511 miles. #t [email protected], is there enough evidence to re!ect to managers claimS 0.0.- &KA &KA U0 @ 3111 A U0 Q 3111 2.- T$ Test Test +left$tailedC.C.- J@ 1.10 1.10 [@ n$0@ 5$0@ D Ttab@ $C.D D.- %e!ect %e!ect &K &K if T= Q Ttab (7723 −8000 ) √ 5 5.- Z c = 500 Hc @ $0.2C3 4.4.- ince ince tc P ttab X #ccept &K .- The managers managers claim claim is false.
In testing two small samplesA tc=
X 1 − X 2
√
2
2
n1 S1 + n2 S 2 n1 + n 2 + n 1+ n2−2 n1 n2
>(am >(ampl ple eD Two sam sampl ples es ar are rand random omly ly sel selec ecte ted d fr from two two grou groups ps of of stud studen ents ts who have been taught using di"erent teaching methods. #n e(amination is given and the results are shown below.
9roup0
9roup2
n0@3
n2@01
F0@35
F2@3
02@D4
22@44
sing 1.15 level of signicance, can we conclude that the two di"erent teaching methods are equally e"ectiveS
0.0.- &KA &KA U0 @ U2 A U0 Y U2 2.- T$ Test Test +two$tailed testC.C.- J@ 1.1 1.15 5 [@ n0\n2$2@ 3\01$2 @ 04 Ttab0@ $2.021 Ttab2@ 2.021 D.- %e!ect %e!ect &K &K if T= Q Ttab0 %e!ect &K if T = P Ttab2 t c =
5.-
85−87
√
( ) + 10 (36 ) 8 + 10 + 8 + 10−2 8 ( 10 )
8 6
tc @ $1.450 4.4.- ince ince tc P ttab0 X #ccept &K .- The two teaching methods are equally e"ective.
E1ERCISES 2- uppose that mi allergist wishes to test the hypothesis that at least C16 of the public is allergic to some cheese products. >(plain how the allergist could commit
+a- a type I errorX +b- a type II error. #nswerA ++a- =onclude that fewer than C16 of the public are allergic to some cheese products when, in fact, C16 or more are allergic. +b- =onclude that at at least C16 of the public are allergic to some cheese products when, in fact, fewer than C16 are allergic.-
3- # sociologist is concerned about the e"ectiveness of a training course designed to get more drivers to use seat, belts in automobiles.
+a-
+a-
5- The proportion of adults living in a small town who are college graduates is estimated to be p @ 1.4.
To To test this hypothesis, a random sample of 05 adults is selected. If the number of college graduates in our sample is anywhere from 4 to 02, we shall not re!ect the null hypothesis that p @ 1.4X otherwise, we shall conclude that p ^ 1.4. +a- >valuate a assuming that p @ 1.4. se the binomial distribution. +b- >valuate 8 for the alternatives p @ 1.5 and p — 1.. +c- Is this a good test procedureS #nswerA ++a- J @ /+F ] 5 ^ p @ 1.4-\/+F _ 0C ^ p @ 1.4- @ 1.1CC3\+0`1.E2E- @ 1.141E. +b- L @ /+4 ] F ] 02 ^ p @ 1.5- @ 1.EE4C ` 1.051E @ 1.3D5D. L @ /+4 ] F ] 02 ^ p @ 1.- @ 1.3C2 ` 1.11C @ 1.34E5. +c- This test procedure is not good for detecting di"erences of 1.0 in p.-
6- %epeat. >(ercise >(ercise 01.D when 211 adults are selected and the fail to re!ect region is dened to be
001 Q x < 0C1, where x is the number of college graduates in our sample. se the normal appro(imation. #nswerA ++a- J @ /+F Q 001 ^ p @ 1.4- \ /+F P 0C1 ^ p @ 1.4- @ /+H Q `0.52- \ /+H P 0.52- @ 2+1.14DC- @ 1.0234. +b- L @ /+001 Q F Q 0C1 ^ p @ 1.5- @ /+0.CD Q H Q D.C0- @ 1.1E10. L @ /+001 Q F Q 0C1 ^ p @ 1.- @ /+`D.0 Q H Q `0.D- @ 1.113. +c- The probability of a Type I error is somewhat high for this procedure, although Type Type II errors are are reduced dramatically.dramatically.-
7- # fabric manufacturer believes that the proportion of orders for raw material arriving late is p @ 1.4.
If a random sample of 01 orders shows that C or fewer arrived late, the hypothesis that p = 1.4 should be re!ected in favor of the alternative p Q 1.4. se the binomial distribution. +a- 'ind the probability of committing a type I error if the true proportion is p @ 1.4. +b- 'ind the probability of committing a type II error for the alternatives p @ 1.C, p — 1.D, and p @ 1.5. #nswerA ++a- J @ /+F ] C ^ p @ 1.4- @ 1.15D3. +b- L @ /+F P C ^ p @ 1.C- @ 0 ` 1.4DE4 @ 1.C51D. L @ /+F P C ^ p @ 1.D- @ 0 ` 1.C32C @ 1.40. L @ /+F P C ^ p @ 1.5- @ 0 ` 1.00E @ 1.3230.-
8. %epeat >(ercise >(ercise 01.4 when 51 orders are selected and the critical criti cal region is dened to be x < 2D, where x is the number of orders in our sample that arrived late. se the normal appro(imation. app ro(imation.
#nswerA ++a- J @ /+F ] 2D ^ p @ 1.4- @ /+H Q `0.5E- @ 1.155E. +b- L @ /+F P 2D ^ p @ 1.C- @ /+H P 2.EC- @ 0 ` 1.EE3C @ 1.110. L @ /+F P 2D ^ p @ 1.D- @ /+H P 0.C1- @ 0 ` 1.E1C2 @ 1.1E43. L @ /+F P 2D ^ p @ 1.5- @ /+H P `1.0D- @ 0 ` 1.DDDC @ 1.555.- #n electrical rm manufactures light bulbs that have a lifetime that is appro(imately normally distributed with a mean of 311 hours and a standard deviation of D1 hours. Test tbe hypothesis that p @ 311 hours against the alternative
p ^ 311 hours if a random sample of C1 bulbs has an average life of 33 hours. se
a /$value in your answers. #nswerA +The hypotheses are &1 A ? @ 311, &0 A ? 4@ 311. Wow, z @
33`311
D1ZC1
@ `0.4D, and /$value@ 2/+H Q `0.4D- @ +2-+1.1515- @ 1.0101. &ence, the mean is not signicantly di"erent di "erent from 311 for J Q 1.010. 9- # random sample of 4D bags of white =heddar popcorn weighed, on average, 5.2C ounces with a standard deviation of 1.2D ounces. Test Test the hypothesis that p @ 5.5 ounces against the alternative hypothesis, p Q 5.5 ounces at the 1.15 level of signicance.
#nswerA ++The hypotheses are &1 A ? @ 5.5, &0 A ? Q 5.5. The
#nswerA +01.20 The hypotheses are &1 A ? @ D1 months, &0 A ? Q D1 months. )ecisionA re!ect & 1.2:-36 Test Test the hypothesis that the average content of containers containers of a particular lubricant is 01 liters if the contents of a random sample of 01 containers are 01.2, E., 01.0, 01.C, 01.0, E.3, E.E, 01.D, 01.C, and E.3 liters. se a 1.10 level of signicance and assume that the distribution of contents is normal.
#nswerA +The hypotheses are &1 A ? @ 01, &0 A ? 4@ 01. )ecisionA 'ail 'ail to re!ect &1.-
2:-37 #ccording to a dietary study, a high sodium intake may be related to ulcers, stomach cancer, and migraine headaches. The human requirement for salt is only 221 milligrams per day, which is surpassed in most single servings of ready$to$eat cereals. If a random sample of 21 similar servings of of certain cereal has a mean sodium content of 2DD milligrams and a standard deviation of 2D.5 milligrams, does this suggest at the 1.15 level of signicance that the average sodium content for a single serving of such cereal is greater than 221 milligramsS #ssume the distribution of sodium contents to be normal.
#nswerA +The hypotheses are &1 A ? @ 221 milligrams, &0 A ? P 221 milligrams. )ecisionA %e!ect & 1 and claim ? P 221 milligrams.2:-38 # study at the niversity of =olorado at Ooulder shows that running increases the percent resting metabolic rate +%B%- in older women. The average %B% of C1 elderly women runners was CD.16 higher than the average %B% of C1 sedentary elderly women and the standard deviations were reported to be 01.56 and 01.26, respectively. respectively.
#nswerA +The hypotheses are &1 A ?0 @ ?2, &0 A ?0 P ?2. &ence, the conclusion is that running increases the mean %B% in older women2:-23 #ccording to Chemical Engineering an important property of ber is its water absorbency. absorbency. The average percent absorbency of 25 randomly selected pieces of cotton ber was found to be 21 with a standard deviation of 0.5. # random sample of 25 pieces of acetate yielded an average percent of 02 with a standard deviation of 0.25. Is there strong evidence that the population mean percent absorbency for cotton ber is signicantly higher than the mean for acetate. #ssume that the percent absorbency is appro(imately normally distributed and that the population variances in percent absorbency for the two bers are the same. se a signicance level of 1.15.
#nswerA +The hypotheses are &1 A ?= @ ?#, &0 A ?= P ?#, The mean percent absorbency for the cotton cotton ber is signicantly higher hig her than the mean percent absorbency for acetate.01.2E /ast e(perience indicates that the time for high school seniors to complete a standardized test is a normal random variable with a mean of C5 minutes. If a
random sample of 21 high school seniors took an average of CC.0 minutes to complete this test with a standard deviation of D.C minutes, test the hypothesis at the 1.15 level of signicance that p @ C5 minutes against the alternative that p Q C5 minutes. #nswerA +The hypotheses are &1 A ? @ C5 minutes, &0 A ? Q C5 minutes. )ecisionA %e!ect & 1 and conclude that it takes less than C5 minutes, on the average, to take the test.2:-42 # manufacturer claims that the average tensile strength of thread th read # e(ceeds the average tensile strength of thread O by at least 02 kilograms. T To o test his claim, clai m, 51 pieces of each type of thread are tested under similar si milar conditions. Type Type # thread had an average tensile strength of 34. kilograms with known standard deviation of a A @ 4.23 kilograms, while type O thread had an average tensile strength of .3 kilograms with known standard deviation of an = 5.40 kilograms. Test the manufacturers claim ata @ 1.15.
#nswerA +hypotheses are &1 A ?# ` ?O @ 02 kilograms, &0 A ?# ` ?O P 02 kilograms. The average tensile strength of thread # does not e(ceed e(ceed the average tensile strength of of thread thread O by 02 kilograms.-
%;%P&R&ME$RIC $ES$S
Wonparametric tests are sometimes called distribution(free tests because they are based on fewer assumptions +e.g., they do not assume that the outcome is appro(imately normally distributed-. /arametric tests involve specic probability distributions +e.g., the normal distribution- and the tests involve estimation of the key parameters of that distribution +e.g., the mean or di"erence in means- from the sample data. The cost of fewer assumptions is that nonparametric tests are generally less powerful than their parametric counterparts +i.e., when the alternative is true, they may be less likely to re!ect &1-. It can sometimes be dicult to assess whether a continuous outcome follows a normal distribution and, thus, whether a parametric or nonparametric test is appropriate. There are several statistical statistical tests that can be used to assess whether data are likely from a normal distribution. The most popular are the olmogorov$mirnov olmogorov$mirnov test, the #nderson$)arling test, and the hapiro$
test0. >ach test is essentially a goodness of t test and compares observed data to quantiles of the normal +or other specied- distribution. The null hypothesis for each test is &1A )ata follow a normal distribution versus &0A )ata do not follow a normal distribution. If the test is statistically signicant +e.g., pQ1.15-, then data do not follow a normal distribution, and a nonparametric test is warranted. It should be noted no ted that these tests for normality can be sub!ect to low power. power. pecically, the tests may fail to re!ect &1A )ata follow a normal distribution when in fact the data do not follow a normal distribution. *ow power is a ma!or issue when the sample size is small $ which unfortunately is often when we wish to employ these tests. The most practical approach to assessing normality involves investigating the distributional form of the outcome in the sample using a histogram and to augment that with data from other studies, if available, that may indicate the likely distribution of the outcome in the population. There are are some situations when it is clear that that the outcome does not follow a normal distribution. These include situationsA •
when the outcome is an ordinal variable or a rank,
•
when there are denite outliers or
•
when the outcome has clear limits of detection.
0sing an ;rdinal Scale
=onsider a clinical trial where study participants are asked to rate their symptom severity following 4 weeks on the assigned treatment. ymptom severity might be measured on a 5 point ordinal scale with response optionsA ymptoms got much worse, slightly worse, no change, slightly improved, or much improved. uppose there are a total of n@21 participants in the trial, randomized to an e(perimental treatment or placebo, and the outcome data are distributed as shown in the gure below. )istribution of Symptom Severity in $otal Sample
The distribution of the outcome +symptom severityseverity- does not appear to be normal as more participants report improvement in symptoms as opposed to worsening of symptoms.
In some studies, the outcome is a rank. 'or e(ample, in obstetrical studies an #/9#% score is often used to assess the health of a newborn. The score, which ranges from 0$01, is the sum of ve component scores based on the infants condition at birth. #/9#% scores generally do not follow a normal distribution, since most newborns have scores of or higher +normal range-.
In some studies, the outcome is continuous but sub!ect to outliers or e(treme values. 'or e(ample, days in the hospital following a particular surgical procedure procedure is an outcome that is often sub!ect to outliers. uppose in an observational study investigators wish to assess a ssess whether there is a di"erence di"erence in the days patients spend in the hospital following liver transplant in for$prot for$prot versus nonprot hospitals. uppose we measure days in the hospital following transplant in n@011 participants, 51 from for$prot for$prot and 51 from non$prot hospitals. The number of days in the hospital are summarized by the bo($whisker plot below.
)istribution of )ays in the #ospital =ollo*ing $ransplant $ransplant
Wote that 56 of the participants stay at most 04 days in the hospital following transplant, while at least 0 stays C5 days which would be considered an outlier. outlier. %ecall from page 3 in the module m odule on ummarizing )ata that we used 0$0.5+C$ 0- as a lower limit and C\0.5+C$ 0- as an upper limit to detect outliers. In the bo($whisker bo($whisker plot above, 01.2, 0@02 and C@04, thus outliers are values below 02$0.5+04$02- @ 4 or above 04\0.5+04$02- @ 22. >imits of )etection
In some studies, the outcome is a continuous variable that is measured with some imprecision +e.g., with clear limits of detection-. 'or e(ample, some instruments or assays cannot measure presence presence of specic quantities above or below certain limits. &I[ viral load is a measure of the amount of virus in the body and is measured as the amount of virus v irus per a certain volume of blood. It can range from not detected or below the limit of detection to hundreds of millions of copies. Thus, in a sample some participants may have measures like 0,25D,111 or 3D,151 copies and others are measured as not detected. If a substantial number of participants have undetectable levels, the distribution of viral load is not normally distributed.
#ypothesis $esting $esting *ith %onparametric $ests
In nonparametric tests, the hypotheses are not about population parameters +e.g., ?@51 or ?0@?2-. Instead, the null hypothesis is more general. 'or e(ample, when comparing two
independent groups in terms of a continuous outcome, the null hypothesis in a parametric test is &1A ?0 @?2. In a nonparametric test the null hypothesis is that the two populations are equal, often this is interpreted as the two populations are equal in terms of their central tendency .
&dvantages of %onparametric $ests $ests
Wonparametric tests have some distinct advantages.
Introduction to %onparametric $esting $esting
This module will describe some popular nonparametric tests for continuous continuous C outcomes. Interested readers should see =onover for a more comprehensive comprehensive coverage of nonparametric tests.
ey =onceptA /arametric tests are generally more powerful and can test a wider range of alternative hypotheses. It is worth repeating that if data are appro(imately appro(imately normally distributed then parametric tests +as in the modules on hypothesis testing- are a re more appropriate. &owever, there are situations in which assumptions for a parametric test are violated and a nonparametric test is more
appropriate.
The techniques described here here apply to outcomes that are ordinal, ranked, ranked, or continuous outcome variables that are not normally no rmally distributed. %ecall %ecall that continuous outcomes are quantitative measures based on a specic measurement scale +e.g., weight in pounds, height in inches-. ome investigators make the distinction between continuous, interval and ordinal scaled data. Interval data are like continuous data in that they are measured on a constant scale +i.e., there e(ists the same di"erence between between ad!acent scale scores across the entire spectrum of scores-. )i"erences )i"erences between interval scores are interpretable, but ratios are not. Temperature in =elsius or 'ahrenheit 'ahrenheit is an e(ample of an interval scale outcome. The di"erence di"erence between C1 and D1 is the same as the di"erence between 1 and 31, yet 31 is not twice as warm as D1. ;rdinal outcomes can be less specic as the ordered categories need not be equally spaced. ymptom severity is an e(ample of an ordinal outcome and it is not clear whether the di"erence di"erence between much worse and slightly worse is the same as a s the di"erence di"erence between no change and slightly improved. ome studies use visual scales to assess participants self$reported self$reported signs and symptoms. /ain is often measured in this way, from f rom 1 to 01 with 1 representing no pain and 01 representing agonizing pain. /articipants are sometimes shown a visual scale such as that shown in the upper portion of the gure below and asked to choose the number that best represents represents their pain state. ometimes pain scales use visual anchors as shown in the lower portion of the gure below. below. +isual Pain Scale
In the upper portion of the gure, certainly 01 is worse than E, which is worse than 3X however, the di"erence between ad!acent scores may not necessarily be the same. It is important to understand how outcomes are measured to make appropriate inferences inferences based on statistical analysis and, in particular, not to overstate precision. &ssigning Ran.s
The nonparametric procedures procedures that we describe describe here follow follow the same general procedure. procedure. The outcome variable +ordinal, interval or continuous- is ranked from lowest to highest and the analysis focuses on the ranks as opposed to the measured or raw values. 'or e(ample, suppose we measure self$reported self$reported pain using a visual analog ana log scale with anchors at 1 +no pain- and 01 +agonizing pain- and record the following in a sample of n@4 participantsA
5
E
C
1
2
The ranks, which are used to perform a nonparametric test, are assigned as followsA 'irst, 'irst, the data are ordered ordered from smallest to largest. The lowest value is then assigned a rank of 0, the ne(t lowest a rank of 2 and so on. The largest value is assigned a rank of n +in this e(ample, n@4-. The observed data and corresponding corresponding ranks are shown belowA ;rdered ;bserved )ata?
12C5E
Ran.s?
02CD54
# complicating issue that arises when assigning ranks occurs when there are ties in the sample +i.e., the same values are measured in two or more participants-. 'or e(ample, suppose that the following data are observed in our sample of n@4A Kbserved )ataA
E
C
1
2
The Dth and 5th ordered values are both equal to .
;rdered ;bserved 1. 2. C. E )ata? 5 5 5
Ran.s?
0. 2. C. D. D. 4 5 5 5 5 5
uppose that there there are three values of . In this case, we assign a rank of 5 +the mean of D, 5 and 4- to the Dth, 5th and 4th values, as followsA ;rdered ;bserved )ata?
12C
Ran.s?
02C555
sing this approach of assigning the mean rank when there are ties ensures ensures that the sum of the ranks is the same in each sample +for e(ample, 0\2\C\D\5\4@20, 0\2\C\D.5\D.5\4@20 and 0\2\C\5\5\5@20-. sing this approach, the sum of the ranks will always equal n+n\0-Z2.
others we re!ect & 1 if the test statistic is small. -2> $ >(pected frequency in corresponding category. =hi quare quare Test Test )egrees ) egrees of 'reedom The degree of freedom freedom for the chi square di"erence test is equal to the di"erence di"erence between degree of freedom associated with the models. >ach type of two way table has its own chi$square distribution, depending on the number of rows and columns, and each chi$square distribution is identied by its degree of freedom. # two way table with r rows and c column uses a chi$square distribution with +r $ 0-V+c $ 0- degree of freedom. 0. 'or one one degree degree of freedom, freedom, the distribution distribution looks looks like like a hyperbola. hyperbola.
2. 'or than than one degree degree of freedom, freedom, it it loos like like a mound that has a long long right tail. =hi quare Test of Independence =hi square test is applied when we have two categorical variables from a single population. It is used to determine whether there is a signicant association between the two variables. This test is applicable when the observations are independent +random-. The =hi$square test for independence is also called a contingency table =hi$square test. Chi Square $est $est of Independence Indepe ndence E@ample
'or a given population, we consider two attributes and we may nd the dependence between them. (pected value-2>(pected value )egree of =reedom for the Chi(Square $est $est for Aoodness of =it The number of degree of freedom freedom that we calculate for the =hi$square =hi$square test for goodness of t re7ects the number of categories that we are comparing minus one. )egree of freedom +df- @ c $ 0
=hi quare quare )i"erence )i" erence Test Test The chi square di"erence di"erence test is very useful both for making simpler models more comple( and for making comple( models simpler. simpler. # more accurate test can be obtained by performing a chi square di"erence test. •
>stimating the original model.
•
>stimating the revised model in which new path has been added.
•
=alculating the di"erence between the two resulting chi square values.
The resulting chi square square di"erence statistic also has a chi square distribution. distribution. The degree of freedom for the chi square di"erence test is equal to the di"erence di"erence between degree of freedom associated with the models. m odels. i!-2>i!
markedly from the e(pected frequencies.
•
'or the chi$square test to be meaningful it is imperative that each person, item or entity contributes to only one cell of the contingency table. Ooth independent and dependent variables are categorical with two or more levels.
•
•
The data consist of frequencies, frequencies, not scores. >ach randomly selected observation can be classied into only one category for the independent variable and only one category for the dependent variable.
/urpose of =hi quare Test =hi$square test is one of the simplest and most widely used non$parametric tests. The chi$square test is the most commonly used method for comparing frequencies frequencies or proportions. It is a statistical test used to compare observed data with data that would be e(pected according to a given hypothesis. It is very popularly known as test of goodness of t for f or the reason that it enables us to ascertain how appropriately the theoretical theoretical distributions. =hi quare Test Table
Table Table for =hi square test is is given belowA belowA
=hi quare Test >(ample =hi$quare Test 0.- 9oodness$of 9oodness$of$t$te $t$test st &oA fo@fe &aA foYfe df@ c$0
∑ ( fo− fe ) XC =
where fo@ observed frequency fe@ e(pected frequency
2
2
fe
>(ampleA The city distributor of air conditioners in the city of Banila has divided the area into four sub$areas. # prospective buyer of the distributorship was told that the installations of the equipment are equally distributed. The
prospective buyer took a random sample of D1 installed performed during the past years from the corporation le and found the followingA O #%># # WK IWT#** 4
O 02
= 0D
) 3
TKT#* D1
Oased on the information can we say that the units are equally distributedS se J@ 1.15 0.- &oA fo@fe fo@fe &aAfoYfe 2.- =hi$square =hi$square test C.C.- J@ 1.1 1.15 5 Ftab@ .305 )f@ c$0 D$0 )f@ C D.- re!ec re!ectt &o if F=2 P Ftab2 5.- fe@ D1ZD D1ZD @ 01 2 F= @ +4$01- 2\+02$01-2\+0D$01-2\+3$01-2 01 2 F= @ D 4.4.- inc ince e F=2 Q Ftab2 do not re!ect &o .- The units are are equally distributed. distributed. 2.- Test of Independence fe =
∑ r∑ c n
(ampleA a survey was conducted to determine whether gender and age are related among stereo shop customers. # total of 211 respondents was taken and the results are presented in the table. #ge nder C1
Bale 41
fe
9ender 'emale 51
fe CC
$;$&> 22:
C1 and over $;$&>
31
4C
01
25:
2
9:
7:
3::
=onduct attest whether gender and age of stereo shop customers are independent at 06 level of signicance 0.- &oA 9ender and age of stereo shop customers are independent &aA 9ender and age of stereo shop customers are dependent 2.- =hi$ square square test test C.C.- J@ 1.E 1.E )f@ +2$0-+2$0@0 Ftab2@ 4.4C5 D.- %e!ect e!ect &o if F =2 P Ftab2
∑ ( fo− fe) Xc =
2
2
5.-
fe
Xc
2
=
( 60 −77 ) 77
2
2
+
( 80 −63 ) 63
2
+
( 50 −33) 33
2
+
( 10−27 ) 27
F=2@ 2.31 4.4.- ince ince F=2 P Ftab2 X re!ect &o .- 9ender and age of stereo stereo shop customers are independent # certain school classied 025 students according to the intelligence and family economic levels. The results is as followsA >conomic *evel
Intelligence )ull
fe
%ich
30
Biddle class
0D0
/oor
02
023.4 050.E D 43.C3
$;$&>
459
#vera ge C22 D5 04C 853
fe CD.C 0 D01.0 0 03D.5 3
Intellige nt 2C 05C 0D3 54
fe 041.1 0 033.E 5 35.1D
$;$& > 747 862 44 2836
sing this results, can we conclude that intelligence is related to the economic levelS se 06 level of signicance 0.- &oA Intelligence is not related to the economic level level &aA Intelligence is related to the economic level 2.- =hi$ square square test test C.C.- J@ 1.1 1.10 0 )f@ +C$0-+C$0@D Ftab2@ 0C.2 D.- %e!ect e!ect &o if F =2 P Ftab2
∑ ( fo− fe) Xc =
2
5.-
2
fe
2
Xc
2
2
2
2
2
( 81−128.67 ) (322−347.31 ) ( 273 −160.01) (141−151.94 ) ( 457 −410.11 ) ( 153 − = + + + + + 128.67
347.31
160.01
151.94
410.11
F=2@ 0C.1 4.4.- ince ince F=2 P Ftab2 X re!ect &o .- Intelligence Intelligence is related related to the economic level level
9iven below are some of the e(amples on chi square test.
E@ercises Buestion 2?
'ind the chi square for the following given datas =olor
Olue
Olack
Orown
Gellow
Kbserved frequency
5
05
01
21
>(pected frequency
01
21
5
C1
18
&ns*er? 'or blue, Kbserved frequency $ >(pected frequency @ 5$01 @ $5
'or black, Kbserved frequency $ >(pected frequency @ 05$21 @ $5 'or brown, Kbserved frequency frequency $ >(pected frequency @ 01$5 @ 5 'or yellow, Kbserved frequency frequency $ >(pected frequency f requency @ 21$C1 @ $01 @E.53CCC!
Buestion 3? 'ind the chi square for the following given datas
=olor
Ol Ola Oro Gello Gello ue ck wn w
Kbserv 01 5 ed freque ncy
25
C5
>(pect 05 C1 C1 ed freque ncy
25
&ns*er? 2.CCC2! Buestion 4? 'ind the chi square for the following given datas
=olor
Ol Ola Oro Gello Gello ue ck wn w
Kbserv 2C 2D C2 ed freque ncy
2C
>(pect 02 C2 25 ed freque ncy
20
&ns*er?
+0D.2CC3Buestion 5? Determine whether the gender and shoe size are dependent among the students of sect section ion ChE ChE 4102 4102 and and instru instructo ctors rs from from Chemi Chemica call Engi Engine neeri ering ng Depar Departme tment nt of Batangas State Universit! Use 0!01for level of significance! Data "e "ender )ale ,emale %&%'(
#elow $ 2 2* 2+
Shoe Size $ and a#ove 1* 11 24
%&%'( 1+ *4 4-
&ns*er? +9ender and shoe size are dependent.-
Buestion 6? 4- samples are selected from the group of male and female students of section ChE 4102 from Chemical Engineering Department of Batangas State Universit with their instructors! instructors! Determine whether gender and height are independent among the students and their professor! Use 0!01 level of significance! "iven data #elow.
GENDER
HEI GHT
b e l o w1 6 0c m
1 6 0c ma n da b o v e
Ma l e
1
1 4
1 5
Fe ma l e
1 6
1 8
3 4
TOTAL
1 7
3 2
4 9
TOTAL
&ns*er? /"ender and height are dependent! Question 6:
%eferenceA httpAZZsphweb.bumc.bu.eduZotltZB/&$ BodulesZOZO1DWonparametricZO1DWonparametricprint.html httpAZZmath.tutorvista.comZstatisticsZchi$square$test.html