c5 Student: ___________________________________________________________________________ 1. The meaning of reliability in in the psychometric sense diers from the meaning of reliability in in the "every day" use of that word in that A. reliability in the "every day sense" is usually "a good thing." B. reliability in the psychometric sense is usually "a good thing." C. reliability in the psychometric sense has greater implications. D. one of these
!. hich is T#$% about reliability in in the psychometric psychometric sense& A. reliability is an all'or'none measurement B. a test may be reliable in one conte(t and unreliable in another C. a reliability coe)cient may not be derived for personality tests D. alternate forms reliability may not be derived for personality tests
*. +n classical test theory, an observed score on an ability test is presumed to represent the testta-ers A. true score. B. true score less the variance. C. true score combined with e(traneous factors. D. the testta-ers true score and error.
/. +n an illustrative scenario described in Chapter 0 of your te(t, a group of 1! th grade "whi -ids" in math, newly arrived to the $nited 2tates from China, perform poorly on a test of 1!th grade math. According to the te(t, what probably accounted for this& A. lower standards in China as compared to the $2 for measuring math ability B. higher standards in the $2 as compared to China for earning high grades C. the ability of the Chinese students to read what was re3uired in %nglish D. the reliability of the instrument used to test 1! th grade math s-ills 0. hich is T#$% of measurement error& A. 4i-e error in general, measurement error may be random or systematic. B. $nli-e error in general, measurement error may be random or systematic. C. 5easurement error is always random. D. 5easurement error is always systematic.
6. This variety of error has also been referred to as "noise." +t is A. systematic syste matic error e rror.. B. random error. C. measurement m easurement error e rror.. D. bac-ground error.
7. A all 2treet 2ecurities 8rm that is actually located on all 2treet is testing a group of candidates for their aptitude in 8nance and business. As the testing begins, an une(pected "9ccupy all 2treet" sit'in ta-es place. :rom a psychometric perspective in the conte(t of this testing, the sit'in is viewed as A. systematic syste matic error e rror.. B. random error. C. test administration error. D. bac-ground error.
;. A test entails behavioral observation and rating of front des- cler-s to determine whether or not they greet guests with a smile. hich type of error is this test most susceptible susceptible to& A. test administration error B. test construction error C. e(aminer'related error D. polling error
<. %rror in the reporting of spousal abuse may result from A. one partner simply forgets all of the details of the abuse. B. one partner misunderstands the instructions for reporting. C. one partner is ashamed to report the abuse. D. All of these
1=. 2tanley >1<71? wrote that in classical test theory, a so'called "true score" is "not the ultimate fact in the boo- of the recording angel." By this, 2tanley meant that A. it would be imprudent to trust in Divine in@uence when estimating variance. B. the amount of test variance that is true relative to error may never be -nown. C. it is near impossible to separate fact from 8ction with regard to "true scores." D. All of these
6. This variety of error has also been referred to as "noise." +t is A. systematic syste matic error e rror.. B. random error. C. measurement m easurement error e rror.. D. bac-ground error.
7. A all 2treet 2ecurities 8rm that is actually located on all 2treet is testing a group of candidates for their aptitude in 8nance and business. As the testing begins, an une(pected "9ccupy all 2treet" sit'in ta-es place. :rom a psychometric perspective in the conte(t of this testing, the sit'in is viewed as A. systematic syste matic error e rror.. B. random error. C. test administration error. D. bac-ground error.
;. A test entails behavioral observation and rating of front des- cler-s to determine whether or not they greet guests with a smile. hich type of error is this test most susceptible susceptible to& A. test administration error B. test construction error C. e(aminer'related error D. polling error
<. %rror in the reporting of spousal abuse may result from A. one partner simply forgets all of the details of the abuse. B. one partner misunderstands the instructions for reporting. C. one partner is ashamed to report the abuse. D. All of these
1=. 2tanley >1<71? wrote that in classical test theory, a so'called "true score" is "not the ultimate fact in the boo- of the recording angel." By this, 2tanley meant that A. it would be imprudent to trust in Divine in@uence when estimating variance. B. the amount of test variance that is true relative to error may never be -nown. C. it is near impossible to separate fact from 8ction with regard to "true scores." D. All of these
11. The term test heterogeneity test heterogeneity B%2T B%2T refers to the e(tent to which test items measure A. dierent factors. B. the same factor. C. a unifactorial trait. D. a nonhomogeneous trait.
1!. The more homogeneous a test is, the A. less inter'item consistency it can be e(pected to have. B. more utility the test has for measuring multifaceted variables. C. more inter'item consistency it can be e(pected to have. D. one of these
1*. hich would 9T be useful in estimating a tests inter'item consistency& A. Cronbachs alpha B. the uder'#ichardson formulas C. the average proportional distance D. a coe)cient of e3uivalence
1/. Cronbach's alpha is alpha is to similarity of similarity of scores scores on test items test items as as average proporti average proportional onal distance is distance is to A. dierence in scores on test items B. inter'item consistency C. test'retest reliability D. parallel forms reliability
10. 9ne of the problems associated with classical test theory has to do with A. the notion that there is a "true score" on a test has great intuitive appeal. B. the fact that CTT assumptions are often characteried as "wea-." C. its assumptions concerning the e3uivalence of all items on a test. D. its assumptions allow for its application in most situations.
16. hich of the following is 9T an alternative to classical test theory cited in your te(t& A. generaliability theory B. representational theory C. domain sampling theory D. latent trait theory
17. Item response theory is to latenttrait theory as observer reliability is to A. generaliability theory. B. domain sampling theory. C. odd'even reliability. D. inter'scorer reliability.
1;. The multiple'choice test items on this e(amination are all e(amples of A. dichotomous test items. B. latent trait test items. C. polytomous test items. D. one of these
1<. A condence interval is a range or band of test scores that A. has proven test'retest reliability. B. is calculated using the standard error of the dierence. C. is li-ely to contain the true score. D. one of these
!=. The standard error of measurement is A. used to infer how far an observed score is from the true score. B. also -nown as the standard error of a score. C. is used in the conte(t of classical test theory. D. All of these
!1. Reliability , in a broad statistical sense, is synonymous with A. consistently good. B. consistently bad. C. consistency. D. validity.
!!. A reliability coecient is A. an inde(. B. a proportion of the total variance attributed to true variance. C. unaected by a systematic source of error. D. All of these
!*. hich of the following is true of systematic error & A. +t signi8cantly lowers the reliability of a measure. B. +t insigni8cantly lowers the reliability of a measure. C. +t increases the reliability of a measure. D. +t has no eect on the reliability of a measure.
!/. As the degree of reliability increases, the proportion of A. total variance attributed to true variance decreases. B. total variance attributed to true variance increases. C. total variance attributed to error variance increases. D. one of these
!0. hy might ability test scores among testta-ers most typically vary& A. because of the true ability of the testta-er B. because of irrelevant, unwanted in@uences C. All of the above D. one of the above
!6. A source of error variance may ta-e the form of A. item sampling. B. testta-ers reactions to environment'related variables such as room temperature and lighting. C. testta-er variables such as amount of sleep the night before a test, amount of an(iety, or drug eects. D. All of the above
!7. Computer'scorable items have tended to eliminate error variance due to A. item sampling. B. scorer dierences. C. content sampling. D. testta-ers reactions to environmental variables.
!;. hich type of reliability estimate is obtained by correlating pairs of scores from the same person >or people? on two dierent administrations of the same test& A. a parallel'forms estimate B. a split'half estimate C. a test'retest estimate D. an au'pair estimate
!<. hich type of reliability estimate would be appropriate only when evaluating the reliability of a test that measures a trait that is presumed to be relatively stable time& A. parallel'forms B. alternate'forms C. test'retest D. split'half
*=. An estimate of test'retest reliability is often referred to as a coecient of stability when the time interval between the test and retest is more than A. *= days. B. 6= days. C. * months. D. 6 months.
*1. hich of the following might lead to a decrease in test'retest reliability& A. the passage of time between the two administrations of the test. B. coaching designed to increase test scores between the two administrations of the test. C. practice with similar test materials between the two administrations of the test. D. All of these
*!. hich of the following is T#$% for estimates of alternate' and parallel'forms reliability& A. Two test administrations with the same group are re3uired. B. Test scores may be aected by factors such as motivation, fatigue, or intervening events li-e practice, learning, or therapy. C. +tem sampling is a source of error variance. D. All of these
**. hich of the following is T#$% for parallel forms of a test& A. The means of the observed scores are e3ual for the two forms. B. The variances of the estimated scores are e3ual for the two forms. C. The means and variances of the observed scores are e3ual for the two forms. D. The means and variances of the estimated scores are e3ual for the two forms.
*/. hich source of error variance aects parallel' or alternate'form reliability estimates but does not aect test'retest estimates& A. fatigue B. learning C. practice D. item sampling
*0. hich of the following types of reliability estimates is the most e(pensive due to the costs involved in test development& A. test'retest B. parallel'form C. internal'consistency D. 2pearmans rho
*6. hat term refers to the degree of correlation between all the items on a scale& A. inter'item homogeneity B. inter'item consistency C. inter'item heterogeneity D. parallel'form reliability
*7. Test'retest estimates of reliability are referred to as measures of , and split' half reliability estimates are referred to as measures of . A. true scores error scores B. internal consistency stability C. interscorer reliability consistency D. stability internal consistency
*;. hich of the following is usually minimied when using split'half estimates of reliability as compared with test'retest or parallelalternate'form estimates of reliability& A. time and e(pense B. reliability and validity C. reliability only D. time spent in scoring and interpretation
*<. hich of the following factors may in@uence a split'half reliability estimate& A. fatigue B. an(iety C. item di)culty D. All of these
/=. +nternal'consistency estimates of reliability are inappropriate for A. reading achievement tests. B. scholastic aptitudeintelligence tests. C. word processing tests based on speed. D. tests purporting to measure a single personality trait.
/1. The 2pearman'Brown formula is used forE A. correcting for one half of the test by estimating the reliability of the whole test. B. determining how many additional items are needed to increase reliability up to a certain level. C. determining how many items can be eliminated without reducing reliability below a predetermined level. D. All of these
/!. :or a heterogeneous test, measures of internal'consistency reliability will tend to be compared with other methods of estimating reliability. A. higher B. lower C. very similar or higher D. more robust
/*. Typically, adding items to a test will have what eect on the tests reliability& A. #eliability will decrease. B. #eliability will increase. C. #eliability will stay the same. D. #eliability will 8rst increase and then decrease.
//. %rror variance for measures of inter'item consistency comes from A. fatigue. B. motivation. C. a testta-er practice eect. D. heterogeneity of the content.
/0. +f items from a test are measuring the same trait, estimates of reliability yielded from split'half methods will typically be as compared to estimates from #'!=. A. higher B. lower C. similar D. appro(imately the same
/6. hich of the following is 9T an acceptable way to divide a test when using the split' half reliability method& A. #andomly assign items to each half of the test. B. Assign odd'numbered items to one half and even'numbered items to the other half of the test. C. Assign the 8rst'half of the items to one half of the test and the second half of the items to the other half of the test. D. Assign easy items to one half of the test and di)cult items to the other half of the test.
/7. +f items on a test are measuring very dierent traits, estimates of reliability yielded from split'half methods will typically be as compared with estimates from #' !=. A. higher B. lower C. similar D. appro(imately the same
/;. #'!= is the statistic of choice for tests with which types of items& A. multiple'choice B. true'false C. All of these D. one of these
/<. The #'!1 reliability estimate was developed A. to yield greater consistency in reliability coe)cients. B. to facilitate computation by hand. C. for use with less homogeneous items. D. because uder wanted to "one'up" #ichardsons !=.
0=. hich is 9T an assumption that should be met in order to use #'!1& A. +tems should be dichotomous. B. +tems should be of e3ual di)culty. C. +tems should be homogeneous. D. +tems should be scorable by computer.
01. hich of the following is generally the preferred statistic for obtaining a measure of internal'consistency reliability& A. #'!= B. #'!1 C. endalls Tau D. coe)cient alpha
0!. Coe)cient alpha is appropriate to use with all of the following test formats %FC%GT A. multiple'choice. B. true'false. C. short'answer for which partial credit is awarded. D. essay e(am with no partial credit awarded.
0*. The "!=" and "!1" in #'!= and #'!1 represent A. numbers held constant in the denominator. B. numbers held constant in the numerator. C. the order in which the formulas were created. D. the age of :red uders son and nephew at the time the formulas were developed.
0/. Coecientalpha is an e(pression of A. the mean of split'half correlations between odd' and even'numbered items. B. the mean of split'half correlations between 8rst' and second'half items. C. the mean of all possible split'half correlations. D. the mean of the best or "alpha" level split'half correlations.
00. A coe)cient alpha over .< may indicate that A. the items in the test are too dissimilar. B. the test is not reliable. C. the items in the test are redundant. D. the test is biased against low'ability individuals.
06. hich of the following is T#$% about coecient alpha& A. uder thought it to be single best measure of reliability. B. +t was 8rst conceived by Alfalfa Alpha. C. +t is a characteristic of a particular set of scores, not of the test itself. D. one of these
07. A synonym for interscorer reliability is A. interHudge reliability B. observer reliability C. interrater reliability D. All of these
0;. hich B%2T conveys the meaning of an inter'scorer reliability estimate of .<=& A. inety percent of the scores obtained are reliable. B. inety percent of the variance in the scores assigned by the scorers was attributed to true dierences and 1=I to error. C. Ten percent of the variance in the scores assigned by the scorers was attributed to true dierences and <=I to error. D. Ten percent of the tests items are in need of revision according to the maHority of the tests users.
0<. hen more than two scorers are used to determine inter'scorer reliability, the statistic of choice is A. Gearson r. B. 2pearmans rho. C. #'!=. D. coe)cient alpha.
6=. :or determining the reliability of tests scored using nominal scales of measurement, the statistic of choice is A. endalls Tau. B. the appa statistic. C. #'!=. D. coe)cient alpha.
61. +f a test is homogeneous A. it is functionally uniform throughout. B. it will li-ely yield a high internal'consistency reliability estimate compared with a test' retest reliability estimate. C. it would be reasonable to e(pect a high degree of internal consistency. D. All of these
6!. hich type>s? of reliability estimates would be most appropriate for a measure of heart rate& A. test'retest B. alternate'form C. parallel form D. internist consistency
6*. +f a time limit is long enough to allow test'ta-ers to attempt all items, and if some items are so di)cult that no test'ta-er is able to obtain a perfect score, then the test is referred to as a test. A. speed B. power C. reliable D. valid
6/. Typically, speed tests A. contain items of a uniform di)culty level. B. are completed by fewer than 1I of all test'ta-ers. C. have low validity coe)cients. D. yield high rates of false positives.
60. hich type>s? of reliability estimates would be appropriate for a speed test& A. test'retest B. alternate'form C. split'half from two independent testing sessions D. All of these
66. hich of the following would result in the 4%A2T appropriate estimate of reliability for a speed test& A. test'retest B. alternate'form C. split'half from a single administration of the test D. split'half from two independent testing sessions
67. A uder'#ichardson >#? or split'half estimate of reliability for a speed test would provide an estimate that is A. spuriously low. B. spuriously high. C. insigni8cant. D. e3ual to a test'retest method.
6;. A measure of clerical speed is obtained by a test that has respondents alphabetie inde( cards. The manual for this test cites a split'half reliability coe)cient for a single administration of the test of .<0. hat might you conclude& A. The test is highly reliable. B. The published reliability estimate is spuriously low and would have been higher had another estimate been used. C. The split'half estimate should not have been used in this instance. D. Clerical speed is too vague a construct to measure.
6<. The 2pearman'Brown formula can be used for which types of tests& A. speed and multiple'choice B. true'false and multiple'choice C. speed, true'false, and multiple'choice D. trade school and driving tests
7=. An estimate of the reliability of a speed test is a measure of A. the stability of the test. B. the consistency of the response speed. C. the homogeneity of the test items. D. All of these
71. $se of the 2pearman'Brown formula would be +AGG#9G#+AT% to A. estimate the eect on reliability of shortening a test. B. determine the number of items needed in a test to obtain the desired level of reliability. C. estimate the internal consistency of a speed test. D. All of these
7!. +nterpretations of criterion'referenced tests are typically made with respect to A. the total number of items the e(aminee responded to. B. the material that the e(aminee evidenced mastery of. C. a comparison of the e(aminees performance with that of others who too- the test. D. a formula that ta-es into account the total number of items for which no response was scorable.
7*. Traditional measures of reliability are inappropriate for criterion'referenced tests because variability A. is ma(imied with criterion'referenced tests. B. is minimied with criterion'referenced tests. C. is variable with criterion'referenced tests. D. cannot be determined with criterion'referenced tests
7/. +f traditional measures of reliability are applied to a criterion'referenced test, the reliability estimate will li-ely be A. spuriously low. B. spuriously high. C. e(actly ero. D. one of these
70. The fact that the length of a test in@uences the sie of the reliability coe)cient is based on which theory of measurement& A. classical test theory >CTT? B. generaliability theory C. domain sampling theory D. item response theory >+#T?
76. hich estimate of reliability is most consistent with the domain sampling theory& A. test'retest B. alternate'form C. internal'consistency D. interscorer
77. Classical reliability theory estimates the portion of a test score that is attributed to , and domain sampling theory estimates . A. speci8c sources of variation error B. error speci8c sources of variation C. the s-ills being measured variation D. the s-ills being measured content -nowledge
7;. Item response theory >+#T? focuses on the A. circumstances that inspired the development of the test. B. test administration variables. C. individual items of a test. D. "how and why" of the +nterborough #apid Transit line
7<. Jeneraliability theory focuses on which of the following& A. the circumstances under which a test was developed B. the circumstances under which a test is administered C. the circumstances under which a test is interpreted D. All of these
;=. The standard deviation of a theoretically normal distribution of test scores obtained by one person on e3uivalent tests is A. the standard error of the dierence between means. B. the standard error of measurement. C. the standard deviation of the reliability coe)cient. D. the variance.
;1. hich of the following is 9T a part of the formula for the standard error of measurement for a particular test& A. the validity of the test B. the reliability of the test C. the standard deviation of the group of test scores D. Both b and c
;!. "2i(ty'eight percent of the scores for a particular test fall between 0; and 61" is a statement regarding A. the utility of a test. B. the reliability of a test. C. the validity of a test. D. one of these
;*. The standard error of measurement of a particular test of an(iety is ;. A student earns a score of 6=. hat is the con8dence interval for this test score at the <0I level& A. 0!'6; B. /='6; C. //'76 D. *6';/
;/. As the con8dence interval increases, the range of scores into which a single test score falls is li-ely to A. decrease. B. increase. C. remain the same. D. alternately decrease and increase.
;0. As the reliability of a test increases, the standard error of measurement A. increases. B. decreases. C. remains the same. D. alternately increases, then decreases.
;6. +f the standard deviations of two tests are identical but the reliability is lower for Test A as compared to Test B, then the standard error of measurement will be for Test A as compared with Test B. A. higher B. lower C. the same D. one of these
;7. hich statistic can help the test user determine how large a dierence must e(ist for scores yielded from two dierent tests to be considered statistically dierent& A. standard error of measurement between two scores B. standard error of the dierence between two scores C. observed variance minus error variance D. standard error of the dierence between two means
;;. The standard error of the dierence between two scores is larger than the standard error of measurement for either score because the standard error of the dierence between the two scores is aected by A. the true score variance of each score. B. the standard deviation of each score summed. C. the measurement error inherent in both scores. D. All of these
;<. A guidance counselor wishes to determine if a student scored higher on a mathematics test than on a reading test. hat statistic>s? would be 592T useful& A. the standard error of measurement for each test score B. the standard error of the dierence between two scores C. the raw score on each test as well as the mean of each distribution D. the mean of each distribution and inde( of test di)culty for each test.
<=. The in generaliability theory is analogous to the reliability coe)cient in classical test theory. A. universe coe)cient B. coe)cient of generaliability C. universe score D. #oulin coe)cient
<1. According to Cronbach et al.s generaliability theory, "facets" include A. the number of test items. B. the amount of training the e(aminers received. C. the purpose of administering the test. D. All of these
<*. +n classical test theory, there e(ists only one true score. +n Cronbach generaliability theory, how many "true scores" e(ist& A. one B. as many as the number of times the test is administered to the same individual C. many, depending on the number of dierent universes D. one of these
<0. +f a device to measure blood pressure consistently overestimated every assessees actual blood pressure by 1= units, which of the following would be T#$% of the reliability of this measuring device as the years passed& A. +t would increase. B. +t would decrease. C. +t would not be aected. D. +t would alternately decrease and increase.
<6. +n general, which of the following is T#$% of the relationship between the magnitude of the test'retest reliability estimate and the length of the interval between test administrations& A. The longer the interval, the lower the reliability coe)cient. B. The longer the interval, the higher the reliability coe)cient. C. The magnitude of the reliability coe)cient is typically not aected by the length of the interval between test administrations. D. The magnitude of the reliability coe)cient is always aected by the length of the interval between test administrations, but one cannot predict how it is aected.
<7. hat is the dierence between alternate forms and parallel forms of a test& A. Alternate forms do not necessarily yield test scores with e3ual means and variances. B. Alternate forms are designed to be e3uivalent only with regard to level of di)culty. C. Alternate forms are dierent only with respect to how they are administered. D. There are no dierences between alternate and parallel forms of a test.
<;. Coecientalpha is the reliability estimate of choice for tests A. with dichotomous items and binary scoring. B. with homogeneous items. C. that can be scored along a continuum of values. D. that contain heterogeneous item content and binary scoring.
<<. +n which type>s? of reliability estimates would test construction 9T be a signi8cant source of error variance& A. test'retest B. alternate'form C. split'half D. uder'#ichardson
1==. +f the variance of either variable is restricted by the sampling procedures used, then the magnitude of the coe)cient of reliability will be A. lowered. B. raised. C. unaected. D. aected only in tests with a true'false format.
1=1. :or criterion'referenced tests, which of the following reliability estimates is recommended& A. test'retest reliability estimates B. alternate'form reliability estimates C. split'half reliability estimates D. one of these
1=!. hich of the following is T#$% of domain sampling theory& A. +t supports the e(istence of a "true score" when measuring psychological constructs. B. +t can be used to argue against the e(istence of a "true score" when measuring psychological constructs. C. either uder nor #ichardson found it to have any applied value. D. All of these
1=*. +f a student received a score of 0= on a math test with a standard error of measurement of *, which of the following statements would be T#$% of the "true score"& A. +n 6;I of the cases, the "true score" would be e(pected to be between // and 06. B. +n 6;I of the cases, the "true score" would be e(pected to be between /7 and 0*. C. +n <0I of the cases, the "true score" would be e(pected to be between /7 and 0*. D. +n <0I of the cases, the "true score" would be e(pected to be between // and 06.
1=/. A psychologist administers a test and the test'ta-er scores a 0!. +f the cut'o score for eligibility for a particular program is 0=, what inde( will best help the psychologist determine how much con8dence to place in the test'ta-ers obtained score of 0!& A. the standard error of dierence B. the standard error of measurement C. measures of central tendencyE mean, median, or mode D. measures of variability such as the standard deviation
1=0. hich of the following is T#$% of both the standard error of measurement and the standard error of dierence& A. Both provide con8dence levels. B. Both can be used to compute con8dence intervals for short answer tests. C. Both can be used to compare performance between two dierent tests. D. Both are abbreviated by 2%5.
1=6. Test'retest reliability estimates of breathalyers have A. a margin of error of appro(imately one'hundredth of a percentage point. B. a margin of error of one percentage point. C. a margin of error so high that they must be deemed unreliable. D. not been done in the 2tate of Alas-a.
1=7. A police o)cer administers a breathalyer test to a suspected drun- driver, does not put on his glasses to read the meter, and as a result, mista-enly records the blood alcohol level. This is the -ind of mista-e that is B%2T with which type of reliability estimates& A. test'retest B. interscorer C. internal'consistency D. situational
1=;. hich of the following statements is T#$% regarding the dierences between a power test and a speed test& A. Gower tests involve physical strength speed tests do not. B. +n a power test, the testta-er has time to complete all items in a speed test, a speci8c time limit is imposed. C. +n a power test, a broad range of -nowledge is assessed in a speed test, a narrower range of -nowledge is assessed. D. Both b and c
1=<. The inde( that allows a test user to compare two peoples scores on a speci8c test to determine if the true scores are li-ely to be dierent is A. the standard error of the mean. B. the standard error of the dierence. C. the standard deviation. D. the correlation coe)cient.
11=. hich type of reliability is directly aected by the heterogeneity of a test& A. test'retest B. interrater C. internal'consistency D. alternate'forms or parallel'forms
111. Jeneraliability theory is most closely related to A. developing norms. B. item analysis. C. test reliability. D. the way things are "in general."
11!. A test of attention span has a reliability coe)cent of .;/. The average score on the test is 1=, with a standard deviation of 0. 4awrence received a score of 6/ on the test. e can be <0I sure that 4awrences "true" attention span score falls between A. 6* and 60. B. 6! and 66. C. 6= and 6;. D. 0/ and 7/.
11*. By de8nition, estimates of reliability can range from to . A. '*.== K*.== B. 1 1= C. = 1 D. '1 to 1
11/. $sing estimates of internal consistency, which of the following tests would li-ely yield the highest reliability coe)cients& A. a test of general intelligence B. a test of achievement in a basic s-ill such as mathematics C. a test of reading comprehension D. a test of vocational interest
110. hat type of reliability estimate is appropriate for use in a comparison of ":orm A" to ":orm B" of a picture vocabulary test& A. test'retest B. alternate'forms C. inter'rater D. internal'consistency
116. hat inde( of reliability would you use to compare two evaluators assessments of a group of Hob applicants& A. #'!= B. coe)cient alpha C. the appa statistic D. the 2pearman'Brown correction
117. hich of the following is T#$% of the standard error of measurement& A. The larger the standard error of measurement, the better. B. The standard error of measurement is inversely related to the standard deviation >that is, when one goes up, the other goes down?. C. The standard error of measurement is inversely related to reliability >that is, when one goes up, the other goes down?. D. A low standard error of measurement is indicative of low validity.
11;. hat type of reliability estimate is obtained by correlating pairs of scores from the same person on two dierent administrations of the same test& A. parallel'forms B. split'half C. interrater D. test'retest
11<. A test containing 1== items is revised by deleting != items. hat might be e(pected to happen to the magnitude of the reliability estimate for that test& A. +t will be e(pected to increase. B. +t will be e(pected to decrease. C. +t will be e(pected to stay the same. D. +t cannot be determined based on the information provided.
1!=. +n the formula F L T K %, T refers to A. the true score. B. the time factor. C. the average test score. D. test'retest reliability.
1!1. The greater the proportion of the total variance attributed to true variance, the more the test. A. scienti8c B. variable C. reliable D. e(pensive
1!!. A score earned by a testta-er on a psychological test may B%2T be viewed as e3ual to A. the raw score plus the observed score. B. the error score. C. the true score. D. the true score plus error.
1!*. hich is 9T a possible source of error variance& A. test administration B. test scoring C. test interpretation D. All are possible sources of error variance.
1!/. A goal of a test developer is to A. ma(imie error variance. B. minimie true variance. C. ma(imie true variance. D. minimie stress for testta-ers.
1!0. hich of the following is T#$% about systematic and unsystematic error in the assessment of physical and psychological abuse& A. :ew sources of unsystematic error e(ist, due to the nature of what is being assessed. B. :ew sources of systematic error e(ist. C. Jender represents a source of systematic error. D. one of these
1!6. +n general, appro(imately what percentage of scores would be e(pected to fall within two standard deviations above or below the standard error of measurement of the "true score" on a test& A. ;0I B. <=I C. <0I D. <
1!7. +n Chapter 0 of your te(tboo-, you read of the "writing surface on a school desriddled with heart carvings, the legacy of past years students who felt compelled to e(press their eternal devotion to someone now long forgotten." This imagery was designed to graphically illustrate sources of error variance during test A. development. B. administration. C. scoring. D. interpretation.
1!;. +n the Chapter 0 eet an eet an !ssessment !ssessment "rofessional "rofessional feature, feature, Dr. Bryce B. #eeve noted the necessity for very brief 3uestionnaires in his wor- due to the fact that many of his clients wereE A. young children with very short attention spans. B. seriously ill and would 8nd ta-ing tests burdensome. C. visually impaired an unable to focus for an e(tended period of time. D. All of these
1!<. +n the Chapter 0 eet an eet an !ssessment !ssessment "rofessional "rofessional feature, feature, Dr. Bryce B. #eeve cited an e(perience in which he learned that the "%(cellent" response category on a test was best translated as meaning in Chinese& A. "super bad" B. "superlative" "superlative" C. "bad" D. one of these
1*=. The items of a personality test are characteried as heterogeneous in heterogeneous in nature. This tells us that the test measures A. aspects of family history. B. ability to relate to the opposite se(. C. unconscious motivation. D. more than one trait.
1*1. "Coe)cient alpha !=" is a reference to A. a variant of the uder'#ichardson #'!= formula. B. the != th
in a series of formulas developed by Cronbach. C. a !=th'century revision of a Jaltonian e(pression. D. one of these 1*!. ith regard to a value found for coe)cient alpha, A. "bigger is always better." B. "smaller is always better." C. "negative is best." D. one of these
1**. 5ost reliability coe)cients, regardless of the speci8c type of reliability they are measuring, range in value fromE A. '1 to K1 B. = to 1== C. = to 1. D. negative in8nity to positive in8nity
1*/. All indices of reliability provide an inde( that is a characteristic of a particular A. test. B. group of test scores. C. trait. D. approach to measurement.
1*0. The precise amount of error inherent in the reliability estimate published in a test manual will vary with A. the purchase price of the test >the more e(pensive, the less the error?. B. the sample of test'ta-ers from which the data were drawn. C. the population of test user actually using a published test. D. All of these
1*6. Dierent types of reliability coe)cients A. all re@ect the same sources of error variance. B. may re@ect dierent sources of error variance. C. never re@ect the same source of error variance. D. re@ect on error variance during leisure activities.
1*7. A test of infant development contains three scalesE >1? Cognitive Ability, >!? 5otor Development, and >*? Behavior #ating. Because these three scales are designed to measure dierent characteristics >that is, they are not homogeneous?, it would be inappropriate to combine the three scales in calculating estimates of the tests A. alternate'forms reliability. B. internal'consistency reliability. C. test'retest reliability. D. interrater reliability.
1*;. The fact that young children develop rapidly and in "growth spurts" is a problem when it comes to the estimation which type of reliability for an infant development scale& A. internal'consistency reliability B. alternate'forms reliability C. test'retest reliability D. interrater reliability
1*<. +n the language of psychological testing and assessment, reliability B%2T refers to A. how well a test measures what it was originally designed to measure. B. the complete lac- of any systematic error. C. the proportion of total variance that can be attributed to true variance. D. whether or not a test publisher consistently publishes high 3uality instruments.
1/=. Because of the uni3ue problems in assessing very young children, which of the following would be the B%2T practice when attempting to estimate the reliability of tests designed to measure cognitive and motor abilities in infants& A. $se relatively short test'retest intervals. B. $se relatively long test'retest intervals. C. Do not use the test'retest method for estimating reliability of the test. D. $se only inter'scorer reliability estimates.
1/1. +f the variance of either variable in a correlational analysis is restricted by the sampling procedure used, then the resulting correlation coe)cient tends to be A. higher. B. lower. C. unaected. D. unstable.
1/!. +f the variance of either variable in a correlational analysis is in#ated by the sampling procedure used, then the resulting correlation coe)cient tends to be A. higher B. lower. C. unaected. D. unstable.
1/*. The directions for scoring a particular motor ability test instruct the e(aminer to "Jive credit if the child holds his hands open most of the time." Because what constitutes "most of the time" is not speci8cally de8ned, directions such as these could result in lowered reliability estimates for A. test'retest reliability. B. alternate'form reliability. C. inter'rater reliability. D. parallel forms reliability.
1//. A vice president >MG? of personnel employs a "Corporate 2creening Test" in the hiring process. :or future testing purposes, the MG maintains records of scores achieved by as opposed to in order to avoid restriction of range eects. A. Hob applicants hired employees B. hired employees Hob applicants C. successful employees hired employees D. successful employees other corporate o)cers
1/0. +n the $veryday "sychometrics for Chapter 0, psychometric aspects of the Breathalyer were discussed. +n one challenge to the test'retest reliability of this device, the court found A. the margin of error was not ta-en into account by the legislature when it originally wrote the law. B. the margin of error was ta-en into account by the legislature when it originally wrote the law. C. the police o)cer had erred by administering the test at head3uarters and not the site of the infraction. D. one of these
1/6. The $veryday "sychometrics for Chapter 0 dealt with psychometric aspects of the Breathalyer. e learned that in the state of ew Nersey, it is legal and proper to administer a Breathalyer test to a drun- driver A. only at the arrest scene. B. at police head3uarters. C. even if the o)cer is into(icated. D. while a suspect is suc-ing on a breath mint.
1/7. +n the Chapter 0 $veryday "sychometrics on psychometric aspects of the Breathalyer, we read of a police o)cer who intentionally recorded incorrect readings from the instrument. 2uch an event would most appropriately be recalled in the technical manual for this instrument under the heading A. "Test'#etest #eliability." B. "+nternal Consistency #eliability." C. "+nter'2corer #eliability." D. "Oow to :a-e :indings with the Breathalyer <==a."
1/;. According to generaliability theory, a variable such as "number of items in the test" is a description of one A. facet of the universe. B. true element of the dominion. C. dominion in the domain. D. one of these
1/<. Advocates of generaliability theory prefer the use of which of the following terms as an alternative to the use of the term "reliability"& A. generaliability B. universality C. regularity D. dependability
10=. hich is the B%2T e(ample of a dynamic characteristic& A. the stress level of a trapee @yer at a circus B. the intelligence of a college student during 2pring BreaC. the anti'authority attitude of an inmate serving a life term D. one of these
101. As used in Chapter 0 of your te(t, the term in#ation of variance is synonymous with A. restriction of variance. B. restriction of range. C. in@ation of range. D. one of these
10!. +n the term latent trait theory , "latent" is a synonym for A. invisible. B. state. C. undeveloped. D. dormant.
10*. +#T is a term used to refer to A. a model that has many parameters. B. a parameter that has many models. C. a family of models for data analysis. D. a dysfunctional family of models.
10/. A polytomous test item is a test item A. that has multiple tomouss attached B. that has varied tomouss attached C. that has multiple and varied tomouss attached D. one of these
100. The #asch model A. was developed by a Danish mathematician named #asch. B. is an +#T model with speci8c assumptions about the underlying distribution. C. was devised from generaliability D. Both a and b
106. hy isnt +#T used more by "mom'and'pop" test developers such as classroom teachers& A. most classroom teachers were trained in generaliability theory B. +#T has no application in classroom tests C. applying +#T re3uires statistical sophistication D. All of these
107. ho are the primary users of +#T& A. classroom teachers B. commercial test producers C. instructors at universities in Departments of %ducation D. Jeorg #aschs twin sisters
10;. hich of the following is 9T an assumption attendant to the use of +#T& A. the assumption of unidimensionality B. the assumption of heteros-edacity C. the assumption of local independence D. the assumption of monotonicity
10<. +n +#T, the single, continuous latent construct being measured is often symbolied by the Jree- letterE A. alpha. B. beta. C. psy. D. theta.
16=. +f some of the items on a test were locally dependent , it would be reasonable to e(pect thatE A. all test items were designed for members of a speci8c culture. B. all test items were measuring the e(act same thing. C. some test items were measuring something dierent than other test items. D. some test items were structured in a dichotomous format and others were structured in a polytomous format.
161. "The probability of endorsing or selecting an item response indicative of higher levels of theta should increase as the underlying level of theta increases." This 3uote sums up the meaning of the +#T assumption of A. unidimensionality. B. heteros-edacity. C. local independence. D. monotonicity.
16!. The probabilistic relationship between a testta-ers response to a test item and that testta-ers level on the latent construct being measured by the test is e(pressed in graphic form by A. an item characteristic curve. B. an item response curve. C. an item trace line. D. All of these
16*. +ts an +#T tool that is useful in helping test users to better understand the range over theta that an item is most useful for. +ts called A. an item response curve. B. an information function. C. an item trace line. D. one of these
16/. An +#T tool useful in helping test users abbreviate a "long form" of a test to a "short form" is the A. item response curve. B. information function. C. item trace line. D. one of these
160. +n an +#T information curve, the term information magnitude may B%2T be understood as referring to A. theta. B. the range of the underlying construct. C. precision. D. di)culty.
166. Test items with little discriminative ability prompt the test developer to consider the possibility that A. the content of the item does not match the construct measured by the other items in the scale. B. the item is poorly worded and needs to be rewritten. C. the item is too comple( for the educational level of the population. D. All of these
167. According to your te(tboo-, a test of depression that contains an abundance of items that probe the respondents outward e(pression of emotion may be inappropriate for use with A. test'ta-ers who have made suicidal gestures. B. inpatients who have been committed involuntarily. C. veterans diagnosed with GT2D. D. %thiopians.
16;. The fact that cultural factors may be operating to wea-en an items ability to discriminate between groups is evident fromE A. 4ords treatise entitled Item Response %heory . B. an item characteristic curve. C. an information function. D. Jeorg #aschs unauthoried biography, &ou Can ever (e %oo Rich or %oo )Rasch.)
16<. A dierence between the use of coe)cient alpha and +#T for evaluating a tests reliability is that with +#T, it is possible to learn A. how the precision of a scale varies depending on the level of the construct being measured. B. how the level of the construct being measured varies depending on variations in the item characteristic curve. C. the precise numerical value for the tests total interitem consistency. D. All of these
c0 ey
1. The meaning of reliability in the psychometric sense diers from the meaning of reliability in the "every day" use of that word in that A. reliability in the "every day sense" is usually "a good thing." B. reliability in the psychometric sense is usually "a good thing." C. reliability in the psychometric sense has greater implications. D. one of these
Cohen * Chapter +, -.
!. hich is T#$% about reliability in the psychometric sense& A. reliability is an all'or'none measurement B. a test may be reliable in one conte(t and unreliable in another C. a reliability coe)cient may not be derived for personality tests D. alternate forms reliability may not be derived for personality tests
Cohen * Chapter +, -/
*. +n classical test theory, an observed score on an ability test is presumed to represent the testta-ers A. true score. B. true score less the variance. C. true score combined with e(traneous factors. D. the testta-ers true score and error.
Cohen * Chapter +, -0
/. +n an illustrative scenario described in Chapter 0 of your te(t, a group of 1! th grade "whi -ids" in math, newly arrived to the $nited 2tates from China, perform poorly on a test of 1!th grade math. According to the te(t, what probably accounted for this& A. lower standards in China as compared to the $2 for measuring math ability B. higher standards in the $2 as compared to China for earning high grades C. the ability of the Chinese students to read what was re3uired in %nglish D. the reliability of the instrument used to test 1! th grade math s-ills
Cohen * Chapter +, -1
0. hich is T#$% of measurement error& A. 4i-e error in general, measurement error may be random or systematic. B. $nli-e error in general, measurement error may be random or systematic. C. 5easurement error is always random. D. 5easurement error is always systematic.
Cohen * Chapter +, -,
6. This variety of error has also been referred to as "noise." +t is A. systematic error. B. random error. C. measurement error. D. bac-ground error.
Cohen * Chapter +, -2
7. A all 2treet 2ecurities 8rm that is actually located on all 2treet is testing a group of candidates for their aptitude in 8nance and business. As the testing begins, an une(pected "9ccupy all 2treet" sit'in ta-es place. :rom a psychometric perspective in the conte(t of this testing, the sit'in is viewed as A. systematic error. B. random error. C. test administration error. D. bac-ground error.
Cohen * Chapter +, -3
;. A test entails behavioral observation and rating of front des- cler-s to determine whether or not they greet guests with a smile. hich type of error is this test most susceptible to& A. test administration error B. test construction error C. e(aminer'related error D. polling error
Cohen * Chapter +, -4
<. %rror in the reporting of spousal abuse may result from A. one partner simply forgets all of the details of the abuse. B. one partner misunderstands the instructions for reporting. C. one partner is ashamed to report the abuse. D. All of these
Cohen * Chapter +, -5
1=. 2tanley >1<71? wrote that in classical test theory, a so'called "true score" is "not the ultimate fact in the boo- of the recording angel." By this, 2tanley meant that A. it would be imprudent to trust in Divine in@uence when estimating variance. B. the amount of test variance that is true relative to error may never be -nown. C. it is near impossible to separate fact from 8ction with regard to "true scores." D. All of these
Cohen * Chapter +, -.+
11. The term test heterogeneity B%2T refers to the e(tent to which test items measure A. dierent factors. B. the same factor. C. a unifactorial trait. D. a nonhomogeneous trait.
Cohen * Chapter +, -..
1!. The more homogeneous a test is, the A. less inter'item consistency it can be e(pected to have. B. more utility the test has for measuring multifaceted variables. C. more inter'item consistency it can be e(pected to have. D. one of these
Cohen * Chapter +, -./
1*. hich would 9T be useful in estimating a tests inter'item consistency& A. Cronbachs alpha B. the uder'#ichardson formulas C. the average proportional distance D. a coe)cient of e3uivalence
Cohen * Chapter +, -.0
1/. Cronbach's alpha is to similarity of scores on test items as average proportional distance is to A. dierence in scores on test items B. inter'item consistency C. test'retest reliability D. parallel forms reliability
Cohen * Chapter +, -.1
10. 9ne of the problems associated with classical test theory has to do with A. the notion that there is a "true score" on a test has great intuitive appeal. B. the fact that CTT assumptions are often characteried as "wea-." C. its assumptions concerning the e3uivalence of all items on a test. D. its assumptions allow for its application in most situations.
Cohen * Chapter +, -.,
16. hich of the following is 9T an alternative to classical test theory cited in your te(t& A. generaliability theory B. representational theory C. domain sampling theory D. latent trait theory
Cohen * Chapter +, -.2
17. Item response theory is to latenttrait theory as observer reliability is to A. generaliability theory. B. domain sampling theory. C. odd'even reliability. D. inter'scorer reliability.
Cohen * Chapter +, -.3
1;. The multiple'choice test items on this e(amination are all e(amples of A. dichotomous test items. B. latent trait test items. C. polytomous test items. D. one of these
Cohen * Chapter +, -.4
1<. A condence interval is a range or band of test scores that A. has proven test'retest reliability. B. is calculated using the standard error of the dierence. C. is li-ely to contain the true score. D. one of these
Cohen * Chapter +, -.5
!=. The standard error of measurement is A. used to infer how far an observed score is from the true score. B. also -nown as the standard error of a score. C. is used in the conte(t of classical test theory. D. All of these
Cohen * Chapter +, -/+
!1. Reliability , in a broad statistical sense, is synonymous with A. consistently good. B. consistently bad. C. consistency. D. validity.
Cohen * Chapter +, -/.
!!. A reliability coecient is A. an inde(. B. a proportion of the total variance attributed to true variance. C. unaected by a systematic source of error. D. All of these
Cohen * Chapter +, -//
!*. hich of the following is true of systematic error & A. +t signi8cantly lowers the reliability of a measure. B. +t insigni8cantly lowers the reliability of a measure. C. +t increases the reliability of a measure. D. +t has no eect on the reliability of a measure.
Cohen * Chapter +, -/0
!/. As the degree of reliability increases, the proportion of A. total variance attributed to true variance decreases. B. total variance attributed to true variance increases. C. total variance attributed to error variance increases. D. one of these
Cohen * Chapter +, -/1
!0. hy might ability test scores among testta-ers most typically vary& A. because of the true ability of the testta-er B. because of irrelevant, unwanted in@uences C. All of the above D. one of the above
Cohen * Chapter +, -/,
!6. A source of error variance may ta-e the form of A. item sampling. B. testta-ers reactions to environment'related variables such as room temperature and lighting. C. testta-er variables such as amount of sleep the night before a test, amount of an(iety, or drug eects. D. All of the above
Cohen * Chapter +, -/2
!7. Computer'scorable items have tended to eliminate error variance due to A. item sampling. B. scorer dierences. C. content sampling. D. testta-ers reactions to environmental variables.
Cohen * Chapter +, -/3
!;. hich type of reliability estimate is obtained by correlating pairs of scores from the same person >or people? on two dierent administrations of the same test& A. a parallel'forms estimate B. a split'half estimate C. a test'retest estimate D. an au'pair estimate
Cohen * Chapter +, -/4
!<. hich type of reliability estimate would be appropriate only when evaluating the reliability of a test that measures a trait that is presumed to be relatively stable time& A. parallel'forms B. alternate'forms C. test'retest D. split'half
Cohen * Chapter +, -/5
*=. An estimate of test'retest reliability is often referred to as a coecient of stability when the time interval between the test and retest is more than A. *= days. B. 6= days. C. * months. D. 6 months.
Cohen * Chapter +, -0+
*1. hich of the following might lead to a decrease in test'retest reliability& A. the passage of time between the two administrations of the test. B. coaching designed to increase test scores between the two administrations of the test. C. practice with similar test materials between the two administrations of the test. D. All of these
Cohen * Chapter +, -0.
*!. hich of the following is T#$% for estimates of alternate' and parallel'forms reliability& A. Two test administrations with the same group are re3uired. B. Test scores may be aected by factors such as motivation, fatigue, or intervening events li-e practice, learning, or therapy. C. +tem sampling is a source of error variance. D. All of these
Cohen * Chapter +, -0/
**. hich of the following is T#$% for parallel forms of a test& A. The means of the observed scores are e3ual for the two forms. B. The variances of the estimated scores are e3ual for the two forms. C. The means and variances of the observed scores are e3ual for the two forms. D. The means and variances of the estimated scores are e3ual for the two forms.
Cohen * Chapter +, -00
*/. hich source of error variance aects parallel' or alternate'form reliability estimates but does not aect test'retest estimates& A. fatigue B. learning C. practice D. item sampling
Cohen * Chapter +, -01
*0. hich of the following types of reliability estimates is the most e(pensive due to the costs involved in test development& A. test'retest B. parallel'form C. internal'consistency D. 2pearmans rho
Cohen * Chapter +, -0,
*6. hat term refers to the degree of correlation between all the items on a scale& A. inter'item homogeneity B. inter'item consistency C. inter'item heterogeneity D. parallel'form reliability
Cohen * Chapter +, -02
*7. Test'retest estimates of reliability are referred to as measures of , and split' half reliability estimates are referred to as measures of . A. true scores error scores B. internal consistency stability C. interscorer reliability consistency D. stability internal consistency
Cohen * Chapter +, -03
*;. hich of the following is usually minimied when using split'half estimates of reliability as compared with test'retest or parallelalternate'form estimates of reliability& A. time and e(pense B. reliability and validity C. reliability only D. time spent in scoring and interpretation
Cohen * Chapter +, -04
*<. hich of the following factors may in@uence a split'half reliability estimate& A. fatigue B. an(iety C. item di)culty D. All of these
Cohen * Chapter +, -05
/=. +nternal'consistency estimates of reliability are inappropriate for A. reading achievement tests. B. scholastic aptitudeintelligence tests. C. word processing tests based on speed. D. tests purporting to measure a single personality trait.
Cohen * Chapter +, -1+
/1. The 2pearman'Brown formula is used forE A. correcting for one half of the test by estimating the reliability of the whole test. B. determining how many additional items are needed to increase reliability up to a certain level. C. determining how many items can be eliminated without reducing reliability below a predetermined level. D. All of these
Cohen * Chapter +, -1.
/!. :or a heterogeneous test, measures of internal'consistency reliability will tend to be compared with other methods of estimating reliability. A. higher B. lower C. very similar or higher D. more robust
Cohen * Chapter +, -1/
/*. Typically, adding items to a test will have what eect on the tests reliability& A. #eliability will decrease. B. #eliability will increase. C. #eliability will stay the same. D. #eliability will 8rst increase and then decrease.
Cohen * Chapter +, -10
//. %rror variance for measures of inter'item consistency comes from A. fatigue. B. motivation. C. a testta-er practice eect. D. heterogeneity of the content.
Cohen * Chapter +, -11
/0. +f items from a test are measuring the same trait, estimates of reliability yielded from split'half methods will typically be as compared to estimates from #'!=. A. higher B. lower C. similar D. appro(imately the same
Cohen * Chapter +, -1,
/6. hich of the following is 9T an acceptable way to divide a test when using the split' half reliability method& A. #andomly assign items to each half of the test. B. Assign odd'numbered items to one half and even'numbered items to the other half of the test. C. Assign the 8rst'half of the items to one half of the test and the second half of the items to the other half of the test. D. Assign easy items to one half of the test and di)cult items to the other half of the test.
Cohen * Chapter +, -12
/7. +f items on a test are measuring very dierent traits, estimates of reliability yielded from split'half methods will typically be as compared with estimates from #' !=. A. higher B. lower C. similar D. appro(imately the same
Cohen * Chapter +, -13
/;. #'!= is the statistic of choice for tests with which types of items& A. multiple'choice B. true'false C. All of these D. one of these
Cohen * Chapter +, -14
/<. The #'!1 reliability estimate was developed A. to yield greater consistency in reliability coe)cients. B. to facilitate computation by hand. C. for use with less homogeneous items. D. because uder wanted to "one'up" #ichardsons !=.
Cohen * Chapter +, -15
0=. hich is 9T an assumption that should be met in order to use #'!1& A. +tems should be dichotomous. B. +tems should be of e3ual di)culty. C. +tems should be homogeneous. D. +tems should be scorable by computer.
Cohen * Chapter +, -,+
01. hich of the following is generally the preferred statistic for obtaining a measure of internal'consistency reliability& A. #'!= B. #'!1 C. endalls Tau D. coe)cient alpha
Cohen * Chapter +, -,.
0!. Coe)cient alpha is appropriate to use with all of the following test formats %FC%GT A. multiple'choice. B. true'false. C. short'answer for which partial credit is awarded. D. essay e(am with no partial credit awarded.
Cohen * Chapter +, -,/
0*. The "!=" and "!1" in #'!= and #'!1 represent A. numbers held constant in the denominator. B. numbers held constant in the numerator. C. the order in which the formulas were created. D. the age of :red uders son and nephew at the time the formulas were developed.
Cohen * Chapter +, -,0
0/. Coecientalpha is an e(pression of A. the mean of split'half correlations between odd' and even'numbered items. B. the mean of split'half correlations between 8rst' and second'half items. C. the mean of all possible split'half correlations. D. the mean of the best or "alpha" level split'half correlations.
Cohen * Chapter +, -,1
00. A coe)cient alpha over .< may indicate that A. the items in the test are too dissimilar. B. the test is not reliable. C. the items in the test are redundant. D. the test is biased against low'ability individuals.
Cohen * Chapter +, -,,
06. hich of the following is T#$% about coecient alpha& A. uder thought it to be single best measure of reliability. B. +t was 8rst conceived by Alfalfa Alpha. C. +t is a characteristic of a particular set of scores, not of the test itself. D. one of these
Cohen * Chapter +, -,2
07. A synonym for interscorer reliability is A. interHudge reliability B. observer reliability C. interrater reliability D. All of these
Cohen * Chapter +, -,3
0;. hich B%2T conveys the meaning of an inter'scorer reliability estimate of .<=& A. inety percent of the scores obtained are reliable. B. inety percent of the variance in the scores assigned by the scorers was attributed to true dierences and 1=I to error. C. Ten percent of the variance in the scores assigned by the scorers was attributed to true dierences and <=I to error. D. Ten percent of the tests items are in need of revision according to the maHority of the tests users.
Cohen * Chapter +, -,4
0<. hen more than two scorers are used to determine inter'scorer reliability, the statistic of choice is A. Gearson r. B. 2pearmans rho. C. #'!=. D. coe)cient alpha.
Cohen * Chapter +, -,5
6=. :or determining the reliability of tests scored using nominal scales of measurement, the statistic of choice is A. endalls Tau. B. the appa statistic. C. #'!=. D. coe)cient alpha.
Cohen * Chapter +, -2+
61. +f a test is homogeneous A. it is functionally uniform throughout. B. it will li-ely yield a high internal'consistency reliability estimate compared with a test' retest reliability estimate. C. it would be reasonable to e(pect a high degree of internal consistency. D. All of these
Cohen * Chapter +, -2.
6!. hich type>s? of reliability estimates would be most appropriate for a measure of heart rate& A. test'retest B. alternate'form C. parallel form D. internist consistency
Cohen * Chapter +, -2/
6*. +f a time limit is long enough to allow test'ta-ers to attempt all items, and if some items are so di)cult that no test'ta-er is able to obtain a perfect score, then the test is referred to as a test. A. speed B. power C. reliable D. valid
Cohen * Chapter +, -20
6/. Typically, speed tests A. contain items of a uniform di)culty level. B. are completed by fewer than 1I of all test'ta-ers. C. have low validity coe)cients. D. yield high rates of false positives.
Cohen * Chapter +, -21
60. hich type>s? of reliability estimates would be appropriate for a speed test& A. test'retest B. alternate'form C. split'half from two independent testing sessions D. All of these
Cohen * Chapter +, -2,
66. hich of the following would result in the 4%A2T appropriate estimate of reliability for a speed test& A. test'retest B. alternate'form C. split'half from a single administration of the test D. split'half from two independent testing sessions
Cohen * Chapter +, -22
67. A uder'#ichardson >#? or split'half estimate of reliability for a speed test would provide an estimate that is A. spuriously low. B. spuriously high. C. insigni8cant. D. e3ual to a test'retest method.
Cohen * Chapter +, -23
6;. A measure of clerical speed is obtained by a test that has respondents alphabetie inde( cards. The manual for this test cites a split'half reliability coe)cient for a single administration of the test of .<0. hat might you conclude& A. The test is highly reliable. B. The published reliability estimate is spuriously low and would have been higher had another estimate been used. C. The split'half estimate should not have been used in this instance. D. Clerical speed is too vague a construct to measure.
Cohen * Chapter +, -24
6<. The 2pearman'Brown formula can be used for which types of tests& A. speed and multiple'choice B. true'false and multiple'choice C. speed, true'false, and multiple'choice D. trade school and driving tests
Cohen * Chapter +, -25
7=. An estimate of the reliability of a speed test is a measure of A. the stability of the test. B. the consistency of the response speed. C. the homogeneity of the test items. D. All of these
Cohen * Chapter +, -3+
71. $se of the 2pearman'Brown formula would be +AGG#9G#+AT% to A. estimate the eect on reliability of shortening a test. B. determine the number of items needed in a test to obtain the desired level of reliability. C. estimate the internal consistency of a speed test. D. All of these
Cohen * Chapter +, -3.
7!. +nterpretations of criterion'referenced tests are typically made with respect to A. the total number of items the e(aminee responded to. B. the material that the e(aminee evidenced mastery of. C. a comparison of the e(aminees performance with that of others who too- the test. D. a formula that ta-es into account the total number of items for which no response was scorable.
Cohen * Chapter +, -3/
7*. Traditional measures of reliability are inappropriate for criterion'referenced tests because variability A. is ma(imied with criterion'referenced tests. B. is minimied with criterion'referenced tests. C. is variable with criterion'referenced tests. D. cannot be determined with criterion'referenced tests
Cohen * Chapter +, -30
7/. +f traditional measures of reliability are applied to a criterion'referenced test, the reliability estimate will li-ely be A. spuriously low. B. spuriously high. C. e(actly ero. D. one of these
Cohen * Chapter +, -31
70. The fact that the length of a test in@uences the sie of the reliability coe)cient is based on which theory of measurement& A. classical test theory >CTT? B. generaliability theory C. domain sampling theory D. item response theory >+#T?
Cohen * Chapter +, -3,
76. hich estimate of reliability is most consistent with the domain sampling theory& A. test'retest B. alternate'form C. internal'consistency D. interscorer
Cohen * Chapter +, -32
77. Classical reliability theory estimates the portion of a test score that is attributed to , and domain sampling theory estimates . A. speci8c sources of variation error B. error speci8c sources of variation C. the s-ills being measured variation D. the s-ills being measured content -nowledge
Cohen * Chapter +, -33
7;. Item response theory >+#T? focuses on the A. circumstances that inspired the development of the test. B. test administration variables. C. individual items of a test. D. "how and why" of the +nterborough #apid Transit line
Cohen * Chapter +, -34
7<. Jeneraliability theory focuses on which of the following& A. the circumstances under which a test was developed B. the circumstances under which a test is administered C. the circumstances under which a test is interpreted D. All of these
Cohen * Chapter +, -35
;=. The standard deviation of a theoretically normal distribution of test scores obtained by one person on e3uivalent tests is A. the standard error of the dierence between means. B. the standard error of measurement. C. the standard deviation of the reliability coe)cient. D. the variance.
Cohen * Chapter +, -4+
;1. hich of the following is 9T a part of the formula for the standard error of measurement for a particular test& A. the validity of the test B. the reliability of the test C. the standard deviation of the group of test scores D. Both b and c
Cohen * Chapter +, -4.
;!. "2i(ty'eight percent of the scores for a particular test fall between 0; and 61" is a statement regarding A. the utility of a test. B. the reliability of a test. C. the validity of a test. D. one of these
Cohen * Chapter +, -4/
;*. The standard error of measurement of a particular test of an(iety is ;. A student earns a score of 6=. hat is the con8dence interval for this test score at the <0I level& A. 0!'6; B. /='6; C. //'76 D. *6';/
Cohen * Chapter +, -40
;/. As the con8dence interval increases, the range of scores into which a single test score falls is li-ely to A. decrease. B. increase. C. remain the same. D. alternately decrease and increase.
Cohen * Chapter +, -41
;0. As the reliability of a test increases, the standard error of measurement A. increases. B. decreases. C. remains the same. D. alternately increases, then decreases.
Cohen * Chapter +, -4,
;6. +f the standard deviations of two tests are identical but the reliability is lower for Test A as compared to Test B, then the standard error of measurement will be for Test A as compared with Test B. A. higher B. lower C. the same D. one of these
Cohen * Chapter +, -42
;7. hich statistic can help the test user determine how large a dierence must e(ist for scores yielded from two dierent tests to be considered statistically dierent& A. standard error of measurement between two scores B. standard error of the dierence between two scores C. observed variance minus error variance D. standard error of the dierence between two means
Cohen * Chapter +, -43
;;. The standard error of the dierence between two scores is larger than the standard error of measurement for either score because the standard error of the dierence between the two scores is aected by A. the true score variance of each score. B. the standard deviation of each score summed. C. the measurement error inherent in both scores. D. All of these
Cohen * Chapter +, -44
;<. A guidance counselor wishes to determine if a student scored higher on a mathematics test than on a reading test. hat statistic>s? would be 592T useful& A. the standard error of measurement for each test score B. the standard error of the dierence between two scores C. the raw score on each test as well as the mean of each distribution D. the mean of each distribution and inde( of test di)culty for each test.
Cohen * Chapter +, -45
<=. The in generaliability theory is analogous to the reliability coe)cient in classical test theory. A. universe coe)cient B. coe)cient of generaliability C. universe score D. #oulin coe)cient
Cohen * Chapter +, -5+
<1. According to Cronbach et al.s generaliability theory, "facets" include A. the number of test items. B. the amount of training the e(aminers received. C. the purpose of administering the test. D. All of these
Cohen * Chapter +, -5.
Cohen * Chapter +, -5/
<*. +n classical test theory, there e(ists only one true score. +n Cronbach generaliability theory, how many "true scores" e(ist& A. one B. as many as the number of times the test is administered to the same individual C. many, depending on the number of dierent universes D. one of these
Cohen * Chapter +, -50
Cohen * Chapter +, -51
<0. +f a device to measure blood pressure consistently overestimated every assessees actual blood pressure by 1= units, which of the following would be T#$% of the reliability of this measuring device as the years passed& A. +t would increase. B. +t would decrease. C. +t would not be aected. D. +t would alternately decrease and increase.
Cohen * Chapter +, -5,
<6. +n general, which of the following is T#$% of the relationship between the magnitude of the test'retest reliability estimate and the length of the interval between test administrations& A. The longer the interval, the lower the reliability coe)cient. B. The longer the interval, the higher the reliability coe)cient. C. The magnitude of the reliability coe)cient is typically not aected by the length of the interval between test administrations. D. The magnitude of the reliability coe)cient is always aected by the length of the interval between test administrations, but one cannot predict how it is aected.
Cohen * Chapter +, -52
<7. hat is the dierence between alternate forms and parallel forms of a test& A. Alternate forms do not necessarily yield test scores with e3ual means and variances. B. Alternate forms are designed to be e3uivalent only with regard to level of di)culty. C. Alternate forms are dierent only with respect to how they are administered. D. There are no dierences between alternate and parallel forms of a test.
Cohen * Chapter +, -53
<;. Coecientalpha is the reliability estimate of choice for tests A. with dichotomous items and binary scoring. B. with homogeneous items. C. that can be scored along a continuum of values. D. that contain heterogeneous item content and binary scoring.
Cohen * Chapter +, -54
<<. +n which type>s? of reliability estimates would test construction 9T be a signi8cant source of error variance& A. test'retest B. alternate'form C. split'half D. uder'#ichardson
Cohen * Chapter +, -55
1==. +f the variance of either variable is restricted by the sampling procedures used, then the magnitude of the coe)cient of reliability will be A. lowered. B. raised. C. unaected. D. aected only in tests with a true'false format.
Cohen * Chapter +, -.++
1=1. :or criterion'referenced tests, which of the following reliability estimates is recommended& A. test'retest reliability estimates B. alternate'form reliability estimates C. split'half reliability estimates D. one of these
Cohen * Chapter +, -.+.
1=!. hich of the following is T#$% of domain sampling theory& A. +t supports the e(istence of a "true score" when measuring psychological constructs. B. +t can be used to argue against the e(istence of a "true score" when measuring psychological constructs. C. either uder nor #ichardson found it to have any applied value. D. All of these
Cohen * Chapter +, -.+/
1=*. +f a student received a score of 0= on a math test with a standard error of measurement of *, which of the following statements would be T#$% of the "true score"& A. +n 6;I of the cases, the "true score" would be e(pected to be between // and 06. B. +n 6;I of the cases, the "true score" would be e(pected to be between /7 and 0*. C. +n <0I of the cases, the "true score" would be e(pected to be between /7 and 0*. D. +n <0I of the cases, the "true score" would be e(pected to be between // and 06.
Cohen * Chapter +, -.+0
1=/. A psychologist administers a test and the test'ta-er scores a 0!. +f the cut'o score for eligibility for a particular program is 0=, what inde( will best help the psychologist determine how much con8dence to place in the test'ta-ers obtained score of 0!& A. the standard error of dierence B. the standard error of measurement C. measures of central tendencyE mean, median, or mode D. measures of variability such as the standard deviation
Cohen * Chapter +, -.+1
1=0. hich of the following is T#$% of both the standard error of measurement and the standard error of dierence& A. Both provide con8dence levels. B. Both can be used to compute con8dence intervals for short answer tests. C. Both can be used to compare performance between two dierent tests. D. Both are abbreviated by 2%5.
Cohen * Chapter +, -.+,
1=6. Test'retest reliability estimates of breathalyers have A. a margin of error of appro(imately one'hundredth of a percentage point. B. a margin of error of one percentage point. C. a margin of error so high that they must be deemed unreliable. D. not been done in the 2tate of Alas-a.
Cohen * Chapter +, -.+2
1=7. A police o)cer administers a breathalyer test to a suspected drun- driver, does not put on his glasses to read the meter, and as a result, mista-enly records the blood alcohol level. This is the -ind of mista-e that is B%2T with which type of reliability estimates& A. test'retest B. interscorer C. internal'consistency D. situational
Cohen * Chapter +, -.+3
1=;. hich of the following statements is T#$% regarding the dierences between a power test and a speed test& A. Gower tests involve physical strength speed tests do not. B. +n a power test, the testta-er has time to complete all items in a speed test, a speci8c time limit is imposed. C. +n a power test, a broad range of -nowledge is assessed in a speed test, a narrower range of -nowledge is assessed. D. Both b and c
Cohen * Chapter +, -.+4
1=<. The inde( that allows a test user to compare two peoples scores on a speci8c test to determine if the true scores are li-ely to be dierent is A. the standard error of the mean. B. the standard error of the dierence. C. the standard deviation. D. the correlation coe)cient.
Cohen * Chapter +, -.+5
11=. hich type of reliability is directly aected by the heterogeneity of a test& A. test'retest B. interrater C. internal'consistency D. alternate'forms or parallel'forms
Cohen * Chapter +, -..+
111. Jeneraliability theory is most closely related to A. developing norms. B. item analysis. C. test reliability. D. the way things are "in general."
Cohen * Chapter +, -...
11!. A test of attention span has a reliability coe)cent of .;/. The average score on the test is 1=, with a standard deviation of 0. 4awrence received a score of 6/ on the test. e can be <0I sure that 4awrences "true" attention span score falls between A. 6* and 60. B. 6! and 66. C. 6= and 6;. D. 0/ and 7/.
Cohen * Chapter +, -../
11*. By de8nition, estimates of reliability can range from to . A. '*.== K*.== B. 1 1= C. = 1 D. '1 to 1
Cohen * Chapter +, -..0
11/. $sing estimates of internal consistency, which of the following tests would li-ely yield the highest reliability coe)cients& A. a test of general intelligence B. a test of achievement in a basic s-ill such as mathematics C. a test of reading comprehension D. a test of vocational interest
Cohen * Chapter +, -..1
110. hat type of reliability estimate is appropriate for use in a comparison of ":orm A" to ":orm B" of a picture vocabulary test& A. test'retest B. alternate'forms C. inter'rater D. internal'consistency
Cohen * Chapter +, -..,
116. hat inde( of reliability would you use to compare two evaluators assessments of a group of Hob applicants& A. #'!= B. coe)cient alpha C. the appa statistic D. the 2pearman'Brown correction
Cohen * Chapter +, -..2
117. hich of the following is T#$% of the standard error of measurement& A. The larger the standard error of measurement, the better. B. The standard error of measurement is inversely related to the standard deviation >that is, when one goes up, the other goes down?. C. The standard error of measurement is inversely related to reliability >that is, when one goes up, the other goes down?. D. A low standard error of measurement is indicative of low validity.
Cohen * Chapter +, -..3
11;. hat type of reliability estimate is obtained by correlating pairs of scores from the same person on two dierent administrations of the same test& A. parallel'forms B. split'half C. interrater D. test'retest
Cohen * Chapter +, -..4
11<. A test containing 1== items is revised by deleting != items. hat might be e(pected to happen to the magnitude of the reliability estimate for that test& A. +t will be e(pected to increase. B. +t will be e(pected to decrease. C. +t will be e(pected to stay the same. D. +t cannot be determined based on the information provided.
Cohen * Chapter +, -..5
1!=. +n the formula F L T K %, T refers to A. the true score. B. the time factor. C. the average test score. D. test'retest reliability.
Cohen * Chapter +, -./+
1!1. The greater the proportion of the total variance attributed to true variance, the more the test. A. scienti8c B. variable C. reliable D. e(pensive
Cohen * Chapter +, -./.
1!!. A score earned by a testta-er on a psychological test may B%2T be viewed as e3ual to A. the raw score plus the observed score. B. the error score. C. the true score. D. the true score plus error.
Cohen * Chapter +, -.//
1!*. hich is 9T a possible source of error variance& A. test administration B. test scoring C. test interpretation D. All are possible sources of error variance.
Cohen * Chapter +, -./0
1!/. A goal of a test developer is to A. ma(imie error variance. B. minimie true variance. C. ma(imie true variance. D. minimie stress for testta-ers.
Cohen * Chapter +, -./1
1!0. hich of the following is T#$% about systematic and unsystematic error in the assessment of physical and psychological abuse& A. :ew sources of unsystematic error e(ist, due to the nature of what is being assessed. B. :ew sources of systematic error e(ist. C. Jender represents a source of systematic error. D. one of these
Cohen * Chapter +, -./,
1!6. +n general, appro(imately what percentage of scores would be e(pected to fall within two standard deviations above or below the standard error of measurement of the "true score" on a test& A. ;0I B. <=I C. <0I D. <
Cohen * Chapter +, -./2
1!7. +n Chapter 0 of your te(tboo-, you read of the "writing surface on a school desriddled with heart carvings, the legacy of past years students who felt compelled to e(press their eternal devotion to someone now long forgotten." This imagery was designed to graphically illustrate sources of error variance during test A. development. B. administration. C. scoring. D. interpretation.
Cohen * Chapter +, -./3
1!;. +n the Chapter 0 eet an !ssessment "rofessional feature, Dr. Bryce B. #eeve noted the necessity for very brief 3uestionnaires in his wor- due to the fact that many of his clients wereE A. young children with very short attention spans. B. seriously ill and would 8nd ta-ing tests burdensome. C. visually impaired an unable to focus for an e(tended period of time. D. All of these
Cohen * Chapter +, -./4
1!<. +n the Chapter 0 eet an !ssessment "rofessional feature, Dr. Bryce B. #eeve cited an e(perience in which he learned that the "%(cellent" response category on a test was best translated as meaning in Chinese& A. "super bad" B. "superlative" C. "bad" D. one of these
Cohen * Chapter +, -./5
1*=. The items of a personality test are characteried as heterogeneous in nature. This tells us that the test measures A. aspects of family history. B. ability to relate to the opposite se(. C. unconscious motivation. D. more than one trait.
Cohen * Chapter +, -.0+
1*1. "Coe)cient alpha !=" is a reference to A. a variant of the uder'#ichardson #'!= formula. B. the != th
in a series of formulas developed by Cronbach. C. a !=th'century revision of a Jaltonian e(pression. D. one of these Cohen * Chapter +, -.0.
1*!. ith regard to a value found for coe)cient alpha, A. "bigger is always better." B. "smaller is always better." C. "negative is best." D. one of these
Cohen * Chapter +, -.0/
1**. 5ost reliability coe)cients, regardless of the speci8c type of reliability they are measuring, range in value fromE A. '1 to K1 B. = to 1== C. = to 1. D. negative in8nity to positive in8nity
Cohen * Chapter +, -.00
1*/. All indices of reliability provide an inde( that is a characteristic of a particular A. test. B. group of test scores. C. trait. D. approach to measurement.
Cohen * Chapter +, -.01
1*0. The precise amount of error inherent in the reliability estimate published in a test manual will vary with A. the purchase price of the test >the more e(pensive, the less the error?. B. the sample of test'ta-ers from which the data were drawn. C. the population of test user actually using a published test. D. All of these
Cohen * Chapter +, -.0,
1*6. Dierent types of reliability coe)cients A. all re@ect the same sources of error variance. B. may re@ect dierent sources of error variance. C. never re@ect the same source of error variance. D. re@ect on error variance during leisure activities.
Cohen * Chapter +, -.02
1*7. A test of infant development contains three scalesE >1? Cognitive Ability, >!? 5otor Development, and >*? Behavior #ating. Because these three scales are designed to measure dierent characteristics >that is, they are not homogeneous?, it would be inappropriate to combine the three scales in calculating estimates of the tests A. alternate'forms reliability. B. internal'consistency reliability. C. test'retest reliability. D. interrater reliability.
Cohen * Chapter +, -.03
1*;. The fact that young children develop rapidly and in "growth spurts" is a problem when it comes to the estimation which type of reliability for an infant development scale& A. internal'consistency reliability B. alternate'forms reliability C. test'retest reliability D. interrater reliability
Cohen * Chapter +, -.04
1*<. +n the language of psychological testing and assessment, reliability B%2T refers to A. how well a test measures what it was originally designed to measure. B. the complete lac- of any systematic error. C. the proportion of total variance that can be attributed to true variance. D. whether or not a test publisher consistently publishes high 3uality instruments.
Cohen * Chapter +, -.05
1/=. Because of the uni3ue problems in assessing very young children, which of the following would be the B%2T practice when attempting to estimate the reliability of tests designed to measure cognitive and motor abilities in infants& A. $se relatively short test'retest intervals. B. $se relatively long test'retest intervals. C. Do not use the test'retest method for estimating reliability of the test. D. $se only inter'scorer reliability estimates.
Cohen * Chapter +, -.1+
1/1. +f the variance of either variable in a correlational analysis is restricted by the sampling procedure used, then the resulting correlation coe)cient tends to be A. higher. B. lower. C. unaected. D. unstable.
Cohen * Chapter +, -.1.
1/!. +f the variance of either variable in a correlational analysis is in#ated by the sampling procedure used, then the resulting correlation coe)cient tends to be A. higher B. lower. C. unaected. D. unstable.
Cohen * Chapter +, -.1/
1/*. The directions for scoring a particular motor ability test instruct the e(aminer to "Jive credit if the child holds his hands open most of the time." Because what constitutes "most of the time" is not speci8cally de8ned, directions such as these could result in lowered reliability estimates for A. test'retest reliability. B. alternate'form reliability. C. inter'rater reliability. D. parallel forms reliability.
Cohen * Chapter +, -.10
1//. A vice president >MG? of personnel employs a "Corporate 2creening Test" in the hiring process. :or future testing purposes, the MG maintains records of scores achieved by as opposed to in order to avoid restriction of range eects. A. Hob applicants hired employees B. hired employees Hob applicants C. successful employees hired employees D. successful employees other corporate o)cers
Cohen * Chapter +, -.11
1/0. +n the $veryday "sychometrics for Chapter 0, psychometric aspects of the Breathalyer were discussed. +n one challenge to the test'retest reliability of this device, the court found A. the margin of error was not ta-en into account by the legislature when it originally wrote the law. B. the margin of error was ta-en into account by the legislature when it originally wrote the law. C. the police o)cer had erred by administering the test at head3uarters and not the site of the infraction. D. one of these
Cohen * Chapter +, -.1,
1/6. The $veryday "sychometrics for Chapter 0 dealt with psychometric aspects of the Breathalyer. e learned that in the state of ew Nersey, it is legal and proper to administer a Breathalyer test to a drun- driver A. only at the arrest scene. B. at police head3uarters. C. even if the o)cer is into(icated. D. while a suspect is suc-ing on a breath mint.
Cohen * Chapter +, -.12
1/7. +n the Chapter 0 $veryday "sychometrics on psychometric aspects of the Breathalyer, we read of a police o)cer who intentionally recorded incorrect readings from the instrument. 2uch an event would most appropriately be recalled in the technical manual for this instrument under the heading A. "Test'#etest #eliability." B. "+nternal Consistency #eliability." C. "+nter'2corer #eliability." D. "Oow to :a-e :indings with the Breathalyer <==a."
Cohen * Chapter +, -.13
1/;. According to generaliability theory, a variable such as "number of items in the test" is a description of one A. facet of the universe. B. true element of the dominion. C. dominion in the domain. D. one of these
Cohen * Chapter +, -.14
1/<. Advocates of generaliability theory prefer the use of which of the following terms as an alternative to the use of the term "reliability"& A. generaliability B. universality C. regularity D. dependability
Cohen * Chapter +, -.15
10=. hich is the B%2T e(ample of a dynamic characteristic& A. the stress level of a trapee @yer at a circus B. the intelligence of a college student during 2pring BreaC. the anti'authority attitude of an inmate serving a life term D. one of these
Cohen * Chapter +, -.,+
101. As used in Chapter 0 of your te(t, the term in#ation of variance is synonymous with A. restriction of variance. B. restriction of range. C. in@ation of range. D. one of these
Cohen * Chapter +, -.,.
10!. +n the term latent trait theory , "latent" is a synonym for A. invisible. B. state. C. undeveloped. D. dormant.
Cohen * Chapter +, -.,/
10*. +#T is a term used to refer to A. a model that has many parameters. B. a parameter that has many models. C. a family of models for data analysis. D. a dysfunctional family of models.
Cohen * Chapter +, -.,0
10/. A polytomous test item is a test item A. that has multiple tomouss attached B. that has varied tomouss attached C. that has multiple and varied tomouss attached D. one of these
Cohen * Chapter +, -.,1
100. The #asch model A. was developed by a Danish mathematician named #asch. B. is an +#T model with speci8c assumptions about the underlying distribution. C. was devised from generaliability D. Both a and b
Cohen * Chapter +, -.,,
106. hy isnt +#T used more by "mom'and'pop" test developers such as classroom teachers& A. most classroom teachers were trained in generaliability theory B. +#T has no application in classroom tests C. applying +#T re3uires statistical sophistication D. All of these
Cohen * Chapter +, -.,2
107. ho are the primary users of +#T& A. classroom teachers B. commercial test producers C. instructors at universities in Departments of %ducation D. Jeorg #aschs twin sisters
Cohen * Chapter +, -.,3
10;. hich of the following is 9T an assumption attendant to the use of +#T& A. the assumption of unidimensionality B. the assumption of heteros-edacity C. the assumption of local independence D. the assumption of monotonicity
Cohen * Chapter +, -.,4
10<. +n +#T, the single, continuous latent construct being measured is often symbolied by the Jree- letterE A. alpha. B. beta. C. psy. D. theta.
Cohen * Chapter +, -.,5
16=. +f some of the items on a test were locally dependent , it would be reasonable to e(pect thatE A. all test items were designed for members of a speci8c culture. B. all test items were measuring the e(act same thing. C. some test items were measuring something dierent than other test items. D. some test items were structured in a dichotomous format and others were structured in a polytomous format.
Cohen * Chapter +, -.2+
161. "The probability of endorsing or selecting an item response indicative of higher levels of theta should increase as the underlying level of theta increases." This 3uote sums up the meaning of the +#T assumption of A. unidimensionality. B. heteros-edacity. C. local independence. D. monotonicity.
Cohen * Chapter +, -.2.
16!. The probabilistic relationship between a testta-ers response to a test item and that testta-ers level on the latent construct being measured by the test is e(pressed in graphic form by A. an item characteristic curve. B. an item response curve. C. an item trace line. D. All of these
Cohen * Chapter +, -.2/
16*. +ts an +#T tool that is useful in helping test users to better understand the range over theta that an item is most useful for. +ts called A. an item response curve. B. an information function. C. an item trace line. D. one of these
Cohen * Chapter +, -.20
16/. An +#T tool useful in helping test users abbreviate a "long form" of a test to a "short form" is the A. item response curve. B. information function. C. item trace line. D. one of these
Cohen * Chapter +, -.21
160. +n an +#T information curve, the term information magnitude may B%2T be understood as referring to A. theta. B. the range of the underlying construct. C. precision. D. di)culty.
Cohen * Chapter +, -.2,
166. Test items with little discriminative ability prompt the test developer to consider the possibility that A. the content of the item does not match the construct measured by the other items in the scale. B. the item is poorly worded and needs to be rewritten. C. the item is too comple( for the educational level of the population. D. All of these
Cohen * Chapter +, -.22
167. According to your te(tboo-, a test of depression that contains an abundance of items that probe the respondents outward e(pression of emotion may be inappropriate for use with A. test'ta-ers who have made suicidal gestures. B. inpatients who have been committed involuntarily. C. veterans diagnosed with GT2D. D. %thiopians.
Cohen * Chapter +, -.23
16;. The fact that cultural factors may be operating to wea-en an items ability to discriminate between groups is evident fromE A. 4ords treatise entitled Item Response %heory . B. an item characteristic curve. C. an information function. D. Jeorg #aschs unauthoried biography, &ou Can ever (e %oo Rich or %oo )Rasch.)
Cohen * Chapter +, -.24
16<. A dierence between the use of coe)cient alpha and +#T for evaluating a tests reliability is that with +#T, it is possible to learn A. how the precision of a scale varies depending on the level of the construct being measured. B. how the level of the construct being measured varies depending on variations in the item characteristic curve. C. the precise numerical value for the tests total interitem consistency. D. All of these
Cohen * Chapter +, -.25