Exploring Item Characteristics That Are Related to the Difficulty of TOEFL Dialogue Items

Transcript

1 Research Reports RR - 79 July 2004 Exploring Item Characteristics That Are Related to the Difficulty of TOEFL Dialogue Items Irene Kostin

2 Exploring Item Characteristics That Are Related to the lty of TOEFL Dialogue Items Difficu Irene Kostin ETS, Princeton, NJ RR-04-11

3 Equa l Opp ortunity/Affirma ETS is an n Emp loyer. tive Actio right 004 by ET S. A ll right s rese rved. Copy © 2 pa rt of t his rep ort m ay be r eproduce d or transm No n any fo rm or by any m eans, itted i electronic udi ng photocopy, reco rdi or mechanical, incl inform ation st ora ge ng, or any and retriev al syste m, with ou t p erm issio n in writing fro m th e pu blish er. Vi olato rs will be pr osec in acc ordance wi th bot h U.S . and international copy right laws. uted eco IO NAL T EST IN EDUC E, E TS, the ETS logo s, Gradu ate R AT rd G SERVIC Exam inations, GRE, TOE FL, and the TOE FL logo are registered t radem arks of Ed ucat ional Te sting Ser vice. The Te st of English as a F orei gn La nguage i s a t radem ark of Edu al Testin g Serv ice. cation College Board is a re gistered t radem ark of the College Entrance E xam ination B oard.

4 Abstract The purpose of this study is to explore the rela item characteristics and tionship between a set of ® s. Identifying character istics that are related to item dialogue item the difficulty of TOEFL difficulty has the potential to im prove the e fficiency of the item -writing process The study ployed 365 TOEFL dialogue item em oded on 49 variables, in cluding 5 significant s, which were c variables reported in Nissan, De Vincenzi, and T ang (1996). Of th e 5 significant variables in Nissan et al., 3 correlated significantly with item difficulty in this study. Another 11 m et a critical probability criterion. These 11 included representatives from three broad categories of variables: 2 in the category of wo rd-level facto y of discourse-level factors, and rs, 1 in the categor ltiple reg ress 8 in the category of task-processing factors. Mu ion analyses indicate that the variables in this study acc ount for about 40% of the variance in item diffic ulty. Key words: English language learning, Englis h as a second language (ESL), item difficulty, ® ), Test of listening comprehension, test item of English as a Foreign Language™ (TOEFL s, Test ® ) English for International Communication™ (TOE IC i

5 ® The Test ish as a F orei gn Lan guage™ (T OE FL of E ) was devel ope d i n 1963 by the Nat ional ngl un cil on the Testing lish as a Fo reign Lang uag e. Th e Coun cil was formed through the Co of Eng ore than 3 0 publ d pri vate or ganizat ions co ncer ned with testing t he En glish coope rative effort of m ic an n to ativ eak ers nonn e lang uage app lying fo r ad missio e sp in stitu tio ns in th e Un ited cy of proficien of th ® ® ® (ETS ) and the College Board assum ed nal Testing Serv catio 65 States. In , Edu 19 ice y fo r t e prog ram . In 197 3, a co op erativ e arrang em ent fo nsibilit he operatio n of th e joint respo r th ® the Gra in to by ETS, th e C olleg e Bo ard, and tered duate Record E xam ina tions (GRE ram prog was en ) he me mbershi p of the Co lleg e Board is co mposed of schoo ls, co lleg es, schoo l syste ms, an d Board. T sociations; Board m educational as embers are ass ociated with gra duate education. GRE ETS a OEF L p rogram unde r t he gene ral di rection of a pol icy b oard t hat was dministers t he T e TOEFL Bo ed by, an with , th e sp on soring organ izatio ns. Mem bers o f th lish ard estab d is affiliated ously the Policy Counci l) represent the College Board, the GRE Boa rd, and suc h institutions and (previ es as g raduat e sch ool s of b usiness, ju nior an d com muni ty col leges, n onprofit educat agenci l iona exchang enc ies, and ag encie s of the United St ates gove rnment. e ag ™ ™ ™ tin uing prog ram of research related to th e TOEFL test is carried o ut in con sultatio n with th e A con xaminers. Its me TOEFL Committee of E mbers in clude re pre sentatives of the TOEFL Boa rd a nd 12 the academ distinguishe ngua ge speci alists from cond la ic comm unity. The C ommitte e d English as a se advi ses the TOEFL program about res ear ch need s a nd, through the re search subc om mittee, revi ews and ap proposal s for funding an d reports f or publicat ion. M embers of t he Committee o f proves iners serve four-year te rm the in vitatio n of th e Bo ard ; th e ch air of the c ommittee serves on Exam s at ard. the Bo re s pecific to the T OEFL test an d the testing progra m, most of the actual researc h udies a Because the st rather th an by outside re searc hers. Ma ny projects require the coope ration is conducted by ETS staff stitutio ns, howev er, p articu larly t hose with pro gram s in th e tea ching of Eng lish as a fo reign of other in or se cond language d a ppl ied l ingui stics. Represe ntatives of suc h program s who a re interest ed i n an ating or con ducting TOEFL-related res earc h are invited t o contact the T OEFL progra m particip in l TO review to asce rch pr oject s m ust undergo appropriate ETS office. Al rtain that data EFL resea entiality will . confid be protected th rrent (200 004 Cu embers of 3-2 e TOEFL Committee of E xam iners are: ) m Mich elin e Ch alh oub-Dev ille Un iversity of Iowa Lyle Bach man Un iversity of Califo rnia, Lo s An geles Boraie The erican Un iversity in Cairo Deena Am erin er Mo nash Un iversity Cath e Eld Fulche r Uni versity of Dundee Glenn Grab e North ern Arizon a Un iversity William Kod Carn egie Mello n Un iversity Keiko a uecht Richard L Uni versity of North Caro lin a at Gr een sbo ro Tim McNa mara Th e Un iversity of Melbo urne Jam es E. Purpura Teachers Colle ge, C olum bia Uni versity Sant bol Hum Terry dt State University os Wiscon ard You Rich University of ng sin- Mad ison use one of t To tain m ore in form atio n ab ou t th e TOEFL program s an d serv ices, ob he fo llowing : E-mail: toefl @ets.org Web site: w ww.ets.org/t oefl ii

6 Acknow ledgments ng valuable background inform ation concerning I would like to thank Susan Nissan for providi TOEFL dialogue item s and also for providing details ficant variables in her concerning how the signi study of TOEFL dialogue item s were coded. I would also like to thank Marc Tolo for doing the coding that was essential for determ ining the inte rcoder reliab ility f or seve ral of the va riab les in this study. Additionally, I would like to thank Fred Cline plex statistical analyses for for carrying out com this study. Finally, I would also lik e to thank the reviewers of this report—Isaac Bejar, Neil Dorans, Dan Eignor, Catherine Elder, and Susan Nissan—fo r their helpful and infor mative comments and suggestions. iii

7 Table of Contents Page ... ... Introduction 1 1 ... Literature Review... Word-level Factors ... 2 ... Sentence-level Factors ... ... 3 ... Discourse- level Factors .. 4 Task-processing Factors .5 ... ... 6 Method ... ... Data 6 ... Variables Assessing Item Characteristics ... 6 The Coding ... ... 13 Results and Discussion ... 15 ... plications ... Conclusions and Im 27 ... Future Studies ... ... 28 References ... ... 29 Appendixes Coding Instructions ... ... 32 A - B - Instructions for C oding Lexical Overlap ... 52 iv

8 List of Tables Page Table 1. Intercoder R eliability Based on 60 TOEFL Dialogue Item s From Two TOEFL For ms ... 14 Table 2. Correlation of Variables W ith Item Difficulty (Equated Delta) ... 16 Table 3. ession, W ith Only Significant Variables Results of Stepwise Multiple Regr Re maining in the Equation ... 26 v

9

10 Introductio n e the relationship between a set of item The purpose of this study is to explor ® type currently included in dialogue item s, an item characteristics and the d ifficulty of TOEFL n of TOEFL. As part of this pts to prehension Sectio the Listening Com purpose, the study attem replicate the significant findings reported by Nissan, DeVincenzi, and Tang (1996). The study e Nissan et al. study. also investigates additional variables that were not included in th s could im prove the The ability to predict the difficulty of TOEFL dialogue item efficiency of the item -writing process. Statis tical specifications for TOEFL dialogue item s as or o ther item types call f well as f s with a r elatively wide range of dif ficultie s. W hen or item assem t, occas ions aris e where th ere are shortages of item s at certain d ifficulty levels. bling a tes ple, Nissan et al. (1996) reported an a shortage of difficult For exam occasion where there was s in the item pool such that , if the pool were not re TOEFL dialogue item plenished, specifications of future tests would not be m et. More recen tly, there has b een a shortag e of easier TO EFL dialogue item s (Marc Tolo, personal comm unic ation, 2002). A knowledge of the characteristics that are associated with harder or easier item item writers produce item s of the s could help desired level of difficulty. Literature Review clude s tudie s not only in The litera ea of listen ing ture reviewed below will in the ar prehension but also in the area of read com ing comprehension. The inclusion of reading com findings in th e litera ture o f sim prehension studies is based on s between re ading and ilaritie listening. For exam ple, Kintsch, Kozm insky, Streby, McKoon, and Keenan (1975) presented college students with paragraphs for reading and listening that were matched for number of propositions. The tim e allowed for reading was lim ited to that needed to present the paragraphs s ound that e level of recall, m orally. The researchers f mber of proposition th easured by the nu correctly recalled, was virtually id entical for both m ethods of pres entation. Kintsch et al. also reported that while paragraph length and num ber of different argum ents contained in the paragraphs affected recall accu did not differ for read ing versus listening. They racy, these effects concluded that the processes unde rlying reading and listening are probably sim ilar. S tudies by Kintsch and Koz minsky (1977) and Sm iley, Oakl ey, Cam pione, and Brown (1977) also support 1

11 this conclusion. Other studies ha ve reported high intercorrelati ons between reading and listening ht & Jam es, 1984, pp. 293-317). tests (see review by Stic pting to replicate are discussed s that this study is attem The Nissan et al. (1996) variable are: the p resence of in frequent oral vocabulary ropriate sectio ns below. These variables in the app s, the presence of ne gatives in the dialogue discussed in the section on word-level factor discussed in the section on senten tence pa ttern of the utterances in the ce-level facto rs, the sen ers in the dialogue discussed in the section on discourse-level dialogue and the roles of the speak factors, and the necessity of m aking an inference to answer the item s discussed in the section on rocess task-p study, Nissan et al. used equated delta as the m easure of item ing factors. In their difficulty; h es on this m easure are a sso ciated with more difficult item s and lower igher valu sociated with easier items. Also, seve values are as ral of the factors listed below are discussed in , by Bejar, Douglas, Jam ieson, Nissan, and Turner (2000). TOEFL 2000 Listening Framework Word-level Factors Past res ear eaning of an unf am iliar word can of ten be inf erred ch has shown that the m from the linguistic context in which it is em bedde d (Miller, 1999). However, the sp ars e lingu istic context in T OEFL dialogues (ranging from 53 words in the current study) probably m akes it 8 to eaning of an unknown word context, so one might expect that difficult to infer the m from vocabulary knowledge will have a s ignifican t eff ect on the d ifficulty of TOEFL dialogue item s. ploying TOEFL dialogue item Em , Nissan et al. (1996) reported findings s in their study supporting this hypothesis. Their m easure of voc abulary knowledge was the presence of an infrequent vocabulary word in the dialogue. A di alogue was coded as having an infrequent word if it contained a word that was not on a word list of 100,000 common words (B erger, 1977), a list based entirely on conversations in the United States, prim arily between adults and som e between university students. Nissan et al . found that the presence of an infrequent word in the dialogue ely associated with item 1) lty. T he findings of a study by Kelly (199 was positiv difficu onstrate the im portance of vocabulary knowledge to listening com prehension in situations dem where the linguistic context is som ewhat greater than in the case of TO EFL dialogues. Advanced English language learners in France both tran scribed and translated English passages (ranging from 82 to 121 words) that they listened to. Kelly categorized their errors as perceptual, lexical, or syntactical; he also rated the errors in rega rd to whether th ey resulted in m inim al 2

12 com prehension failure or severe comprehensi ailure. Kelly reported tha t lex ical erro rs, on f nfamiliar vo cabular typically in response to u y, accounted for most of the errors where paired. prehension was severely im com ay a ffec t item difficulty. Henrichson (1984), for exam ple , Phonological variables also m reported that the difference in listening com prehension between native sp eakers of English and they listened to spoken E nglish em ploying sandhi-variation nonnative speakers was greater when than when they listened to spoken E nglish wi thout sandhi-variation. This finding supports the akes com prehe nsion of spoken language m ore diffi cult for hypothesis that sandhi-variation m Sandhi-variation refers to “the phonological m odification of nonnative speakers of English. atical for ed” (Crystal, 1980, p. 311). Exam ples of sandhi- gramm ms which have been juxtapos gonna for going to , wanna for want to , and hasta variation are has to . for Sentence-level Factor s syntactic com plexity affects listening Several researchers have hypothesized that com ore complex the synt ax is in a text, the m ore dif ficult it is to prehension such that the m com prehend (Anderson & Lynch, 1988; Rost, 1990) . A few findings support this hypothesis. Nissan et al. (1996) reported th at the presence of more th an a single negative in TOEFL di dialogues was positively associated with item fficulty. In a related finding, Freedle and Kostin (1999) reported that the num ber of negativ es present in TOE FL m ini-talk passages w as ely re lated to item dif ficulty. Using the nu mb er of dependent c positiv es in a d ialo gue as a laus measure of s uck and Kos tin (1999a), in a pilot study, found that this yntactic complexity, B measure was positively related to s in the Test of English for the difficulty of dialogue item ® ). International Communication™ (TOEIC In the area of reading, Abraham sen and Shelton (1989) dem onstrated im proved com prehension of texts that were m rt, so th at full noun phras es were subs titu ted in odified, in pa ressions such as pronouns. This im provem ent in com prehension is place of referential exp hypothesized to have occurred b ecause, in th e m odified cond ition, the tes t takers no longer had to figure out what the referentials were referr ing to. Consistent with this finding, Buck and Kostin (1999a) found that the presence of within-text refere ntials in TOEIC dialogues was positiv ely re lated to item dif ficulty. 3

13 Discourse-level Facto rs rea o f reading co n, severa l studies have s hown that f amiliarity with the In the a mprehensio on (McNa topic of a text facilitates text com mara, Kintsch, Songer, & Kintsch, 1996; prehensi Recht & Leslie, 1988; Spilich, Vesonder, Chiesi the TOEFL , & Voss, 1979). Using data from reading section, Hale (1988) reported results consis W hile the size of the tent with these findings: tudents in two m effect was sm e hum anities /social all, Hale found that s ajor field groups, th al sciences, perf orm ed better on passages related to their own sciences and the biological/physic groups than on other passages. Em ploying an im mediate retrospective verbal report procedure, Yi’an (1998) investigated the com cesses involved when Chinese test takers, who prehension pro were studying English as a foreign language, re le-choice questions about a sponded to multip interview they had listened to; the protocols from this study recorded English language radio knowledge about the topic of the showed that these test takers frequently used their background e multip le-choice questions. interview when respond ing to th e findings regarding TOEFL listening item s can be interpreted as illustrating the Som prehension. effect of background knowledge on com ported that when the Nissan et al. (1996) re language of one of the speakers a specific role the speaker in a TOEFL dialogue was linked to of a casual acquaintance or classm ate, the item played and the role was not one s associated with such dialogues were significantly more difficult than item s without th is feature. (A m ore detailed description of this variable can be f ound in Appendix A, p. 39 of this report.) The authors hypothesized that su ch item ay be m ore difficu lt because the tes t takers m ay be unfa miliar s m Freedle and Kostin (19 s with the specific roles enacted in these dialogues. 99) reported that item ith TOEFL m ini-talks th at dealt with academ ic subject m atter such as s associated w cience or the hum anities were m ore dif ficult than item s asso ciated with m ini-talk pas sages tha t had nonacadem ic subjec t m atter. It is pos sible that d iffe rentia l fam iliarity with these d ifferent topics played a role here, too, in accountin g for the relationsh difficulty. ip to item an additional finding regard Nissan et al. (1996) reported ing the relationship between text and item diffi discourse characteristics of the that the utterance pattern in culty. They found TOEFL dialogues was significantly related to item difficulty: For TOEFL dialogues com posed of two utterances, they found that ite ms associated with dialogues having a statem ent in the second ore dif utte re sign ifican tly m rance we ficult th an item s associated with dialogues having a question in the second utterance. 4

14 Several researchers have studied the effects on listening com different kinds prehension of s at low er and interm of redundancy in a text. For second language listener ediate levels of ability, epeated nou s to be m ore ef fective in facilitating m of r ns seem redundancy in a text in the for listening comprehension than other restatem ent devices, such as use of synonym s (Chaudron, ation, 1995). On the other hand, in a study by Chiang a nd Dunkel (1992), elaboration of inform inform ation only f acilita ted the com prehension of ents of the text, or paraphrasing repeating segm high listening proficient second la nguage test takers. According to Chiang and Dunkel, the lack the lower- of adequate vocabulary prevented l tes t tak ers f rom taking advantage of the kinds leve of redundant inform ation used in their study. Task-processing Factors Task-proc actors typica lly inv olve an in teraction between features of the text and essing f features of the item . One task-processing factor that has been found to influence listening item difficulty is whether or not an item requires the exam inee to make an inference beyond what is explicitly stated in the text. Nissan et al. (1996) reported that TOEFL di alogue item s that required an inf erence (i. e., item s that tested im plicit inf orm ation) were sig nificantly more dif ficult than item s that tested com prehension of explicit inform ation. nd words in an item Lexical overlap between words in the text a ’s options has been found to affect listening item difficulty. Freedle and Fellbaum (1987) found that the greater the am ount erlap b etween words in the correct option and w ords in a single stim ulus sentence of lexical ov type in the TO (an item ior to 1995), the easier th e item . In their pilot EFL Listening Section pr study of TOEIC dialogue item (1999a) similarly found that easier item s were s, Buck and Kostin characterized by a greater am ount of lexical overl ap between words in the dialogue and words in the correct option. They further found that if there was a greater degree of lexical overlap between words in the dialogue and words in the incorrect options as co mpared to the correct tended to be m ore difficult. option, the item Studies in the field of reading com on have found that inform ation from the m ost prehensi recen t clau se in a sen tence is m ore accessib le than inform ation from an earlier clause (Gernsbacher, 1990). One possible implication in rega rd to listening stimuli such as dialogues is that the last clause of a dialogue is the one best retained in m emory. Consistent with this, Buck 5

15 and Kostin (1999a) reported that when the inform ation directly relevant to responding correctly came at the end of a TOEIC dialogue, frequently coinciding with the last clause, the to an item urtherm tended to be easy. F a word in th e correct ore, if there was item lexical overlap between also tended to be easy. e at the end of a option and a word that cam TOEIC dialogue, the item Method Data s with 1 item per dialogue. O f alogue item The total sample consisted of 365 TOEFL di this total, 240 item s came from eight disclosed post-1995 pape r-and-pencil TOEFL form s with s per for m. The rem 30 item s were selected from 28 disclosed pre-1995 paper- aining 125 item and-pencil T ms. As there has been an in creas ed em phasis on lim iting th e content of the OEFL for pus-related m al item dition dialogues to cam s were selected because they atters, these 125 ad included campus-related content. For the dialogue item s employed in this st s a short conversation udy, the test taker hear between two people, each having one turn to sp eak, which lasts between 5 and 20 seconds. Then a narrator asks a question about what was said. T he test taker has 12 seconds to read four possible responses (options) in th e test book, select the correct answ er to the question, and m ark it on th e ans he sections below and the coding manual in Appendix A include several wer sheet. T ples of these d ue item s. exam ialog tio n and in the sections tha t follo w, the corre ct optio n will be ref In this sec erred to as the key e incorr ect op tions will be ref erred to a , and th distracters . s the Variables A ssessing Item Characteristics Below is a s umm ary of t he variables assess ing item characteristics that were included in this study. Detailed descriptions of how these variables were coded are found in the coding five significant variab les reported by Nissan et manual in Appendix A. The variables include the al. (1996) as well as other variab les identified in the literature ination review above or by exam by the author of a sam ple of hard and easy dialogue item s. Several of these variables were coded separa tely f or the f irst speaker and f or the seco nd speaker, as well as for the total dialogue. The reason for the separa te coding of the first and the second speaker is that, in 93% of the TOEFL di alogues in this study, the narrator’s question only 6

16 refers to what the second speaker has said. Becaus e of this, it is hypothesized that test takers will ore on what the second speaker has said than on what the first speaker has said; as a focus m ay be m consequence, characteristics of the second speake ore closely related to r’s utterance m ficulty than a re c harac ter istic item the f irst speaker’s utterance. It should be em phasized dif s of stion usually focuses on what th e second speaker has said, in that although the narrator’s que t taker must also co mprehend what the f most cases the tes as said in o rder to irst speaker h respond correctly to the item . Several m know ledge were employed. First, Word-level variables. easures of vocabulary easure of vocabulary knowledge included in Nissan et al. (1996), discussed above, was the m easure of difficult vocabulary was th e presence of an infrequent vocabulary word coded. Their m in the dialogue; that is, a dial ogue was coded as having an infrequent word if it contained a word that was not on a list of 100,000 common wo rds com piled by Berger (1977). Exam ination of the item s coded for infreque nt vocabulary, using the m ethod in Nissan et al. (1996), revealed two types of item s: For one type of item eaning of the infrequent word was relevant 1. , knowledge of the m to responding correctly to the item : In the exam ple below, knowledge that the ord almanac infrequent w refers to a kind of book is relevant to identifying the key. (m an) Shall I retu rn th is alm anac to the referenc e desk ? (wom an) I want to check a few dates first. (narrato r) What does the wom an m ean? (A) She needs to check her calendar. (B) She hasn’t finished with the book.* (C) The reference m aterial is ou t-of-date. (D) She has already returned the alm anac. 2. For a second type of item , knowledge of the m eaning of the infrequent word does not appear to be relevant to responding co rrectly to the item , as in the example below where knowledge of the m eaning of the infrequent word antique does not appear to be needed to respond correctly: 7

17 (wom an) There’s a great antique sho nt Auditor ium. Let’s go w at the Gra see it this evening. (m on’t it be there for a while? an) I’ve worked really hard all da y long. W mply? an i (narrator) What does the m (A) He has to work late tonight. (B) He’d rather go at another tim e.* e show. (C) He’s already seen th e hard to ge t to the auditor ium on tim e. (D) It’ll b of the N issan et al. (1996) m eas ure of Based on the above distinction, a variant uded in the study; for this vari ant, only those item s were vocabulary knowledge was also incl eani frequent vocabulary word was relevant to coded where knowledge of the m ng of the in responding correctly to the item . The average word length of the words in the dialogue was also used as a m easure of idence that longer words are generally more difficult than vocabulary knowledge; there is ev shorter words (e.g., Carver, 1976). Average word length was obtaine d separately for the first utterance, as well as for the total dialogue. speaker’s utterance and for the second speaker’s Item com prehension of an idiom in the dialogue s were also coded as to whether or not was relevant to respondi ng correctly to the item . The American Heritage Dictionary (2000) defines the word idiom or m ore words having a m eaning that as “an expression consisting of two the m onstituent parts” (p. xxxvi). C omprehending idiom s cannot be deduced from eanings of its c rds in th e co ntext of an idiom can be difficult because even high-frequency wo can mean som ething quite different from what they co mm only m ean and thus have a m eaning that nonnative test tak ers are unfa miliar with. Sim ply coding for infrequent words will n ot pick up this kind of difficulty. An exam ple able is given below ; in this of a dialogue coded for this vari exam atic expression she’s got it made , which is relevant to responding correctly to ple, the idiom the item , includes no infrequent words, but the m eaning cannot be inferred from the meaning of the individual words. (m an) If you could, would you tr ade places with your sister? (wom an) Yeah, she’s got it m ade. 8

18 (narrator) What does the wom an m ean? ters share a lot of things. (A) The sis (B) She and her sister will switch seats. (C) Things are going well for her sister.* (D) Her sister finished her cooking. there were ins tructi ons to include sandhi- Another word-level code concerned whether variation in the dialogue. An exa instructions is given below: mple of an item that includes such an) You know [Y’know], som e TV channels have been (wom edies from the sixties. W hat do you rerunning a lot of [lotta] com think of [thinka] those old shows? an) Not much. But then, the new ’t so great eith er. (m ones aren (narrato r) What does the m an mean? uch television. (A) He no longer watches m edies from the sixties. (B) He prefers the com (C) Television com edies haven' t im proved since the sixties.* (D) He hasn’t seen m any of the old shows. A reviewer of this report, who is fam ilia r with the creation of TOEFL dialogue item s, made the point that “often the speakers [in the dialogue] elide in the deliv ery, and this would not an Nissan, personal comm ation, June 5, 2003). necessarily be indicated in the script” (Sus unic However, one would have to listen to the record ing of the dialogue in order to code for sandhi- in the script. Although coding for sandhi-variation based on the variation that was not indicated m ethod for a ssessing this variable, this was not recording of the dialogue is clearly the superior possible here, as will be explained below. In addition to sandhi-variation, several othe r phonological variables unique to listening might also contribute to the di s, su ch as speech rate, false start, fficulty of TOEFL dialogue item and repetition rate (see B uck & Kostin, 1999b, for a discussion of phonological variables). However, measurem ent of variables such as thes e was not possible in th e current study for the following reasons: (a) T he reco rding of each item is em bedded in a long er recording of the test in 9

19 which the item occurs, (b) to co ings of each dialogue and create a m aster tap e llect the record om a great number of original recordings, and (c) analyzing would require accessing excerpts fr ent th such a tape would require expert at were not available for the ise, processes, and equipm current study. riab le inc luded in th e study was whether or not the key contained A further word-lev el va rs in printed form, this variable also is presented to test take an infrequent word. Since the key taps reading com prehension skill; insofar as the construct being assessed by the dialogue item s is prehend spoken rather than writte n text, this variable co uld be considered, in the ability to com easure of one kind of construct-irrelevant variance. part, to be a m iables. finding, dialogues were coded Sentence-level var Based on Nissan et al.’s (1996) with regard to whether or not they contained m ore than one negative; utterances of the first and second speaker were also separately coded for this variable. O ther m easures of grammatical plexity that were coded separately for the fi rst and second speaker as well as for the total com dialogue were: (a) the num ber of ber of words in the longest dependent clauses and (b) the num -unit being defined as an independent clause with any attached dependent clauses T-unit, the T 4). The dialogues were al so coded f or the num ber of each of four (Hatch & Lazaraton, 199 different types of referentials. Another sen tence-level variab le cod ed wa s whether the k ey was in the form of a suggestion or a directive. Since m ost of the test English in a classroom takers probably learned obably included frequent suggesti setting, where the instructor pr ons and/or directives in the that tes t takers are very fam ilia r with th ese gramm atical for ms, course of lecturing, it is likely which m ight tend to m s using such forms easier. ake item The dialogues were coded for Discourse-level variables. the four different utterance patterns identified by Nissan et al. (1996): quest ion-question, statem ent-question, statem ent- statem ent, and question-statem ent. Also, based on Nissan et al., dialogue s were coded as to whether or not the language of one of the speaker s in the dialogue was linked to a specific role er played and the role was not one of a casual acquainta nce or classm the speak ate. Several additional codes concerned the kind of content in the dialogue. F or exam ple, a dialogue was coded as h aving content dealing with th e academ ic part of cam pus life if it dealt des toward their cou with the following type of topics dents ’ a ttitu : registering for classes; stu rse work; references to m aterials used for clas s, such as textbooks and calculators; studying; 10

20 interactions with professors i nvolving course work; class atte ic requirem ents; ndance; academ s; course assignm ents; class experience; and sim ilar content. exam room ing Nissan et al. (1996), item Follow s were coded with regard Task- processing variables. to whether or not the item required the test taker to m ake an inference beyond what was explicitly stated in the dialogue. sess ing lexic Several v verlap b etween words in the options and words in the ariables as al o e assessed the am ount of lexical overlap betw een the words in the dialogue were included. Som ther variab les in this category com pared the am key and the words in the dialogue. O ount of lexical overlap in the distracters with the am lexical overlap in the key; the expectation is ount of istracters that have a gr an the key has would be very that d eater degree of lexical overlap th attractive and would tend to m ake an item m ore difficult. les assessed th e location of the lexi cal overlap, such as, Additional task-processing variab for exam ple, whether or not the the last clause of the dialogue. lexical overlap involved words in ove, research has shown that informati on from As noted ab ost recent clau se in a senten ce is the m more access ible than inf orm ation from an earlier clause. The expectation is that the relationship between lexical overlap and item diffi be stronger if the overlap involved w ords in culty would than if it involved words co the last clause of the dialogue ming earlier in the dialogue. k-proces sing variab le co ncer ned whether there w ere two pieces of A further tas r su ation in the d ue that functioned as su bstitu tes for e ach othe inform ch that each of these ialog com ponents, in isolation, could yi eld the correct response. This can be thought of as a for m of redundant infor mation in the dial ogue. For example, in the follo wing item, the second speaker’s utterance contains the following two com ponent anym ore” and “I’ve s: “Oh, it’s not a problem ent that works just fine.” Each of these two components, in found an ointm isolation, could yield the correct response. (wom an) Have you seen the doctor about your skin condition yet? (m an) Oh, it’s not a problem anym ore. I’ve found an ointm ent that works just fine. an i mply? (narrator) What does the m (A) The doctor was too busy to see him . (B) He does n’t need to see the doctor.* 11

21 (C) The wom an should use the ointm ent. (D) His skin condition has gotten worse. Item pond correctly to an item s were also coded as to whether or not test takers could res s associat ost TOEFL tterance. Item solely on the basis of the second speaker’s u ed with m ation from the utterances of the two speakers dialogues require the test taker to integrate inform . In cont s coded for this variable do not require in order to respond correctly to the item rast, item such integration; com prehension of only the second speaker’s uttera nce suffices to respond ost TOEFL dialogue item s assess, in part, the correctly. Insofar as m ability to integrate inform the utterances of the two speakers, item s coded for this variable can be seen as ation from rt in this rega coded for this variable, where ollowing is an e xam ple of a n item falling sho rd. The f pond correctly to the item if one onl y com prehends the utterance of the it appears possible to res second speaker. (m hat have you heard about Pr ofessor Sm ith? I’m thinking of an) W taking an advanced engin eering course with him . (wom an) You really should. One of hi s articles just won some sort of lishing som award and I heard he’s always pub ething in the journals. (narrator) What does the wom an say about the professor? (A) His clas ses are v ery difficult. (B) His work is well respected.* ok soon. (C) He will publish a bo (D) He is no longer teaching. An addition al code con cerned wheth er the re was an apparen t inconsis tenc y between a n utterance in the dialogue and the item ’s key. In the dialogue below, for exam ple, there is an get m y m essage” and apparent inconsistency between th e wom an’s utterance “Then you did essage did not reach the m an.” In item s such as the following exam ple, the key, “Her m com prehension of the narrator’s question appears to be essential for responding correctly to the item . (m an) Thanks for letting us know you’d be late for the appointm ent. (wom an) Oh, good. Then you did get m y m essage. 12

22 (narrator) What had the wom ed? an assum (A) The m an had given her the m essage. an was late as well. (B) The m (C) She had plenty of tim ake the appointm ent. e to m (D) Her m essage did no t reach the m an.* In addition, this code applies there is also an apparent to dialogues using sarcasm where ce in the dialogue and the item ’s key, as in the example below, inconsistency between an utteran where there is an apparent in consistency between the utteranc e “... another one of Mike’s often m akes foolish suggestions.” brilliant ideas” and the key, “He [Mike] an) Can you believe it? (m our Now we’re supposed to bring a note from cto ery sing le tim e we want to use the computer! instru r ev an) [ sarcastica lly] (wom et th at w as anothe r o ne of Mike’s brillian t I’ll b ideas! (narrator) What does the wo man i mply about Mike? (A) He often m akes foolish suggestions.* a note. (B) His instructor won’t give him puter him self. (C) He should try using the com (D) He is a very good instructor. The Coding The data analysis is based on the coding of one researcher. A second coder, an ETS staff nd dialogue item member who writes and reviews TOEFL dialogues a s, was recruited to establish intercoder reliability fo r (a) those variables requiring subjec tive judgm ent and (b) the significant variables reported in the Nissa n et al. (1996) study of TOEF L dialogue item s. Sixty dialogue item s from two TOEFL for ms were used for this purpose. For variables that sim or absence of a charac ter istic, the sta tistic ply code for the presence used here to assess inte rcoder re liab ility is p ercent agreem ent, with an ag reem ent of 90% or m ore as the desired outcom sts those variables that are sim ply coded for the presence or e. Table 1 li absence of a characteristic and th e as sociat ed percent ag reement be tween the two coders. 13

23 Table 1 Intercoder Reliability Based on 60 TOEFL Di alogue Items From Two TOEFL Forms Variable nam Percent agreem e ent 95% V01: Infrequent word in dialogue V02: Knowledge of infrequent wo 92% rd in dialogue is relevant to responding correctly. V07: Com prehension of idiom 85% in dialogue is relevant to responding correctly. V11: Two or m ore negatives in total dialogue 97% V23: Utterance pattern: question-question 100% atem ent-question 95% V24: Utterance pattern: st atem V25: Utterance pattern: st 98% ent-statem ent ent V26: Utterance pattern: question-statem 98% V27: Speaker has specif ic role. 100% e deals with academ ic campus life. 93% V28: Content of dialogu e deals with ic cam pus life. 88% nonacadem V29: Content of dialogu lated to both cam pus and a few 93% V30: Content of dialogue is re other dom ains. V31: Ca mpus-related term present in dialogue but are s are 87% ocus. tal to inciden main f ted to noncam pus dom ain. 90% V32: Content of dialogue is rela to respond correctly. 92% V45: An inference is required V46: More than one elem ent in utterance of second speaker yields 90% key. V47: Only com prehension of u tteran ce of second speaker is 92% needed to respond correctly. s inconsistent with content of dialogue. 98% V49: Key seem Using the cr iterion of pe rcent agree ment, the intercode r re liability re ache s or exce eds 90% agreement for 15 of the 18 variables in le 1, and the percen t agreem ent for the Tab rem aining variables is close to 90%. Intercoder reliability was also obtained for one of the variables in the study that assess ed lexical overlap, nam ely, for variable V34 (num ber of words 14

24 in key that overlap with words in dialogue); unlike the variables included in Table 1, which were ously (i.e., either 1 or 0), this variable was coded on a continuum all coded dichotom , allowing coefficient. (The criteria for liab ed b y the P earso n correlation e assess ility to b rcoder re inte tions and words in the dialogue overlap between words in the op judging whether there is lexical rlap.) Coding item is the sam e two form s that e for all variables assessing lexical ove s in the sam the corre etween th e coding of the first were used for coding the variables in Table 1, lation b = .80, p = .000, indicating an acceptable coder and the coding of the second coder for V34 was r in level of coder r eliability f or this variab le. ter Dependent variable. udy is equated delta, a m easure of The dependent variable in this st difficulty (Petersen, Marco, & S ). Higher values are a ssociated with more item tewart, 1982 difficult item s and lower values are associated w ith easier item s. Results and Discussion Table 2 repo rts th on correlatio n coe fficients between equated delta and the 49 e Pears variables in this study for the da s. (Note that all the statistical ta set of 365 TOEFL dialogue item analyses in this report w ere carried out usi ng SPSS [Statistical Package for the Social Sciences] software.) In an effort to control for Type I error, the Bonferonni procedure was used to determ ine the critical probabilit ber of test s of significance, the critical y. Dividing .05 by the num es .001. The 11 variab lations at th is la tter level of sign ificance probability becom les with corre iscu ssed below. will be d itical The first variable in Table 2 whose al to o r is les s than th e cr value is equ p probability is V02 (knowledge of infrequent wo rd in dialogu e is relevan t to respond ing correctly ); the correlatio n indica tes that item s coded for V02 tend to be m ore difficult. This variable is a variant of the vocabulary m n et al. (1996), the latter sim ply easure used in Nissa the dialogue. In contrast to Nissan et al., who coding for the presence of an infrequent word in latter vocabulary m easure and item difficulty, the reported a significant relationship between this current study, w here this vocabular y m easure is referred to as corresponding correlation in the V01: Infrequent word in dial ogue, is not significant. The findi ngs of the current study suggest that it is not the m a low-frequency word in the di alogue that is associated with ere presence of item difficulty; rather, the critical factor seem s to be whether or not knowle dge of the m eaning of the infrequent word is relevant to responding correctly to the ite m. One possible explanation for 15

25 the discrepancy between the nd the cu that the Nis san et al. result in Nissan et al. a rrent result is s that required understa study included m nding infrequent words than were included in ore item the current study. Table 2 lty (Equated Delta) Correlation of Variables With Item Difficu Correla tion with a p Variable nam e equated delta Word-level variables n b = 132) V01: Infrequent word in dialogue ( .130 N .059 nt word in dialogue is .200 .000 V02: Knowledge of infreque relevant to responding correctly. ( = 52) N V03: Average word length in utterance of .084 .109 er first speak .904 .006 V04: Average word length in utterance of second speaker .077 .141 V05: Average word length in total dialogue include sandhi-variation V06: Instructions to .124 .017 = 4) in dialogue ( N V07: Com in dialogue is prehension of idiom .000 .245 =4 7) relevant to responding correctly. ( N N = 9) .139 .008 V08: Infrequent word in key ( iables Sentence-level var V09: Two or m ore negatives in utterance of n .035 .251 er ( N = 3) first speak n V10: Two or m ore negatives in utterance of .008 .125 = 7) second speaker ( N n ore negatives in .114 .014 V11: Two or m total dialogue ( N = 31) V12: Num ber of dependen t clauses in utterance .225 .064 of first speaker (Table continues) 16

26 Table 2 (continued) Correla tion with a p e Variable nam equated delta ber of dependen .129 .014 V13: Num t clauses in utterance of second speaker auses in total dialogue .124 V14: Num ber of dependent cl .018 ber of words in longest T -unit of .012 .818 V15: Num first speak er ber of words in longest T -unit of V16: Num .085 .104 second speaker V17: Num -unit of ber of words in longest T .347 .049 e l dia tota logu ber of within clause ref erentia ls V18: Num .122 .020 in dialogue V19: Num ber of between clau se ref erentials with in .021 .693 a turn in dialogue .066 .096 ls in utterance of one V20: Num ber of referentia speaker that refer to word in utterance of other speaker ber of special refe -.055 .292 V21: Num rentials in dialogue .038 V22: Num ber of words in key .468 Discourse-level variables n N = 11) -.147 .002 V23: Utterance patter n: question-question ( n .064 = 41) -.080 ent-question ( V24: Utterance pattern: statem N n ent-statem ent V25: Utterance pattern : statem .024 .104 = 172) ( N n V26: Utterance pattern: question- statem ent .003 .483 = 140) ( N nc ic role. ( N = 20) -.101 n/a V27: Speaker has specif (Table continues) 17

27 Table 2 (continued) Correla tion with a e p Variable nam equated delta ic V28: Content of dialogu e deals with academ .001 .181 = 125) pus life. ( cam N ic V29: Content of dialogu e deals with nonacadem .618 .026 = 30) N cam pus life. ( is related to both cam pus -.069 V30: Content of dialogue = 45) N and a few other dom ains. ( -.114 .030 s are present but are mpus-related term V31: Ca dialogue. ain focus of incidental to m ( = 24) N is related to noncam V32: Content of dialogue .098 -.087 pus N = 141) ain. ( dom V33: Total num ber of words in dialogue -.018 .732 Task-processing variab les lap variab ver Lexical o les -.149 .004 V34: Num ber of words in key that overlap with words in dialogue s in key that overlap V35: Percentage of word -.180 .001 with words in dialogue -.135 V36: Key has m .010 ore words that overlap with dialogue = 40) n than do three distracters. ( V37: No distracter has more words than key -.216 .000 = 96) overlapping with dialogue. ( N V38: The key has no helpful lexical overlap with .128 .014 N = 102) the dialogue. ( V39: All three distracters have m ore words than .040 .107 =53) key overlapping with dialogue. ( N V40: The key has the last overlapping word .000 -.326 = 73) N with the dialogue. ( (Table continues) 18

28 Table 2 (continued) Correla tion with a e Variable nam p equated delta -.206 .000 V41: There is overlap between words in the key and words spoken by second speaker. ( = 132) N V42: There is overlap between words in the key and .000 -.207 = 88) N words in last clause of dialogue. ( -.084 .111 V43: Key has synonym of (but no overlapping word with) a word in last clause of dialogue. ( = 22) N all three distracters .003 .153 V44: Overlapping words of N = 55) com e later in dialogue. ( Additional task-pro cessing variables n ired to respond correctly. .158 .001 V45: An inference is requ ( N = 178) ent in utterance of second .000 V46: More than one elem -.291 = 27) N speaker yields key. ( -.163 .002 prehension of utterance of second V47: Only com speaker is needed to res pond correctly. ( N = 70) V48: Key is a suggestion or directive. ( N = 42) -.161 .002 V49: Key seem .238 .000 s inconsistent with content of N = 7) dialogue. ( a n valu es m arked with th e super scrip t e sig are associated with variable s that wer The nificant in p se there was a iction regard ing the d irection of the clear pred the Nissan et al. (1996) study. Becau ariables, the p values for them are based on a one-tail test of significance. correlation for these v b For variab les with iled test s of significance. All other p values in the table are bas ed on two-ta dichotom ous coding (i.e., coded eith ber of item s coded for the presence of the er 1 or 0), the num c The correlation is not in the predicted ses after the variable nam variable is given in parenthe e. direction, in which case a one-ta iled test is not appropriate. A second variable m lity criterion is V07: Com prehension of eeting the critical probabi idiom in dialogue is relevant to responding correctly; V07 correlates positively with item difficulty. As noted earlier, com prehending idiom s can be difficult because even high-frequency 19

29 words in the context of can m ean som ething quite different from what they commonly an idiom at nonnative test tak ers are unfa mean and thus have a meaning th miliar with. with the academ that dialogues dealing ic The correlation for variable V28 indicates ore diffi cult than dialogues dealing with other subject m features of cam e pus life are m atter. Som ore difficult dialogues coded for V28 d eal with academ ic procedures typical of of the m erican u nivers ities, s uch as obtaining the req uired num ber of credits to graduate, registering Am basic courses in a subject before taking m ore advanced courses, for classes, the need for taking to obtain special perm ission to and getting a professor’s signature take a course. It is possible tive test takers lack background t are more difficult because nonna that dialogues with such conten knowledge about these topics. The correlations of several variables deali ng with lexical overlap m eet the critical (the percen probability criterion. Variable V35 e of words in the key th at overlap w ith words tag in the dialogue) was negatively related to item difficulty, indicating th at item s with a h igh percentage of lexical overlap in the key tend to be easier items. Sim ilar findings in regard to percentage of lexical overlap in the key have been reported for TOEFL m ini-talks (Freedle & Kostin, 1999) and for TOEFL reading (Freedle & Kostin, 1993). One might be concerned that a or no com prehension of a dialogue test taker having little could nevertheless perform well on ply choosing the option that had the m ost lexi s by sim TOEFL dialogue item cal overlap with the dialogue. Som e infor mation relevant to this conc ern is provided by results regarding V36 (key ore words overlapping with has m three distracters); only 40 of the dialogue than do any of the the 365 dialogue item the item s, were coded for this variable. Thus, s in this study, about 11% of optio using a strategy of selecting the ost lex ical o verlap would certainly f ail to yie ld n with the m a good score on this item type. (Further exam inat ion of the TOEFL dialogue item s indicates that there is no simple strategy invol ving lexical overlap that would ance on yield successful perform s.) these item A further finding suggests that item also related to le xical overlap between difficulty is words in the distracters and words in the dialogue . The correlation for variab le V37 indicates that item s tend to be easier when no distracter has m ore words that overlap w ith the dialogue than does the key. This suggests that if distracters had m ore lexical overlap with the dialogue as com pared to the key, the item would be harder . Supporting this conjecture is th e correlation for variable V39, 20

30 signif ican stringent value of p = .040, which indicates that ite ms tend to be harder when t at the less ping with the dialogue than does the key. all three distracters have more words overlap est th at ite m dif ficulty is also re late d to The correlations of som e additional variables sugg the words in the d overlap with words in the key. In general, the gue that ation of the loc ialo at th e r est th hip between item difficulty and le xical overlap is strengthened if results sugg elations the lexical overlap involves words com ing later in the dialogue. For example, one can consider ces of lexical o verlap b all instan the dialogue and words in the options and then etween words in identify which of these overlapping words occu . The correlation for rs last in the dialogue ence of this “last” overlapping word in the key is negatively variable V40 shows that the pres it is associated w ith easier it em s. In a related finding, variable related to item difficulty; that is, V41, which codes for the presence of lexical overlap between words spoken by the second e dialogu e and word speaker in th s. Likewise, s in the key, is also associated with easier item variable V42, which codes for lexical overlap betw een words in the las t clause of th e dialogu e and words in the key, is also asso ciated with eas ier item s. The correlation of item difficulty with V45 (a n inference is require d to respond correctly) also m eets the critical probability criterion. As ndicates that item s that expected, the correlation i ake an inference b eyond what is explicitly stated in the d require the test takers to m ialogue tend ficult than item ore dif o not requ ire this. to be m s that d eeting the cr Also m l probability criter ion is the corr ela tio n between item diff iculty itica and variable V46, which coded item s with respect to whether or not ther e were two components, (i.e., clauses, phrases, exclam ations, or a com bina tion of these) uttered by the second speaker in the dialogue such that each of these com ponents, r, could yield the key. independent of the othe ciated with item The presence of this variable was negatively asso sociated with difficulty (i.e., as easier item s). The presence of two such com pone nts in the dialogue is a kind of redundancy; other kinds of redundancy have been found to facilitate listening com prehension in past research (see Chaudron, 1995; Chiang & Dunkel, 1992). eeting th e critical probab ility criteri on is between item difficulty The last correlation m and variable V49, which coded for whether or not there was an apparent inconsistency between the text of the dialogue and the key. The correlation for variable V49 indicates that item s coded for this variable tend to be m ore diffi cult. 21

31 In Table 2, the variables in th is study are grouped into four broad categories: word-level les, s entence-level variab and task-processing variables. The variab les, discourse-level variables, ove, whose corre itica item dif ficulty m et the cr les discussed ab l probability 11 variab lation with criterion, include representatives from three of these four broad categories, with 2 belonging in the category of word-level variables, 1 in the ca tegory of discourse-level variables, and 8 in the category of task-processing variables. Also, som e of these 11 variables were discussed in the variables, the di rection of their correla tion with item difficulty literature review above. For those was consistent with the findings co vered in the literature review. Although statistically significant, the Regarding the magnitude of the correlations. fficulty are gene rally sm correlations between the 11 variab les described above and item di all in magnitude: Only 1 exceeds a m agnitude of .30, an additional 7 fall between .20 and .30, with the aining 3 falling below .20. These results are si rem milar to results obtained in an earlier study exploring the relationship between item istics and th e difficulty of TOEFL mini-talk character s (see F reedle & Kostin, 1999). Freedle and Ko stin’s (1999) comm ents below regarding the item sm all m agnitudes of the significant correlations in the TOEFL m ini-talk study can be seen as applying to the present results as well: Regarding these sm agnitudes, it is interes ting that a p arallel-p rocess ing all m prehension such as model of language com that proposed by Just and Carpenter (1987, pp. 279-281) is consistent with such an observation. T hat is, if m any they do operate in parallel, then no processes influence comprehension, and if plies aria ly to dom inate the co mprehension process. This fact im single v ble is like that the correlation of any single variab le with a m easure of c omprehension should be sm all in m agnitude. (The reader should note that if future studies should find large correlations between item lty and other variables, this m ay only difficu mean that the idea of m assive parallel processing m ight be called into question.) (p. 19) The fact that a sim ilar pattern of correlations has been observed for TOEFL dialogues as well as for T ini- talks can been seen as lending support to the inte rpretation of both sets OEFL m of results in term s of a parallel-pro cessing m odel of language com prehension. 22

32 Results regarding the significant variables in Nissan et al. (1996) . rst vari abl e The fi in Nissan et al. was , which was m easured by the reported as significant infrequent vocabulary as noted above, this presence of an infrequent word in the dial ogue. In the current study, ith item dif ficulty ( i.e., a sign = .059, p = .130). ificant correlation w variable, V01, did not have r word is relevant to responding However, a variant of this vari able, V02 (knowledge of infrequent r id correlate significan with item dif ficulty (i.e., tly = .211, p = .000). As noted ), d correctly m ight account for Nissan et al .’s significant finding and the earlier, one possible reason that this study is that the dialogues corresponding nonsignificant one in in Nissan et al.’s study had a quent words that were relevant to responding correctly than was much higher percentage of infre the case in this study. ssed in Nissan et al. (1996) was utterance pattern; The second significant variable discu ent in the item ent- statem ent and question-statem ent s with a statem second utterance (i.e., statem ore di patterns) were found to be significantly m fficult than those with a question in the second utterance (i.e., question-question and statem question patterns). There were not enough item s ent- ns that an et al. study ine separately in the Niss e two patter to exam had a statem ent for the th second utterance or the two patterns that had a qu estion for the second utterance. These separate patterns were included in the current study. Of the two patterns with a question in the second utterance, the results here sugge st that the question- ore closely (and question pattern, V23, is m an the statem ent-question pattern, V24 ( = -.147, p = .002 negatively) related to item difficulty th r r p = .064, respectively). Of the two patte rns with a statem ent in the second and = -.080, st that the statement-statem ent pattern, V25, is m ore closely (and utterance, the results here sugge p positiv m dif ficulty than the qu estion- sta tement pattern, V26 ( r = .104, ) related to ite = .024 ely and r = .003, p = .479, respectively). In general, the result s here replicate the results in Nissan et al. regarding utterance pattern and provide addi tio nal inf orm ation rega rding the contribution of the com ponents m aking up the patterns. nificant va s in Nissan et al. (1996) was negative in stimulus ; item The third sig riable associated w o or m ore negatives were found to be significantly m ore ith dialogues that had tw difficult than those that had fewer negatives. Consis tent with this result, in the current study the correlation between item and variable V 11 (two or more negatives in the dialogue) is difficulty in the exp ected direction ( r = .114) and is signific ant at the level of p = .014. The results also suggest that the presence of ne gatives in the utteran ce of the second speaker m ay play a greater 23

33 role in acco unting for th lt than the pres ence of negativ es in th e utterance of the first is resu e correlation between item difficulty a ore negatives in utterance of speaker: Th nd V09 (two or m r p = .251, while the correlation betw een item difficulty and V10 (two or er) is = .035, first speak d speaker) is r = .125, p more negatives in utteran ce of secon = .008. rted in Nissan et al. (1996) is The fourth significant variable repo implicit versus explicit item s are co For this variable, ard to wheth er an inferen ce is information tested. ded with reg the item . As noted above, the correlation in the current study for needed to respond correctly to this variable, V45, m et the critical probability criterion ( = .158, p = .001). r cant in Nissan et al. (1996) was role of speaker(s) The last variable reported as signifi ; the dialogue was linked to a spec s where the language of one of the speakers in le the item ific ro speaker played and the role was not one of a cas ual acquaintance or classm ate were found to be s not having this charact eristic. In the curren more difficult than item t study, the correlation between item difficulty and this variable, V27, wa so was in a direction s not significant and al explanation for the discrepancy between the tw o studies is opposite to prediction. One possible that the specific roles in the cu tudy m ay have been m ore fa miliar to the test takers than rrent s were the roles in Nissan et al. Exam ples of so me specific roles in the current study associated with easier dialogue item s are: serv a superm arket or grocery store, er at a restaurant, m anager at s likely that nonna ng luggage. It seem e and sales person at a store selli tive test takers have som background knowledge concerning roles such as th ese and can use this knowledge to aid in com prehending the dialogues that include these roles. Multip le r egre ssion was us ed to es tim ate how m uch varian ce in Regression analyses. difficulty is accounted for by the 49 variab les em ployed in this study. In the regression item analysis, equated delta was the de pendent variable and the 49 vari ables in Table 2 w ere entered 2 of (47, 317) = 6.369, p = .000; the m ultip le r = .697 with an adjusted R as a set. The overall F .409, suggesting th at abo e variance is accounted f or by the variables in th e study. ut 41% of th Stepwise regression was used to identify a onious subset of variables to more parsim predict item difficulty. As noted above, the statis tical analyses in this report were carried out using SPSS software. The stepwise regression procedure used by this sof tware, as described in the SPSS manual (SPSS, 1999), em lection procedure to start the process; ploys the forward se that is, variables are entered in to the model one by one. The vari able with the stronges t positiv e (or negative) sim ple correlation w ith the dependent variable is ente red first. At subsequent steps, 24

34 the var iable with the s partial correlation is entered and tested for significan ce. However, trongest tests variab les a model for rem oval at each step. the stepwise selection procedure lready in the ng these procedures, see SPSS, 1999, ation concerni p.216.) (For additional inform All 49 variables listed in Table 2 were availa ble f or possible selection. Each new variable ld a signif icance leve l of p that was ad .05. In the final mitted into the solution had to yie ≤ re left. Results are given in Table 3. In carrying out the regression equation, 14 variables we stepwise regression, no “already entered variables” needed to be rem oved from the model nger m et th e estab lished criterion. We see th at th e 14 because their significance level no lo variables accounted for about 40% of the variance with an (14, 350) = 18.15, p = .000. F difficulty with al l but one of the these 14 variables were The correlations of item < .05 (see Table 2), the one exception being V43. Som e of significant at p these 14 significant variables were discussed in the literature review above. For such variables, the direction of their ts is consisten t with th e f beta weigh indings covered in the literature review. It is im that the above estim ate of variance accounted for by the portant to note here onsiderable degree on chance. A j 14 variables capitalizes to a c ackknife procedure was used to estim ate how m uch the variance acc ounted for would vary w hen using data sets that differ from the original 365-item data set. The jackknife procedure was carried out as follows: First, 10 sam ples of a ately equal dif ficulty were cre ated from the pproxim ately equal size and approxim data set. Next, a reg ression procedure was ru n 10 tim es; for each run, the 14 origin al 365 item les we re used to p redict the ite m dif ficulty of a data set com prising 9 of the 10 sam ples, variab ples used for each . The resulting equation was then used to run with a different set of 9 sam t the item diff iculty values in the 10th sa mple. The pre dicted d ifficulty va lues were then predic 2 for ming a resulting R correlated with the observed difficulty values in this 10th sam ple, with the basis for estim ating variance accounted for. nife procedu follows: The correlations between predicted The results of the jackk re are as diffic ulty in the 10 runs range from and observed item p < .001 to .742, p < .000, with a .517, mean correlation of .610, p = .000; thus, the variance accoun ted for ranges from 26.7% to 55.1%, with a m seen as estim ates of variance accounted for ean of 37.2%. These latter figures can be when the 14 variab les th at em erged in the origin al stepwise regression ar e used to pred ict the difficulty of a set of TOEFL dialogue item s that differs from the orig inal set of 365 item s. 25

35 Table 3 Results of S ltiple Regression, With Only Significant Variab les Rema ining in the tepwise Mu Equation Std. Beta t-test Prob. B Error .119 87.661 .000 10.461 Constant V40: Key has last overlapping word with the dialogue. -.750 .158 -.214 -4.757 .000 V46: More than one elem ent in utterance of -1.167 .225 -.218 -5.182 .000 second speaker yields key. s inconsistent with content of V49: Key seem 1.895 .422 .186 4.493 .000 dialogue. prehension of idiom V07: Com in dialogue is relevant to responding correctly. .927 .174 .222 5.332 .000 prehension of infrequent w ord in V02: Com dialogue is relevant to responding corre ctly. .166 .167 4.011 .000 .667 V11: Two or m .632 .208 .126 3.045 .003 ore negatives in total dialogue V14: Total num ber of dependent clauses in dialogue .157 .046 .141 3.402 .001 V43: Key has synonym of a word in last clause of -.749 .243 -3.078 .002 dialogue. -.127 1.017 .113 2.721 .007 V08: Infrequent word is in key. .374 -.562 .183 -.128 V48: Key is a suggestion or directive. .002 -3.069 V47: Only com of utterance of second prehension pond correctly. -.477 .148 speaker is needed to res -3.224 .001 -.134 V28: Content of dialogu e deals with academ ic cam pus life. .329 .122 .111 2.687 .008 V37: No distracter has mo re lexical overlap with -.345 -.109 -2.421 .016 dialogue than key. .143 V18: Num within- clause ref erentia ls in ber of dialogue .618 .273 .093 2.261 .024 2 2 = .421; Adjusted R = .649; R Note. Multip R le = .398; standard error of estim ate = 1.088. 26

36 Conclusions and Implications First of all, this study has replicated som e of the significant findings in Nissan et al. were significant in Nissan et al. were also significantly (1996). The following variables that presence of two or mo e current study: (a) the re negatives in the related to item difficulty in th ialogue, and dialogue, (b) the need to draw an inference beyon plicitly stated in the d d what is ex the dialogue. One can have conf idence in these results not only (c) the pattern of utterances in because they have been replicated b abilities for th em are ut also b ecause the intercoder reli lts are based on ex isting item s; it still n eeds to be d eterm ined acceptable. However, these resu nd/or for m odifying item whether they can provide the basis for creating a s to desired levels of difficulty. s, one could ms, Carson, and In regard to modifying item follow the approach of Ada ® discrete item s in order to produce item s of Cureton (1993), who revised m iddle-difficulty GRE the case of TOEFL dial ogue item higher or lower difficulty; in s, for example, one could insert two or m ting dialogues of m iddle diffi culty that have no negatives and see ore negatives into exis odification increased the difficu lty of the item . However, Adam s et al. only whether this m needed to change som e words in a printed test form to m odify these GRE item s, which led them to conclude that “producing ha rder analogies and antonym s in this m anner s by revising item e” (see Abstract). In contrast, adding negatives to an existing would be a cost-effective procedur e dialogue, w hich m ight m ean that such a TOEFL dialogue would require re-recording th ive. Consequently, these results might best be used only as a procedure would not be cost-effect basis for creating new item s of varying levels ing that one has a of difficulty. However, assum L dialogue item well-replicated set of variables that predict TOEF y, a reviewer of this difficult report has suggested that “the pr ocess of recording dial ogues for this item type could be planned in such a way as to prerecord all the variations that would be relevant for late r cons tru ction [ of] sets of appro rson al communication, Decem ber 30, 2002 ). Also, if priate difficulty” (I. Bejar, pe the significant findings regarding lexical overlap variables are repl icated, these findings could be used as a basis for m odifying existing item s wit hout the need for re-recording the dialogues. In the cas e of lexical overlap variab les, it would be possible to modi fy the degree of lexical overlap between the options and the dialogue by sim ply changing som e of the words in the options, which are in printed form. 27

37 The correlations between item diffic mber of variables other than those from ulty and a nu (1996) m et the critical p criterion. At p resent, thes e findings are suitab le Nissan et al. robability generation, since they st eed to b arily for hypotheses e replicated. However, it is prim ill n e sim ply from an exam ination of appropriate to note that several of these variable s did not com s them selves, but also from a survey of the research literatu the item re. The direction with which these va les corr ela ted with item dif ficulty is, in all c ases, c onsisten t with the f inding riab s in the research literature. This provides evidence to su at th e results reg arding som ggest th e of these variab les will be su ccess fully replicated. Future Studies The prim ary purpose of the current study was a practical one, that is, to provide test ent staff with infor ntial to help them create harder and/or easier developm mation that has the pote TOEFL dialogue item s. However, ideally, future st udies that investigate the relationship between characteris item diff iculty will be m ore theo retica lly guid ed th an the pr ese nt one; the tics and item em pirical results of these studies will, hopefull y, also yield infor mation about the predictive power of different theoretical or ideally, will attem pt to conf irm ientations. Also, future studies, these predictions using m ethods other th an the regression m ethods used here. It has been noted above that the correlationa l results in the presen t study are consistent with the findings in the research literature. On e can hope that it would be possible in the near future to in tegrate thes e separa te f indings into a more com prehensive theoretical approach to language processing. 28

38 References sen, E., & Shelton, K. (1989). Reading com Abraham prehension in adolescents with learning antic Journal of Learning Disabilities , 22 , 569-572. : Sem disabilities and syntactic effects. Item difficulty adjus tment study: GRE verbal Ada ms, R., Carson, J., & Cureton, K. (1993). TS RR-92-79). Princeton, NJ: ETS. (E discretes. th ed.). (2000). Boston: Houghton American Heritage Dictionary of the English Language (4 Mifflin Co. Anderson, A., & Lynch, T. (1988). Listening . New York: Oxford University Press. Bejar, I., Douglas, D., Jam ., Nissan, S., & Turner, J. (2000). TOEFL listen ing framework: ieson, J A working paper. (ETS RM-00-07). Princeton, NJ: ETS. The most common 100,000 words used in conversations Berger, K. W. (1977). . Kent, OH: Herald Publishing House. Breland, H., & Jenkins, L. (1997). English word frequency statisti cs: Ana lysis o f a selected million toke corpus of 14 New York: College E ntranc e Exa mination Board. ns. Buck, G., & Kostin, I. (1999a). Exploring the cause of item difficulty on TOEFL CBT dialogue items. Manuscript in preparation. Buck, G., & Kostin, I. (1999b). Developing a scheme to analyze the phonological characteristics of listening-item stimuli . Manuscript in preparation. ord length, pr Journal of Reading Behavior , Carver, R. (1976). W ose difficulty, and reading rate. 8 , 193-203. ic listening. In D. Mendelsohn and J. Rubin (Eds.), Chaudron, C. (1995). Academ uide for the A g teaching of second language listening (pp. 74-96). San Diego, CA : Dom inie Press, Inc. Chiang, C., & Dunkel, P. (1992). Th e effect of speech m odification, prio r knowledge, and listening proficiency on EFL lecture learning. TESOL Quarterly , 26 , 345-374. Crystal, D. (1980). linguistics and phonetics. Boulder, CO: W estview Press. A first dictionary of Freedle, R., & Fellbaum , C. (1987). An explorator y study of the relative difficulty of TOEFL’s listening comprehension item s. In R. Freedle & R. Duran (Eds.), Cognitive and linguistic analyses of test performance (pp.162-192). Norwood, NJ: Ablex. Freedle, R., & Kostin, I. (1993). The prediction of TOEFL reading item difficulty: Implications for construct validity. Language Testing , 10 , 133-170. 29

39 Freedle, R., & Kostin, I. (1999). Does the text m atter in a m ice test of com prehension ? ultiple-cho Language Testing , , 2-32. The case for the construct validity of TOEFL’s minitalks. 16 . Hillsda . Language comprehension as structure building le, NJ: Erlbaum Gernsbacher, M. (1990). field group and text content in TOEFL reading Hale, G. (1988). The interaction of student major- ep. No. 25). Princeton, NJ: ETS. . (TOEFL Research R comprehension Hatch, E., & Lazaraton, A. (1994). The research manual—Design and statistics for applied Boston: Heinle & Heinle. linguistics. filter of input for learners of ESL . Language Henrichson, L. (1984). Sandhi-variation: A , , 103-126. Learning 34 Just, M., & Carpenter, P. (1987). The psychology of readi ng and language comprehension . Boston, MA: Allyn & B acon. ain obst Kelly, P. (1991). Lexical ignorance: the m prehension with advanced acle to listening com foreign language learners . International Review of Applied L inguistics in Language Teaching , 29 , 135-150. Kintsch, W ., & Kozm ins ky, E. (1977). Summari zing stories after re ading and listening. Journal of Educational Psychology , , 491-499. 69 ., Koz G., & Keenan, J. (1975). Com prehension and Kintsch, W minsky, E., Streby, W., McKoon, recall of text as a functio n of content variables. Journal of Verbal Learning and Verbal , 14 Behavior , 196-214. McNam ara, D., Kintsch, E., Songer, N., & Ki ntsch, W . (1996). Are good texts always better? Interactions of text coherence, backgr ound knowledge, and levels of understanding in learning from text. Cognition and Instruction , 14 , 1-43. Miller, G.A. (1999). On knowing a word. Annual Review of Psychology 50 , 1-19. , An analysis of factors affe Nissan, S., DeVincenzi, F., & Tang, K. L. (1996). cting the difficulty of dialogue items in TOEFL listening comprehension. (TOEFL Research R ep. No. 51). Princeton, NJ: ETS. Petersen, N., Marco, G., & Stewart, E. (1982). A test of the ade quacy of linear score equating models. In Holland, P. & Rubin, D. (pp. 71-136). New York: Academ ic Test Equating Press. Quirk, R., Greenbaum , S., Leech, G., & Svartvik, J. (1985). A comprehensive grammar of the English language. London, England: Longm an. 30

40 Recht, D., & Leslie, L. (1988). Effect of prio r knowledge on good and poor readers’ m emory of Journal of Educational Psychology 80 , 16-20. text. , Rost, M. (1990). Listening in language learning . New York: Longm an. iley, S., Oakley, D., Ca mpione, J., & Brown, A. (1977). Recall of them atically relevant Sm s as a function of written versus oral material by adolescent good and poor reader Journal of Educational Psychology , 69 , 381-387. presentation. (1979). Text process ing o f dom ain related Spilich, G., Vesonder, G., Chiesi, H., & Voss, J. inform ation for individuals with high and low dom ain knowledge. Journal of Verbal Learning and Verbal Behavior , , 275-290. 18 SPSS base 9.0: A Chicago: SPSS, Inc. SPSS, Inc. (1999). pplication guide. es, J.H. (1984). Listening and read ing. In P.D. Pearson, R. Barr, M. Ka Sticht, T., & Jam mil, & P. Mosenthal (Eds.), Handbook of reading research (pp. 293-317). NY: Longm an. Yi’an, W . (1998). W hat do tests of listening comprehension test ?—A retrospection study of EFL ice task. test-tak erform ing a multiple -cho ers p Language Testing , 15 , 21-44. 31

41 Appendix A Coding Instructions Word-level Codes V01: Infrequent Word in Dialogue The A word in the dialogue is considered to be an infrequent word if it does not appear in , by Kenneth Berger (1977). Most Common 100,000 Words Used in Conversations Coding instructions for V01. t word in the dialogue that does If there is at least one conten not appear in Berger’s word-frequency list, code 1; else 0. Additional coding instructions for V 01: e root but with diffe rent endings are considered to be the same Words with the sam 1. (e.g., the word offe word in a dialogue would get coded 0 if the word offer ed ring appeared on Berger’s list but the word ing did not, since both words have the offer e root). sam A com pound word in a di alogue would get coded 0 if (a) its com 2. ponent words appeared in Berger’s list and (b) the meaning of t he com pound word could be inferred from its com ponents (e.g., the word weekday would get coded 0 because both week day appear on Berger’s list.) and To help in coding V02 below, coders s 3. hould look up all the words in the dialogue ight not appear in Be rger’s wo rd-frequenc y list and make note of that they believe m all those words that don’t appear on the list. V02: Knowl edge of An Infrequent Word in the Dialogue Is Relevant to Responding Correctly to the Item. Note. This variable is only coded for those it em s assigned a 1 for code V01 (infrequent word in dialogue). 1. Below is an exam ple of an item where knowledge of the infrequent word almanac is relevant to responding correctly to the item : (m an) Shall I retu rn th is alm anac to the referenc e desk ? 32

42 (wom an) I want to check a few dates first. an m ean? (narrator) What does the wom (A) She needs to check her calendar. (B) She hasn’t finished with the book.* (C) The reference m t-of-date. aterial is ou (D) She has already returned the alm anac. ple of an item 2. ere knowledge of the infrequent word antique is Below is an exam wh spond correctly to the item : NOT needed in order to re an) There’s a great antique sho w at the Gra nt Auditor ium. Let’s (wom go see it this evening. (m lly hard all da y long an) I’ve worked rea on’t it be th or a while ? . W ere f an i mply? (narrator) What does the m (A) He has to work late tonight. e.* (B) He’d rather go at another tim (C) He’s already seen th e show. (D) It’ll b e hard to ge t to the auditor ium on tim e. Coding instructions for V02. If knowledge of an infreque nt word in the dialogue is relevant to responding correctly to the item if the infrequent word does not also appear in AND ed here that knowledge of the infrequent word in the dialogue the key, code 1; else 0. (It is assum ent word is al so present in the key b ecause, in th e latter case, may not be needed when the infrequ ple m atching strate gy m ight yield the key.) a sim V03: Average Word Length in th e Utterance o f the Firs t Speaker Coding instructions. Use gramm ar tool in MS-Word to get th e averag e word leng th in characters o ce for the first speaker. f the utteran V04: Average Word Length in the Utterance o f the Second Speaker Coding instructions. Use gramm ar tool in MS-Word to get th e averag e word leng th in characters of the uttera nce for the second speaker. 33

43 V05: Average Word Length in Total Dialogue Coding instructions. ar tool in MS-Word to get th e averag e word leng th in Use gramm nce for the total dialogue. characters of the uttera e V6: Instructions to Include Sa ndhi-variation in the Dialogu m that include s ins tructions to include sandhi-variation in Below is an exam ple of an ite the dialogue: an) You know [Y’know], som e TV channels have been rerunning a (wom edies from the si xties. W hat do you think of [thinka] lot of [lotta] com those old shows? an) Not much. But then, the new ones aren ’t so great eith er. (m man mean? (narrator) What does the (12 seconds) (A) He no longer watches m uch television. (B) He prefers the com edies from the sixties. edies haven’t im proved since the sixties.* (C) Television com any of the old shows. (D) He hasn’t seen m Coding instructions. If the speakers in the dialogue are instru cted to alter the pronunciation of the words that they speak, code 1; else 0. V07: Comprehension of an Idiom Or an Idiomatic Multiword Verb Is Relevant to Res ponding Correctly to the Item. The American Heritage Dictionary idiom as “an expression (2000) defines the word ore words having a m that cannot be deduced from the meanings of consisting of two or m eaning xxxvi). Sim ilarly, acco rding to Quirk , Greenbau its constituent parts” (p. m, Leech, and Svartvik (1985), idiom atic m ultiword verb s are those whose “m eaning is not predictable from the meanings of its parts ” (p . 1162). Som e exa mple s of idiom atic m ultiword verbs given by Quirk et give i al. are: ire), turn up (make an appearance), (acqu n (surrende r), catch on come by (understand), and blow up (explode). 1. ple of an item where com prehension of the idiom Below is an exam she’s got it made is relevant to responding correctly to the item : 34

44 (m an) If you could, would you tr ade places with your sister? an) Yeah, she’s got it m ade. (wom ean? (narrator) What does the wom an m (A) The sis ters share a lot of things. (B) She and her sister will switch seats. (C) Things are going well for her sister.* (D) Her sister finished her cooking. Below is an exam ple of an item where com prehension of the multiword idiom atic 2. verb turned down ng correctly to the item : is relevant to respondi an) But David, you m didn’t apply for a scholarship? (wom ean you (m an) I did, but I was turned down. ean? (narrator) What does David m (A) He decided to quit school this term . (B) He didn’t bring his application form . (C) He m ade a wrong turn downtown. (D) He didn’t receive financial aid.* 3. Below is an exam ple of an item where com prehension of the idiom gets on my nerves does NOT appear to be needed in orde r to r nd corre ctly to the item : espo an) W hy did you come to the m (m ng late? I left a m essage with your eeti roomm ate a bout the time change. (wom an) She has a very short m emory, and it really gets on m y nerves some s. time (narrator) What does the wom an i mply? (A) The m an shouldn’t have invite d her roomm ate to the m eeting. (B) Her roommate was unable to attend the m eeting. (C) Her roo mmate is unreliabl e about delivering messages.* 35

45 (D) She forgot about the tim e change. If comprehension of an idiom ultiword idiom atic verb Coding instructions for V07. or m correctly, code 1; else 0. is relevant to responding uent Word in the Key. V08: There Is an Infreq Standard Frequency Index (S FI) Coding instructions for V08. If a word in the key has an y count (Breland & Jenkins, 1997) AND if this of less than 40.0 in the Breland word-frequenc word does not also appear in the dialogue, code 1; else 0. (It is assum ed here that com prehension ay not be needed of the infrequent word in the key m if the infrequent word is also present in the dialogue because, in the latter case, a sim atching strategy m ight yield the key.) ple m Sentence-level Codes V09: Two or More Negatives in Utterance of First Speaker arkers (e.g., no and not ) are counted, as well as negative prefixes (e.g., un- Negative m and dis- ). Negativ e tags are also cou nted, even if their m eaning is not negative. Coding instructions for V09. If the num ber of negatives in th e utte rance o f the f irst speaker is 2 or greate r, code 1; else 0. V10: Two or More Negatives in Utterance of Second Speaker Negative m no and not ) are counted, as well as negative prefixes (e.g., un- arkers (e.g., dis- ). Negativ and nted, even if their m eaning is not negative. e tags are also cou Coding instructions for V10. If the num ber of negatives in the utterance of the second speaker is 2 or greate r, code 1; else 0. V11: Two or More Negatives in Total Dialogue Negative m arkers (e.g., no and not ) are counted, as well as negative prefixes (e.g., un- and ). Negativ e tags are also cou nted, even if their m eaning is not negative. dis- Coding instructions for V11. If the num ber of negatives in the total dialogue is 2 or greater, code 1; else 0. V12: Number of Depen dent Clauses in Utteran ce of First Speaker 36

46 Coding instructions for V12. Code the num clauses in th e utterance o f ber of dependent irst spea ker. the f ce of Secon V13: Number of Depen d Speaker dent Clauses in Utteran ber of dependent clauses in th e utterance o Coding instructions for V13. Code the num f the second speaker. V14: Number of Depen dent Clauses in Total D ialogue Code the num ber of dependent cl Coding instructions for V14. e tota l dia logue. auses in th V15: Number of Words in Longest T- irst Speaker unit in Utterance of F pendent clause with any attach ses (H t clau A T-unit is defined as an inde atch ed dependen , 1994). & Lazaraton Coding instructions for V15. Code the num ber of words in the longest T-unit in the utte rance of the f irst spea ker. V16: Number of Words in Longest T- unit in Utterance of S econd Speaker Code the num Coding instructions for V16. ber of words in the longest T-unit in the utterance of the second speaker. ngest T-unit of To tal Dialogue V17: Number of Words in Lo Code the num ber of words in the longest T-unit in the total Coding instructions for V17. dialogue. V18: Number of Within -clause Referentials in the Dialogue The line of dialogue below contains the within-cla use referential his . (m e borrow hi s notes, even though I needed them . an) Roy wouldn’t let m Coding instructions for V18. Code the num ber of within-clause referentials in the dialogue. V19: Number of Between-clause Referentials Within a Speaker’s Tu rn in the Dialogue The line of dialogue below contains the between-clause referential he . (m an) Julia asked m e to pick up the guest speaker, Bob Russell, at the airport this afternoon. Do you know what he looks like? 37

47 Coding instructions for V19. Code the num use ref erentials with in a ber of between-cla speaker’s turn in the dialogue. V20: Number of Refer of One Speaker That Refer to a Word in the entials in the Utterance f the Other Speaker Utterance o In the dialogue below, the pronoun an, refers to the word packages , , spoken by the m they spoken by the wom an. (wom an) Those packages took forever to arrive. arrive, didn’t they? an) But they did (m Code the num Coding instructions for V20. ls used by one speaker that ber of referentia rd in th rance of the other speaker. refer to a wo e utte V21: Number of Special Referen tials in the Dialogue erentia Special ref re th ose tha t ref er to th ings outside of the text. In the exam ple below, ls a the pronouns you and I refer to the speakers them selves ra ther than to words in the dialogue. (wo man) Do you have change for a fifty-dollar bill? an) A fifty-dollar bill! ! (m I hardly have fifty cents Code the num ber of special re ferentials in the dialogue. Coding instructions for V21. V22: Number of Words in the Key Coding instructions for V22. Code the num ber of words in the key. Discourse-level Codes Variables V 23-V26 Each item needs to be coded for one of th e following four variables having to do with utte tterns. rance pa V23: Uttera nce Pattern : Question-q uestion Coding instructions for V23. If the utterance pattern takes the for m of que stion-question, code 1; else 0. V24: Uttera nce Pattern : Statement- question 38

48 Coding instructions for V24. If the utterance pattern takes m of statem ent-question, the for code 1; else 0. : Statement-s tate ment V25: Uttera nce Pattern rance pa ent- kes the for m of statem If the utte Coding instructions for V25. ttern ta ent, code 1; else 0 statem . V26: Uttera nce Pattern : Question-statemen t If the u tterance pa ttern tak Coding instructions for V26: m of que stion-statement, es the for code 1; else 0. ctions fo Additional Coding Instru r V23-V26 If an utterance includes two sentences, one a ques tion and ano ther a s tatement, the item ’s ined to determ key needs to be exam e focus is on the question or on the statem ent. ine whether th For exam ple, in the dialogue below, the wom an both asks a q uestion and m akes a statem ent. The wom an’s response is cod ed as a statem ent because the key fo cuses on the statem ent part of her response. (m y chem istry hom ework. an) All I can turn in today is m an) Is everything all right? usually have everything com pleted (wom You on tim e. man i mply about the m (narrator) What does the wo an? (A) He usua lly tu rns in h is ass ignm ents la te. (B) He didn’t have tim e to com plete everything. (C) He is usually a conscientious student.* pletes only his chem istry work on tim e. (D) He usually com V27: The Speaker Has a Specific Role. For variable V27, use the following inst ructions f rom Nissan et al. (1996): judge whether the language of one of the speakers is linked to a sp ecific role the speaker plays. represent experiences For m Dialogues, the situations are som ewhat sim ilar; they tend to any common to young adults in the un iversity settin g (e.g., too much noise in the dorm itory, 39

49 problem s with a lab experim s take on an anonym ous “every student” role. In ent), and the speaker is of a very general nature a nd could be inferred to be spoken other cases, the speakers’ exchange the gist of the Dialogue or the speakers’ by practically anyone without m isunderstanding or som e item s, however, the iden tity of the speak ers diverges from the “every tions. F inten r communicative function student” and “any person” roles. The language of the speakers and thei is directly linked to some specialized role. The following exam ple exhibits a sp ecialized role (and a p robable lo cation). an) I’m looking for a warm jacket. (m (wom an) We have som arked down. e very nice ones m r) What does the wom ean? (narrato an m When processing this item, it would be helpful to assum e that the wom an is a sa les c lerk in a s (and that the speakers are probably situated ells clothing (pp. 9-10). tore that s Coding instructions for V27. If the language of one of the speakers is linked to a specific role the speak er plays a nd the role is not that of a cas ual acquaintance or classm ate, code 1; else 0. Variables V28-V32 Each item needs to be coded for one of the ne xt five variables; these concern the content of the dialogues with regard to if and/or how the content is re pus lif e. lated to cam ith the Academic Part of Campus Life. V28: The Content of the Dialogue Deals W ue is related to university academ ic activ ities. This includ es The content of the dialog as registe content such or classes ; studen ts’ attitud es towa rd the ir cou rse work; ref erence s to ring f materials used for class such as textbooks, calcula tors, and the like; studyi ng; interactions with professors involving course work; class attendan ents; exam s; hom ework; ce; academic requirem course assignm ents; classroom experience; and sim ilar content. One example is given below: (m an) All I can turn in today is m y chem istry hom ework. (wom You usually have everything com pleted an) Is everything all right? on tim e. (narrator) What does the wo man i mply about the m an? 40

50 (A) He usua lly tu is ass ignm ents la te. rns in h e to com plete everything. (B) He didn’t have tim (C) He is usually a conscientious student.* pletes only his chem e. istry work on tim (D) He usually com is related to university Coding instructions for V28. If the content of the dialogue e 0. ic activities, code 1; els academ V29: The Content of the Dialogue Deals With the Nonacademic Part of Campus Life. ic features such as re ferences to life in a dorm itory, studen This includes nonacadem t governm activities, getting tr ansportation to school, ent, discounts for students, extracurricular hile at school, jobs on ilar content. The following is an finding a place to live w campus, and sim ple: exam has really gotten out of control. an) You know, the noise in m (wom y dorm My roomm ate and I can rarely get to sleep before m idnight. (m an) W hy don’t you take the problem up with the dorm superviso r? (narrator) What does the m an suggest the wom an do? ion with the person in charge of the dorm (A) Discuss the situat itory.* (B) Ask her roomm ate not to m ake so m uch noise. (C) Go to bed after m idnight. r to the r (D) Send a lette ts. esiden Coding instructions for V29. If the content of the dialogue is related to nonacadem ic features of cam pus life, code 1; else 0. V30: The Content of the Dialogue Is Related to Campus Life But Could Also Be Related to One or Two Additional Domains. This includes references to c ontent such as the following, where is it not clear whether pus, recreation, or work relate d: working on a project, gym the context is cam s, cafeterias, roomm ates, books, presentations, health clinic, library, references to equipm ent such as com puters and photocopy m achines, and sim ilar content. In the exam ple given below , the thr ee projects could be conducted either at a unive rsity or in a work-related setting. 41

51 (wom an) I’m getting really stressed out. t don’t have th e tim e to work I jus on all three projects. (m ioritie s—jus t take the tim e to figure out what has t pr an) You need to se to be done first. an suggest the wom an do? (narrator) What does the m uch each pro ject will cos t. (A) Calculate how m e to relax. (B) Take tim th the p er. (C) Discuss her stress wi roject lead (D) Decide which project is m ost urgent.* If the content of the dialogue Coding instructions for V30. is related to campus life but could also be related to one or two additional dom ains becaus e the con text is not sp ecified, code 1; else 0. V31: Campus-Related Terms Are Present But Are Inciden ain Focus of the tal to the M Dialogue. One exam ple is given below: an) You know, I’ve been watering my plants regularly, but they’r e still (m not doing well in m w dor m room. y ne (wom an) Maybe instead of keeping them in the corner you should put them directly in front of the window. (narrator) What does the wom mply? an i (A) The plants m ay need m ore light.* (B) The plants should get less water. (C) The area in front of the window is too cold for plants. (D) Plants rarely do well in the dorm itory. Coding instructions for V31. If campus-related term inology is present but is incidental to the m ain foc us of the dialogue, code 1; else 0. 42

52 V32: The Content of the Dialogue Is Either Related to a Noncampus Domain Or Is Very General. Two exa mples are given below: related to the noncam pus dom ain of shopping. 1. The content of the dialogue below is an) I thought the departm ent store was open late from Tuesday (wom through Friday night. an) No, j (m ust Thursdays and Fridays. (narrator) On what nights is the store open late? (A) Thursdays and Fridays.* (B) Tuesdays and Fridays. (C) W ednes days and Thursdays. (D) Tuesdays, Thursdays, and Fridays. 2. The content of the dialogue below is very general and could occur in a great variety of settings. (m an) You know, every tim e I talk to Mary I get the feeling she’s being critical of me. (wom re overreacting a bit? an) Don’t you think you' r) What does the wom (narrato ean? an m (A) She thinks Mary is too critical. (B) She doesn’t know how to react. (C) She thinks the m an is too sensitive.* (D) She wants to know what the m an thinks. Coding instructions for V32. either very general or clearly If the content of the dialogue is related to a noncam pus dom ain, code 1; else 0. V33: Total Number of Words in th e Dialogue Coding instructions for V33. Code the total number of words in the dialogue. 43

53 Task-processing Codes Codes Involving Lexical Overlap Overlap With Words in the Dialogue V34: Number Of Words in the Key That ns f or c oding lexical overlap given in Coding instructions for V34. Using the instructio y that overlap with wo ds in the ke rds in the dialogue. Appendix B, code the number of wor overlap with words in the dialogue are content Note that m ost of the words in the key that lexical overlap is also coded fo r function words as described in es, words; however, in certain cas Appendix B. ey Th V35: Percentage of Words in the K at Overlap With Words in the Dialogue Divide the num Coding instructions for V35. ber of words coded for variable V34 by the num ber of words coded for variable V22. Words in the Dialogue Than Do Any of the V36: The Key Has More Words That Overlap With Three Distr acters. Coding instructions for V36. If the key has m ore words that overlap with words in the dialogue than do any of the three distracters, code 1; else 0. V37: No Distracter Has More Word s Than the K ey That Overlap W ith Wo rds in th e Dialogue. Coding instructions for V37. If no distracter has more word s that overlap with words in ode 1; else 0. Note that all s assigned a 1 for V36 should the dialogue than does the key, c item also be assigned a 1 for V37. ful Lex V38: The Key Has No Help lap With the Dialogue. ical Over Coding instructions for V38. If the key has no words that overlap with words in the dialogue OR if the key h as lex ical ov erlap with th e dialogu e that is identic al to the lex ical overlap of all three distra cters, code 1; else 0. V39: All Three Distra cters Have More Words Than Key That Overlap With Words In the Dialogue. Coding instructions for V39. I f all three distracters have m ore words that overlap with words in the dialogue than doe s the key, code 1; else 0. 44

54 V40: The Key Has the Last Overla pping Word With the Dialogue. A 1 is assigned for this code if (a) only the key has the last Coding instructions for V40. (b) the key and only one distracter have the last overlapping word with the dialogue, OR e, but the key’s other overlappi ng words com e later than those overlapping word with the dialogu of this one distracter, stracter have the last overlapping word but OR (c) the key and only one di lexically overlapping words; else 0. are otherwise equal in regard to ample below, o In the ex e key has the last overlapping wo rd with the dialogue, that nly th is, the word . No distracter has an overlapping word with th e dialogu e that com es later than tea the word . tea (m an) It’s really nice of you to vis it m e when I’m so miserable with the flu. I’m sure I’d feel much bette r if I just ha d some of my mo m’ s hom emade chicken soup. an) That will be [that’ll be] e by, but a cup of (wom hard to com m ight help. hot tea r) What will the wom (narrato an probably do next? (A) Make som e tea for the m an.* (B) Take the m an to see a doctor. (C) Ask the m an’s m other to com e over. (D) Look up a recipe for chicken soup. V41: There Is Overlap Between Words in the Key and Words Spoken by the Second Speaker in the Dialogue. Coding instructions for V41. If the key has a word or words that overlap with those of the second speaker in the dialogue, code 1; else 0. V42: There Is Overlap Between Words in the Key and Words in the L ast Clause o f the Dialogue. Coding instructions for V42. If the key has a word or words that overlap with those of the last claus e in the dial ogue, code 1; else 0. 45

55 V43: The Key Has a Word That Is Synonymous With a Wo Last Clause of the rd in the Dialogue. Coding instructions for V43. For item s coded 0 for V42, if the key has a word that is with a word in the last clau ous synonym se of the dialogue, code 1; else 0. V44: All Three Distra cters Have L exical Over lap With the Dialogue That Comes Later in the Dialogue Than Does Any Lexical Overlap of the Key. xical overlap w ith the dialogue Coding instructions for V44. If all three distracters have le that com es later in the dialogue than does any lexical overlap of the key, code 1; else 0. ample below, there is ov erlap between the word go in the key and the word In the ex in go the dialogue. Each of the three distracters have words that overlap with words in the dialogue e later in the dialogue than does the word . that com go bowling with him this weekend. (m an) Dennis would like us to go an) I’d love to—but not (wom until I get this project out of the way ... and ! that cou ld take weeks (narrator) What does the woman mean? (A) She doesn’t like bowling . (B) She probably won' t be able to go .* (C) She’ll go bowling with Dennis next week . this weekend. (D) She’ll help Dennis with his project Other Text-Processing Codes Respond Correctly to the Item. V45: An Inference Is Required to Variable V45 identifies item her the inform ation teste d is explicitly or s according to whet im tated in the s tim ulus. The answer to an item that te sts explic it inf orm ation is of ten a plicitly s paraphrase of what was stated in the stim ulus. To answer an item that tests im plicit inf orm ation, it is often necessary to go beyond what is actuall ulus. Most of the dialogues y stated in the stim that test inference hav e stem s worded “W hat does the m an/wom an im ply? ” or “W hat does the man/woman im ply about x? ” One exam ple is given below. (wom an) What did you think of the new doctor at the infirm ary? 46

56 (m an) You m He was away attending a conference. ean Dr. Randolf? an i mply? (narrator) What does the m tor wasn’t well. (A) The doc (B) He didn’t see the new doctor.* anyway. (C) The doctor was going to see him (D) He went to a conference with Dr. Randolf. item requires an inference, Coding instructions for V45. If responding correctly to the code 1; else 0. ructions for V 45. Do NOT Additional coding inst assign a 1 f or this variable if the only inference involved is inferring the referent ore pronouns in the dialogue. of one or m the Dialogue Contains Two Sentences, Clauses, V46: The Utterance of the Second Speaker in of These Such That Each of These Sentences, Phrases, Exclamations, or Some Combination in Isolation, Clauses, Phrases, or Exclamations, Can Yield the Key. In the example below, it is possible to respond correctly to this item if one only com prehends the sentence, “Oh, it’s not a problem anym ore” or if one only com prehends the sentence, “I’ve found an ointm ent that works just ssary to com prehend both fine.” It is not nece . sentences to respond correctly to this item an) Have you seen the doctor about your skin condition yet? (wom an) Oh, it’s not a problem anym ore. I’ve found an ointm (m ent that works just fine. (narrator) What does the m an i mply? (A) The doctor was too busy to see him . (B) He does n’t need to see the doctor.* (C) The wom ent. an should use the ointm (D) His skin condition has gotten worse. Coding instructions for V46. If there are two sen tences, claus es, phrases, exclam ation s, or som e com bination of these in th e turn of the s eco nd speaker in the dia logue such that each of them , in isolation, can yield the key, code 1; else 0. 47

57 Additional coding inst ructions for V When coding this variable, one should assum e 46. e referents of any pronouns used by the second that the test taker has correctly inferred th that the test taker has inferred that the speaker. In the exam ple below, one should assume , spoken by the m an, refers to the South Dor it pronoun m. (wom an) I need a place to live nex t semester. The ride back and forth to class this year was too much. an) Did you check out the South Dorm ? The room s are pretty sm (m all, but it’s close to everything. an suggest the wom (narrator) What does the m an do? (A) Move out of the South Dorm . . (B) Find a bigger room in the South Dorm.* (C) Look for a room (D) Stay where she lives now. V47: Only Comprehension of the Utterance o f the Second Speaker Is Needed to Respond Correctly to the Item. In the example below, it is only necessary to comprehend what the second speaker has to say in order to respond co rrectly to this item . an) W ssor Sm ith? I’m thinking of taking (m hat have you heard about Profe . an advanced engineering course with him an) You really should. One of hi s articles just won some sort of (wom award—and I heard he’s always publ ishing som ething in the journals. (narrator) What does the wom an say about the professor? (A) His clas ery difficult. ses are v (B) His work is well respected.* (C) He will publish a bo ok soon. (D) He is no longer teaching. Coding instructions for V47. If it is not necess ary to com prehend what the first speaker says in order to respond correctly to this item , code 1; else 0. 48

58 Additional coding inst ructions for V This code is NOT assigned to an item if the key 47. uses any term used by the first speaker unless the term present in the respon se for the item is also d/or in d speaker an of the secon the question asked by the narrator. V48: The Key Is a Suggestion or Directive. Coding instructions for V48. If the key is a suggestion or directiv e such as including the should or using the im perative form of a verb, c ode 1; else 0. Below are two examples of word s coded for this variable. item ple 1: Exam an) How often do the buses run? (wom (m an) Every half hour on weekdays, but I’m not sure about w eekends. There’s a schedule on the corner by the bus stop. an i mply? (narrator) What does the m (A) The wom an should check the bus schedule.* (B) The buses stop running on Fridays. (C) The bus doesn’t stop at the corner. (D) The sch edule on th e corner is ou t-of-date. Exam ple 2 : an) I need to be in the city by . to get to a 9 :30 [ nine- thir ty] (wom 9 a.m doctor’s appointm ent... Do you think I should take the bus or the train? (m an) Let’s see ... the bu s doesn’t arri ve till 9 :45 [nine-forty-f ive]... Oh! But the train gets in at qu arte r to n ine. (narrator) What does the m an do? an suggest the wom (A) Reschedule her appointm ent. (B) Travel by bus. (C) Meet him at the bus station. (D) Take the tra in to th e city.* V49: The Key Seems to Be Inconsis tent With th e Content of the Dialogu e. Exam ples of item s coded for this variable are given below. 49

59 1. In a num s where the narrat or asks about what the second speaker ber of item ed, the key seem s to be inconsistent wi aid in the dia logu e. In the assum th what is s ple below, there is an apparen exam t inconsistency between the key (“Someone would drive them (the cousins) hom e”) and “So they (the cousins) didn’t m anage to get a lift after all” in the dialogue. an) Your cousins ju st called. They ’re s trand ed at the b (m each. (wom anage to get a lift after all. an) So they didn’t m an assum (narrator) What had the wom ed about her cousins? (A) Their friends would take them to the beach. ind taking the bus. (B) They wouldn’t m (C) Som home.* eone would drive them (D) They wouldn’t be able to find a phone. 2. In a num ber of dialogues that involve sarcas m, the key seem s to be inconsistent with what is said in the dialogue. In som e of these cases, there is ap parent p rais e of som eone or som whereas there is c riticism in the key. ething in the dialogue, an) Can you believe it? our (m Now we’re supposed to bring a note from cto r ev ery sing le tim e we instru want to use the computer! (wom an) [sarcastically] I’ll b et th at w as another o ne of Mike’s brilliant ideas! (narrator) What does the wom an i mply about Mike? (A) He often m foolish suggestions.* akes (B) His instructor won' t give him a note. (C) He should try using the com puter him self. (D) He is a very good instructor. 3. Another example of whe re the key seem s to be inconsistent w ith what is said in the dialogue is where a seem ingly negative response to a request is actually a positiv e one. 50

60 (wom an) Mind if I borrow your ics notes for a while? econom an) Not at all. (m r) What does the m an mean? (narrato (A) He’ll only give he r part of his notes. ics. (B) He doesn’t know anything about econom ics class. (C) He’s not taking an econom (D) He’s happy to le nd her his notes.* Coding instructions for V49. If the key seem s to be inconsistent with what is stated in the dialogue, code 1; else 0. Additional coding inst ructions for V 49. This cod e is NOT assigned if a statem ent in the dialogue appears to be inconsis tent with a later statem the dialogue itself, as in the ent in ple below: exam (wom cited about the class election. an) A lot of people were ex (m an) But they didn’t turn out to vote, did they? (N) W hat does the m an imply about the students? (A) They weren’t rea teres ted in the election. * lly in (B) They didn’t vote for the best people. (C) Their votes weren’t counted. (D) They re mained enthusiastic abo ut the c andidates. 51

61 Appendix B Instructions for Coding Lexical Overlap ul lexic al ove ded, that is, if th e key has lexical overlap Only words with helpf rlap are co distracters, it is not coded for to the lexical overlap of all three with the dialogue that is identical Nancy , which appears in the dialogue, below, the word lexical overlap. For example, in the item o all four options; this wo on t is comm rd is not coded for lexical overlap. (m an) W e got a thank-you note from Nancy today. She said she’s already worn the scarf we sent. an) That’s g (wom reat. I wasn’t sure if she’d wear red. (narrator) What had the wom an been concerned about? (A) Nancy wouldn’t send a thank-you note. (B) Nancy hadn’t received the scarf. (C) Nancy wouldn’t like the gift.* (D) Nancy doesn’t wear scarves. The instructions below ty pically ref er to le xical overlap between words in the dialogue instructions apply equall and words in the key. It should be noted that the y well to lexical overlap between words in the dialogue a nd words in the distracters. I. For content words (i.e., nouns, m ain verbs, adjectives, and adverbs), use the instructions below to determ ine whether there is lexical ove rlap between a word in the key and a word in the dialogue. 1. in the key and a word in the dialogue is coded if the Lexical overlap between a word e; for exam would be coded expecting and expected root of the words is the sam ple, ically overlapp as lex se both share the sam e root (i. e., exp ect ). In the ing words becau exam ple below, lexical overlap is coded between the word reading in the dialogue ). Ther and the word e key because both have the sam e root ( read in th e is also read lexical overlap in this item between the word page in the dialogue and the identical word page in the key. that one page for a lon g tim e now. (m an) You’ ve certain ly been read ing 52

62 (wom an) Well, I’m orrow. being tested on it tom an i mply? (narrator) What does the wom (A) She’s reading a very long book. an is m (B) The m istaken. (C) She nee ds to read the page carefully.* orking on a long assignment. (D) She’s w 2. To code lexical overlap between a w ord in the key and a word in the dialogue, the e or si milar m eanings; for exam ple, the word left , when words need to have the sam used to refer to a direction, would NOT be coded as having lexical overlap with the left, leave . In the following item, lexical word when it is the past tense of the word overlap is NOT coded between the word go in the key and the word going in the ms of the word dialogue, since these two for have quite different m eanings. go (wom an A) That fam ous violinist ou r professor was talking about is going ! to be the soloist in next week’s concert (wom an B) Great! I don’t want to miss it. W here can we get tickets ? (narrato r) What will the speakers p ? robably do next week essor is going to perfor (A) Find out where their prof m. (B) Go to a concert .* (C) Perform in a m usical recital. rvie w the violinist. (D) Inte If a word appears twice in a dialogue but refers to two diffe rent things, lexical 3. overlap is onl y coded between the word in the ke y and the word with the sam e referent in the dialogu e. In the ex ample below, the word salad refers to two different things in the dialogue. O for lexical overlap between the word salad in ne only codes the key and the word salad spoken by the second speaker because these two words have the sam e referent (i.e., tuna salad), whereas one does NOT code for lexical overlap b etween the word salad in the key and the word salad spoken by the first 53

63 speaker, s ince in th the word salad refers to tuna salad whereas the w ord salad e key fferent ref erent, nam spoken by the first speaker refers to a di ely, chicken salad. dered? This looks like chicken salad. an) Are you sure this is what I or (m (wom sorry. You ordered the tuna salad an) Oh, I’m , didn’t you? I’ll be right back with it. an m ean? (narrator) What does the wom (A) She wants to eat chicken salad. (B) The chicken salad is gone. an’s food. (C) She dropped the m (D) She’ll bring the tuna salad .* 4. A word in the key is coded as having lexi cal overlap with a w ord in the dialogue if the sam pound word in the dialogue or vice-versa. In e word appears as part of a com the exam ple below, lexical overlap is coded betw een the word hall in the key and hall in the com pound word hallway in the dialogue. (m an A) I can hardly read because it’s so dark in this classroo m. an B) It is in the ha way, too. (wom ll (narrato r) What does the wom an m ean? (A) The hall is also da rk.* (B) It’s difficult to read while class is going on. (C) The reading assignm ent was too long. (D) All the classroom s are the sam e. 5. Lexical overlap is coded between a w ord th at is comm only used as a substitute for a longer word of which it is a part and the ord itself. In the exam ple below, longer w lexical overlap is coded between the word dorm in the dialogue and the w ord dormitory in the key, since dorm is part of the longer word dormitory and is frequently used instead of the longer word. 54

64 has really gotten out of control. (wom an) You know, the noise in m y dorm ate and I can rarely get to sleep before m My roomm idnight. hy don’t you take the up with the dorm superviso r? an) W problem (m an do? (narrator) What does the m an suggest the wom ion with the person in charge of the dorm (A) Discuss the situat itory.* (B) Ask her roomm ate not to m ake so m uch noise. idnight. (C) Go to bed after m (D) Send a lette r to the r ts. esiden function words iners, auxiliary ve rbs, conjunctions, prepositions, and II. For (i.e., determ pronouns), use the instructions below to de term ine whether there is lexical overlap ord in the key and a word in the dialogue. between a w Determ iners such as a and the in the key are cod ed as having lexical overlap with 1. the sam e words in the dialogue only when they directly precede the sam e content word. For exam ple, if appears in the key and the dog also appears in the the dog as having lexical overlap. dialogue, both word are coded In the ex ample below, lexical overlap is coded between the words the a) in the key and the sam party the party in the dialogue. e words (m an) My math assignment’s due tomorrow m orning and I haven’t even started it yet. tonight. an) I’ll m (wom iss you at the party (narrator) What does the wom an i mply? (A) The par ty will be c rowded. (B) The m an will do his assignm ent bef ore the p arty. (C) She’s not going to the party. an won’t be able to go to the party (D) The m .* b) In the example below, lexical ove rlap is only coded between the word machine in the key and the word machine in the dialogue. Lexical 55

65 overlap is NOT the in the key and the word coded between the word in the dialogue because the word the the in the dialogue does not machine recede the word . directly p to work. (m an) I can’t seem to get the copy machine (wom an) Have you checked the switch? an i mply? (narrator) What does the wom (A) The m achine works like that other one. an should change m achines. (B) The m (C) The m achine ight not be turned on. * m ight be charged for the copies. (D) The m an m Auxiliary verbs in th are coded as having lexical ov erlap with the d ialogue only 2. e key when they have the s the ke y as in the dialog ue, that is, they preced e ame function in the sam e or sim ilar content. In the exam ple below, the auxiliary verb hasn’t precedes content in th e key tha t is sim ilar to th e conten t it precede s in the dialogu e. (wom an) Has Alice decided on a m I know she was thinking ajor yet? erican history. about Am an) She has so m any interes ts—as far as I kno w she hasn’t been able to (m make up her m ind. ator) W hat does the m an say about Alice (narr ? (A) She isn’t interested in being a historian. (B) She has n’t chosen a course of study.* (C) She’s studying Am erican history. (D) She’s a very good student. Additional coding instructi The above instructions also apply to ons for auxiliary verbs. ’ll contracted auxiliary verbs (e.g., she’ll or I’ll ). as in 3. Form s of the verb to be in the key are coded as ha ving lexical overlap with the dialogue only when they have the sam e functi on in the key as in the dialogue, that is, they precede the sam e or sim ilar content. 56

66 a) In the ex overlap is coded between the verb been in ample below, lexical been spoken by the second speaker in the dialogue the key and the verb is f d by sim ilar content in both cases. (L exical overlap because ollowe been is also coded for this item between the word paper in the key and the word in the dialogue.) paper (wom an) I haven’t seen you at the st udent center all week. H ave you been ? sick overwhelm ed with my history paper . (m an) I’ve b een r) What does the m (narrato an mean? d extra h isto ry clas ses. (A) He decided to atten (B) He hopes to m an at the student center. eet the wom (C) He was too sick to w ork on his paper. (D) He’s be en busy working on his paper .* b) In the example below, lexical ove rlap is NOT coded between the verb is in the key and the verb is r in the dialogue because spoken by the first speake ite different in the two cases. the content following the verb is qu you bought? I’ve never seen such an old jalopy! an) This is the ca r (wom an) It m ay not look like m uch, but it gets m e where I’m (m going. (narrato r) What does the m an mean? (A) The car is dependable.* (B) The car isn’t v ery o ld. (C) This ca r is better than his old one. (D) He paid too m ar. uch for the c 4. Prepositions in the key are coded as havi ng lexical overlap w ith the sam e preposition in the dialogue when the preposition has th e sam e function in the key as it has in the dialogue (i.e., when the preposition precedes the sam e word, or when it p recedes a synonym of the word, or when it precedes a word that refers to the sam e thing in the 57

67 key as it does in the dialogue). In the exam ple below, lexical overlap is coded with between the preposition in the key and the preposition with spoken by the e s that refer to the sam instances are followed by word second speaker, since both thing. (Lexical overlap is also coded for th is item between the word ski in the key and the word skiing in the dialogue.) with m e this weekend, or do you have to an) Can you com e skiing (wom s? study for your exam you, but I’m so tired from studying that I’m (m th an) I’ll com e along wi afraid I won’t be doing much skiing . an probably do? (narrator) What will the m (A) Stay hom e and study all weekend. (B) Stay hom e and rest all weekend. (C) Go with the wom an a nd ski all weekend. (D) Go with the wom an and rest rather than ski .* 5. Pronouns in the key are coded as having le e pronouns in xical overlap with the sam me thing in both cas ple the dialogue when the pronoun refers to the sa es. In the exam she in the dialogue and the word she in the key both ref below, the word er to the sam e person, Laura. was supposed to (wom an A) What’s Laura doing here today? I thought she be out of the office on Mondays. (wom decided she ’d rather have Fridays off instead. an B) She (narrator) What can be inferred about Laura? (A) She has changed her schedule.* (B) She was sick on Friday. (C) She works less than she used to. (D) Her vacation started on Monday. 6. Conjunctions in the key are coded as having lexical overlap with the same conjunctions in the dialogue when the c onjunction has the sam e function in the key 58

68 as it h as in the dialogu ction p reced es the sam e or sim ilar e, that is, when the conjun ple below, lexical overlap is coded betw why in the content. In the exam een the word why in the dialogu e because th e words precede sim ilar content. key and the word (m an) Joe e alone ten m inutes ago. took a taxi hom an) I wonder why (wom e to go with him. he didn’t wait for m (narrator) What does the wom an m ean? (A) She wanted to visit Joe' s hom e. (B) She doesn’t understand why left without her.* Joe (C) Joe should take a taxi to her house. (D) Joe didn’t want to ta ke the taxi to his house. 7. Negative form s of verbs such as can’t , doesn’t , and haven’t are not coded as having lexical overlap with positive form rbs; that is, lexical overlap is not coded s of these ve between can ’t and can . 59

69 ® Test of English as a Foreign Language PO Box 6155 Princeton, NJ 08541-6155 USA To obtain more information about TOEFL programs and services, use one of the following: Phone: 1-877-863-3546 (US, US Territories*, and Canada) 1-609-771-7100 (all other locations) Email: [email protected] Web site: www.ets.org/toefl * America Samoa, Guam, Puerto Rico, and US Virgin Islands

Related documents

Fourth National Report on Human Exposure to Environmental Chemicals Update

Fourth National Report on Human Exposure to Environmental Chemicals Update

201 8 Fourth National Report on Human Exposure to Environmental Chemicals U pdated Tables, March 2018 , Volume One

More info »
AT Commands Reference Guide

AT Commands Reference Guide

AT Commands Reference Guide GE863-QUAD, GE863-PY, GE863-GPS, GM862-QUAD, GM862-QUAD-PY, GE862-GPS, GE864-QUAD, GE864-PY, GC864-QUAD and GC864-PY 80000ST10025a Rev. 0 - 04/08/06

More info »
The 9/11 Commission Report

The 9/11 Commission Report

Final FM.1pp 7/17/04 5:25 PM Page i THE 9/11 COMMISSION REPORT

More info »
JO 7400.11C   Airspace Designations and Reporting Points

JO 7400.11C Airspace Designations and Reporting Points

U.S. DEPARTMENT OF TRANSPORTATION ORDER FEDERAL AVIATION ADMINISTRATION 7400.11C JO Air Traffic Organization Policy August 13, 2018 SUBJ: Airspace Designations and Reporting Points . This O rder, publ...

More info »
Nios® II Software Developer's Handbook

Nios® II Software Developer's Handbook

® Nios II Software Developer's Handbook ® ® Quartus Updated for Intel Prime Design Suite: 19.1 Subscribe NII-SDH | 2019.04.30 Send Feedback Latest document on the web: PDF | HTML

More info »
RIE Tenant List By Docket Number

RIE Tenant List By Docket Number

SCRIE TENANTS LIST ~ By Docket Number ~ Borough of Bronx SCRIE in the last year; it includes tenants that have a lease expiration date equal or who have received • This report displays information on ...

More info »
WEF GlobalInformationTechnology Report 2014

WEF GlobalInformationTechnology Report 2014

Insight Report The Global Information Technology Report 2014 Rewards and Risks of Big Data Beñat Bilbao-Osorio, Soumitra Dutta, and Bruno Lanvin, Editors

More info »
catalog 2019

catalog 2019

2019 ® HARLEY-DAVIDSON GENUINE MOTOR PARTS & ACCESSORIES

More info »
Working Copy of Digital Project Development

Working Copy of Digital Project Development

Connecticut Department of Transportation – Digital Project Development Manual CONNECTICUT DEPARTMENT OF TRANSPORTATION DEVELOPMENT DIGITAL PROJECT MANUAL Version 4.0 4 Issued Version 4.04 3/2019 1

More info »
UNSCEAR 2008 Report Vol.I

UNSCEAR 2008 Report Vol.I

This publication contains: VOLUME I: SOURCES SOURCES AND EFFECTS Report of the United Nations Scientific Committee on the Effects of Atomic Radiation to the General Assembly OF IONIZING RADIATION Scie...

More info »
OCS Operations Field Directory

OCS Operations Field Directory

Gulf of Mexico OCS Region OCS Operations Field Directory (Includes all active and expired fields and leases) Quarterly Repor t, as of March 31 , 201 9 U.S. Department of the Interior Bureau of Ocean E...

More info »
vol9 organic ligands

vol9 organic ligands

C HERMODYNAMICS HEMICAL T OMPOUNDS AND C OMPLEXES OF OF C U, Np, Pu, Am, Tc, Se, Ni and Zr O ELECTED WITH RGANIC L IGANDS S Wolfgang Hummel (Chairman) Laboratory for Waste Management Paul Scherrer Ins...

More info »
CompleteBusBook

CompleteBusBook

February 10, 2019 $ 1 BUS BOOK EFFECTIVE THROUGH JUNE 8, 2019 OCBus.com EFECTIVO HASTA EL 8 DE JUNIO 2019 EASY JUST GOT EASIER. Upgrade to version 2.0 See back for cool new features! CHANGE HIGHLIGHTS...

More info »
E Tile Transceiver PHY User Guide

E Tile Transceiver PHY User Guide

E-Tile Transceiver PHY User Guide Subscribe UG-20056 | 2019.04.19 Send Feedback Latest document on the web: PDF | HTML

More info »
FD GeneralGuidelines BestPractices HandlingRetrievals Chargebacks

FD GeneralGuidelines BestPractices HandlingRetrievals Chargebacks

Retrieval & Chargeback Best Practices A Merchant User Guide to Help Manage Disputes Visa MasterCard Discover American Express April 2018 www.First D ata.com

More info »
Strong Start for Mothers and Newborns Evaluation: Year 5 Project Synthesis Volume 1: Cross Cutting Findings

Strong Start for Mothers and Newborns Evaluation: Year 5 Project Synthesis Volume 1: Cross Cutting Findings

g Star t f or Mothe rs and Newbor ns Evaluation: Stron YNTHESIS ROJECT S AR 5 P YE Volume 1 indings -Cutting F ross : C Prepared for: ss Caitlin Cro -Barnet Center fo HS nd Medicaid Innovation, DH r M...

More info »
doj final opinion

doj final opinion

UNITED STAT ES DIS TRICT COURT IC F OR THE D ISTR T OF CO LU M BIA UNITED STAT F AMERICA, : ES O : : la in t if f, P 99 No. on cti l A vi Ci : 96 (GK) -24 : and : TOBACCO-F UND, : REE KIDS ACTION F : ...

More info »