A Corpus-based Evaluation of a High-stakes EFL Exam

Rafatbakhsh, Elaheh; Ahmadi, Alireza

doi:10.22034/jsllt.2024.21052.1030

A Corpus-based Evaluation of a High-stakes EFL Exam

Document Type : Original Article

Authors

Department of Foreign Languages and Linguistics, Shiraz University, Shiraz, Iran

10.22034/jsllt.2024.21052.1030

Abstract

High-stakes assessments play a significant role in people’s lives, and their results greatly define individuals’ future social and financial prospects. Corpus linguistics has recently been used to inform the development and validation of such tests. This study aimed at identifying the degree of typicality of vocabulary items tested in the English proficiency subtest of the Master of Arts/Science Iranian University Entrance Exam. To this end, the vocabulary options and collocations in 20 test versions were extracted, and their frequency of occurrence in the Corpus of Contemporary American English was examined using a specially written computer program. The results indicated that the frequency of the options in the academic genre was not as dominant as expected in a test designed for academic purposes. The findings also revealed some inconsistencies among the different parallel test versions in terms of their option frequencies. Furthermore, for some options and collocations, atypicality was observed as zero or close to zero instances in the corpus. The current study suggests the inclusion of frequency information from corpora and various wordlists to accompany test developers’ intuition for more robust vocabulary assessment.

Keywords

Main Subjects

Language Assessment and Evaluation

References

Ahmadi, A. & Thompson, N. A. (2012). Issues affecting item response theory fit in language assessment: A study of differential item functioning in the Iranian national university entrance exam. Journal of Language Teaching & Research, 3(3), 401-412.

Ahmadi, A., Darabi Bazvand, A., Sahragard, R. & Razmjoo, A. (2015). Investigating the validity of PhD entrance exam of ELT in Iran in light of argument-based validity and theory of action. Journal of Teaching Language Skills, 34(2), 1-37.

Akbari, N. (2016). Word frequency and morphological family size effects on the accuracy and speed of lexical access in school-aged bilingual students. International Journal of Applied Linguistics, 26(3), 311-328.

Alderson, C. & Wall, D. (1993). Does washback exist? Applied Linguistics, 14(2), 115-129.

Alderson, J. C. (1996). Do corpora have a role in language assessment? In Thomas, J. & Short, M. (Eds.), Using corpora for language research: Studies in the honour of Geoffrey Leech (pp. 248-259), London: Longman.

Bachman, L. & Palmer, A. S. (1996). Language Testing in Practice. Oxford: Oxford University Press.

Bai, Y. (2005). Authenticity Assessment of Proofreading in NMET by Corpus-based Approach. Unpublished master’s thesis, Guangdong University of Foreign Studies, Guangzhou, China.

Bazvand, A. D., Kheirzadeh, S. & Ahmadi, A. (2019). On the statistical and heuristic difficulty estimates of a high stakes test in Iran. International Journal of Assessment Tools in Education, 6(3), 330-343.

Beglar, D. & Nation, P. (2007). A vocabulary size test. The Language Teacher, 31(7), 9-13.

Beigman Klebanov, B., Ramineni, C., Kaufer, D., Yeoh, P. & Ishizaki, S. (2019). Advancing the validity argument for standardized writing tests using quantitative rhetorical analysis. Language Testing, 36(1), 125-144.

Biber, D., Conrad, S., Reppen, R., Byrd, P. & Helt, M. (2002). Speaking and writing in the university: A multidimensional comparison. TESOL Quarterly, 36(1), 9-48.

Bovaird, J. A., Geisinger, K. F. & Buckendahl, C. W. (2011). High-stakes Testing in Education: Science and Practice in K-12 Settings. Washington, DC: American Psychological Association.

Brown, J. C., Frishkoff, G. A. & Eskenazi, M. (2005). Automatic question generation for vocabulary assessment. Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, Vancouver, 819-826.

Chen, X., Dong, Y. & Yu, X. (2018). On the predictive validity of various corpus-based frequency norms in L2 English lexical processing. Behavior Research Methods, 50(1), 1-25.

Choi, I. C. & Moon, Y. (2020). Predicting the difficulty of EFL tests based on corpus linguistic features and expert judgment. Language Assessment Quarterly, 17(1), 18-42.

Chujo, K. & Hasegawa, S. (2003). Jijieigo no jugyo de motiirareru eibunsozai no goi reberuchousadBNC (British National Corpus) wo kijun ni site [An investigation of vocabulary levels of materials used in current English class: in reference to BNC]. Jiji Eigogaku Kenkyu, 42, 439-451.

Coxhead, A. (2000). A new academic word list. TESOL Quarterly, 34(2), 213-238.

Crossley, S. A., Salsbury, T., McNamara, D. S. & Jarvis, S. (2011). Predicting lexical proficiency in language learner texts using computational indices. Language Testing, 28(4), 561-580.

Crosthwaite, P. R. & Raquel, M. (2019). Validating an L2 academic group oral assessment: Insights from a spoken learner corpus. Language Assessment Quarterly, 16(1), 39-63.

Culligan, B. (2015). A comparison of three test formats to assess word difficulty. Language Testing, 32(4), 503-520.

Cushing, S. T. (2017). Corpus linguistics in language testing research. Language Testing, 34(4), 441-449.

Davies, M. (2008). The corpus of contemporary American English: 450 million words, 1990-present. Available from http://corpus.byu.edu/coca

Davis, A. (2006). High stakes testing and the structure of the mind: A reply to Randall Curran. Journal of Philosophy of Education, 40(1), 1-16.

Egbert, J. (2017). Corpus linguistics and language testing: Navigating uncharted waters. Language Testing, 34(4), 555-564.

Ellis, N. C. (2002). Frequency effects in language processing: A review with implications for theories of implicit and explicit language acquisition. Studies in Second Language Acquisition, 24, 143-188.

Gardner, D. & Davies, M. (2014). A new academic vocabulary list. Applied Linguistics, 35(3), 305-327.

Gebril, A. & Eid, M. (2017). Test preparation beliefs and practices in a high-stakes context: A teacher’s perspective. Language Assessment Quarterly, 14(4), 360-379.

Goodfellow, R., Lamy, M. -N. & Jones, G. (2002). Assessing learners’ writing using lexical frequency. ReCALL, 14(1), 133-145.

Hazenberg, S. & Hulstijn, J. (1996). Defining a minimal receptive second-language vocabulary for non-native university students: an empirical investigation. Applied Linguistics, 17(2), 145-163.

Iranian National Organization for Educational Testing, (2020). www.sanjesh.org

Isaacs, T., Trofimovich, P. & Foote, J. A. (2018). Developing a user-oriented second language comprehensibility scale for English-medium universities. Language Testing, 35(2), 193-216.

Johansson, S. (2009). Some thoughts on corpora and second-language acquisition. Corpora and language teaching, 33-44.

Larsson, M. & Olin-Scheller, C. (2020). Adaptation and resistance: washback effects of the national test on upper secondary Swedish teaching. The Curriculum Journal, 31(4), 687-703.

Laufer, B. (1992). How much lexis is necessary for reading comprehension? In Arnaud, P. J. L. & Bejoint, H. (Eds.), Vocabulary and Applied Linguistics, pp. 126-132, London: Macmillan Academic and Professional.

Laufer, B., Elder, C., Hill, K. & Congdon, P. (2004). Size and strength: Do we need both to measure vocabulary knowledge? Language Testing, 21(2), 202-226.

Lin, D. & Gao, M. (2020). Book review: Teacher involvement in high-stakes language testing. Language Testing, 37(1), 159-162.

Lin, Y. C., Sung, L. C. & Chen, M. C. (2007). An automatic multiple-choice question generation scheme for English adjective understanding. In Workshop on Modeling, Management and Generation of Problems/Questions in eLearning, the 15th International Conference on Computers in Education (ICCE), 137-142.

Mitkov, R. & Ha, L. A. (2003). Computer-aided generation of multiple-choice tests. Proceedings of the HLT-NAACL 03 Workshop on Building Educational Applications Using Natural Language Processing, 2, 17-22.

Monteiro, K. R., Crossley, S. A. & Kyle, K. (2020). In search of new benchmarks: Using L2 lexical frequency and contextual diversity indices to assess second language writing. Applied Linguistics, 41(2), 280-300.

Nation, P. (2006). How large a vocabulary is needed for reading and listening? The Canadian Modern Language Review, 63(1), 59-82.

Okamoto, M. (2015). Is corpus word frequency a good yardstick for selecting words to teach? Threshold levels for vocabulary selection. System, 51, 1-10.

Pan, M. & Qian, D. D. (2017). Embedding corpora into the content validation of the grammar test of the National Matriculation English Test (NMET) in China. Language Assessment Quarterly, 14(2), 120-139.

Paribakht, T. S. & Webb, S. (2016). The relationship between academic vocabulary coverage and scores on a standardized English proficiency test. Journal of English for Academic Purposes, 21, 121-132.

Park, K. (2014). Corpora and language assessment: The state of the art. Language Assessment Quarterly, 11(1), 27-44.

Pawley, A. and Syder, F. (1983). Two puzzles for linguistic theory. In Richards, J. and Schmidt, R. (eds.). Language and Communication. London: Longman.

Rafatbakhsh, E., Ahmadi, A., Moloodi, A. & Mehrpour, S. (2021). Development and validation of an automatic item generation system for English idioms. Educational Measurement: Issues and Practice, 40(2), 49-59.

Ravand, H. & Firoozi, T. (2016). Examining construct validity of the master’s UEE using the Rasch model and the six aspects of the Messick’s framework. International Journal of Language Testing, 6(1), 1-18.

Ravand, H., Rohani, G. & Faryabi, F. (2018). On the factor structure (invariance) of the PhD UEE using multigroup structural equation modeling. Journal of Teaching Language Skills, 36(4), 141-170.

Razavipur, K. (2014). On the substantive and predictive validity facets of the university entrance exam for English majors. Research in Applied Linguistics, 5(1), 77-90.

Sasao, Y. & Webb, S. (2017). The word part levels test. Language Teaching Research, 21(1), 12-30.

Schmidtke, J. (2014). Second language experience modulates word retrieval effort in bilinguals: Evidence from pupillometry. Frontiers in Psychology, 5, 1-16.

Schmitt, N. (2012). Formulaic language and collocation. In C. Chapelle (Ed.), The encyclopedia of applied linguistics, pp. 1-10, New York: Blackwell.

Shohamy, E. (2001). The Power of Tests: A Critical Perspective on The Uses of Language Tests. Harlow, England: Longman.

Shohamy, E., Donitsa-Schmidt, S. & Ferman, I. (1996). Test impact revisited: Washback effect over time. Language Testing, 13(3), 298-317.

Sinclair, J. (1991). Corpus, Concordance, Collocation. Oxford, UK: Oxford University Press.

Staples, S., Biber, D. & Reppen, R. (2018). Using corpus‐based register analysis to explore the authenticity of high‐stakes language exams: A register comparison of TOEFL iBT and disciplinary writing tasks. The Modern Language Journal, 102(2), 310-332.

Taylor, L. & Barker, F. (2008). Using corpora for language assessment. Encyclopedia of Language and Education, 7, 241-254.

Vu, D. V. (2019). A corpus-based lexical analysis of Vietnam’s high-stakes English exams. In The 20th English in Southeast Asia Conference. Singapore: National Institute of Education, Nanyang Technological University.

Weir, C. J. and Milanovic, M. (Eds.) (2003). Continuity and innovation: The History of the CPE, 1913-2002. Vol. 15, Cambridge, England: Cambridge University Press.

Journal of Studies in Language Learning and Teaching

A Corpus-based Evaluation of a High-stakes EFL Exam

References

Volume 1, Issue 2
July 2024
Pages 211-225

Files

History

Share

How to cite

Statistics

A Corpus-based Evaluation of a High-stakes EFL Exam

References

Volume 1, Issue 2July 2024Pages 211-225

Files

History

Share

How to cite

Statistics

Volume 1, Issue 2
July 2024
Pages 211-225