-
Evaluating methodological enhancements to the Yes/No Angoff standard-setting method in language proficiency assessment Language Testing (IF 2.4) Pub Date : 2024-02-12 Tia M. Fechter, Heeyeon Yoon
This study evaluated the efficacy of two proposed methods in an operational standard-setting study conducted for a high-stakes language proficiency test of the U.S. government. The goal was to seek low-cost modifications to the existing Yes/No Angoff method to increase the validity and reliability of the recommended cut scores using a convergent mixed-methods study design. The study used the Yes/No
-
A shortened test is feasible: Evaluating a large-scale multistage adaptive English language assessment Language Testing (IF 2.4) Pub Date : 2024-02-08 Shangchao Min, Kyoungwon Bishop
This paper evaluates the multistage adaptive test (MST) design of a large-scale academic language assessment (ACCESS) for Grades 1–12, with an aim to simplify the current MST design, using both operational and simulated test data. Study 1 explored the operational population data (1,456,287 test-takers) of the listening and reading tests of MST ACCESS in the 2018–2019 school year to evaluate the MST
-
Setting standards for a diagnostic test of aviation English for student pilots Language Testing (IF 2.4) Pub Date : 2024-02-06 Maria Treadaway, John Read
Standard-setting is an essential component of test development, supporting the meaningfulness and appropriate interpretation of test scores. However, in the high-stakes testing environment of aviation, standard-setting studies are underexplored. To address this gap, we document two stages in the standard-setting procedures for the Overseas Flight Training Preparation Test (OFTPT), a diagnostic English
-
Korean Syntactic Complexity Analyzer (KOSCA): An NLP application for the analysis of syntactic complexity in second language production Language Testing (IF 2.4) Pub Date : 2024-02-06 Haerim Hwang, Hyunwoo Kim
Given the lack of computational tools available for assessing second language (L2) production in Korean, this study introduces a novel automated tool called the Korean Syntactic Complexity Analyzer (KOSCA) for measuring syntactic complexity in L2 Korean production. As an open-source graphic user interface (GUI) developed in Python, KOSCA provides seven indices of syntactic complexity, including traditional
-
The development of a Chinese vocabulary proficiency test (CVPT) for learners of Chinese as a second/foreign language Language Testing (IF 2.4) Pub Date : 2024-01-10 Haiwei Zhang, Peng Sun, Yaowaluk Bianglae, Winda Widiawati
In order to address the needs of the continually growing number of Chinese language learners, the present study developed and presented initial validation of a 100-item Chinese vocabulary proficiency test (CVPT) for learners of Chinese as a second/foreign language (CS/FL) using Item Response Theory among 170 CS/FL learners from Indonesia and 354 CS/FL learners from Thailand. Participants were required
-
Open Science should be welcomed by test providers but grounded in pragmatic caution: A response to Winke Language Testing (IF 2.4) Pub Date : 2024-01-04 Tony Clark, Emma Bruce
This article is temporarily under embargo.
-
Implementation of an accommodations policy for candidates with diverse needs in a large-scale testing system Language Testing (IF 2.4) Pub Date : 2023-05-16 Johanna Motteram, Richard Spiby, Gemma Bellhouse, Katarzyna Sroka
This article describes the implementation of a special accommodations policy for a suite of localised English language and numeracy tests, the Workplace Literacy and Numeracy (WPLN) Assessments. Th...
-
The relationship between written discourse features and integrated listening-to-write scores for adolescent English language learners Language Testing (IF 2.4) Pub Date : 2023-05-13 Ray J. T. Liao, Renka Ohta, Kwangmin Lee
As integrated writing tasks in large-scale and classroom-based writing assessments have risen in popularity, research studies have increasingly concentrated on providing validity evidence. Given th...
-
English foreign language reading and spelling diagnostic assessments informing teaching and learning of young learners Language Testing (IF 2.4) Pub Date : 2023-04-29 Janina Kahn-Horwitz, Zahava Goldstein
In order to inform English foreign language (EFL) diagnostic assessment of literacy, this study examined the extent to which 175 first-language Hebrew-speaking EFL young learners from fifth to tent...
-
Critical discursive approaches to evaluating policy-driven testing: Social impact as a target for validation Language Testing (IF 2.4) Pub Date : 2023-04-27 Dongil Shin
This paper addresses the intersection of testing and policy, situating test-driven impact and validation within the context of policy-led educational reform in Korea. I will briefly review the exis...
-
Speaking performances, stakeholder perceptions, and test scores: Extrapolating from the Duolingo English test to the university Language Testing (IF 2.4) Pub Date : 2023-04-24 Daniel R. Isbell, Dustin Crowther, Hitoshi Nishizawa
The extrapolation of test scores to a target domain—that is, association between test performances and relevant real-world outcomes—is critical to valid score interpretation and use. This study exa...
-
Establishing meaning recall and meaning recognition vocabulary knowledge as distinct psychometric constructs in relation to reading proficiency Language Testing (IF 2.4) Pub Date : 2023-04-24 Jeffrey Stewart, Henrik Gyllstad, Christopher Nicklin, Stuart McLean
The purpose of this paper is to (a) establish whether meaning recall and meaning recognition item formats test psychometrically distinct constructs of vocabulary knowledge which measure separate sk...
-
Modeling local item dependence in C-tests with the loglinear Rasch model Language Testing (IF 2.4) Pub Date : 2023-04-15 Purya Baghaei, Karl Bang Christensen
C-tests are gap-filling tests mainly used as rough and economical measures of second-language proficiency for placement and research purposes. A C-test usually consists of several short independent...
-
Examining the predictive validity of the Duolingo English Test: Evidence from a major UK university Language Testing (IF 2.4) Pub Date : 2023-04-03 Talia Isaacs, Ruolin Hu, Danijela Trenkic, Julia Varga
The COVID-19 pandemic has changed the university admissions and proficiency testing landscape. One change has been the meteoric rise in use of the fully automated Duolingo English Test (DET) for un...
-
The distribution of cognates and their impact on response accuracy in the EIKEN tests Language Testing (IF 2.4) Pub Date : 2023-03-26 David Allen, Keita Nakamura
Although there is abundant evidence for the use of first-language (L1) knowledge by bilinguals when using a second language (L2), investigation into the impact of L1 knowledge in large-scale L2 lan...
-
Measuring the development of general language skills in English as a foreign language—Longitudinal invariance of the C-test Language Testing (IF 2.4) Pub Date : 2023-03-25 Birger Schnoor, Johannes Hartig, Thorsten Klinger, Alexander Naumann, Irina Usanova
Research on assessing English as a foreign language (EFL) development has been growing recently. However, empirical evidence from longitudinal analyses based on substantial samples is still needed....
-
Operationalizing the reading-into-writing construct in analytic rating scales: Effects of different approaches on rating Language Testing (IF 2.4) Pub Date : 2023-03-20 Santi B. Lestari, Tineke Brunfaut
Assessing integrated reading-into-writing task performances is known to be challenging, and analytic rating scales have been found to better facilitate the scoring of these performances than other ...
-
Assessment of fluency in the Test of English for Educational Purposes Language Testing (IF 2.4) Pub Date : 2023-03-13 Parvaneh Tavakoli, Gill Kendon, Svetlana Mazhurnaya, Anna Ziomek
The main aim of this study was to investigate how oral fluency is assessed across different levels of proficiency in the Test of English for Educational Purposes (TEEP). Working with data from 56 t...
-
The relationship among accent familiarity, shared L1, and comprehensibility: A path analysis perspective Language Testing (IF 2.4) Pub Date : 2023-03-13 Yongzhi Miao
Scholars have argued for the inclusion of different spoken varieties of English in high-stakes listening tests to better represent the global use of English. However, doing so may introduce additio...
-
Strategy use in a spoken dialog system–delivered paired discussion task: A stimulated recall study Language Testing (IF 2.4) Pub Date : 2023-03-07 Nazlinur Gokturk, Evgeny Chukharev-Hudilainen
With recent technological advances, researchers have begun to explore the potential use of spoken dialog systems (SDSs) for L2 oral communication assessment. While several studies support the feasi...
-
Proficiency at the lexis–grammar interface: Comparing oral versus written French exam tasks Language Testing (IF 2.4) Pub Date : 2023-03-07 Nathan Vandeweerd, Alex Housen, Magali Paquot
This study investigates whether re-thinking the separation of lexis and grammar in language testing could lead to more valid inferences about proficiency across modes. As argued by Römer, typical s...
-
Investigating the impact of self-pacing on the L2 listening performance of young learner candidates with differing L1 literacy skills Language Testing (IF 2.4) Pub Date : 2023-03-02 Kathrin Eberharter, Judit Kormos, Elisa Guggenbichler, Viktoria S. Ebner, Shungo Suzuki, Doris Moser-Frötscher, Eva Konrad, Benjamin Kremmel
In online environments, listening involves being able to pause or replay the recording as needed. Previous research indicates that control over the listening input could improve the measurement acc...
-
Universal tools activation in English language proficiency assessments: A comparison of Grades 1–12 English learners with and without disabilities Language Testing (IF 2.4) Pub Date : 2023-02-02 Ahyoung Alicia Kim, Meltem Yumsek, Jason A. Kemp, Mark Chapman, H. Gary Cook
English learners (ELs) comprise approximately 10% of kindergarten to Grade 12 students in US public schools, with about 15% of ELs identified as having disabilities. English language proficiency (E...
-
L2 and L1 semantic context indices as automated measures of lexical sophistication Language Testing (IF 2.4) Pub Date : 2023-02-02 Kátia Monteiro, Scott Crossley, Robert-Mihai Botarleanu, Mihai Dascălu
Lexical frequency benchmarks have been extensively used to investigate second language (L2) lexical sophistication, especially in language assessment studies. However, indices based on semantic co-...
-
Linking scores from two written receptive English academic vocabulary tests—The VLT-Ac and the AVT Language Testing (IF 2.4) Pub Date : 2023-01-12 Marcus Warnby, Hans Malmström, Kajsa Yang Hansen
The academic section of the Vocabulary Levels Test (VLT-Ac) and the Academic Vocabulary Test (AVT) both assess meaning-recognition knowledge of written receptive academic vocabulary, deemed central...
-
Measuring bilingual language dominance: An examination of the reliability of the Bilingual Language Profile Language Testing (IF 2.4) Pub Date : 2023-01-12 Daniel J. Olson
Measuring language dominance, broadly defined as the relative strength of each of a bilingual’s two languages, remains a crucial methodological issue in bilingualism research. While various methods...
-
The vexing problem of validity and the future of second language assessment Language Testing (IF 2.4) Pub Date : 2023-01-11 Vahid Aryadoust
Construct validity and building validity arguments are some of the main challenges facing the language assessment community. The notion of construct validity and validity arguments arose from resea...
-
Epilogue—Note from an outgoing editor Language Testing (IF 2.4) Pub Date : 2023-01-11 Luke Harding
In this brief epilogue, outgoing editor Luke Harding reflects on his time as editor and considers the future Language Testing.
-
Reframing the discourse and rhetoric of language testing and assessment for the public square Language Testing (IF 2.4) Pub Date : 2023-01-11 Lynda Taylor
As applied linguists and language testers, we are in the business of “doing language”. For many of us, language learning is a lifelong passion, and we invest similar enthusiasm in our language asse...
-
Administration, labor, and love Language Testing (IF 2.4) Pub Date : 2023-01-11 April Ginther
Great opportunities for language testing practitioners are enabled through language program administration. Local language tests lend themselves to multiple purposes—for placement and diagnosis, as...
-
Future challenges and opportunities in language testing and assessment: Basic questions and principles at the forefront Language Testing (IF 2.4) Pub Date : 2023-01-11 Tineke Brunfaut
In this invited Viewpoint on the occasion of the 40th anniversary of the journal Language Testing, I argue that at the core of future challenges and opportunities for the field—both in scholarly an...
-
Towards a new sophistication in vocabulary assessment Language Testing (IF 2.4) Pub Date : 2023-01-11 John Read
Published work on vocabulary assessment has grown substantially in the last 10 years, but it is still somewhat outside the mainstream of the field. There has been a recent call for those developing...
-
Reflections on the past and future of language testing and assessment: An emerging scholar’s perspective Language Testing (IF 2.4) Pub Date : 2023-01-11 J. Dylan Burton
In its 40th year, Language Testing journal has served as the flagship journal for scholars, researchers, and practitioners in the field of language testing and assessment. This viewpoint piece, wri...
-
Test design and validity evidence of interactive speaking assessment in the era of emerging technologies Language Testing (IF 2.4) Pub Date : 2023-01-11 Soo Jung Youn
As access to smartphones and emerging technologies has become ubiquitous in our daily lives and in language learning, technology-mediated social interaction has become common in teaching and assess...
-
Construct validity and fairness of an operational listening test with World Englishes Language Testing (IF 2.4) Pub Date : 2023-01-04 Hitoshi Nishizawa
In this study, I investigate the construct validity and fairness pertaining to the use of a variety of Englishes in listening test input. I obtained data from a post-entry English language placemen...
-
But who trains the language teacher educator who trains the language teacher? An empirical investigation of Chilean EFL teacher educators’ language assessment literacy Language Testing (IF 2.4) Pub Date : 2022-12-27 Salomé Villa Larenas, Tineke Brunfaut
Research has shown that language teachers typically feel underprepared for assessment aspects of their job. One reason may relate to how teacher education programmes prepare future teachers in this...
-
Towards more valid scoring criteria for integrated reading-writing and listening-writing summary tasks Language Testing (IF 2.4) Pub Date : 2022-12-12 Sathena Chan, Lyn May
Despite the increased use of integrated tasks in high-stakes academic writing assessment, research on rating criteria which reflect the unique construct of integrated summary writing skills is comp...
-
The typology of second language listening constructs: A systematic review Language Testing (IF 2.4) Pub Date : 2022-12-07 Vahid Aryadoust, Lan Luo
This study reviewed conceptualizations and operationalizations of second language (L2) listening constructs. A total of 157 peer-reviewed papers published in 19 journals in applied linguistics were...
-
Temporal fluency and floor/ceiling scoring of intermediate and advanced speech on the ACTFL Spanish Oral Proficiency Interview–computer Language Testing (IF 2.4) Pub Date : 2022-11-09 Troy L. Cox, Alan V. Brown, Gregory L. Thompson
The rating of proficiency tests that use the Inter-agency Roundtable (ILR) and American Council on the Teaching of Foreign Languages (ACTFL) guidelines claims that each major level is based on hier...
-
Challenges in rating signed production: A mixed-methods study of a Swiss German Sign Language form-recall vocabulary test Language Testing (IF 2.4) Pub Date : 2022-09-21 Aaron Olaf Batty, Tobias Haug, Sarah Ebling, Katja Tissi, Sandra Sidler-Miserez
Sign languages present particular challenges to language assessors in relation to variation in signs, weakly defined citation forms, and a general lack of standard-setting work even in long-establi...
-
L2 English vocabulary breadth and knowledge of derivational morphology: One or two constructs? Language Testing (IF 2.4) Pub Date : 2022-09-02 Dmitri Leontjev, Ari Huhta, Asko Tolvanen
Derivational morphology (DM) and how it can be assessed have been investigated relatively rarely in language learning and testing research. The goal of this study is to add to the understanding of ...
-
A meta-analysis on the predictive validity of English language proficiency assessments for college admissions Language Testing (IF 2.4) Pub Date : 2022-08-16 Samuel Dale Ihlenfeldt, Joseph A. Rios
For institutions where English is the primary language of instruction, English assessments for admissions such as the Test of English as a Foreign Language (TOEFL) and International English Languag...
-
Comparing holistic and analytic marking methods in assessing speech act production in L2 Chinese Language Testing (IF 2.4) Pub Date : 2022-08-09 Shuai Li, Ting Wen, Xian Li, Yali Feng, Chuan Lin
This study compared holistic and analytic marking methods for their effects on parameter estimation (of examinees, raters, and items) and rater cognition in assessing speech act production in L2 Ch...
-
Who succeeds and who fails? Exploring the role of background variables in explaining the outcomes of L2 language tests Language Testing (IF 2.4) Pub Date : 2022-07-24 Ann-Kristin Helland Gujord
This study explores whether and to what extent the background information supplied by 10,155 immigrants who took an official language test in Norwegian affected their chances of passing one, two, o...
-
Local tests, local contexts Language Testing (IF 2.4) Pub Date : 2022-07-18 Slobodanka Dimova, Xun Yan, April Ginther
We, the co-editors of this special issue, have collaborated for many years on issues related to the design, development, and implementation of local language tests. In our professional contexts, we each hold academic positions at institutions of higher education that include the standard expectations for research and teaching. In addition, we have also accepted responsibility for the development and
-
A sequential approach to detecting differential rater functioning in sparse rater-mediated assessment networks Language Testing (IF 2.4) Pub Date : 2022-05-12 Stefanie A. Wind
Researchers frequently evaluate rater judgments in performance assessments for evidence of differential rater functioning (DRF), which occurs when rater severity is systematically related to construct-irrelevant student characteristics after controlling for student achievement levels. However, researchers have observed that methods for detecting DRF may be limited in sparse rating designs, where it
-
Using instructor judgment, learner corpora, and DIF to develop a placement test for Spanish L2 and heritage learners Language Testing (IF 2.4) Pub Date : 2022-05-01 Melissa A. Bowles
This study details the development of a local test designed to place university Spanish students (n = 719) into one of the four different course levels and to distinguish between traditional L2 learners and early bilinguals on the basis of their linguistic knowledge, regardless of the variety of Spanish they were exposed to. Early bilinguals include two groups—heritage learners (HLs), who were exposed
-
Local placement test retrofit and building language assessment literacy with teacher stakeholders: A case study from Colombia Language Testing (IF 2.4) Pub Date : 2022-04-14 Gerriet Janssen
This article provides a single, common-case study of a test retrofit project at one Colombian university. It reports on how the test retrofit project was carried out and describes the different areas of language assessment literacy the project afforded local teacher stakeholders. This project was successful in that it modified the test constructs and item types, while drawing stronger connections between
-
Test Review: The International English Language Testing System (IELTS) Language Testing (IF 2.4) Pub Date : 2022-04-04 John Read
International English Language Testing System (IELTS)
-
Book Review: Another Generation of Fundamental Considerations in Language Assessment: A Festschrift in Honor of Lyle F. Bachman Language Testing (IF 2.4) Pub Date : 2022-04-01 Ying Xu,Xiaodong Li
-
Review of the Japanese-Language Proficiency Test Language Testing (IF 2.4) Pub Date : 2022-03-09 Hitoshi Nishizawa,Daniel R. Isbell,Yuichi Suzuki
-
Book Review: Scoring Second Language Spoken and Written Performance: Issues, Options and Directions Language Testing (IF 2.4) Pub Date : 2022-03-03 Santi Budi Lestari, Kathrin Eberharter
The need to assess communicative language skills has continuously been on the rise in academic and professional settings. This trend forces language teachers and test developers alike to face the challenges of rating complex language performance in a valid and reliable manner. The recent publication Scoring Second Language Spoken and Written Performance offers a concise and much needed overview of
-
National assessment of foreign languages in Sweden: A multifaceted and collaborative venture Language Testing (IF 2.4) Pub Date : 2022-03-02 Gudrun Erickson, Linda Borger, Eva Olsson
The article addresses the local system of national assessment of foreign languages in Sweden, a contextually specific, large-scale system with a summative aim, but also a system aimed to support teachers in their continuous assessment and grading of their students’ competences. In the text, the educational context and the multifaceted nature of national assessment are described and discussed. Furthermore
-
How do raters learn to rate? Many-facet Rasch modeling of rater performance over the course of a rater certification program Language Testing (IF 2.4) Pub Date : 2022-03-01 Xun Yan, Ping-Lin Chuang
This study employed a mixed-methods approach to examine how rater performance develops during a semester-long rater certification program for an English as a Second Language (ESL) writing placement test at a large US university. From 2016 to 2018, we tracked three groups of novice raters (n = 30) across four rounds in the certification program. Using many-facet Rasch modeling, rater performance was
-
Psychometric approaches to analyzing C-tests Language Testing (IF 2.4) Pub Date : 2022-02-28 David Alpizar, Tongyun Li, John M. Norris, Lixiong Gu
The C-test is a type of gap-filling test designed to efficiently measure second language proficiency. The typical C-test consists of several short paragraphs with the second half of every second word deleted. The words with deleted parts are considered as items nested within the corresponding paragraph. Given this testlet structure, it is commonly taken for granted that the C-test design may violate
-
Revisiting English language proficiency and its impact on the academic performance of domestic university students in Singapore Language Testing (IF 2.4) Pub Date : 2022-02-28 Wenjin Vikki Bo, Mingchen Fu, Wei Ying Lim
The role of international students’ English language proficiency has been extensively researched to understand its impact on academic achievement in English-medium universities, mainly because of students’ non-English-speaking backgrounds. However, the relationship between language proficiency and academic achievement among English-speaking-background students remains under-researched, especially in
-
The use of generalizability theory in investigating the score dependability of classroom-based L2 reading assessment Language Testing (IF 2.4) Pub Date : 2022-02-28 Ray J. T. Liao
Among the variety of selected response formats used in L2 reading assessment, multiple-choice (MC) is the most commonly adopted, primarily due to its efficiency and objectiveness. Given the impact of assessment results on teaching and learning, it is necessary to investigate the degree to which the MC format reliably measures learners’ L2 reading comprehension in the classroom context. While researchers
-
Application of an Automated Essay Scoring engine to English writing assessment using Many-Facet Rasch Measurement Language Testing (IF 2.4) Pub Date : 2022-02-26 Kinnie Kin Yee Chan, Trevor Bond, Zi Yan
We investigated the relationship between the scores assigned by an Automated Essay Scoring (AES) system, the Intelligent Essay Assessor (IEA), and grades allocated by trained, professional human raters to English essay writing by instigating two procedures novel to written-language assessment: the logistic transformation of AES raw scores into hierarchically ordered grades, and the co-calibration of
-
Developing a local academic English listening test using authentic unscripted audio-visual texts Language Testing (IF 2.4) Pub Date : 2022-02-24 Yena Park, Senyung Lee, Sun-Young Shin
Despite consistent calls for authentic stimuli in listening tests for better construct representation, unscripted texts have been rarely adopted in high-stakes listening tests due to perceived inefficiency. This study details how a local academic listening test was developed using authentic unscripted audio-visual texts from the local target language use (TLU) domain without compromising the reliability
-
Register variation in spoken and written language use across technology-mediated and non-technology-mediated learning environments Language Testing (IF 2.4) Pub Date : 2022-02-20 Kristopher Kyle, Masaki Eguchi, Ann Tai Choe, Geoff LaFlair
In the realm of language proficiency assessments, the domain description inference and the extrapolation inference are key components of a validity argument. Biber et al.’s description of the lexicogrammatical features of the spoken and written registers in the T2K-SWAL corpus has served as support for the TOEFL iBT test’s domain description and extrapolation inferences. In the time since the T2K-SWAL