Skip to content
Licensed Unlicensed Requires Authentication Published by De Gruyter Mouton June 22, 2018

Beginning and intermediate L2 writer’s use of N-grams: an association measures study

  • Jamie Garner EMAIL logo , Scott Crossley and Kristopher Kyle

Abstract

Acommon approach to analyzing phraseological knowledge in first language (L1) and second language (L2) learners is to employ raw frequency data. Several studies have also analyzed n-gram use on the basis of statistical association scores. Results from n-gram studies have found significant differences between L1 and L2 writers and between intermediate and advanced L2 writers in terms of their bigram use. The current study expands on this research by investigating the connection between bigram and trigram association measures and human judgments of L2 writing quality. Using multiple statistical association indices, it examines bigram and trigram use by beginner and intermediate L1 Korean learners of English in English placement test essays. Results of a logistic regression indicated that intermediate writers employed a greater number of strongly associated academic bigrams and spoken trigrams. These findings have important implications for understanding lexical development in L2 writers and notions of writing proficiency.

References

Ackerman, K. & Y.H. Chen. 2013. Developing the Academic Collocations List (ACL) - A corpus-driven and expert-judged approach. Journal of English for Academic Purposes 12. 235–247.10.1016/j.jeap.2013.08.002Search in Google Scholar

Ädel, A. & B. Erman. 2012. Recurrent word combinations in academic writing by native and non-native speakers of English: A lexical bundles approach. English for Specific Purposes 31. 81–92.10.1016/j.esp.2011.08.004Search in Google Scholar

Bestgen, Y. & S. Granger. 2014. Quantifying the development of phraseological competence in L2 English writing: An automated approach. Journal of Second Language Writing 26. 28–41.10.1016/j.jslw.2014.09.004Search in Google Scholar

Biber, D., Johansson, S., Leech, G., Conrad, S., & Finegan, E. 1999. Longman Grammar of Spoken and Written English. Harlow: Pearson Education.Search in Google Scholar

Bialystok, E. 1978. A theoretical model of second language learning model. Language Learning 28. 69–83.10.1111/j.1467-1770.1978.tb00305.xSearch in Google Scholar

Chen, Y.H. & P. Baker (2014). Investigating criterial discourse features across second language development: Lexical bundles in rated learner essays, CEFR B1, B2, C1. Applied Linguistics Advance Access published December 5, 2014. 10.1093/applin/amu065Search in Google Scholar

Crossley, S.A. 2013. Advancing research in second language writing through computational tools and machine learning techniques: A research agenda. Language Teaching 46(2). 256–271.10.1017/S0261444812000547Search in Google Scholar

Crossley, S. A., Salsbury, T., McNamara, D. S., & Jarvis, S. 2011. Predicting lexical proficiency in language learners using computational indices. Language Testing, 28, 561–580.10.1177/0265532210378031Search in Google Scholar

Crossley, S.A., Z. Cai & D.S. McNamara (2012). Syntagmatic, paradigmatic, and automatic n-gram approaches to assessing essay quality. In P. M. McCarthy & G. M. Youngblood (eds.), Proceedings of the 25th International Florida Artificial Intelligence Research Society (FLAIRS) Conference, 214–219. Menlo Park, CA: The AAAI Press.Search in Google Scholar

Crossley, S.A., T. Cobb & D.S. McNamara. 2013. Comparing count-based and band-based indices of word frequency: Implications for active vocabulary research and pedagogical applications. System 41. 965–981.10.1016/j.system.2013.08.002Search in Google Scholar

Crossley, S.A., K. Kyle, L.K. Allen, L. Guo & D.S. McNamara. 2014. Linguistic microfeatures to predict L2 writing proficiency: A case study in Automated Writing Evaluation. The Journal of Writing Assessment 7. 1.Search in Google Scholar

Crossley, S.A., K. Kyle & D.S. McNamara. 2016. The development and use of cohesive devices in L2 writing and their relations to judgments of essay quality. The Journal of Second Language Writing 32. 1–16.10.1016/j.jslw.2016.01.003Search in Google Scholar

Crossley, S.A. & D.S. McNamara. 2012. Predicting second language writing proficiency: The roles of cohesion and linguistic sophistication. Journal of Research in Reading 35(2). 115–135.10.1111/j.1467-9817.2010.01449.xSearch in Google Scholar

Crossley, S.A. & D.S. McNamara. 2013. Applications of text analysis tools for spoken response grading. Language Learning and Technology 17(2). 171–192.Search in Google Scholar

Crossley, S.A. & D.S. McNamara. 2014. Does writing development equal writing quality? A computational investigation of syntactic complexity in L2 learners. Journal of Second Language Writing 26(4). 66–79.10.1016/j.jslw.2014.09.006Search in Google Scholar

Davies, M. (2009). The Corpus of Contemporary American English: 450 million words, 1990-present. Avaliable online at: http://corpus/byu.edu/cocaSearch in Google Scholar

DeCock, S., S. Granger, G. Leech & T. McEnery. 1998. An automated approach to the phrasicon of EFL learners. In S. Granger (ed.), Learner English on Computer, 67–79. London: Longman.10.4324/9781315841342-5Search in Google Scholar

Durrant, P. & N. Schmitt. 2009. To what extent do native and non-native writers make use of collocations?. International Review of Applied Linguistics 47. 157–177.10.1515/iral.2009.007Search in Google Scholar

Ellis, N.C. 2006. Language acquisition as rational contingency learning. Applied Linguistics 27(1). 1–24.10.1093/applin/ami038Search in Google Scholar

Ellis, N. C., & Simpson-Vlach, R. 2009. Formulaic language in native speakers: Triangulating psycholinguistics, corpus linguistics, and education. Corpus Linguistics and Linguistic Theory 5(1). 61–78.10.1515/CLLT.2009.003Search in Google Scholar

Ellis, N.C. 2012. Formulaic language and second language acqusition: Zipf and the phrasal teddy bear. Annual Review of Applied Linguistics 32. 17–44.10.1017/S0267190512000025Search in Google Scholar

Erman, B., & Warren, B. 2000. The idiom principle and the open choice principle. Text, 20. 29–62.10.1515/text.1.2000.20.1.29Search in Google Scholar

Evert, S. (2004). The Statistics of Word Cooccurrences: Word Pairs and Collocations. (Dissertation). University of Stuttgart: Stuttgart.Search in Google Scholar

Evert, S. 2009. Corpora and collocations. In A. Lüdeling & M. Kytö (eds.), Corpus Linguistics. An International Handbook, 1211–1248. Berlin: Mouton de Gruyter.Search in Google Scholar

Fisher, R.A. 1934. Statistical Methods for Research Workers. 2nd edn. Edinburgh: Oliver and Boyd.Search in Google Scholar

Friginal, E., M. Li & S. Weigle. 2014. Revisiting multiple profiles of learner compositions: A comparison of highly rated NS and NNS essays. Journal of Second Language Writing 23. 1–16.10.1016/j.jslw.2013.10.001Search in Google Scholar

Granger, S. 1998. Prefabricated patterns in advanced EFL writing: Collocations and formulae. In A. P. Cowie (ed.), Phraseology: Theory, Analysis, and Applications, 145–160. Oxford: Oxford University Press.Search in Google Scholar

Granger, S. & Y. Bestgen. 2014. The use of collocations by intermediate vs. advanced non-native writers: A bigram-based study. International Review of Applied Linguistics 52(3). 229–252.10.1515/iral-2014-0011Search in Google Scholar

Granger, S. & M. Paquot. 2008. Disentangling the phraseological web. In S. Granger & F. H. Meunier (eds.), Phraseology: An Interdisciplinary Perspective, 27–49. Amsterdam: John Benjamins.10.1075/z.139.07graSearch in Google Scholar

Grant, L. & A. Ginther. 2000. Using computer-tagged linguistic features to describe L2 writing differences. Journal of Second Language Writing 9(2). 123–145.10.1016/S1060-3743(00)00019-9Search in Google Scholar

Gries, S.T., Hampe, B., & Schönfeld, D. 2005. Converging evidence: Bringing together experimental and corpus data on the association of verbs and constructions. Cognitive Linguistics 16(4). 635–676.10.1515/cogl.2005.16.4.635Search in Google Scholar

Gries, S.T. 2013. 50-something years of work on collocations: What is or should be next …. International Journal of Corpus Linguistics 18(1). 137–165.10.1075/bct.74.07griSearch in Google Scholar

Gries, S.T. & N.C. Ellis. 2015. Statistical measures for Usage-Based Linguistics. Language Learning 65(Supplement 1). 228–255.10.1111/lang.12119Search in Google Scholar

Groom, N. 2009. Effects of second language immersion on second language collocational development. In A. Barfield & H. Gyllstad (eds.), Researching Collocations in Another Language: Multiple Interpretations, 21–33. Bastingstoke: Palgrave Macmillan.10.1057/9780230245327_2Search in Google Scholar

Guo, L., S.A. Crossley & D.S. McNamara. 2013. Predicting human judgments of essay quality in both integrated and independent second language writing samples: A comparative study. Assessing Writing 18. 218–238.10.1016/j.asw.2013.05.002Search in Google Scholar

Hyland, K. 2008. Academic clusters: Text patterning in published and postgraduate writing. International Journal of Applied Linguistics 18(1). 41–62.10.1111/j.1473-4192.2008.00178.xSearch in Google Scholar

Ishikawa, S. 2009. Phraseology overused and underused by Japanese learners of English: A contrastive Interlanguage Analysis. Phraseology, Corpus Linguistics and Lexicography: Papers from Phraseology 2009 in Japan, 87–100. Nishinomiya: Kwansei Gakuin University Press.Search in Google Scholar

Jarvis, S. 2002. Short texts, best-fitting curves and new measures of lexical diversity. Language Testing 19(1). 57–84.10.1191/0265532202lt220oaSearch in Google Scholar

Jin, W. 2001. A quantitative study of cohesion in Chinese graduate students’ writing: Variations across genres and proficiency levels. ERIC database (ED452726). (accessed).Search in Google Scholar

Kyle, K. & S.A. Crossley. 2015. Automatically assessing lexical sophsitication: Indices, tools, findings, and application. TESOL Quarterly 49(4). 757–786.10.1002/tesq.194Search in Google Scholar

Kyle, K. & S.A. Crossley. 2016. The relationship between lexical sophistication and independent and source-based writing. Journal of Second Language Writing 34(4). 12–24.10.1016/j.jslw.2016.10.003Search in Google Scholar

Laufer, B. & P. Nation. 1995. Vocabulary size and use: Lexical Richness in L2 Written Production. Applied Linguistics 16(3). 307–322.10.1093/applin/16.3.307Search in Google Scholar

Meara, P. 2005. Designing vocabulary tests for English, Spanish and other languages. In C. Butler, S. Christopher, M.A. Gómez González, & S. M. Doval-Suárez (Eds.), The Dynamics of Language Use (pp. 271–285). Amsterdam: John Benjamins.10.1075/pbns.140.19meaSearch in Google Scholar

Nesselhauf, N. 2003. The use of collocations by advanced learners of English and some implications for teaching. Applied Linguistics 24(2). 223–242.10.1093/applin/24.2.223Search in Google Scholar

Nesselhauf, N. 2005. Collocations in a Learner Corpus. Amsterdam: John Benjamins.10.1075/scl.14Search in Google Scholar

Nunan, D. 1989. Designing tasks for the classroom. Cambridge: Cambridge University Press.Search in Google Scholar

Ohlrogge, A. 2009. Formulaic expressions in intermediate EFL writing assessment. In R. Corrigan, E. A. Moravcsik, H. Ouali & K. M. Wheatley (eds.), Formulaic Language (Volume 2): Acquisition, Loss, Psychological Reality, and Functional Explanations, 375–385. Amsterdam: John Benjamins.10.1075/tsl.83.07ohlSearch in Google Scholar

Reid, J. 1992. A computer text analysis of four cohesion devices in English discourse by native and nonnative writers. Journal of Second Language Writing 1. 79–107.10.1016/1060-3743(92)90010-MSearch in Google Scholar

Rhee, S.C. & C.K. Jung. 2014. Compilation of the Yonsei English Learner Corpus (YELC) 2011 and its use for understanding current usage of English by Korean pre-university students. The Journal of the Korea Contents Association 14(11). 1019–1029.10.5392/JKCA.2014.14.11.1019Search in Google Scholar

Read, J. 2000. Assessing Vocabulary. Cambridge: Cambridge University Press.10.1017/CBO9780511732942Search in Google Scholar

Read, J. 2004. Plumbing the depths: How should the construct of vocabulary knowledge be defined. Vocabulary in a Second Language: Selection, Acquisition, and Testing, 10(1). 209–227.10.1075/lllt.10.15reaSearch in Google Scholar

Römer, U. 2009. The inseparability of lexis and grammar: Corpus linguistic perspectives. Annual Review of Applied Linguistics 7. 140–162.10.1075/arcl.7.06romSearch in Google Scholar

Simpson-Vlach, R. & N.C. Ellis. 2010. An academic formulas list (AFL). Applied Linguistics 31. 487–512.10.1093/applin/amp058Search in Google Scholar

Siyanova-Chantura, A. & R. Martinez. 2015. The idiom principle revisited. Applied Linguistics 36(5). 549–569.10.1093/applin/amt054Search in Google Scholar

Stefanowitsch, A. & S.T. Gries. 2003. Collostructions: Investigating the interaction of words and constructions. International Journal of Corpus Linguistics 8(2). 209–243.10.1075/ijcl.8.2.03steSearch in Google Scholar

Vidakovic, I. & F. Barker. 2010. Use of words and multi-word units in Skills for Life Writing examinations. Cambridge ESOL: Research Notes 41. 7–14.Search in Google Scholar

White, R. 1981. Approaches to writing. Guidelines 6. 1–11.10.2190/0MDH-XF0T-F93C-PR68Search in Google Scholar

Yates, F. 1934. Contingency tables involving small numbers and the χ2 test. Journal of the Royal Statistical Society, Supplement 1. 217–235.10.2307/2983604Search in Google Scholar

Published Online: 2018-06-22
Published in Print: 2020-03-26

© 2020 Walter de Gruyter GmbH, Berlin/Boston

Downloaded on 23.4.2024 from https://www.degruyter.com/document/doi/10.1515/iral-2017-0089/html
Scroll to top button