-
Tracking and analyzing recent developments in German-language online press in the face of the coronavirus crisis International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2020-10-15 Sascha Wolfer, Alexander Koplenig, Frank Michaelis, Carolin Müller-Spitzer
The coronavirus pandemic may be the largest crisis the world has had to face since World War II. It does not come as a surprise that it is also having an impact on language as our primary communication tool. We present three inter-connected resources that are designed to capture and illustrate these effects on a subset of the German language: An RSS corpus of German-language newsfeeds (with freely
-
The translation of reporting verbs in Italian International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2020-10-15 Lorenzo Mastropierro
Abstract This paper reports on a study of reporting verbs in the Harry Potter series and their translation in Italian. It offers quantitative and qualitative perspectives on how the English verbs have been translated by two Italian translators, who worked on different books of the series. This study first analyses verb usage across the three protagonists of the series (Harry, Ron, and Hermione) in
-
Shifts in signed media interpreting International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2020-10-15 Ella Wehrmeyer
Abstract This study offers a unique contribution through the construction of an annotated text-based sign language interpreting corpus and its application in analyzing shifts (defined as deviations from source semantic content), which in turn enables researchers to identify and categorize interpreter strategies and norms. The corpus comprises ten half-hour news broadcasts in English and their simultaneously
-
Realizing an online conference International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2020-10-15 Beatrix Busse, Ingo Kleiber
Abstract This paper aims to assist future organizers of international online conferences with designing and realizing these events. On the basis of the authors’ experience of having to move a corpus linguistics conference – originally planned as a physical event – into the digital space, this paper describes the conference’s organization and management structure, outlines the software and communication
-
Too early to say: The English too ADJ to V construction and models of cross-cultural communications styles International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2020-10-15 Vladan Pavlovic
Abstract This paper studies the English too ADJ to V construction. It starts with a (multiple) distinctive collexeme analysis (as one of the subtypes of collostructional analysis) of the ADJ-V pairs appearing in the given construction in three regional varieties of English (American, British and Indian English) based on the GloWbE corpus. This analysis establishes the most distinctive and most strongly
-
Friginal, E. (2018). Corpus linguistics for English teachers: New tools, online resources, and classroom activities International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2020-08-28 Marcus Callies, Tugba Simsek
This article reviews Corpus Linguistics for English Teachers: New Tools, Online Resources, and Classroom Activities
-
Adverb placement in EFL academic writing International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2020-08-28 Tove Larsson, Marcus Callies, Hilde Hasselgård, Natalia Judith Laso, Sanne van Vuuren, Isabel Verdaguer, Magali Paquot
Abstract The present study looks at adverb placement in expert writing and in first-language and second-language novice spoken and written production. The extent to which first-language (L1) transfer is still present in advanced learners’ written production is also investigated. The study uses data from one expert corpus (LOCRA), two native-speaker student corpora (BAWE and LOCNEC) and two learner
-
Methodological issues in contrastive lexical bundle research International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2020-08-28 Fan Pan, Randi Reppen, Douglas Biber
Abstract This study explores the influence of corpus design when comparing lexical bundle use across groups, examining how the number of texts and average length of texts can impact conclusions about group differences. The study compares the use of lexical bundles by L1-English versus L2-English writers, based on analysis of two sub-corpora of academic articles that are matched for discipline, writer
-
Turn structure and inserts International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2020-08-28 Christoph Rühlemann
Abstract Turns-at-talk often do not start with their main business but rather with a pre-start (Sacks et al., 1974). This paper investigates the correlation of pre-starts with inserts, one of three major word classes (Biber et al., 1999). Based on the BNC’s mark-up, I investigate how inserts are positionally distributed in large amounts of turns of varied lengths. The analysis shows that inserts are
-
Key words when text forms the unit of study International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2020-08-28 Stephen Jeaco
Abstract Throughout the social sciences, there has been growing pressure to present effect sizes when publishing empirical data (see American Psychological Association, 2001; Parsons & Nelson, 2004). While it seems indisputable that for the majority of quantitative research foci, effect size is an essential element of statistical analysis, this paper argues that specifically for key word analysis in
-
Noun phrase complexity in young Spanish EFL learners’ writing International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2020-04-16 María Belén Díez-Bedmar, Pascual Pérez-Paredes
Abstract The research reported in this article examines Noun Phrase (NP) syntactic complexity in the writing of Spanish EFL secondary school learners in Grades 7, 8, 11 and 12 in the International Corpus of Crosslinguistic Interlanguage. Two methods were combined: a manual parsing of NPs and an automatic analysis of NP indices using the Tool for the Automatic Analysis of Syntactic Sophistication and
-
Lexical dispersion and corpus design International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2020-04-16 Jesse Egbert, Brent Burch, Douglas Biber
Abstract Lexical dispersion is typically measured across arbitrary corpus parts of equal size. In this study, we apply DA – a new dispersion index designed for unequal-sized corpus parts – to the British National Corpus (BNC) in a series of cases studies to show that the dispersion of a word is strongly influenced by the corpus units or parts it is measured across. Our results show that dispersion
-
Electronic supplement analysis of multiple texts International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2020-04-16 Laura Louise Paterson
Abstract This paper adapts O’Halloran’s (2010) electronic supplement analysis (ESA) to investigate debates about UK poverty in online newspaper articles and reader responses to those articles. While O’Halloran’s method was originally conceived to facilitate close reading, this paper modifies ESA for corpus-based discourse analysis by scaling it up to include multiple texts. I analyse (key-)keywords
-
How much vocabulary is needed touse a concordance? International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2020-04-16 Oliver James Ballance, Averil Coxhead
Abstract Vocabulary load is a predictor of comprehension and a common concern in relation to learner use of concordances; however, vocabulary load figures for whole texts have limited relevance to learner use of concordances. This paper explores the average vocabulary load of the citations (or lines) in a concordance, reflecting how learners use concordances as reading or reference resources. Non-parametric
-
Phonological CorpusTools International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2019-01-01 Kathleen Currie Hall, J. Scott Mackie, Roger Yu-Hsiang Lo
Abstract Phonological analysis increasingly involves the quantification of various lexical and/or usage statistics, such as phonotactic probabilities, the functional loads of various phonemic contrasts, or neighbourhood densities. This paper presents Phonological CorpusTools, a free, open-source software for conducting such phonological analyses on transcribed corpora. The motivations for creating
-
The indicative vs. subjunctive alternation with expressions of possibility in Spanish International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2019-01-01 Sandra C. Deshors, Mark Waltermire
Abstract This study explores the indicative vs. subjunctive alternation in Spanish subordinate clauses following epistemic adverbials and expressions of possibility. Anchored in semantic-pragmatic and variationist theoretical frameworks, traditional research on mood alternation in Spanish remains largely experimental in nature. In contrast, we adopt a corpus-based multifactorial methodology to investigate
-
Variation and change in a specialized register International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2019-01-01 Nicholas Smith, Cathleen Waters
Abstract Corpus-based studies of specialized registers typically sample texts using random methods as far as possible, but they disregard social characteristics of the speakers/writers. In contrast, in corpus-based studies of conversation and quantitative sociolinguistic studies, sampling is more typically designed to optimize social representation. To our knowledge, this study is the first to compare
-
Usage Fluctuation Analysis International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2019-01-01 Tony McEnery, Vaclav Brezina, Helen Baker
Abstract This article introduces a methodology for the diachronic analysis of large historical corpora, Usage Fluctuation Analysis (UFA). UFA looks at the fluctuation of the usage of a word as observed through collocation. It presupposes neither a commitment to a specific semantic theory, nor that the results will focus solely on semantics. We focus, rather, upon a word’s usage. UFA considers large
-
Dimensions of variation across American television registers International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2019-01-01 Tony Berber Sardinha, Marcia Veirano Pinto
Abstract The goal of this study is to identify the dimensions of variation across American television programs, following the multidimensional analysis (MD) framework introduced by Biber (1988). Although television is a major form of mass communication, there has been no previous large-scale MD study of television dialogue. A large corpus containing the key types of contemporary American television
-
Kaleidographic International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2019-01-01 Helen Caple, Laurence Anthony, Monika Bednarek
Abstract Kaleidographic is a dynamic and interactive data visualization tool that allows users to observe and explore relations between any number of variables. The tool is useful for displaying the complex ways in which textual elements interact across a range of texts. Thus far, the tool has been used to display the results of corpus studies as well as corpus-assisted multimodal discourse analyses
-
An introduction to the ANAWC International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2019-01-01 Lucy Pickering, Laura Di Ferrante, Carrie Bruce, Eric Friginal, Pamela Pearson, Julie Bouchard
Abstract This paper presents an overview of the Augmentative and Alternative Communication (AAC) and Non-AAC Workplace Corpus (ANAWC) (Pickering & Bruce, 2009). The corpus is the first resource of its kind that makes it possible to systematically study the typical language patterns of both AAC users and comparable non-AAC users in the workplace. We discuss the origin of the corpus and give an account
-
Lexical bundles in university course materials International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2019-01-01 Tatiana Nekrasova-Beker, Anthony Becker
Abstract The present study compared 4-word lexical bundles found in a general engineering corpus (2,030,000 words) with those found in a corpus of texts collected from a Pathway engineering course for ESL (English as a Second Language) students (356,000 words) and a corpus of pedagogical materials used to teach advanced ESL students at an intensive English program (440,000 words). The results indicated
-
Patterns, constructions, and applied linguistics International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2019-01-01 Susan Hunston
Abstract This paper proposes an alignment between aspects of pattern grammar (Francis, 1993; Hunston & Francis, 2000) and construction grammar (Goldberg, 2006). Pattern Grammar describes the grammatical behaviour of individual words at a specific level of generality. The paper claims that grammar patterns and the groups of words identified as occurring with them can be used to propose candidate constructions
-
15 years of collostructions International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2019-01-01 Stefan Th. Gries
Abstract This paper discusses a variety of potential shortcomings of most of the most widely-used association measures as used in collocation research and collostructional analyses. To address these shortcomings, I then discuss a research program called tupleization, an approach that does away with the usual kinds of information conflation by keeping relevant corpus-linguistic dimensions of information
-
A corpus perspective on the development of verb constructions in second language learners International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2019-01-01 Ute Römer
Abstract This article reports initial findings from a study that uses written data from second language (L2) learners of English at different proficiency levels (CEFR A1 to C1) in a large-scale investigation of verb-argument construction (VAC) emergence. The findings provide insights into first VACs in L2 learner production, changes in the learners’ VAC repertoire from low to high proficiency levels
-
Why very good in India might be pretty good in North America International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2019-01-01 Susanne Wagner
Abstract Situated at the interface of several sub-disciplines (corpus linguistics, World Englishes, variationist sociolinguistics), this study investigates patterns of adjectival amplification (very good, so glad, pretty cool) in the Corpus of Global Web-Based English (GloWbE). It highlights regional distributions/preferences of amplifier-adjective 2-grams and the idiosyncratic status of certain bigrams
-
Investigating the additive probability of repeated language production decisions International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2019-01-01 Sean Wallis
This paper introduces an experimental paradigm based on probabilistic evidence of the interaction between construction decisions in a parsed corpus. The approach is demonstrated using ICE-GB, a one million-word corpus of English. It finds an interaction between attributive adjective phrases in noun phrases with a noun head, such that the probability of adding adjective phrases falls successively. The
-
Do speech registers differ in the predictability of words? International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2019-01-01 Martijn Bentum, Louis ten Bosch, Antal van den Bosch, Mirjam Ernestus
Abstract Previous research has demonstrated that language use can vary depending on the context of situation. The present paper extends this finding by comparing word predictability differences between 14 speech registers ranging from highly informal conversations to read-aloud books. We trained 14 statistical language models to compute register-specific word predictability and trained a register classifier
-
Vocabulary sophistication in First-Year Composition assignments International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2019-01-01 Philip Durrant, Joseph Moxley, Lee McCallum
Abstract Recently-developed tools which quickly and reliably quantify vocabulary use on a range of measures open up new possibilities for understanding the construct of vocabulary sophistication. To take this work forward, we need to understand how these different measures relate to each other and to human readers’ perceptions of texts. This study applied 356 quantitative measures of vocabulary use
-
Constructing a corpus-informed list of Arabic formulaic sequences (ArFSs) for language pedagogy and technology International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2019-01-01 Ayman Alghamdi, Eric Steven Atwell
Abstract This study aims to construct a corpus-informed list of Arabic Formulaic Sequences (ArFSs) for use in language pedagogy (LP) and Natural Language Processing (NLP) applications. A hybrid mixed methods model was adopted for extracting ArFSs from a corpus, that combined automatic and manual extracting methods, based on well-established quantitative and qualitative criteria that are relevant from
-
Harrington, K. (2018). The Role of Corpus Linguistics in the Ethnography of a Closed Community: Survival Communication International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2019-01-01 Robbie Love
This article reviews The Role of Corpus Linguistics in the Ethnography of a Closed Community: Survival Communication
-
Towards an English Constructicon using patterns and frames International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2019-01-01 Florent Perek, Amanda L. Patten
Abstract Recent research in construction grammar has been marked by increasing efforts to create constructicons: detailed inventories of form-meaning pairs to describe the grammar of a given language, following the principles of construction grammar. This paper describes proposals for building a new constructicon of English, based on the combination of the COBUILD Grammar Patterns and the semantic
-
Construction Grammar and the corpus-based analysis of discourses International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2019-01-01 Nicholas Groom
Abstract Construction grammar (CxG) initially arose as a usage-based alternative to nativist theoretical accounts of language, and remains to this day strongly associated with cognitive linguistic theory and research. In this paper, however, I argue that CxG can be seen as offering an equally viable general framework for socially-oriented linguists whose work focuses on the corpus-based analysis of
-
Paterson, L. L., & Gregory, I. (2018). Representations of Poverty and Place: Using Geographical Text Analysis to Understand Discourse International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2019-01-01 Kristin Berberich
This article reviews Representations of Poverty and Place: Using Geographical Text Analysis to Understand Discourse
-
The creative use of absences International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2018-10-29 Rocío Montoro
In an article published in this journal, Partington (2014) addresses the criticism often made against corpus linguistics that it is apparently unable to cope with absences. He convincingly argues that corpus linguistics is better suited to account for absences than has been claimed. I resume the debate by discussing a type of absence not fully addressed in Partington (2014) which I have termed ‘creative
-
“What are you talking about?” International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2018-10-29 Julian Northbrook, Kathy Conklin
In a communicative approach to language teaching, students are presented with “authentic” language, which is thought to allow them to produce it in a nativelike way. The current study explores whether the lexical bundles in communicative Japanese junior high school textbooks are representative of conversational English. To do this, we use a corpus-based approach that compares the most frequent lexical
-
A critical review of research and practice in data-driven learning (DDL) in the academic writing classroom International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2018-10-29 Meilin Chen, John Flowerdew
Since the late 1980s, there has been a growing interest in the direct application of corpora, or data-driven learning (DDL), in language education. This relatively novel teaching approach has been particularly applied in the teaching and learning of English for Academic Purposes (EAP)/academic writing, especially since the turn of the century. This paper synthesizes and evaluates the research progress
-
A corpus-driven comparison of English and French Islamist extremist texts International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2018-10-29 Paul Baker, Rachelle Vessey
Using corpus linguistics and qualitative, manual discourse analysis, this paper compares English and French extremist texts to determine how messages in different languages draw upon similar and distinct discursive themes and linguistic strategies. Findings show that both corpora focus on religion and rewards (i.e. for faith) and strongly rely on othering strategies. However, the English texts are
-
Review of Weisser, M. (2016) Practical Corpus Linguistics: An Analysis to Corpus-based Language Analysis International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2018-10-05 Viola Wiegand
This article reviews Practical Corpus Linguistics: An Introduction to Corpus-based Language Analysis
-
Dimensions of variation across Internet registers International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2018-10-05 Tony Berber Sardinha
This paper presents a study that sought to identify the dimensions of variation underlying a corpus of Internet texts, using Biber’s (1988) multi-dimensional (MD) analysis framework. The corpus was compiled following the method proposed by Biber (1993), according to which the size of each register subcorpus should be determined based on the linguistic variation across the texts. The corpus was tagged
-
Investigating effects of criterial consistency, the diversity dimension, and threshold variation in formulaic language research International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2018-10-05 Xiaofei Lu, Olesya Kisselev, Jungwan Yoon, Michael D. Amory
O’Donnell et al. (2013) considered four measures of formulaicity and reported that they produced different results concerning the effects of expertise and first/second language status on formulaic sequence usage in academic writing. The current study explores several additional methodological issues using the same dataset from O’Donnell et al. (2013). We first motivate the need for criterial consistency
-
The academic English collocation list International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2018-10-05 Lei Lei, Dilin Liu
The use of collocations plays an important role for the proficiency of ESL/EFL learners. Hence, educators and researchers have long tried to identify collocations typical of either academic or general English and the challenges involved in learning them. This paper proposes a comprehensive and type-balanced academic English collocation list (AECL). AECL is based on a large corpus of academic English
-
Multi-unit association measures International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2018-10-05 Jonathan Dunn
This paper formulates and evaluates a series of multi-unit measures of directional association, building on the pairwise ΔP measure, that are able to quantify association in sequences of varying length and type of representation. Multi-unit measures face an additional segmentation problem: once the implicit length constraint of pairwise measures is abandoned, association measures must also identify
-
Dependency parsing of learner English International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2018-05-31 Yan Huang, Akira Murakami, Theodora Alexopoulou, Anna Korhonen
Current syntactic annotation of large-scale learner corpora mainly resorts to “standard parsers” trained on native language data. Understanding how these parsers perform on learner data is important for downstream research and application related to learner language. This study evaluates the performance of multiple standard probabilistic parsers on learner English. Our contributions are three-fold
-
Collocation and word association International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2018-05-31 Beom-Mo Kang
This paper studies the relationship between grammar and language use by comparing word association and collocation. Since word association reveals mental semantic knowledge, usage-based approaches expect word association to mirror the relation between words in use, namely collocation. The paragraph is a more apt unit for collocation than the sentence in mirroring word association. Among measures of
-
Lexical preference and variation in the complementation of provide International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2018-05-31 Hans Martin Lehmann
This paper investigates grammatical variation in the complementation of the verb provide. It describes the distribution of the four possible patterns with two internal arguments and the interaction between pattern choice and lexical choice. The study finds and documents significant differences in the preferred complementation patterns for American and British English as well as for spoken and written
-
Register variation in spoken British English International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2018-05-31 Jacqueline Laws, Chris Ryder
The aim of this paper is to identify the effect of register variation in spoken British English on the occurrence of the four principal verb-forming suffixes: ‑ate, ‑en, ‑ify and ‑ize, by building on the work of Biber et al. (1999) , Plag et al. (1999) and Schmid (2011) . Register variation effects were compared between the less formal Demographically-Sampled and the more formal Context-Governed components
-
BasiScript International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2018-01-01 Agnes Tellings, Nelleke Oostdijk, Iris Monster, Franc Grootjen, Antal van den Bosch
This short paper introduces BasiScript, a 9-million-word corpus of contemporary Dutch texts written by primary school children. The data were collected over three years with 17,216 children contributing texts throughout this period. Each word token in the corpus is annotated with the correct orthographical form, the associated lemma and the part of speech. The most frequent polysemous words have been
-
General extenders and discourse variation International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2018-01-01 Jūratė Ruzaitė
The present study accounts for the use of general extenders (GEs) in spoken and written registers. The repertoire and usage of GEs is analysed in Lithuanian by focusing on their distribution across different registers, their structural properties, and discourse-pragmatic functions. The study is based on a reference corpus of Lithuanian, which includes four subcorpora of written discourse and a subcorpus
-
Solving contradictions in semantic prosody analysis with prosody concord International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2018-01-01 Xuri Tang, Gaixiang Liu
Using collocation-based approaches, semantic prosody analyses of lemmas like alleviate and cure yield judgments of negative prosody, which contradict common sense. This poses a challenge to the concept of semantic prosody and the principle of co-occurrence. To solve such contradictions, this paper proposes a new approach to semantic prosody analysis named ‘prosody concord’. The approach adopts collostruction
-
Academic lexical bundles International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2018-01-01 Ken Hyland, Feng (Kevin) Jiang
An important component of fluent linguistic production and a key distinguishing feature of particular modes, registers and genres is the multi-word expressions referred to as ‘lexical bundles’. These are extended collocations which appear more frequently than expected by chance, helping to shape meanings and contributing to our sense of coherence and distinctiveness in a text. These strings have been
-
The textual colligation of stance phraseology in cross-disciplinary academic discourse International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2018-01-01 Jihua Dong, Louisa Buckingham
This study investigates the textual colligation of stance phrases at the levels of sentence, paragraph and text in empirical research articles from agriculture and economics. We extracted the textual positions of stance phrases with the software Wordskew (Barlow, 2016) in two purpose-built corpora of around three million tokens. The results show that stance phrases display similar distribution patterns
-
The English Grammar Profile of learner competence International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2017-12-01 Anne O'Keeffe, Geraldine Mark
English Profile (EP) is an ongoing empirical exploration of learner English initiated by Cambridge University Press and Cambridge English, among others. EP aims to create a set of empirically-based descriptions of language competencies for English. ‘Reference Level Descriptors’ already exist as part of the Common European Framework of Reference (CEFR) but are intuitively derived and not designed for
-
Multi-word discourse markers and their corpus-driven identification International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2017-12-01 Kaja Dobrovoljc
With expanding evidence on the formulaic nature of human communication, there is a growing need to extend discourse marker research to functionally analogue multi-word expressions. In contrast to the common qualitative approaches to discourse marker identification in corpora, this paper presents a corpus-driven semi-automatic approach to identification of multi-word discourse markers (MWDMs) in the
-
Association with explanation-conveying constructions predicts verbs’ implicit causality biases International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2017-12-01 Emiel van den Hoven, Evelyn C. Ferstl
Given a sentence such as Mary fascinated/admired Sue because she did great, the verb fascinated leads people to interpret she as referring to Mary, whereas admired leads people to interpret she as referring to Sue. This phenomenon is known as implicit causality (IC). Recent studies have shown that verbs’ causality biases closely correspond to the verbs’ semantic classes, as classified in VerbNet, a
-
A distributional semantic approach to the periodization of change in the productivity of constructions International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2017-12-01 Florent Perek, Martin Hilpert
This paper describes a method to automatically identify stages of language change in diachronic corpus data, combining variability-based neighbour clustering, which offers objective and reproducible criteria for periodization, and distributional semantics as a representation of lexical meaning. This method partitions the history of a grammatical construction according to qualitative stages of productivity
-
Discourse markers and (dis)fluency in English and French International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2017-09-22 Ludivine Crible
While discourse markers (DMs) and (dis)fluency have been extensively studied in the past as separate phenomena, corpus-based research combining large-scale yet fine-grained annotations of both categories has, however, never been carried out before. Integrating these two levels of analysis, while methodologically challenging, is not only innovative but also highly relevant to the investigation of spoken
-
Lexical bundles in spoken academic ELF International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2017-09-22 Ying Wang
This corpus-based study explored the effects of two factors – genre (i.e. speech event type) and disciplinary variation – on spoken academic ELF, from the perspective of lexical bundles (i.e. recurrent word combinations). The material was drawn from a corpus of transcribed spoken academic lingua franca English (ELFA). The investigation involved a quantitative analysis of the use of four-word bundles
-
Methodological issues in the use of directional parallel corpora International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2017-09-22 Maïté Dupont, Sandrine Zufferey
The recent emergence of large parallel corpora has represented a leap ahead for cross-linguistic and translation studies. However, the specificities of these corpora and their influence on the nature of observed linguistic phenomena remain underexplored, especially in the field of contrastive linguistics. In this study, we compare the translation equivalences of four concessive adverbial connectives
-
Using word n-grams to identify authors and idiolects International Journal of Corpus Linguistics (IF 1.139) Pub Date : 2017-09-22 David Wright
Forensic authorship attribution is concerned with identifying the writers of anonymous criminal documents. Over the last twenty years, computer scientists have developed a wide range of statistical procedures using a number of different linguistic features to measure similarity between texts. However, much of this work is not of practical use to forensic linguists who need to explain in reports or
Contents have been reproduced by permission of the publishers.