-
Multifractal Analysis of the Distribution of Three Grammatical Constructions in English Texts Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2024-02-08 Rosmawati, Wander Lowie
Both the Menzerath-Altmann law and the Zipf-Mandelbrot law note that language is a fractal structure and, like any other fractals, follows power laws. Studies on fractal linguistics demonstrated th...
-
Word Length in Chinese: The Menzerath-Altmann Law is Valid After All Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2023-12-29 Tereza Motalová, Ján Mačutek, Radek Čech
According to the Menzerath-Altmann law, longer language constructs consist, on average, of shorter constituents. It is most often studied at the level of words and syllables (the mean syllable leng...
-
Effects of Word Limit on Sentence Length and Clause Length in Academic Journal Article Abstracts: A Synergetic Linguistic Perspective Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2023-12-29 Yue Li, Yuan Gao, Xiaofei Lu
Several studies have sought to characterize the syntactic features of research articles (RAs) and their part-genres. However, no study has examined the interrelation between different syntactic com...
-
Words and Numbers. In Memory of Peter Grzybek (1957-2019) Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2023-12-29 Mengge Wang
Published in Journal of Quantitative Linguistics (Vol. 30, No. 3-4, 2023)
-
Structural Factor Analysis of Lexical Complexity Constructs and Measures: A Quantitative Measure-Testing Process on Specialised Academic Texts Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2023-12-29 Maryam Nasseri, Philip McCarthy
This study evaluates 22 lexical complexity measures that represent the three constructs of density, diversity and sophistication. The selection of these measures stems from an extensive review of t...
-
Lexical Features and Psychological States: A Quantitative Linguistic Approach Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2023-12-29 Xiaowei Du
In recent decades, there has been an increasing interest in the relation between lexical features and texts of psychological states. Previous studies demonstrated that some lexical features varied ...
-
Quantitative Approaches to Universality and Individuality in Language Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2023-12-17 Wei Huang, Tenghao Ji
Published in Journal of Quantitative Linguistics (Ahead of Print, 2023)
-
The Current State and Prominent Features of Quantitative Linguistics Through the Lens of QUALICO 2023: A Conference Report Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2023-11-28 Jianwei Yan
Quantitative Linguistics (QL) is an academic field that employs quantitative and statistical methods to explore language patterns and linguistic laws. From June 28th to 30th, 2023, the Internationa...
-
The Structural Complexity of Chinese Words and Its Relationship with Word Frequency Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2023-07-06 Xinpei Hong, Wei Huang, Haitao Liu
The morphological synergetic model has yet to be fully tested in typical analytic languages. The quantification of Chinese morphology and its relationship with word frequency can help construct and...
-
Synergetic Properties of Lexical Structures in Chinese and English Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2023-05-19 Jieqiang Zhu, Jingyang Jiang
ABSTRACT The synergetic lexical model provides a unique framework for exploration of the interrelationships between the lexical properties of languages. Previous studies concerning several properties of this lexical model have yielded many successful fittings results, but very few studies have investigated synonymy, a major property of this model. The present study uses 825 Chinese and 848 English
-
A Corpus-Based Study of the Distributions of Adnominals Across Registers and Disciplines Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2023-05-03 Yiyang Hu, Qingshun He
ABSTRACT Adnominals are an important resource of noun modification in written registers, especially in academic writing. This study compares the frequencies of adjectival adnominals and nominal adnominals across two registers (Fiction and Academic writing) by calculating T-values and conducting Welch’s t-tests on the adnominal subtypes. It is found that the preference for nominal adnominals exists
-
Unifying Models for Word Length Distributions Based on Types and Tokens Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2023-04-24 Peter Zörnig, Thomas Berg
ABSTRACT Word length studies have been one of the central issues in Quantitative Linguistics for a long time. Most models were constructed for very specific purposes, i.e. the individual models apply only to a specific language, only to token counts or only to type counts. The present paper takes up the challenge of developing unifying models which account for both type and token frequencies of a moderately
-
Zipf’s Law for Speech Acts in Spoken English Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2023-04-18 Da Qi, Hua Wang
ABSTRACT Speech acts, as basic communication units in pragmatics, are highly correlated to speakers’ communicative intentions. It is a worthwhile goal to explore whether they obey some linguistic laws that reflect people’s cognitive mechanisms, e.g. Zipf’s Law. However, few studies have examined whether the Zipf distribution can capture the frequencies of speech acts, and whether its parameters can
-
Too Noisy at the Bottom: Why Gries’ (2008, 2020) Dispersion Measures Cannot Identify Unbiased Distributions of Words Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2023-02-02 Robert N. Nelson
ABSTRACT Gries (2008 Gries, S. T. (2008). Dispersions and adjusted frequencies in corpora. International Journal of Corpus Linguistics, 13(4), 403–437. https://doi.org/10.1075/ijcl.13.4.02gri[Crossref] , [Google Scholar], 2021) defined two dispersion measures able to alert corpus analysts to words that have a problematically limited distribution. Gries (2010 Gries, S. T. (2010). Dispersions and adjusted
-
Modelling the Dynamics of Language Change: Logistic Regression, Piotrowski’s Law, and a Handful of Examples in Polish Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2022-12-10 Rafał L. Górski, Maciej Eder
ABSTRACT The study discusses modelling diachronic processes by logistic regression. The phenomenon of nonlinear changes in language was first observed by Raimund Piotrowski (hence labelled as Piotrowski’s law), even if actual linguistic evidence often speaks against using the notion of a ‘law’ in this context. In our study, we apply logistic regression models to changes which occurred between 15th
-
Word Use Equivalence and Hierarchical Word Tiers Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2022-10-13 Brent Burch, Jesse Egbert
ABSTRACT A ranked word list provides information about the position of each word in the list. However, retaining and employing the measure used to generate the ranked list can yield additional information about the words. If ω denotes the prevalence of a word in a corpus, then not only can the values of ω be ordered, their values can be compared to one another, and words having similar values can be
-
Stylistic Fingerprints, POS-tags, and Inflected Languages: A Case Study in Polish Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2022-09-18 Maciej Eder, Rafał L. Górski
ABSTRACT In stylometric investigations, frequencies of the most frequent words (MFWs) and character n-grams outperform other style-markers, even if their performance varies significantly across languages. In inflected languages, word endings play a prominent role, and hence different word forms cannot be recognized using generic text tokenization. Countless inflected word forms make frequencies sparse
-
Unified Parametrization of Phonetic Features and Numerical Calculation of Phonetic Distances between Speech Sounds Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2022-07-25 Maksym O. Vakulenko
ABSTRACT A metric method to numerically measure phonetic and phonemic distances or contrasts, between speech sounds, is put forward. The feature values of the compared phones taken from the standard IPA charts are treated as independent parameters that give rise to corresponding Euclidean distances. As an illustration, the general phone set is mapped to Ukrainian phonemes. The proposed model agrees
-
The Entropy of Morphological Systems in Natural Languages Is Modulated by Functional and Semantic Properties Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2022-05-04 Francesca Franzon, Chiara Zanini
ABSTRACT In most natural languages, grammatical gender and number features encode semantic attributes concerning animacy, sex, and numerosity. Despite the likely advantage of promptly communicating about such salient attributes, inflectional systems rarely display consistently bijective correspondences between the semantic attributes and the grammatical feature values. In a study on Italian, we explored
-
Authorship Attribution via Occupancy-problem-type Indices Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2022-02-14 Lukun Zheng, Huiqiang Zheng, Chandra Kundu
ABSTRACT In this paper, we propose a new methodology for authorship attribution based on a profile of indices related to the occupancy problem, called occupancy-problem indices. The occupancy problem has a long history and is an important example in standard textbooks like Feller (1971). We base our methodology on function words. We establish a testing procedure by constructing a confidence band of
-
To Move or Not to Move: An Entropy-based Approach to the Informativeness of Research Article Abstracts across Disciplines Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2022-02-10 Wei Xiao, Li Li, Jin Liu
ABSTRACT Research article (RA) abstracts succinctly and skilfully epitomize the core information of the full text and have thus attracted the attention of a number of scholars. While previous studies mainly focused on the rhetorical structures, meta-discursive features and lexico-grammatical features, few have made explorations from the perspective of information theory. To bridge this gap, the present
-
Menzerath-Altmann Law in Consecutive and Simultaneous Interpreting: Insights into Varied Cognitive Processes and Load Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2022-01-16 Xinlei Jiang, Yue Jiang
ABSTRACT Notwithstanding theoretical simulations of distinctive cognitive processes and load of consecutive (CI) and simultaneous interpreting (SI), quantitative linguistic inquiry into their outputs is needed for solid empirical evidence. As a fundamental law of quantitative linguistics, Menzerath–Altmann Law (MAL) mirrors the economic processing of linguistic information and complex dynamic language
-
Syntactic Complexity of Different Text Types: From the Perspective of Dependency Distance Both Linearly and Hierarchically Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2021-12-09 Ruina Chen, Sirui Deng, Haitao Liu
ABSTRACT Dependency distance (DD) is a well-established measure of syntactic complexity. Previous studies largely focused on the linear dimension, mostly by mean of dependency distance (MDD). In the present study, a new quantitative indicator –mean hierarchical dependency distance (MHDD), is proposed to discuss DD-related issues. Combining MHDD and MDD, the study investigates syntactic complexity of
-
Dependency Distance and Its Probability Distribution: Are They the Universals for Measuring Second Language Learners’ Language Proficiency? Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2021-11-17 Yuxin Hao, Xuelin Wang, Yanni Lin
ABSTRACT Previous studies have shown that dependency distance and its probability distribution can be applied as syntactic indicators of English as interlanguage. However, the universal application of these indicators has not been verified from the perspective of language typology. The issues are addressed in the present study based on a treebank of Chinese interlanguage of English and Japanese native
-
A Zipfian Approach to Words in Contexts: The Cases of Modern English and Chinese Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2021-05-19 Jin Cong
ABSTRACT The system-level complexity of language has been thoroughly investigated in terms of Zipf’s law, whose quantitative features have proved to reflect text/language typology. This study extends the scope of Zipf’s law from the macroscopic scale of language to specific words in contexts, with the aim of examining its potential as an indicator of word typology. The focus is confined to the high-frequency
-
The Indicative/subjunctive Mood Alternation with Adverbs of Doubt in Spanish Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2021-04-27 Harunobu Hirota
ABSTRACT This study aims to analyse the indicative/subjunctive mood alternation in Spanish sentences with adverbs of doubt (acaso, posiblemente, probablemente, quizá, quizás, tal vez, seguramente, a lo mejor, igual). To this end, this study statistically analysed the linguistic and social factors conditioning the mood alternation in sentences with adverbs of doubt. A total of 1278 tokens were analysed
-
Gabriel Altmann (1931–2020) Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2021-03-23 Reinhard Köhler, Emmerich Kelih, Hans Goebl
(2021). Gabriel Altmann (1931–2020) Journal of Quantitative Linguistics: Vol. 28, No. 2, pp. 187-193.
-
Correction Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2021-02-09
(2021). Correction. Journal of Quantitative Linguistics: Vol. 28, No. 2, pp. I-II.
-
Derivational Suffix Productivity in Persian: A Fuzzy Analysis Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2021-03-21 Seyyedeh Zohreh Aftabi, Abbas Ali Ahangar, Hassan Mishmast Nehi
ABSTRACT The main aim of this article is to introduce a new way of dealing with the vague concept of suffix productivity in Persian. This approach, that is fuzzy set theory, gives each suffix a degree of membership from [0,1] to different productivity categories. To estimate morphological productivity of Persian suffixes, first Baayen’s proposed measures, i.e. realized productivity, expanding productivity
-
Interactive Heatmaps as an Improved Means of Analysing Complex Socio-dialectal Patterns: German Loans in Silesian Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2021-03-14 István Fekete, Gerd Hentschel
ABSTRACT This paper presents an application of interactive cluster heatmaps in sociolinguistics, a method hitherto scarcely employed in the field. To that end, we developed a statistical workflow to illustrate the method and analyse large-scale Silesian questionnaire data. In our quantitative-linguistic study we demonstrate how heatmaps can uncover information about complex patterns of regional variation
-
Estimating Phonetic Probability in Etymology Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2021-03-02 Kamil Stachowski
ABSTRACT An etymological proposition is often said to be probable or improbable from the phonetic point of view, and it is not rare for opinions to diverge on which it is. The estimation is typically purely intuitive, based on perceived similarity and no more than a handful of analogous examples. This paper proposes a method for quantifying the phonetic probability of an etymology and comparing it
-
Quantitative Analysis of Spoken Discourse Using Memoirs of Old-time Moviegoers Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2021-02-24 S. Çalışkan, F. Can, H. Akbulut, S. R. Öztürk
ABSTRACT We present the first quantitative analysis of spoken discourse for the Turkish language using memoirs of a group of old-time moviegoers of varying age groups whose birth year spreads over a period of four decades ranging from the 1930s to the 1960s. They tell their experiences by answering a set of questions. Their responses are evaluated comprehensively with the expectation that various attributes
-
Why Do Parameter Values in the Zipf-Mandelbrot Distribution Sometimes Explode? Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2021-02-23 Ján Mačutek
ABSTRACT The Zipf-Mandelbrot distribution serves as a mathematical model for ranked frequencies in many areas of scientific research, including linguistics. Many linguistic units, like e.g., words or word n-grams, follow this distribution. However, in some cases, such as for graphemes in linguistics or species abundance and diversity data in biology, the parameters of the Zipf-Mandelbrot distribution
-
Experiments in Text Classification: Analyzing the Sentiment of Electronic Product Reviews in Greek Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2021-02-17 Dimitris Bilianos
ABSTRACT Sentiment analysis, which deals with people’s sentiments as they appear in the growing amount of online social data, has been on the rise in the past few years. In its simplest form, sentiment analysis deals with the polarity of a given text, i.e., whether the opinion expressed in it is positive or negative. Sentiment analysis, or opinion mining applications on websites and the social media
-
Diachronic Distribution of Elemental Ordering in English Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2021-02-14 Jiangping Zhou, Yanmei Gao
ABSTRACT English elemental ordering in a non-canonical word order incorporates preposing, postposing and elemental reversal. This paper intends to explore how these types of elemental ordering are distributed during the last two centuries by employing the Corpus of Historical American English or COHA. The findings demonstrate that preposing has been increasing apparently but still in its inceptive
-
Markov Models for Multi-state Language Change Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2021-02-08 Freek Van de Velde, Isabeau De Smet
ABSTRACT Historical linguistics has witnessed an upsurge in quantitative corpus studies. The bulk of these studies involve the use of regression modelling. We point out a number of potential problems with this approach, and offer an alternative. For a multi-state language change, we propose a Markov model in continuous time. The major advantage of this technique, which has been used in medical contexts
-
Book Review of Corpus Stylistics: Theory and Practice Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2021-01-04 Qianqian Jiang, Yaqin Wang
(2021). Book Review of Corpus Stylistics: Theory and Practice. Journal of Quantitative Linguistics: Vol. 28, No. 3, pp. 282-287.
-
Revisiting Keyword Analysis in a Specialized Corpus: Religious Terminology Extraction Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2021-01-01 Hsin-Yi Lien
ABSTRACT This study investigates keyword extraction using a compiled Buddhist corpus. It sets out the fundamental mode of generation and refinement of keywords with statistical measures and manual screening with specific criteria. The Buddhist Word List contains 1244 keywords with 375 Pali words in Buddhist literacy. We compared the results of applying occurring frequency, log-likelihood (LL), and
-
Linguistic Accommodation in Teenagers’ Social Media Writing: Convergence Patterns in Mixed-gender Conversations Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2020-09-06 Lisa Hilte, Reinhild Vandekerckhove, Walter Daelemans
ABSTRACT The present study analyzes the phenomenon of linguistic accommodation, i.e. the adaptation of one’s language use to that of one’s conversation partner. In a large corpus of private social media messages, we compare Flemish teenagers’ writing in two conversational settings: same-gender (including only boys or only girls) and mixed-gender conversations (including at least one girl and one boy)
-
Optimal Coding and the Origins of Zipfian Laws Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2020-07-24 Ramon Ferrer-i-Cancho, Christian Bentz, Caio Seguin
ABSTRACT The problem of compression in standard information theory consists of assigning codes as short as possible to numbers. Here we consider the problem of optimal coding – under an arbitrary coding scheme – and show that it predicts Zipf’s law of abbreviation, namely a tendency in natural languages for more frequent words to be shorter. We apply this result to investigate optimal coding also under
-
Predictive Modelling of Type Valency in Word Formation Grammar Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2020-07-03 Kateryna Krykoniuk
ABSTRACT This paper explores different regression models for predicting the type valency of Persian suffixes within a usage-based approach. Usage-based models treat the type frequency of a suffix as a key predictor for its type valency revealing that an increase in the type frequency leads to a greater combining power between a construction’s paradigmatic elements. However, this effect is limited to
-
Dependency Distances and Their Frequencies in Indo-European Language Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2020-06-18 Xinying Chen, Kim Gerdes
ABSTRACT The present study investigates the relationship between two features of dependencies, namely, dependency distances and dependency frequencies. The study is based on the analysis of a parallel dependency treebank that includes 10 Indo-European languages. Two corresponding random dependency treebanks are generated as baselines for comparison. After computing the values of dependency distances
-
Quantifying Perceived Political Bias of Newspapers through a Document Classification Technique Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2020-06-16 Hyungsuc Kang, Janghoon Yang
ABSTRACT Even though a certain degree of political bias is unavoidable in the media, strong media bias is likely to have an impact on society, especially on the formation of public opinion. This research proposes a data-driven method for quantifying political bias of media contents. With a document classification technique called doc2vec and social data from Facebook posts, a model for analysing the
-
The Effect of Translation on Text Coherence: A Quantitative Study Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2020-06-16 Elham Najafi, Alireza Valizadeh, Amir H. Darooneh
ABSTRACT Investigating the coherence of translated texts is an important issue in multilingual studies. In this paper, we aim to study text coherence in human translated texts and its relation to the text properties by a quantitative approach. For this purpose, we assigned a word importance value to each word-type of a text and construct the text ‘importance time series’ from the original and translated
-
Lexical Richness and Text Length: An Entropy-based Perspective Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2020-06-10 Yaqian Shi, Lei Lei
ABSTRACT Text length is a major concern in the measurement of lexical richness, and how lexical richness is affected by text length still remains open. The present study aims to explore the relation between text length and lexical richness from an entropy-based perspective. Results show a non-linear growth pattern of lexical richness by increasing text length. To be specific, lexical richness increases
-
A Word Embedding Model for Analyzing Patterns and Their Distributional Semantics Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2020-06-07 Rui Feng, Congcong Yang, Yunhua Qu
ABSTRACT Recent advances in natural language processing have catalysed active research in designing algorithms to generate contextual vector representations of words, or word embedding, in the machine learning and computational linguistics community. Existing works pay little attention to patterns of words, which encode rich semantic information and impose semantic constraints on a word’s context.
-
Does Menzerath–Altmann Law Hold True for Translational Language: Evidence from Translated English Literary Texts Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2020-05-24 Yue Jiang, Ruimin Ma
ABSTRACT Menzerath–Altmann Law (MAL) is regarded as one of the fundamental laws of language due to its extensive validity for different languages at various linguistic levels and applicability for register differentiation. However, whether MAL holds true for translational language remains to be answered. Translational language, different from both the source language and target original (non-translated)
-
Is Queen’s English Drifting Towards Common People’s English? —Quantifying Diachronic Changes of Queen’s Christmas Messages (1952–2018) with Reference to BNC Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2020-05-18 Xinlei Jiang, Yue Jiang, Cathy Ka Weng Hoi
ABSTRACT Queen's English (QE), a linguistic symbol of the royal or upper class, is a particular variety or an aristocratic form of English. However, QE has been dethroned by a surprising finding that it shifted phonologically towards common people's English (CE) between the 1950s-1980s, arousing a debate on its existence. Based upon Queen's Christmas Messages (1952-2018) and BNC, this study quantitatively
-
Probability Distribution of Dependency Distance Based on a Treebank of Japanese EFL Learners’ Interlanguage Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2020-04-26 Wenping Li, Jianwei Yan
ABSTRACT Ouyang and Jiang (2018) measured the second language proficiency of English as a foreign language (EFL) learners based on the probability distribution of dependency distance. However, the typological features of the native language (Chinese) and the target language (English) they adopted are generally considered similar in word order and dependency direction. In addition, their method of classifying
-
A Multifactorial Analysis of Concessive Clause Positioning Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2020-03-11 Hui Kang, Jiajin Xu
ABSTRACT Previous works have identified multiple factors and their interplay that condition the positioning of the concessive adverbial clauses. This study continues this line of research by 1) focusing exclusively on the positioning of although-led concessive adverbial clauses (although-clauses hereafter) among different concessive clause relations; 2) supplementing the factor set with more linguistic
-
Analysis of Transitional Areas in Dialectology: Approach with Fuzzy Logic Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2020-03-03 Gotzon Aurrekoetxea, Aitor Iglesias, Esteve Clua, Iker Usobiaga, Miquel Salicrú
ABSTRACT Comparing the dialectal classifications into disjointed zones with the representation of populations in a geolectal continuum has emphasized the importance of transition regions. Identifying these regions has been the subject of study in the scientific literature, although research has not been conducted in a reliable manner. Based on the Basque ‘Bourciez’ Corpus, we have highlighted the limitations
-
A Methodology to Measure the Diachronic Language Distance between Three Languages Based on Perplexity Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2020-03-01 José Ramom Pichel, Pablo Gamallo, Iñaki Alegria, Marco Neves
ABSTRACT The aim of this paper is to apply a corpus-based methodology, based on the measure of perplexity, to automatically calculate the cross-lingual language distance between historical periods of three languages. The three historical corpora have been constructed and collected with the closest spelling to the original on a balanced basis of fiction and non-fiction. This methodology has been applied
-
Numerical Assessment of Orthographic Neighbourhood Size Fluctuation in Writing Using Fractal Dimension Analysis Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2019-11-25 Rex Taibu, Eric Cheung, Weier Ye, Sunil Dehipawala, Vazgen Shekoyan, George Tremberger Jr, Tak Cheung
ABSTRACT The orthographic size of a targeted word, the number of new words that can be generated from a targeted word by exchanging a single letter, offers a research window where words can be transformed into numerical values. The CLEARPOND technology from Northwestern University was used for the transformation. A writing can then be modelled as a time series where the fluctuation can be further described
-
Word Length Distribution in Zhuang Language Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2019-10-30 Aiyun Wei, Qian Lu, Haitao Liu
ABSTRACT The present study focuses on the word length distribution (WLD) of Zhuang language. The results show that the WLDs of all texts investigated can be described by the Positive Cohen-Poisson model when the word length is measured by the syllable numbers. However, when the word length is measured by the letter numbers, they do not follow any model from the Poisson or Binomial distribution families
-
Calculation of Phonetic Distances between Speech Sounds Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2019-10-23 Maksym Vakulenko
ABSTRACT A new formalism to numerically measure phonetic differences between speech sounds treating feature values of the compared phones as independent parameters that give rise to corresponding Euclidean distances is put forward. The articulatory and acoustic methods within this formalism were compared, where the corresponding results display good agreement. The more reliable and more universal character
-
The Discriminativeness of Internal Syntactic Representations in Automatic Genre Classification Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2019-09-26 Mingyu Wan, Alex Chengyu Fang, Chu-Ren Huang
ABSTRACT Genre characterizes a document differently from a subject that has been the focus of most document retrieval and classification applications. This work hypothesizes a close interaction between syntactic variation and genre differentiation by introspecting stylistic cues in functional and structural aspects beyond word level. It has engineered 14 syntactic feature sets of internal representations
-
Word Length Distribution in German Texts during the 17th-19th Century Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2019-09-15 Fei Lian, Yuan Li
ABSTRACT Word length in German texts has been a frequently discussed issue in the field of quantitative linguistics. Taking an overall view of the existing research data, however, most of the research focuses on literary texts and private letters and the size of data corpus for each research is relatively small. This paper provides a time- and genre-based analysis of word length distribution in German
-
Peter Grzybek (1957 – 2019) Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2019-09-04
(2019). Peter Grzybek (1957 – 2019) Journal of Quantitative Linguistics: Vol. 26, No. 4, pp. 356-357.
-
Towards a Fractal Analysis of the Sign Language Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2019-09-01 Jan Andres, Martina Benešová, Jiří Langer
ABSTRACT In this paper, it is the first attempt to discuss and explore a possible extension of Hřebíček’s conjecture about a fractal structure of language from a natural language to a sign language. We will show that the approach by means of four versions of a formula for the Menzerath-Altmann law sensitively depends especially on a suitable segmentation of a given sign language structure and the way
-
Anti Dependency Distance Minimization in Short Sequences. A Graph Theoretic Approach Journal of Quantitative Linguistics (IF 0.761) Pub Date : 2019-08-22 Ramon Ferrer-i-Cancho, Carlos Gómez-Rodríguez
ABSTRACT Dependency distance minimization (DDm) is a word order principle favouring the placement of syntactically related words close to each other in sentences. Massive evidence of the principle has been reported for more than a decade with the help of syntactic dependency treebanks where long sentences abound. However, it has been predicted theoretically that the principle is more likely to be beaten