Abstract
Shanghainese is an extremely topic-prominent language with many topic markers in competition with one another, often without any obvious basis for the selection of one topic marker over another. We explore the influence of five variables on the five most frequent topic markers in a corpus of (spoken) Shanghainese: topic length, syntactic category of the topic, function of the topic, comment type, and genre. We carry out a multivariate statistical analysis of the data, relying on a polytomous logistic regression model. Our approach leads to a satisfying quantification of the role of each factor, as well as an estimate of the probabilities of combinations of factors, in influencing the choice of topic marker. This study serves simultaneously as an introduction to the polytomous package (Arppe 2013) in the statistical software package R.
About the authors
Weifeng Han is a Lecturer in the Department of English, Donghua University, Shanghai. His research interests include cognitive linguistics, syntactic typology, linguistic philosophy and second language acquisition.
Antti Arppe is an Assistant Professor in Quantitative Linguistics in the Department of Linguistics, University of Alberta. His research interests include corpus linguistics, specifically exploiting and developing statistical methods, and in general multimethodological, empirical research strategies in linguistics, and the study of various sorts of linguistic alternations.
John Newman is a Professor in the Department of Linguistics, University of Alberta. His research interests include corpus linguistics, cognitive linguistics, and field work (in Papua New Guinea). He is the Director of ICE-CANADA, the Canadian component of the International Corpus of English.
© 2017 Walter de Gruyter GmbH, Berlin/Boston