Skip to content
BY 4.0 license Open Access Published by De Gruyter Mouton August 17, 2020

Words, constructions and corpora: Network representations of constructional semantics for Mandarin space particles

  • Alvin Cheng-Hsien Chen ORCID logo EMAIL logo

Abstract

In this study, we aim to demonstrate the effectiveness of network science in exploring the emergence of constructional semantics from the connectedness and relationships between linguistic units. With Mandarin locative constructions (MLCs) as a case study, we extracted constructional tokens from a representative corpus, including their respective space particles (SPs) and the head nouns of the landmarks (LMs), which constitute the nodes of the network. We computed edges based on the lexical similarities of word embeddings learned from large text corpora and the SP-LM contingency from collostructional analysis. We address three issues: (1) For each LM, how prototypical is it of the meaning of the SP? (2) For each SP, how semantically cohesive are its LM exemplars? (3) What are the emerging semantic fields from the constructional network of MLCs? We address these questions by examining the quantitative properties of the network at three levels: microscopic (i.e., node centrality and local clustering coefficient), mesoscopic (i.e., community) and macroscopic properties (i.e., small-worldness and scale-free). Our network analyses bring to the foreground the importance of repeated language experiences in the shaping and entrenchment of linguistic knowledge.

1 Introduction

Meaning in language is a complex subject. It is usually difficult to find one semantic theory that accounts for meanings of all kinds of linguistic units (Riemer 2010). For example, meaning by reference cannot explain how the abstract words develop senses; meaning by contrast would often end up with circular definitions and meaning by features may end up with more issues connected to categorization and feature universality. All of these approaches to meaning may face even more challenges if they support one assumption: meaning is fixed and stable. In recent years, a usage-based approach to language has found strong evidence against this assumption (Beckner et al. 2009; Bybee 2002, 2006; Diessel 2019). Most importantly, distributional semantics has now been the central focus in the usage-based research paradigm, and this puts forth the idea that meaning is in flux and consistently distributed in the co-occurring contexts of the linguistic unit (Bybee 2007). The distributional properties of the linguistic unit may be taken as more realistic estimates of their emerging semantics. A systematic investigation of how a linguistic unit is connected to the other items often uncovers high degrees of semantic coherence underlying these co-occurrence patterns. This uncovered semantic coherence shows how different aspects of linguistic meaning are interactionally or cognitively motivated by “domain-general processes” (Diessel 2019: 5), i.e., factors that are not unique to language processing, such as processes of social interaction (e.g., Clark 1996; Tao 2003), conceptualization (e.g., frame-semantic knowledge, semantic prototype and categorization, image-schemas discussed in Stefanowitsch and Gries [2005]) and memory (e.g., automation in Bybee [2002]; priming in Pickering and Ferreira [2008]) . Therefore, this study takes a dynamic view of meaning – linguistic meaning can be observed from the emerging semantic coherence of linguistic co-occurrence patterns, which in turn can be cognitively and/or interactionally grounded.

One repercussion of distributional semantics is the reliance of massive word distribution data from large and representative corpora for semantic analysis. Many corpus-based analyses have suggested that word co-occurrences contribute greatly to our understanding of lexical meanings. For example, analyzing the sentential contexts of pairs of nouns varying in their semantic similarity, Miller and Charles (1991) observed that the human ratings of semantic similarity significantly predict whether the pairs of words could be substituted into the same linguistic context. Collocation data have also been widely used in the study of lexical relations, such as (near) synonym (high vs tall in Taylor [2003]), polysemy (senses of run in Gries [2006] and Glynn [2014]) and antonymy (Justeson and Katz 1991). This strong relationship between the meaning of a word and its collocates is often attributed to the Firthian hypothesis – “You shall know a word by the company it keeps” (Firth 1957: 11). This lexical approach of distributional semantics has been extended to include not only collocation patterns of words but also their morphosyntactic and/or discourse contexts, thus leading to a comprehensive behavioral profile of words in lexical semantic analyses (Divjak and Gries 2006; Hanks 1996; Liu 2010).

In the present study, we are particularly interested in semantics of constructions. Observations from the usage-based approach to grammar have also shown that not only lexical items but also grammatical patterns or constructions may take on their own semantic coherence (Goldberg 2006; Stefanowitsch and Gries 2003). In combination with the distributional semantics hypothesis, the semantics of a construction may be inferred by its co-occurring words in the open slots of the construction. The co-occurrence between lexical items and (schematic) grammatical construction is now referred to as collostruction (Stefanowitsch and Gries 2003). Words that often co-occur with a particular construction are referred to as collexemes of the construction. In collostruction analysis, constructional semantics is studied via analyzing the semantic relationships of the collexemes (Stefanowitsch and Gries 2003) or lexical pairs covarying in the open slots of the construction (Stefanowitsch and Gries 2005). This corpus method has been proven effective in identifying the semantics of not only morphosyntactic patterns (e.g., English un-participle in Schönefeld [2015] and go-V vs go and V in Wulff [2006]) but also abstract argument structures (e.g., various types of English causative constructions in Gilquin [2006]). This method can also be used to discover additional usage patterns of a construction that are often absent on the radar of an intuition-based analysis. For example, analyzing the complex transitive constructions – that is, transitive constructions with an additional complement (e.g., push air out of the way, riddle them clean) – in the International Corpus of English (ICE-GB), Hampe (2011) was able to uncover a distinctive semantic pattern when the construction takes a noun–phrase complement (e.g., appointed me a part-time Special Advisor).[1]

Studies on the distributions of collocation and collostruction have pointed to similar conclusions: (a) words and constructions may not differ much because both structures show consistent form-meaning pairings, and (b) their co-occurring contexts correlate with their semantics. Therefore, under this usage-based framework, the traditional modular boundary between lexicon and syntax may not necessarily be a clearly drawn line. Rather, the blurry lexicon–syntax boundary indicates that “grammatical knowledge represents a continuum on two dimensions, from the substantive to the schematic and from the atomic to the complex” (Croft and Cruse 2004: 255–256). This is now often referred to as the syntax–lexicon continuum.

Although distributional usage patterns have been highlighted as one of the main contributors to our understanding of the meanings of words and constructions, there are still a few more important considerations with respect to how we can represent this syntax–lexicon network. In most of the previous studies, the discussion of the co-occurring words was often limited to a small set of high-ranking collocates or collexemes. Although these top collexemes may have revealed a lot about the semantics of the construction, the analysts often run into questions of how to come up with a more holistic generalization from the collexeme list (i.e., the semantic fields underlying these top-ranking collexemes). Second, it is more difficult to see the whole construction network by only examining the ranking list of collexemes. It should be noted that underlying this distributional approach lies the hypothesis that these co-occurring collexemes and constructions form a network in our mental grammar. Previous methods and analyses may not present the co-occurrence patterns observed in the corpus as equally as an intuitive network.

Thus, our objective is to bridge these gaps by examining the constructional semantics not in terms of ranking lists of collexemes but deciphering the emerging semantic fields via establishing a constructional network. This study is built upon a few pioneering works, which have started to employ similar quantitative methods to explore the semantic coherence of the collexemes, including cluster analysis (Gries and Stefanowitsch 2010) and, more relevant to our current study, network analysis (Dekalo and Hampe 2017; Ellis et al. 2014). In particular, this usage-based theorizing of grammar as network has also been further developed by Diessel (2019), where he summarizes six types of relations for this “nested network model of grammar” (Diessel 2019: 11). At the level of linguistic signs, speakers’ knowledge of linguistic signs (i.e., lexemes and constructions) involves three types of relations that highlight particular aspects of their meaning and usage: symbolic (the form-meaning pairing of the sign), sequential (the syntagmatic relation of the sign), and taxonomic (the paradigmatic relation of the sign) relations. At the cross-sign level, linguistic signs are connected to each other forming a complex adaptive network. Of particular relevance to this study are the three types of links identified by Diessel at this higher level of the grammar network: lexical (links connecting lexemes at particular semantic fields), constructional (links connecting constructions of similar structural/functional correlates), and filler-slot (links connecting lexemes with particular slots of constructions) relations. As will be seen in Section 3, these three associative connections form the bases for the constructional network. This study brings the theory to practice by establishing the constructional network via the distributional patterns of constructions observed in the corpus.

In this study, we aim to demonstrate the network scientific method (Barabási 2016; Newman 2010) as a quantitative framework for the study of the triangular relationship among words, constructions and meanings. We present a case study on the functional variations of Mandarin locative constructions (MLCs; i.e., [zai NP li/nei/zhong/wai/shang/xia/ qian /hou]), which encode various spatial relations, and we create its constructional network based on its distributional properties in corpus. This study will show that comprehensive analysis of the multilayered structural characteristics of the network advances our understanding of the constructional semantics, shedding further light on its methodological implications for usage-based grammar. Our network analyses also bring to the foreground the importance of repeated language experiences in shaping and entrenchment of linguistic knowledge.

2 Network science

Networks are omnipresent in our daily lives, from the transportation networks in the real world to friends we interact with or webpages we navigate in the social or cyber space. Each type of network consists of multiple components that interact with each other; the interrelationships between these individual components form their own unique patterns of complex behaviors. Network is not only a phenomenon but a graphic unit with sophisticated mathematical foundations. Based on the mathematical graph theory, network science has developed as a quantitative method to study complex systems across diverse fields (Barabási 2016; Siew et al. 2019).

Language, a human activity situated in highly interactional contexts, is no doubt a “complex adaptive system” (Beckner et al. 2009: 2). Although network science has been widely used in research on structures at the neural level of the brain (See a comprehensive review in Siew et al. [2019]), its application to linguistic analyses is still limited. The strength of a network is its ability to account for the complex system based on the properties and patterns presented by the whole network.

Network science formalizes a knowledge system as a network, consisting of nodes and edges describing the entities and the relationships between them. Based on the well-established measures from graph theory, network science provides a quantitative framework for analysts to extract specific information related to the connectedness and relationships between various entities (Newman 2010). All these informative generalizations from the network can be visualized as an intuitive and comprehensive representation of the knowledge system. One of the key steps in network creation is to define what each node represents and how nodes are connected by the edges. With the resulting network, the network methodologies provide quantitative metrics at three different scales for analysts to systematically inspect the structures of the network (Barabási 2016; Siew et al. 2019). The microscopic level focuses on the properties of each node and/or edge; the mesoscopic level focuses on the subgroupings of the nodes emerging from the network; and the macroscopic level focuses on the birds-eye view of the network summarizing the structural characteristics of the entire network. For a more comprehensive review of all the quantitative metrics at these three scales, please refer to Siew et al. (2019) and Barabási (2016). We will introduce the quantitative measures used in this study in Section 4.

In language studies, a classical example is to examine the network of words. For example, comparing the macroscopic properties of the lexical networks built from children and care-takers’ speech data, Ke and Yao (2008) observed a growth in the size and connectivity of the two networks, indicating a language development. In their analyses, they based the edges on the collocation patterns of the words, which were computed based on lexical associations from the speech corpora. Veremyev et al. (2019) examined how a lexical network based on human-built dictionary (e.g., WordNet) may differ from one based on machine-learning resources (i.e., word embeddings) at the three scales. Their results show that the latter uncovers more fine-grained small semantic clusters, thus providing richer information on lexical semantics. Globally, Vitevitch (2008) created a lexical network whose edges were based on the phonological similarities of words. He concluded that the network of mental lexicon shows small-world characteristics and that this organization of words in the mental lexicon is closely connected to the acquisition and retrieval of phonological word forms.

Two important macroscopic properties have been consistently observed in many networks relating to human activities: the scale-free (Barabási and Albert 1999) and small-world structures (Watts and Strogatz 1998). The small-world structure refers to a tendency in which a network usually consists of several small communities or clusters where the within-community edges are much stronger than across-community ones. Almost all real-world networks exhibit this small-world structure, including biological brains (van den Heuvel and Sporns 2013), social networks (Lewis et al. 2008) and the Internet (Albert et al. 1999). Researchers have posited that the small-world structure optimizes the structure and organization of the nodes, which in turn maximizes the efficiency of information exchange in the network (cf. Siew et al. [2019] for more case study reviews).

The scale-free structure refers to a pattern in which the degree (i.e., the edge numbers of the node) distribution of the nodes follows a power law, suggesting that there are usually only a few nodes with high degrees and many more with low degrees. This distribution is reminiscent of Zipf’s law in lexical distribution. The ubiquity of the scale-free structure has led to the hypothesis that high-degree nodes may be the basis for the development of the network. Barabási and Albert (1999) formulated this idea in their proposal of preferential attachment, which states that new nodes in a network prefer to be introduced to the nodes with a large number of links. Steyvers and Tenenbaum (2005) further suggested that the scale-free structure may be a mechanistic basis for language learning – children may develop their vocabulary by connecting new words to words with diverse meanings (i.e., words with multiple links to other words).

Use of network science in the study of constructional semantics is still in its early stages, often limited to metrics of particular aspects of the network. For example, investigating the verbal patterns in various English verb-argument constructions (VACs), Ellis et al. (2014) examined how the verbs named in free association and verbal fluency tasks were connected to the distributional patterns of VAC usage in corpus (e.g., verb type-token frequency, verb-construction collostruction strength). Most importantly, they utilized a node-level microscopic metric in network science (i.e., the betweenness centrality) to determine the prototypicality of the verbs based on each VAC semantic network. Their analysis showed a strong connection between the centrality metrics and the frequency of verb generation in association tasks. A similar application can also be found in Dekalo and Hampe (2017), who investigated the interrelationships of the verb collexemes in two German modal constructions, vermögen and bekommen. Creating the collexeme networks of each modal construction based on lexical similarities of German WordNet, they found that collexemes of higher centrality in the network played a more salient role in determining the central or prototypical member of each verbal category. Following Ellis et al. (2014) and Dekalo and Hampe (2017), we continue to put forth the potentials of network science and, going beyond the prior works, aim to demonstrate that network science is a systematic method for the study of constructional semantics. We will closely investigate the network’s properties at all three levels (i.e., microscopic, mesoscopic, and macroscopic) and will specifically contextualize their implications for the development of constructional semantics in relation to the emerging semantics of the usage-based grammar.

3 Data and method

In this study, we analyzed MLCs as a case study to present the potentials of network science in accounting for the constructional semantics in our mental representation. We defined an MLC by a constructional schema, “zai + Noun Phrase (NP) + Space Particle,” where the head nouns in the NP referred to the landmarks (LMs) of a spatial relation encoded by the space particle (SP) in the MLC. Our data came from the Sinica Corpus (Huang and Chen 2010), which is one of the largest and representative balanced corpora of contemporary Taiwan Mandarin publicly available. The collection of texts includes linguistic productions in diverse registers and genres, amounting to about 10 million words. In particular, we focused on eight SPs that could be used in the SP slot of the construction: shang ‘up’, xia下 ‘down’, qian前 ‘in front of’, hou後 ‘behind’, li裡 ‘in’, nei內 ‘in’, zhong中 ‘in’ and wai外 ‘outside’.

We first extracted all MLC tokens from the Sinica Corpus using the regular expressions of the parts-of-speech tags provided by the corpus. We manually scrutinized all constructional tokens for false positives of the automatic retrievals, and, at the same time, we manually checked the correctness of the automatic retrieval of the head noun in each MLC. In total, we identified 28,287 MLC tokens. We transformed each MLC into word pairs, the head noun and the locative SPs, which we then submitted to the analysis of covarying collexemes. We excluded tokens where the LM was a proper name or pronoun, resulting in a loss of 2,100 tokens (7.4% of the original dataset). We used the coll.analysis script written by Stefan Gries (Gries 2014) to compute the collostruction strength of each covarying collexeme pair. Finally, the collostruction strengths of these covarying collexeme pairs formed our quantitative bases for the creation of the networks (i.e., the links between LMs [head nouns] and SPs).

In the constructional network of MLC, there were at least two types of nodes to consider: the LMs and the SPs. Therefore, we needed to model three types of links between these nodes in the network: (I) the links between LMs and SPs, (II) the links between the LMs, and (III) the links between the SPs. These three types of links correspond to the filler-slot, lexical and constructional relations, respectively, in Diessel’s grammar-as-network model (Diessel 2019). It should now be clear that the covarying collexeme analyses have already given us a statistical measure for Type I links. In the following paragraphs, we further introduce the computation of the other two types of links.

Type II links were essentially the semantic connections between nouns or, in general, words in Mandarin. The most intuitive method to measure lexical similarity in a large scale was to consult a dictionary (e.g., Chinese WordNet) to derive the pairwise lexical semantic similarity/distance (Dekalo and Hampe 2017; Ellis et al. 2014). However, unlike the English WordNet, the current access to the Chinese WordNet is still limited. Another alternative was to resort to the unsupervised machine learning of lexical relations through the co-occurrences of words in a huge collection of text data. Specifically, we used the word-embedding models, which have now been effectively used in many computational processing tasks. Word-embedding models of lexical semantics were essentially a computational version of distributional semantics built upon the assumption that “difference of meaning correlates with difference of distribution” (Harris 1970: 785). In this computational model, we mapped words to numeric vectors. These numbers were based on the co-occurrence patterns of the words and their context words within a user-defined window. Word-embedding models learned the word vectors by maximizing the likelihood of either predicting the context words when given the target word (e.g., word2vec:CBOW) or predicting the target word when given the context words (e.g., word2vec:Skipgram). We used the pretrained Chinese word-embedding model in the fastText library created by Facebook AI’s Research lab.[2] The model was trained on the Internet corpora collected in the Common Crawl project and Wikipedia. For all the LM head nouns identified in our MLC concordances, we computed their pairwise cosine similarity based on their word embeddings, which in turn served as the numeric representation of their semantic connection. These similarities provided the foundation for our Type II links.

Type III links concerned the semantic connection between various locative SPs. Following the tenets of distributional semantics, we based the similarities of SPs on their co-occurrence distribution with the head nouns – if two particles co-occur often with similar sets of LMs, they are conceptually more similar. We first created a locative-by-landmark co-occurrence table, as shown in Table 1. Each particle was then transformed into numeric vectors, showing its degrees of co-occurrence with each LM. We then computed the pairwise cosine similarity of these SPs, which in turn formed the basis of our Type III links.

Table 1:

Locative-by-landmark contingency table.

guocheng

‘process’
qingkuang

‘condition’
nian

‘year’
xin

‘heart’
huanjing

‘environment’
shenghuo

‘life’
yan

‘eye’
shehui

‘society’
shu

‘book’
shijian

‘time’
zhong 0.63 0 0 0.27 0.18 0.47 0.45 0.31 0.57 0
xia 0 0.82 0 0 0.21 0 0 0 0 0
shang 0 0 0 0 0 0 0 0 0 0
hou 0 0 0.05 0 0 0 0 0 0 0
qian 0 0 0.34 0 0 0 0 0 0 0
li 0 0 0 0.28 0.03 0 0.14 0.16 0 0.1
nei 0 0 0.26 0 0 0 0 0 0 0.65
wai 0 0 0 0 0 0 0 0 0 0
  1. Note. The numbers in the cells refer to the collostruction strengths of the covarying collexemes.

With all nodes and links ready, we created the MLC network using the relevant packages in R (igraph and visNetwork). We determined cutoff values of collostruction strengths and cosine similarities to remove weak edges and to avoid overplotting the network. Our current analysis included all covarying collexeme pairs whose collostruction strengths were significant (p < 0.001); we set the cutoff value for pairwise cosine similarities at 0.5. We constructed an undirected weighted bipartite graph with the similarity/association metrics as the edge weights (all associations were normalized to the same range, i.e., [0, 1], for consistent aesthetic parameters in network visualization). The following section shows our network analyses and demonstrates how the structural correlates of the network can provide novel insight into important questions related to the constructional semantics.

4 Network analysis of MLC

Our network analysis produced a constructional network with 509 nodes and 1,456 edges, as shown in Figure 1. The yellow nodes in Figure 1 refer to the eight subtypes of SPs in MLCs; the other nodes refer to the landmark collexemes that are attracted to the SPs based on the corpus distribution. In the following sections, we will discuss the functionalities of network science by highlighting what we can infer from three perspectives: macroscopic, mesoscopic and microscopic observations. (In Supplementary Data, more comprehensive and dynamic versions of the networks are provided with the inclusion of the Chinese pinyin and English gloss for each node.)

Figure 1: 
Semantic network of MLCs.
Figure 1:

Semantic network of MLCs.

4.1 Macroscopic observations

At the macroscopic level, we examined the constructional network in terms of two key features: small-world and scale-free properties. The scale-free structure of a network describes a tendency in which the probability distribution of nodes’ degrees, P(k), follows the power law, suggesting that a few nodes with high degrees often dominate the network, whereas most of the nodes’ degrees are very low. Linguistically, the scale-free structure indicates that only a few linguistic units in the system are frequently connected to other linguistic units, and most of the other units are relatively weak with this property. Figure 2 shows the cumulative degree distribution of our network. According to the Kolmogorov–Smirnov test, it followed a power law distribution (α = 2.849, D = 0.088, p = 0.512), suggesting that the network exhibits a scale-free structure.

Figure 2: 
The cumulative degree distribution of the network. (The degree of a node refers to the number of the links a node has. This graph shows the probability distribution of these node degrees over the entire network.)
Figure 2:

The cumulative degree distribution of the network. (The degree of a node refers to the number of the links a node has. This graph shows the probability distribution of these node degrees over the entire network.)

Watts and Strogatz (1998) suggest that the small-world structure of a network can be quantitatively determined based on two complex macroscopic network parameters: the global clustering coefficient (<C>) and the average path length (<l>). The clustering coefficient measures to what extent a node’s neighbors are also connected to each other. For example, given a node i with k i degrees and E i edges between these k i nodes, the clustering coefficient of this node i is E i divided by the maximal number of edges for these k i nodes (i.e., k i [k i −1]/2). The clustering coefficient of the whole network is the average of all the local clustering coefficients. The average path length refers to the mean of the paths that connect any two nodes in the network. A path is defined as the number of edges that are minimally required to connect two nodes. Therefore, a network of a small-world structure may consistently demonstrate a smaller average path length and a larger global clustering coefficient. We adopted the quantitative metric, S (small-worldness) index, proposed by Humphries and Gurney (2008), to assess the small-worldness of our construction network. The formula is given as follows:

S = C / C r n d l / l r n d

The metric is essentially a ratio of the network’s global clustering coefficient (<C>) and average path length (<l>), but both indices are adjusted by the corresponding metrics computed based on a random network with the same number of nodes (i.e., <C rnd >, <l rnd >). A network is considered small world if S is larger than 1 (a more conservative threshold is larger than 3). We computed the S index based on 1,000 random networks (i.e., Erdős-Rényi network) and the mean score of the S was 11.03 (CI = [10.90, 11.17]), suggesting a small-world structure. The small-world and scale-free properties of the constructional network may have implications for constructional semantics in language processing.

The usage-based approach to grammar considers meanings to be an emergent property of a linguistic unit arising from its repeated language use. Every token use of a linguistic sign in experience plays a role in its development in speaker’s mental representation. The role of frequency thus emerges as one of the primary factors for language development (Bybee 2006; Goldberg 2006). Most importantly, when speakers encounter recurrent tokens of identical (i.e., high token-frequency) or similar features (i.e., high type-frequency), they are more likely to develop a higher level of abstract representation of the linguistic structure. The joint roles of frequency and similarity provide a necessary foundation for semantic categorization (Nosofsky 1988), which in turn gives rise to exemplar-based learning of the linguistic structures and their emerging semantics (Bybee 2006; Goldberg 2006).

In connection to the macroscopic properties of the constructional network, we posit that the small-world and scale-free structures may be quantitative indicators of the emergent semantic coherence of the constructional schema in speaker’s mental representation. Following the principle of semantic coherence practiced in the collostructional analysis (Gries and Stefanowitsch 2010; Stefanowitsch and Gries 2003), we share the same assumption that collexemes in one or various slots of a construction should be semantically compatible with the semantics of their connected construction. That is, a construction with clear semantics would attract groups of words indexing its relevant semantic fields. The small-world structure of the constructional network suggests that semantically connected nodes form small clusters in the network and the scale-free structure shows that these clusters feature only a few central nodes of multiple connections to other nodes. The emergence of these small clusters may be interpreted as the emerging semantic fields of the collexemes, and the small number of high-degree nodes can be taken as important exemplars, or “cognitive reference points” (Diessel 2019: 34) for the learning of the more novel usages of the constructions (Ellis et al. 2014; Hills et al. 2009; Nosofsky 1988). The small number of these exemplar nodes may also reflect the efficiency and importance of semantic prototypes in mental representation (Ellis et al. 2014). The following analyses of the microscopic and mesoscopic properties of the network will further substantiate our claims.

4.2 Microscopic observations

Microscopic analyses of the network concern the importance of a node in the network via measures of centrality. We first analyzed the local clustering coefficient of the SP nodes in the network (i.e., the proportion of the SPs’ neighbors that were connected by edges). Similar to the global clustering coefficient used in Section 4.1, the local clustering coefficient is a metric that indicates the extent to which the neighbors of a node are interconnected – namely, whether a node’s neighbors are also neighbors of each other. In our case, if the neighbors of the SPs (i.e., the LMs of the MLC) are more interconnected, they are more semantically connected; thus, there is a higher likelihood of showing more centralized semantic fields. Moreover, to examine the semantics of the co-occurring LMs with each SP, we also utilized a microscopic metric in network science, PageRank centrality,[3] which was computed as the inverse of the average shortest path length to all the other nodes in the network. This metric is often used to characterize the importance of a node beyond its immediate neighbors. For all the immediate nodes connected to a particular SP, those of higher centrality values may be argued to indicate more prototypical uses of the spatial relation encoded by the particle (Dekalo and Hampe 2017; Ellis et al. 2014).

Figure 3 shows the local clustering coefficients of the eight SP nodes using circles of different sizes (statistics are provided in Table 2). An interesting observation can be made from the network. If we separate all SPs according to their denoted spatial relations into three types, horizontal direction (qian前 and hou後), vertical direction (shang上 and xia下) and containment (nei內, li裡, zhong中, wei外), the ones of higher local clustering coefficients from each type (i.e., nei內, xia下 and qian前) were the space orientations that were experientially and interactionally more prominent than their respective counterparts on the same space dimension. For example, humans interact with the internal space of a container much more frequently than its external space. Humans experience a downward movement in life much more often than the upward one given the omnipresent rule of gravity. Entities in front of the landmark are more likely in the speaker’s focal attention compared to those in the back of the landmark. These higher degrees of recurrent bodily involvement in interaction may lead to a stronger semantic coherence of the LMs referenced in the spatial relations of these dimensions.

Figure 3: 
Semantic network with SP nodes sized according to their local clustering coefficient scores.
Figure 3:

Semantic network with SP nodes sized according to their local clustering coefficient scores.

Table 2:

Local clustering coefficients of the space particles.

Space Particle Local Clustering Coefficient
nei 0.0677
xia 0.0258
qian 0.0235
li 0.0228
shang 0.0224
hou 0.0201
zhong 0.0191
wai 0.0095

The decreasing local clustering coefficients of the SPs from nei內 to wai外 in Table 2 indicate that their co-occurring LMs cover entities that are more and more semantically heterogenous. Among these particles, wai外 demonstrates the smallest clustering coefficient in our network, which is an indicator of little interconnectedness of its neighboring nodes – high semantic heterogeneity of its co-occurring LMs. Alternatively, we may interpret this low score as an indicator of the more extended (and productive) use of wai外 in Mandarin.

Figure 4 shows the sizes of the landmark nodes according to their PageRank centrality scores, which in turn help identify the prototypical LMs relating to each SP. As discussed earlier, among all SPs, nei內 is the most semantically coherent SP, attracting collexemes that are the most interconnected compared to the collexemes of the other particles. A closer look at the collexemes of nei內 indicates that most of them are connected to time-related concepts, such as time units (e.g., fenzhong分鐘 ‘minute’, nian年 ‘year’, xiaoshi小時 ‘hour’, tian天 ‘day’, miaozhong秒鐘 ‘second’, yue月 ‘month’) and durations (e.g., qijian期間 ‘period’, shijian時間 ‘time’, shiqi時期 ‘period’, xinqi星期 ‘week’). The semantic coherence of the particle xia下 is the second one, which attracts nouns referring to condition/control (e.g., qingkuang情況 ‘condition’, zhuangkuang狀況 ‘situation’, kaolu考慮 ‘consideration’) or influence/guidance (e.g., tuidong推動 ‘promotion’, dailing帶領 ‘leadership’, bangzhu幫助 ‘help’, xiezhu協助 ‘assistance’, yindao引導 ‘guidance’, yinling引領 ‘introduction’, yingxiang影響 ‘influence’). The third SP in the ranking of the local clustering coefficient is qian前, which attracts LMs relating to event aspects (e.g., kaishi開始 ‘start’, shishi實施 ‘implementation’, jieshu結束 ‘finish’) or time-related concepts (e.g., nian年 ‘year’, yue月 ‘month’, niandi年底 ‘year end’, yuedi月底 ‘month end’, guonian過年 ‘New Year’).

Figure 4: 
Semantic network with LM nodes sized according to their PageRank centrality values.
Figure 4:

Semantic network with LM nodes sized according to their PageRank centrality values.

The high degrees of semantic coherence in nei內 and xia下 constructions among all particles may also be connected to a crosslinguistic observation that CONTAINMENT and SUPPORT are the two cardinal space concepts on which language-particular space particles are conventionalized. Spatial relations in languages often form a closed set of spatial terms. Both typological and developmental studies have shown that their semantics can be fractionated into a few semantic primitives,[4] such as concepts of proximity, contiguity and containment, thus being topological in nature (Bowerman and Choi 2001; Feist 2008; Jackendoff 1983; Levinson et al. 2003; Miller and Johnson-Laird 1976; Vandeloise 2003, 2010). Although views may differ as to which semantic primitives are universal and prelinguistic, previous researchers have highlighted the role of CONTAINMENT and SUPPORT in space conceptualization. For example, children demonstrated understanding of a range of extensional spatial concepts in their prelinguistic stages, such as predicting the projected trajectory of object movement (Spelke et al. 1992) and knowing that a midair-positioned object would fall or that all containers need a bottom (Baillargeon 1995). Studies on the acquisition of linguistic spatial terms have also shown a similar tendency in children’s cognitive development (Bowerman and Choi 2001; Piaget and Inhelder 1956). Children often start with functional and topological notions such as containment, contiguity, support, and occlusion and continue on with the concept of proximity (e.g., next to, beside, between), with projective relations (e.g., in front of, behind) being the most challenging concepts to acquire in the rather late stage.

In addition, typological similarities in spatial constructions also provide support for the preexistence of the nonlinguistic spatial categories. Although the language-particular mappings between space terms and the scenes may not be one-to-one, several spatial relations are consistently grammaticalized in languages. These commonalities may suggest that space conceptualization works along with similar biological and environmental constraints (e.g., upright posture, front-back asymmetry, gravity), which are the conditions and experiences shared by all human beings (Clark 1973). For example, analyzing 38 languages from 25 different families, Melissa Bowerman and her colleagues (Bowerman 1996; Bowerman and Choi 2001; Bowerman and Pederson 2003) established a typological research framework for space with the elicitation-based method using the topological relations picture series (Bowerman 1996). The series consisted of a collection of pictures describing various spatial relations, based on which the crosslinguistic elicitation of the spatial terms was conducted. One of the major contributions from Bowerman et al. is the observation of the CONTAINMENT–SUPPORT continuum, on which various spatial concepts are found to be gradient in the sense of being more containment-like or support-like (Bowerman and Choi 2001). Speakers’ built-in sensitivities to the fine-grained distinction of spatial concepts on this continuum may be in constant interaction with a variety of characteristics in language-particular inputs. For instance, McDonough et al. (2003) found that infants may have the conceptual readiness for learning either language-particular or language-general spatial concepts, but adults may become more indifferent to some spatial relations if their native language does not encode them. Their analysis shows that Korean-speaking adults still preserve the cognitive sensitivities toward the contrast between tight containment and loose containment (which are still two distinct categories, kkita and nehta, in Korean), but the English-speaking adults do not (both categories are grammaticalized as in in English).

Both developmental and typological observations suggest that CONTAINMENT and SUPPORT may be the two most distinct space concepts in conceptualizing the real-world spatial relations. Following the usage-based hypothesis, we may posit that concepts on these two extremes should bear the most distinctive semantic coherence in a language as well. This hypothesis is consistent with our current microscopic observations of the SP nodes in the network, where nei內 (a particle of CONTAINMENT) and xia下 (a particle of SUPPORT) were the SPs of the highest local clustering coefficient scores.

4.3 Mesoscopic observations

At the mesoscopic level, network science allows us to analyze the relationships between nodes and identify the subgrouping patterns of nodes based on their edge interconnectedness. These subgroups are often referred to as communities in network science. In language networks, communities can be used to uncover semantic fields (Borge-Holthoefer and Arenas 2010; Dekalo and Hampe 2017). In our mesoscopic analysis, we examined the communities identified by network science using the spin-glass algorithm for each subnetwork of the MLC.[5] As we were concerned with the emerging semantic fields in each SP construction, we first subset the immediate neighbors of each SP and ran the community detection for each subnetwork. Figure 5 and Figure 6 show the communities identified for each SP subnetwork. We particularly highlighted communities consisting of at least four nodes in colors for semantic analysis. We identified nodes in gray as sparse or one-node communities, which may contribute little to the generalization of semantic fields. To more closely inspect the distribution of the communities in each SP network, we focused on the larger communities and identified the most prototypical member of each community according to the centrality metrics (i.e., PageRank centrality). We arranged the subnetworks in Figure 5 and Figure 6 according to the local clustering coefficients of the SPs presented in Section 4.2. (Enriched dynamic versions of the graphs are available in Supplementary Data.)

Figure 5: 
Communities of the subnetworks for SPs nei, xia, qian, and li.
Figure 5:

Communities of the subnetworks for SPs nei, xia, qian, and li.

Figure 6: 
Communities of the subnetworks for SPs shang, hou, zhong, and wai.
Figure 6:

Communities of the subnetworks for SPs shang, hou, zhong, and wai.

The nei-network in Figure 5 confirms our earlier observations that the co-occurring landmark nodes of nei內 show high degrees of semantic coherence, with three major communities emerging from the subnetwork. Their respective most prototypical members include nian年 ‘year’, fengzhong分鐘 ‘minute’ and tian天 ‘day’. All these major communities are connected to the temporal domain. The next subnetwork is xia-network in Figure 5, where we have detected 10 major communities. Their respective most prototypical members include qingkuang情況 ‘condition’, dailing帶領 ‘leadership’, waibiao外表 ‘appearance’, tuidong推動 ‘promotion’, kaolu考慮 ‘consideration’, tizhi體制 ‘regime’, bangzhu幫助 ‘help’, longzhao籠罩 ‘cover’, zhichi支持 ‘support’ and yingxiang影響 ‘influence’, featuring the semantic fields of CONTROL, CONDITION, GUIDANCE, and SUPPORT. The third subnetwork in the rank is the qian-network, where we have identified two major communities with nian年 ‘year’ and kaishi開始 ‘start’ as their respective most prototypical members. The qian-network starts to show signs of metaphorical extension in its attracted LMs from time-related concepts to more event-related aspectual ones. The li-network includes only one major community whose most prototypical member is shin心 ‘heart’. Different from the CONTAINMENT particle nei, li seems to be more connected to a bodily and concrete container, which can be further supported by the other nodes in this community, such as yian眼 ‘eye’, shinling心靈 ‘mind’, xi戲 ‘drama/play’ and guodu國度 ‘territory’.

The remaining four subnetworks in Figure 6 consistently demonstrate a wider variety in the semantics of their major communities identified by network science. The shang-network includes 12 major communities, and their respective most prototypical members are zhengzhi政治 ‘politics’, lilun理論 ‘theory’, wunti問題 ‘problem’, sheji設計 ‘design’, kexue科學 ‘science’, benzhi本質 ‘essence’, fangfa方法 ‘method’, fazhan發展 ‘development’, dianshi電視 ‘TV’, pinzhi品質 ‘quality’, taidu態度 ‘attitude’ and fen份 ‘sake’, suggesting much more semantic heterogeneity of its LMs (compared to its counterpart xia下). Similarly, the hou-network also shows a greater semantic variation in its connected LMs than its counterpart qian前. Five major communities have been identified in the hou-network, whose most prototypical members include jieshu結束 ‘finish’, tian天 ‘day’, zhiliau治療 ‘treatment’, xiake下課 ‘finish of class’ and queding確定 ‘confirmation’. It should be noted that the hou-network still bears considerable resemblance with the qian-network in that both particles are connected to time-related concepts (e.g., tian) and event-aspectual concepts (e.g., jieshu).

The final two subnetworks, zhong- and wai-networks, both fall into the space category of CONTAINMENT. Our earlier observations show that the nei-network is predominantly connected to temporal concepts, and the li-network shows more connection with bodily and concrete entities. The zhong-network, a particle near-synonymous to nei and li, demonstrates a much wider semantic range in its co-occurring LMs. We have identified 10 major communities in the zhong-network, whose most prototypical members include lucheng旅程 ‘journey’, guocheng過程 ‘process’, bisai比賽 ‘competition’, diaocha調查 ‘investigation’, anli案例 ‘case’, dizheng地震 ‘earthquake’, xiashuo小說 ‘novel’, kesheng歌聲 ‘singing voice’, tanhua談話 ‘talking’ and jihua計畫 ‘plan’. We start to see more nodes relating to event nouns (e.g., bisai, diaocha, dizheng, tanhua). In fact, these event-like lexical items can even be used as a verb perfectly without any morphological changes in Mandarin, indicating the high dynamicity of their lexical semantics. The distribution of the communities in zhong-network indicates that the concept of CONTAINMENT has been extended considerably in conceptualizing abstract relations involving less tangible and more dynamic LMs. Similar observations of the semantic differences among these CONTAINMENT-related particles can also be found in Su and Chen (2019).

The last subnetwork wai外, a counterpart of nei, li裡 and zhong, shows little semantic coherence in our current community analysis. The absence of major communities in the wai-network is consistent with our earlier description that wai-construction takes on the lowest local clustering coefficient in the MLC network (cf. Section 4.2). The particle wai外 is the semantic category of the CONTAINMENT domain with less experiential and interactional involvement in life than its counterparts (li裡, nei內, zhong中).

5 Conclusion

In this study, we have demonstrated the usefulness of network representation in the study of linguistic constructions and its potential as a quantitative framework for the account of constructional semantics. Following the central tenet shared by usage-based grammarians, we subscribe to the view that linguistic meaning may be best manifested by the semantic coherence observed from the recurrent contexts of the linguistic form. With MLC as a case study, we applied the collostructional analysis to first identify the covarying pairs of LMs and SPs within this constructional schema. We continued to create a semantic network representation of the MLC with these covarying pairs as the nodes and established three types of edges highlighting lexical, constructional and filler-slot relations between the nodes in the network. We then analyzed the microscopic, mesoscopic and macroscopic characteristics of the network. These multilayered structural correlates of the network have shown consistent patterns that may be taken as indicators of emerging semantics. Table 3 summarizes the network properties examined in this study and their implications for usage-based grammar.

Table 3:

Summary of network properties and their implications for usage-based grammar.

Levels Network Properties Usage-based Grammar
Macroscopic Small-worldness Emergence of semantic fields in the network
Scale-free Emergence of exemplars in the network
Microscopic Local Clustering Coefficient Degrees of semantic coherence for the construction nodes
PageRank Centrality Degrees of prototypicality for the lexeme nodes
Mesoscopic Community Detection Delineation of the emerging semantic fields from the network

At the macroscopic level, our analysis shows that the construction network exhibits small-world and scale-free characteristics, suggesting that semantically similar clusters arise from the network (i.e., small-world) and that the high-degree nodes are limited in number (i.e., scale-free). These two important characteristics are coherent with the hypothesis of frequency-based exemplar learning in usage-based grammar. At the microscopic level, the local clustering coefficients of the SPs highlight the conceptual prominence of particular spatial orientations (e.g., nei內, xia下 and qian前), which can be attributed to their interactional importance in bodily experiences. Finally, the node-wise centrality measures, along with the community detection at the mesoscopic level, further contribute to a better understanding of the nature of the emerging semantic fields in the construction network.

Although the construction networks reveal the general semantic fields of the MLC and the particular semantic preferences of each SP construction, we may still see a few bizarre cases that are difficult to interpret (i.e., communities in gray in Figure 5 and Figure 6). However, we argue that these eccentric cases should not undermine the potential of the corpus-based network representation. Several limitations may need to be considered more carefully before we totally deny the values of this powerful quantitative method.

First of all, the interpretability of the current MLC network may be partly connected to the fact that we have only examined one construction in Mandarin, which theoretically should be linked to other constructional networks. The quantitative metrics examined in this study may need to be more critically assessed by further contextualizing the MLC in the bigger “nested network” (Diessel 2019: 11). For example, based on the six types of links proposed by Diessel (2019), all the LM lexemes included in our current network are also connected to many other constructions that take noun phrases via filler-slot relations; our current MLC is also connected to other constructions that are either more or less abstract via taxonomic relations; similarly, the MLC is also connected to the other constructions (e.g., Verb + MLC, MLC + Verb) in connected discourse via sequential relations. With the network science method, future study may further develop more complex networks by including constructions connected via more diverse types of relations, which may provide a more comprehensive account of the grammar network.

Second, the creation of the constructional network is sensitive to the nature of the corpus because all the links are established based on the corpus distribution of the words and constructions. The Sinica Corpus we used for the present study was constructed more than a decade ago. Some of the semantic fields identified in our network may not necessarily reflect the semantics of the construction per se but instead reflect the subject matters that have been prominent in the times when the texts were produced and collected.

Finally, network science, in combination with the corpus-based co-occurrence information, provides a flexible framework to model our linguistic knowledge as an emergent structure or, alternatively, “a biological organism” (Givón 1993: 2). Critics may be concerned with the determination of the cutoff values for links in network creation. We acknowledge the fact that various cutoff values or statistical methods for the node associations may produce a slightly different version of the network. However, we would like to stress that this flexibility could in fact be a strength of this network method. Linguistic behaviors involve idiosyncratic variations at many different levels due to individual variation in many cognitive dimensions, such as creativity, personality or working memory. We are not claiming that the network representation produced in our analysis would correspond to the mental grammar shared by all speakers. Instead, we believe that when the cutoff values are set higher (i.e., the links in the network are more strictly defined by stronger or tighter associations), we may be able to approach the core grammatical representations shared by the majority of the speakers.

The rationale underlying the algorithm of network science bears a great resemblance to the usage-based approach to language development. Various cutoff values or association metrics in creating the network may be seen as an analogy of the unique language experiences of different speakers. Athough linguistic patterns or structures of repeated use are often more likely to register in speakers’ mental grammar, speakers may vary in their cognitive capacity and sensitivity to make effective generalizations. This network scientific perspective offers a new and useful set of quantitative tools to increase our understanding of how language develops meanings from a complex network in flux.


Corresponding author: Alvin Cheng-Hsien Chen, National Taiwan Normal University, Department of English, No 162, Section 1, Heping E. Rd., Taipei 106, Taiwan, E-mail:

Funding source: Ministry of Science and Technology, Taiwan

Award Identifier / Grant number: 104-2410-H-003-134

Award Identifier / Grant number: 108-2410-H-003-023-MY2

Acknowledgement

This research was supported by Taiwan Ministry of Science and Technology (104-2410-H-003-134 and 108-2410-H-003-023-MY2).

References

Albert, Réka, Hawoong Jeong & Albert-László Barabási. 1999. Internet: Diameter of the world-wide web. Nature 401(6749). 130–131. https://doi.org/10.1038/43601.10.1038/43601Search in Google Scholar

Baillargeon, Renee. 1995. A model of physical reasoning in infancy. In Lewis P. Lipsitt & Carolyn K. Rovee-Collier (eds.), Advances in infancy research, 305–371. Norwood, NJ: Ablex Publishing.book-chapter.Search in Google Scholar

Barabási, Albert-László & Réka Albert. 1999. Emergence of scaling in random networks. Science 286(5439). 509–512. https://doi.org/10.1126/science.286.5439.509.10.1515/9781400841356.349Search in Google Scholar

Barabási, Albert-László. 2016. Network science. Cambridge: Cambridge University Press.Search in Google Scholar

Beckner, Clay, Richard Blythe, Joan Bybee, Morten H. Christiansen, William Croft, Nick C. Ellis, John Holland, Jinyun Ke, Diane Larsen-Freeman & Tom Schoenemann. 2009. Language is a complex adaptive system: Position paper. Language Learning 59(s1). 1–26. https://doi.org/10.1111/j.1467-9922.2009.00534.x.10.1111/j.1467-9922.2009.00533.xSearch in Google Scholar

Borge-Holthoefer, Javier & Alex Arenas. 2010. Semantic networks: Structure and dynamics. Entropy 12. 1264–1302. https://doi.org/10.3390/e12051264.10.3390/e12051264Search in Google Scholar

Bowerman, Melissa & Soonja Choi. 2001. Shaping meanings for language: Universal and language-specific in the acquisition of spatial semantic categories. In Melissa Bowerman & Stephen C. Levinson (eds.), Language acquisition and conceptual development, 475–511. Cambridge, UK: Cambridge University Press.book-chapter.10.1017/CBO9780511620669.018Search in Google Scholar

Bowerman, Melissa & Eric Pederson. 2003. Crosslinguistic perspectives on topological spatial relationships. Eugene and Nijmegen: University of Oregon and Max Planck Institute for Psycholinguistics.Search in Google Scholar

Bowerman, Melissa. 1996. Learning how to structure space for language: A crosslinguistic perspective. In Paul Bloom, Merrill F. Garrett, Lynn Nadel & Mary A. Peterson (eds.), Language and space, 385–436. Cambridge, MA: MIT Press.book-chapter.Search in Google Scholar

Bybee, Joan. 2002. Sequentiality as the basis of constituent structure. In Talmy Givón & Bertram F. Malle (eds.), The evolution of language out of prelanguage, 109–134. Amsterdam: John Benjamins.book-chapter.10.1075/tsl.53.07bybSearch in Google Scholar

Bybee, Joan. 2006. From usage to grammar: The mind’s response to repetition. Language 82. 711–733. https://doi.org/10.1353/lan.2006.0186.10.1353/lan.2006.0186Search in Google Scholar

Bybee, Joan. 2007. Frequency of use and the organization of language. Oxford, NY: Oxford University Press.10.1093/acprof:oso/9780195301571.001.0001Search in Google Scholar

Clark, Herbert H. 1973. Space, time, semantics, and the child. In Timothy E. Moore (ed.), Cognitive development and the acquisition of language, 27–63. New York, NY: Academic Press.book-chapter.10.1016/B978-0-12-505850-6.50008-6Search in Google Scholar

Clark, Herbert H. 1996. Using language. Cambridge: Cambridge University Press.10.1017/CBO9780511620539Search in Google Scholar

Croft, William & D. Alan Cruse. 2004. Cognitive Linguistics. Cambridge: Cambridge University Press.10.1017/CBO9780511803864Search in Google Scholar

Dekalo, Volodymyr & Beate Hampe. 2017. Networks of meanings: Complementing collostructional analysis by cluster and network analyses. Yearbook of the German Cognitive Linguistics Association 5. 143–176. https://doi.org/10.1515/gcla-2017-0011.10.1515/gcla-2017-0011Search in Google Scholar

Diessel, Holger. 2019. The grammar network: How linguistic structure is shaped by language use. Cambridge, UK: Cambridge University Press.10.1017/9781108671040Search in Google Scholar

Divjak, Dagmar & Stefan Th Gries. 2006. Ways of trying in Russian: Clustering behavioral profiles. Corpus Linguistics and Linguistic Theory 2(1). 23–60. https://doi.org/10.1515/cllt.2006.002.10.1515/CLLT.2006.002Search in Google Scholar

Ellis, Nick C, Matthew Brook O’Donnel & Ute Römer. 2014. The processing of verb-argument constructions is sensitive to form, function, frequency, contingency, and prototypicality. Cognitive Linguistics 25(1). 55–98. https://doi.org/10.1515/cog-2013-0031.10.1515/cog-2013-0031Search in Google Scholar

Feist, Michele I. 2008. Space between languages. Cognitive Science 32. 1177–1199. https://doi.org/10.1080/03640210802152335.10.1080/03640210802152335Search in Google Scholar

Firth, John Rupert 1957. Modes of meaning. In Frank R. Palmer (ed.), Papers in linguistics 1934–1951, 190–215. Oxford: Oxfored University Press.book-chapter.Search in Google Scholar

Gilquin, Gaëtanelle. 2006. The verb slot in causative constructions: Finding the best fit. Constructions S1(3). 1–46.Search in Google Scholar

Givón, Talmy. 1993. English grammar: A function-based introduction. Amsterdam: John Benjamins.10.1075/z.engram2Search in Google Scholar

Glynn, Dylan. 2014. The many uses of run. In Dylan Glynn & Justyna A. Robinson (eds.), Corpus methods for semantics: Quantitative studies in polysemy and synonymy, 117–144. Amsterdam: John Benjamins.book-chapter.10.1075/hcp.43.05glySearch in Google Scholar

Goldberg, Adele E. 2006. Constructions at work: the nature of generalization in language. Oxford: Oxford University Press.10.1093/acprof:oso/9780199268511.001.0001Search in Google Scholar

Gries, Stefan Th & Anatol Stefanowitsch. 2010. Cluster analysis and the identification of collexeme classes. In Sally Rice & John Newman (eds.), Empirical and experimental methods in cognitive/functional research, 73–90. Stanford, CA: CSLI Publications.book-chapter.Search in Google Scholar

Gries, Stefan Th. 2006. Corpus-based methods and cognitive semantics: The many meanings of to run. In Stefan Th Gries & Anatol Stefanowitsch (eds.), Corpora in cognitive linguistics: Corpus-based approaches to syntax and lexis, 57–99. Berlin & New York: Mouton de Gruyter.book-chapter.10.1515/9783110197709.57Search in Google Scholar

Gries, Stefan Th. 2014. Coll.analysis 3.5. A script for R to compute perform collostructional analyses. Available at: https://www.linguistics.ucsb.edu/faculty/stgries/teaching/groningen/.Search in Google Scholar

Hampe, Beate & Joseph E. Grady. 2005. From perception to meaning: Image schemas in cognitive linguistics. Berlin: Walter de Gruyter.10.1515/9783110197532Search in Google Scholar

Hampe, Beate. 2011. Discovering constructions by means of collostruction analysis: The English denominative construction. Cognitive Linguistics 22(2). 211–245. https://doi.org/10.1515/cogl.2011.009.10.1515/9783110335255.141Search in Google Scholar

Hanks, Patrick. 1996. Contextual dependency and lexical sets. International Journal of Corpus Linguistics 1(1). 75–98. https://doi.org/10.1075/ijcl.1.1.06han.10.7551/mitpress/9780262018579.003.0005Search in Google Scholar

Harris, Zellig S. 1970. Papers in structural and transformational linguistics. Dordrecht: Reidel.10.1007/978-94-017-6059-1Search in Google Scholar

Hills, Thomas T., Mounir Maouene, Josita Maouene, Adam Sheya & Linda Smith. 2009. Longitudinal analysis of early semantic networks: Preferential attachment or preferential acquisition? Psychological Science 20. 729–739. https://doi.org/10.1111/j.1467-9280.2009.02365.x.10.1111/j.1467-9280.2009.02365.xSearch in Google Scholar

Huang, Chu-Ren & Keh-jiann Chen. 2010. Academia sinica balanced corpus of modern Chinese 4.0. Taipei, Taiwan: Academia Sinica.Search in Google Scholar

Humphries, Mark D. & Kevin Gurney. 2008. Network ‘small-world-ness’: A quantitative method for determining canonical network equivalence. PloS One 3(4). e0002051. https://doi.org/10.1371/journal.pone.0002051.10.1371/journal.pone.0002051Search in Google Scholar

Jackendoff, R. 1983. Semantics and cognition. Cambridge, MA: MIT Press.Search in Google Scholar

Justeson, John S. & Slava M. Katz. 1991. Co-occurrences of antonymous adjectives and their contexts. Computational Linguistics 17(1). 1–19.Search in Google Scholar

Ke, Jinyun & Y. A. O. Yao. 2008. Analysing language development from a network approach. Journal of Quantitative Linguistics 15. 70–99. https://doi.org/10.1080/09296170701794286.10.1080/09296170701794286Search in Google Scholar

Levinson, Stephen C, Sérgio Meira & The Language and Cognition Group. 2003. ’Natural concepts’ in the spatial topologial domain – adpositional meanings in crosslinguistic perspective: An exercise in semantic typology. Language 79(3). 485–516. https://doi.org/10.1353/lan.2003.0174.10.1353/lan.2003.0174Search in Google Scholar

Lewis, Kevin, Jason Kaufman, Marco Gonzalez, Andreas Wimmer & Nicholas Christakis. 2008. Tastes, ties, and time: A new social network dataset using Facebook.com. Social Networks 30. 330–342. https://doi.org/10.1016/j.socnet.2008.07.002.10.1016/j.socnet.2008.07.002Search in Google Scholar

Liu, Dilin. 2010. Is it a chief, main, major, primary, or principal concern? A corpus-based behavioral profile study of the near-synonyms. International Journal of Corpus Linguistics 15(1). 56–87. https://doi.org/10.1075/ijcl.15.1.03liu.10.1075/ijcl.15.1.03liuSearch in Google Scholar

Mandler, Jean M. & Cristóbal Pagán Cánovas. 2014. On defining image schemas. Language and Cognition 6(4). 510–532. https://doi.org/10.1017/langcog.2014.14.10.1017/langcog.2014.14Search in Google Scholar

McDonough, Laraine, Soonja Choi & Jean M. Mandler. 2003. Understanding spatial relations: Flexible infants, lexical adults. Cognitive Psychology 46(3). 229–259. https://doi.org/10.1016/s0010-0285(02)00514-5.10.1016/S0010-0285(02)00514-5Search in Google Scholar

Miller, George A. & Walter G. Charles. 1991. Contextual correlates of semantic similarity. Language & Cognitive Processes 6(1). 1–28. https://doi.org/10.1080/01690969108406936.10.1080/01690969108406936Search in Google Scholar

Miller, George A. & Philip N. Johnson-Laird. 1976. Language and perception. Harvard, MA: Belknap Press.10.4159/harvard.9780674421288Search in Google Scholar

Newman, Mark E. J. 2010. Networks: An introduction. Oxford, UK: Oxford University Press.10.1093/acprof:oso/9780199206650.001.0001Search in Google Scholar

Nosofsky, Robert M. 1988. Similarity, frequency, and category representations. Journal of Experimental Psychology: Learning, Memory, and Cognition 14(1). 54–65. https://doi.org/10.1037/0278-7393.14.1.54.10.1037/0278-7393.14.1.54Search in Google Scholar

Piaget, Jean & Barbel Inhelder. 1956. The child’s conception of space. London, UK: Routledge and Kegan Paul.Search in Google Scholar

Pickering, Martin J. & Victor S. Ferreira. 2008. Structural priming: A critical review. Psychological Bulletin 134(3). 427–459. https://doi.org/10.1037/0033-2909.134.3.427.10.1037/0033-2909.134.3.427Search in Google Scholar

Riemer, Nick. 2010. Introducing semantics. Cambridge: Cambridge University Press.10.1017/CBO9780511808883Search in Google Scholar

Schönefeld, Doris. 2015. A constructional analysis of un-participle constructions. Cognitive Linguistics 26(3). 423–466. https://doi.org/10.1515/cog-2014-0017.10.1515/cog-2014-0017Search in Google Scholar

Siew, Cynthia S. Q., Dirk U. Wulff, Nicole M. Beckage & Yoed N. Kenett. 2019. Cognitive network science: A review of research on cognition through the lens of network representations, processes, and dynamics. Complexity 2019. 1–24. https://doi.org/10.1155/2019/2108423.10.1155/2019/2108423Search in Google Scholar

Spelke, Elizabeth S., Karen Breinlinger, Janet Macomber & Kristen Jacobson. 1992. Origins of knowledge. Psychological Review 99(4). 605–632. https://doi.org/10.1037/0033-295x.99.4.605.10.1037/0033-295X.99.4.605Search in Google Scholar

Stefanowitsch, Anatol & Stefan Th Gries. 2003. Collostructions: Investigating the interaction between words and constructions. International Journal of Corpus Linguistics 8. 209–243. https://doi.org/10.1075/ijcl.8.2.03ste.10.1075/ijcl.8.2.03steSearch in Google Scholar

Stefanowitsch, Anatol & Stefan Th Gries. 2005. Covarying collexemes. Corpus Linguistics and Linguistic Theory 1. 1–43. https://doi.org/10.1515/cllt.2005.1.1.1.10.1515/cllt.2005.1.1.1Search in Google Scholar

Steyvers, Mark & Joshua B. Tenenbaum. 2005. The large-scale structure of semantic networks: Statistical analyses and a model of semantic growth. Cognitive Science 29. 41–78. https://doi.org/10.1207/s15516709cog2901_3.10.1207/s15516709cog2901_3Search in Google Scholar

Su, Hung-Kuan & Alvin Cheng-Hsien Chen. 2019. Conceptualization of containmentin Chinese: A corpus-based study of the Chinese space particles lǐ, nèi, and zhōng. Concentric: Studies about Languages 45(2). 211–245. https://doi.org/10.1075/consl.00009.su.10.1075/consl.00009.suSearch in Google Scholar

Tao, Hongyin. 2003. A usage-based approach to argument structure: ‘remember’and ‘forget’ in spoken English. International Journal of Corpus Linguistics 8(1). 75–95. https://doi.org/10.1075/ijcl.8.1.04tao.10.1075/ijcl.8.1.04taoSearch in Google Scholar

Taylor, John R. 2003. Near synonyms as co-extensive categories: ‘High’ and ‘tall’ revisited. Language Sciences 25(3). 263–284. https://doi.org/10.1016/s0388-0001(02)00018-9.10.1016/S0388-0001(02)00018-9Search in Google Scholar

van den Heuvel, Martijn P. & Olaf Sporns. 2013. Network hubs in the human brain. Trends in Cognitive Sciences 17. 683–696. https://doi.org/10.1016/j.tics.2013.09.012.10.1016/j.tics.2013.09.012Search in Google Scholar

Vandeloise, Claude. 2003. Containment, support, and linguistic relativity. In Hubert Cuyckens, René Dirven & John R. Taylor (eds.), Cognitive approaches to lexical semantics, 393–425. Berlin and New York, Germany and NY: Mouton de Gruyter.book-chapter.10.1515/9783110219074.393Search in Google Scholar

Vandeloise, Claude. 2010. Genesis of spatial terms. In Vyvyan Evans & Paul Chilton (eds.), Language, cognition and space: The state of the art and new directions, 171–192. London: Equinoxbook-chapter.Search in Google Scholar

Veremyev, Alexander, Alexander Semenov, Eduardo L. Pasiliao & Vladimir Boginski. 2019. Graph-based exploration and clustering analysis of semantic spaces. Applied Network Science 4(1). 104–132. https://doi.org/10.1007/s41109-019-0228-y.10.1007/s41109-019-0228-ySearch in Google Scholar

Vitevitch, Michael S. 2008. What can graph theory tell us about word learning and lexical retrieval?. Journal of Speech, Language, and Hearing Research 51. 408–422. https://doi.org/10.1044/1092-4388(2008/030).10.1044/1092-4388(2008/030)Search in Google Scholar

Watts, Duncan J. & Steven H. Strogatz. 1998. Collective dynamics of ‘small-world’networks. Nature 393(6684). 440–442. https://doi.org/10.1038/30918.10.1038/30918Search in Google Scholar

Wulff, Stefanie. 2006. Go-V vs. go-and-V in English: A case of constructional synonymy?. In Stefan Th Gries & Anatol Stefanowitsch (eds.), Corpora in Cognitive Linguistics, 101–126. Berlin/New York: Mouton de Gruyter.book-chapter.Search in Google Scholar

Yang, Zhao, René Algesheimer & Claudio J. Tessone. 2016. A comparative analysis of community detection algorithms on artificial networks. Scientific Reports 6(30750). https://doi.org/10.1038/srep30750.10.1038/srep30750Search in Google Scholar


Supplementary Material

The online version of supplementary material is available in the web link https://web.ntnu.edu.tw/∼alvinchen/data/cllt-2020-0012.


Published Online: 2020-08-17
Published in Print: 2022-05-25

© 2020 Alvin Cheng-Hsien Chen, published by De Gruyter, Berlin/Boston

This work is licensed under the Creative Commons Attribution 4.0 International License.

Downloaded on 25.4.2024 from https://www.degruyter.com/document/doi/10.1515/cllt-2020-0012/html
Scroll to top button