Lili Wu Chinese

Menghui Shi; Yiya Chen

doi:10.1017/S0025100320000092

Lili Wu Chinese

Published online by Cambridge University Press: 29 September 2020

Menghui Shi

and

Yiya Chen

Show author details

Menghui Shi: Affiliation:
Leiden University Centre for Linguistics (LUCL)smhfudan@163.com
Yiya Chen: Affiliation:
Leiden University Centre for Linguistics (LUCL) & Leiden Institute for Brain and Cognition (LIBC) yiya.chen@hum.leidenuniv.nl

Article contents

Extract
Supplementary material
Footnotes
References

Rights & Permissions

Extract

Lili Wu Chinese () is a Wu dialect (; ISO 639-3; code: wuu) spoken by approximately 38,000 people who reside in the town of Lili (), one of the ten major towns in the Wujiang district (). The Wujiang district belongs to the prefectural-level municipality of Suzhou city () in Jiangsu province (), the People’s Republic of China. It is located at the juncture area of the city of Shanghai (), the city of Suzhou, and the city of Jiaxing (), as shown in Figure 1.

Type: Illustrations of the IPA
Information: Journal of the International Phonetic Association , Volume 52 , Issue 1 , April 2022 , pp. 157 - 179

DOI: https://doi.org/10.1017/S0025100320000092 [Opens in a new window]
Copyright: © International Phonetic Association 2020

Figure 1 Map of the Wujiang dialects (modified based on the map in Ye Reference Ye1983).

Lili Wu Chinese is commonly considered to belong to the Suhujia dialect cluster (), which in turn is classified as a member of the Tai Lake subgroup () of the Northern Wu dialect group, a Sinitic branch within the Sino-Tibetan family (Wurm et al. Reference Wurm, Li, Baumann and Lee1987: B–9). The dialect is famous for the so-called aspiration-induced tonal split phenomenon, which refers to the lowering of f0 contours after voiceless aspirated obstruents in certain tonal contexts.Footnote 1 Lili Wu Chinese has therefore attracted much attention over the last six decades, which led to a handful of descriptions not only on the dialect but also on its closely-related dialects in the Wujiang area which appear to have similar aspiration-induced tonal splits. Perhaps because of this salient tonal-split feature in the dialect, much less attention has been paid to the segmental properties of Lili Wu Chinese in the existing literature. This description aims to bring together existing descriptions of Lili Wu Chinese in an accessible form, as well as to propose a number of methodological/analytical innovations and new perspectives with regard to not only lexical tones but also segmental features. Specifically, they are: (i) an instrumental analysis of the lexical tones and a reanalysis of the co-occurrence pattern between lexical tone and onset; (ii) acoustic realizations of voiceless vs. voiced fricatives; (iii) detailed phonetic analyses of two high front vowels /i/ and /i̟/; and (iv) the addition of two syllabic approximants /ɹ̩/ and /ɹ̹̍/ in the sound system of Lili Wu Chinese.

The description is mainly accompanied by recordings of a sixty-eight-year-old male native speaker, who was born in 1948 and raised in Lili town. All acoustic data we present in this description were elicited from this consultant. Our consultant spent most of his life living in Lili and speaking Lili Wu Chinese, except for the three years attending a college in a nearby city. According to his self-report, he can speak (accented) Standard Chinese and limited Shanghainese when the situation requires him to do so, but he speaks only Lili Wu Chinese at home. All video recordings were elicited from another male native speaker, who was born in 1947 and raised in Lili town as well.

Lexical tones and aspiration-induced tonal split

There are eight lexical tones in Lili Wu Chinese. Plotted in Figure 2 are the f0 contours of the example morphemes, labelled as T1 to T8, respectively. Generally speaking, lexical tones marked as odd numbers start within a higher f0 range (above 160Hz, high-register hereafter), while those marked as even numbers start within a lower range (under 160Hz, low-register hereafter). T1 (black solid) has a level f0 contour within the high-register (high–level) while T2 (dark grey solid) is a low-register rising tone (low–rising). T3 (black round dot) starts within the high-register and falls (high–falling). T4 (dark grey round dot) is a low-register level tone (low–level). T5 (black square dot) has a convex contour which starts at the high-register, falls and ends with a slight rise (high–dipping). T6 (dark grey square dot) is realized with a similar f0 contour to that of T5 but starts at the low-register (low–dipping). Both T7 (black dash-dotted) and T8 (dark grey dash-dotted) are associated with syllables that have a much shorter duration than the other tone-bearing syllables. T7 starts within the high-register and despite the slight falling contour, sounds like a high-register level tone (short–high–level). T8 is a low-register level tone (short–low–level).

Figure 2 F0 contours of the lexical tones of the example words.

These eight lexical tones exhibit interesting co-occurrence patterns with both the onset and coda. Lili Wu Chinese features the three-way laryngeal contrast in obstruents, known as voiceless unaspirated, voiceless aspirated and voiced, respectively. (See the section on consonants below for more details.) Syllables with voiceless unaspirated onsets only allow high-register tones (T1, T3, T5, and T7); while voiced onsets co-occur with low-register tones (T2, T4, T6, and T8). T1 to T6 only co-occur with open syllables or syllables with a nasal coda () and are therefore also known as smooth/non-checked tones (developing from the Ping [, level], Shang [, rising], and Qu [, departing] tonal categories of Middle Chinese), while the T7 and T8 only co-occur with closed syllables with a glottal coda /ʔ/ () and are known as abrupt/checked tones (developing from the Ru [, entering] tonal category of Middle Chinese).

In the vast majority of Northern Wu dialects such as Shanghainese (Chen & Gussenhoven Reference Chen and Gussenhoven2015), both voiceless unaspirated and aspirated onsets condition high-register tones, leaving voiced onsets to co-occur with low-register tones. What makes Lili Wu Chinese interesting is the effect of obstruent aspiration on lexical tonal realization, as exemplified by /tʰʊŋ¹/ ‘unblocked’, /tʰʊŋ⁴/ ‘to unify’, /tʰʊŋ⁶/ ‘ache’, and /tʰʊʔ⁸/ ‘baldy’. Their f0 contours are plotted in Figure 3 (labelled as T1–A, T3–A, T5–A, and T7–A where A indicates voiceless aspirated onsets), in comparison to the f0 contours of the presumably same lexical tones realized after voiceless unaspirated onsets (indicated with U). Except for T1 (i.e. T1–U vs. T1–A), we see a clear f0-lowering effect in syllables with voiceless aspirated onsets. This lowering effect, as if a split of the same tone into two as a function of voiceless unaspirated vs. aspirated onsets, is known as aspiration-induced tonal split.

Figure 3 F0 contours of the lexical tones of the words with voiceless unaspirated onsets (U, black) and those with voiceless aspirated onsets (A, light grey).

Perhaps due to this prominent phenomenon, the tonal inventory in Lili Wu Chinese has been a point of debates in recent decades. To our knowledge, there are at least eleven descriptive works focusing on this aspiration-induced tonal-split phenomenon in Lili Wu Chinese (Chao Reference Chao1928: 82; Ye Reference Ye1983; Zhang & Liu Reference Zhang and Liu1983; F. Shi Reference Shi1992; Qian Reference Qian1992: 48; Shen Reference Shen1994; P. Wang Reference Wang2008, Reference Wang2010: 26–27; Z. Xu Reference Xu2009: 55; Hirayama Reference Hirayama2010; Yanhong Xu Reference Xu2013: 32). Researchers differ greatly in their treatment/interpretation of the tonal-split phenomenon. The main debate lies in the question of whether the f0 contours of lexical tones after aspirated onsets have been merged with those after voiced onsets or emerged as distinct tonal categories independent of the existing eight tonal categories. It is important to note that previous studies typically explore this phenomenon based on impressionistic descriptions (e.g. Chao Reference Chao1928: 8–10), or with data from a very limited number of speakers (e.g. F. Shi Reference Shi1992 for one male and one female speakers; Shen Reference Shen1994 for two young speakers).

M. Shi, Chen & Mous (Reference Shi, Chen and Mous2016), with data from twenty native speakers (eight males and 12 females with mean age of 67 years and standard deviation of six years), show comparable f0 contours after voiceless aspirated (T1–A) and voiceless unaspirated (T1–U) onsets, both of which are realized within the high register, as shown in Figure 4(a). However, the f0 contours after voiceless aspirated onsets can also pattern more like those after voiced onsets, resulting in the merger of the f0 contours of T3–A and T4, T5–A and T6, T7–A and T8, respectively, as plotted in Figure 4(b–c). F0 contours after aspirated onsets show a trend of slightly higher f0. Suggested by the statistical results (growth curve analysis, GCA) in M. Shi et al. (Reference Shi, Chen and Mous2016), there is no significant difference between f0 contours after voiceless aspirated and voiced onsets for each pair.

Figure 4 Normalized average f0 contours of the lexical tones based on data from twenty native speakers (eight males and 12 females with mean age of 67 years and standard deviation of six years). Each participant read a minimal set of 36 real monosyllabic words with three laryngeal-alveolar contrasts (voiceless unaspirated, voiceless aspirated, and voiced) combined with three vowels (low, middle, and high). The grey areas indicate (±1) one standard error of the mean. Adopted from M. Shi et al. (Reference Shi, Chen and Mous2016).

In summary, the lexical tonal system of Lili Wu Chinese includes two level tones (high–level T1 and low–level T4), one falling tone (high–falling T3), one rising (low–rising T2) and two dipping tones (high–dipping T5 and low–dipping T6). For short syllables with a glottal coda, two level tones are identified (short–high–level T7 and short–low–level T8). The numerical representations of the eight lexical tones and their co-occurrence patterns with onsets are provided in Table 1. Here, we adopted the tonal transcription system developed by Chao (Reference Chao1930) where 5 indicates the highest end of a speaker’s pitch range into levels and 1 the lowest.

Table 1 Numerical representations of the lexical tones in Lili Wu Chinese. A single number refers to cases where the tone-carrying syllables have short duration and only co-occur with the glottal coda /ʔ/.

T1 can co-occur with both voiceless onsets (i.e. unaspirated and aspirated). T3, T5, and T7, on the other hand, can only co-occur with voiceless unaspirated onsets. The three low-register tones (T4, T6, and T8) are licensed by both voiceless aspirated and voiced onsets, while T2 is only allowed after voiced onsets. It is important to note that the co-occurrence pattern (i.e. voiceless onsets co-occurring with high-register tones, while voiced onsets with low-register ones), which is commonly observed in most Northern Wu dialects, falls apart in Lili Wu Chinese where voiceless aspirated onsets can co-occur with low-register tones.

In addition, it is worth noting that in Lili Wu Chinese, sonorants (i.e. nasals and liquids) mainly co-occur with low-register tones and share the same tonal pattern with voiced plosives. A set of words initialed with nasals can also co-occur with high-register tones, such as /mu³/ [məʊ³] ‘bound morpheme for the literary address of mother’.Footnote 2 With respect to fricatives, since there is only a two-way laryngeal distinction (i.e. voiceless vs. voiced), voiceless fricatives co-occur with high-register tones while their voiced counterparts with low-register ones.

Consonants

Lili Wu Chinese has 28 consonants. Corresponding key words/bound morphemes are provided below the consonant chart. Lili Wu Chinese features the three-way laryngeal contrast in obstruents, known as voiceless unaspirated, voiceless aspirated and voiced, respectively (Chao Reference Chao1967). This three-way contrast is a prominent feature of the Northern Wu dialects. The three-way laryngeal contrast, however, has different phonetic manifestations in the initial vs. medial positions within a word. Generally speaking, in the initial position, these obstruents vary in their phonation from clearly modal (voiceless unaspirated), aspirated with breathiness (voiceless aspirated), to breathy (voiced) (M. Shi et al. Reference Shi, Chen and Mous2016). In the medial position, voiced obstruents are realized with noticeable voicing throughout the closure, leading to a three-way laryngeal distinction in terms of voice onset time (VOT). In Shanghai Wu Chinese, a Northern Wu dialect closely related to Lili Wu, there are other phonetic properties to signal the three-way laryngeal contrast in both initial and medial positions (e.g. Shen, Wooters & W S.-Y. Wang Reference Shen, Wooters, Wang, Joseph and Zwicky1987 on closure duration; Ren Reference Ren1992: 95–111 on transillumination data; Gao Reference Gao2015: 199–207 on motion-capture-system data; see also a review in Chen Reference Chen2011). Impressionistically speaking, Lili Wu behaves similarly to Shanghainese. Needless to say, more research is needed to examine if these properties also function in Lili Wu Chinese.

Fricatives have the voiceless vs. voiced two-way laryngeal contrast. Similar to the plosives and affricates, in the initial position, their phonatory states vary from clearly modal in the voiceless ones to slightly breathy in the voiced ones. In medial position, the voiced category is realized as vigorous voicing, leading to a two-way contrast in terms of VOT. It is worth noting that the fricative voicing contrast is also signaled via their durational differences, similar to what have been reported for voicing contrast in English fricatives (Cole & Cooper Reference Cole and Cooper1975), as shown in the following pairs: /f/ (/fu¹/ ‘husband’) vs. /v/ (/vu²/ ‘support somebody with one's hand’); /s/ (/sɛ¹/ ‘three’) vs. /z/ (/zɛ²/ ‘greedy’). Figure 5 illustrates the acoustic realization of /f/ in /fu¹/ (6a) and /v/ in /vu²/ (6b). Although neither is realized with regular vocal pulses (i.e. phonetically voiceless), the fricative duration of /f/ (131 ms; 29% of the total duration) is almost 2.4 times longer than that of /v/ (56 ms; 12% of the total duration).

Figure 5 Waveforms and spectrograms of (a)‘husband’ and (b)‘support somebody with one’s hand’. Within a syllable, the percentage of the frication duration with absolute values (ms) between parentheses is indicated.

To further confirm these observations, we elicited ten minimal pairs for each minimal set of voicing contrast. All stimuli were lexemes of relatively high familiarity, as confirmed by our consultant. Both the absolute duration of the frication and the percentage of the frication duration over the whole syllable duration were calculated. The fricative duration was measured from the onset of clear frication noise to the first periodic cycle of the vowel. Results in Table 2 show that the percentage of the frication duration of voiceless onsets is significantly greater than that of their voiced counterparts, confirmed by the results of the independent samples t-tests (one-tailed) for each pair.

Table 2 The average percentage of the frication duration and the independent samples t-test results for each pair of voiceless vs. voiced onsets. Parentheses indicate absolute values of the average duration (mean) and the standard deviation (sd).

Last but not least, /dz/ and /z/ are sometimes in free variation for the same lexical item, as exemplified in ‘groceries’ (/ʣaʔ⁸ hu⁵/ vs. /zaʔ⁸ hu⁵/). This finding may imply that in Lili Wu Chinese, the affricate /dz/ and fricative /z/ are undergoing merger at the lexical level. The glottal plosive /ʔ/ only appears in coda position as a phoneme and co-occurs with short syllables as in /paʔ⁷/ ‘hundred’. Phonetically speaking, the [ʔ] segment can be observed at the beginning of onsetless syllables with the high-register tones (i.e. T1, T3, T5, and T7) (see the section below for further details on onsetless syllables).

Sonorants

/n l/ are typical laminal alveolar. The alveolar nasal /n/ is palatalized before high front segments (i.e. /i i̟ y j/), as in /ni²/ [ɲi²] ‘year’ and /njɛ⁶/ [ɲjɛ⁶] ‘to read’. Labial and velar nasals can form syllable nuclei as in /ŋ̍⁴/ ‘five’ and /m̩⁴/ ‘parcel of land’. These two syllabic nasals can be found in many Southern Chinese dialects (i.e. Wu, Min, Hakka, Xiang, and Yue) but are relatively rare in dialects belonging to the Mandarin family (Shen Reference Shen2006). In addition, /ŋ/ occurs as a nasal coda as well, but its acoustic realization varies according to the preceding vowel. After a front vowel, the nasal coda acquires the anterior feature and sounds like [n] (as in /zɪŋ²/ [zɪn²] ‘to look for’ and /tɕʏŋ¹/ [tɕʏn¹] ‘army’), as in contrast to a non-front vowel (as in /dzəŋ¹/ ‘deity’ and /dʊŋ²/ ‘copper’). Following the treatment of Chen & Gussenhoven (Reference Chen and Gussenhoven2015) for Shanghainese, we posit an underlying /ŋ/ in coda position.

There are two glides /j/ and /w/ in Lili Wu Chinese. Glides are typically defined as vowel-like segments that function as consonants and belong to the approximant class (Ladefoged & Maddieson Reference Ladefoged and Maddieson1996: 322). In Lili Wu Chinese, /j/ and /w/ differ from the corresponding vowels (i.e. /i/ and /u/) in that both tend to be produced with a narrower constriction of the vocal tract indicated via lower F1 values. Following Maddieson & Emmorey (Reference Maddieson and Emmorey1985), we compared mean F1 of the beginning interval (i.e. 50 ms) of /j/ (/jɘ̝o¹/ ‘surname, Ou’) with /i/ (/i¹/ ‘smoke’) and /w/ (/wɛ²/ ‘to return’) with /u/ (/u¹/ ‘crow’), respectively. Results showed that the F1 values of /j/ (265 Hz) and /w/ (314 Hz) are lower than the corresponding vowels (/i/: 271 Hz; /u/: 354 Hz). Existing descriptions of Lili Wu Chinese such as P. Wang (Reference Wang2010: 26) have typically posited high vowels /i u/ instead of glides /j w/ in words like /jɘ̝o¹/ and /wɛ²/ (/iəu¹/ and /uᴇ²/ in P. Wang Reference Wang2010: 26, respectively),Footnote 3 despite the consensus among sinologists that they are glides. We have adopted the approximants /j w/ to transcribe the sounds. Note that before rounded vowels /o/ and /ø/, /j/ is realized as [ɥ] as in /joʔ⁸/ [ɥoʔ⁸] ‘bath’ and /jø²/ [ɥø²] ‘rounded’. Because of the complementary distribution, /ɥ/ is treated as a context-specific (i.e. before /o/ and /ø/) variation of /j/.

A controversial issue is whether it is necessary to posit /j/ after an alveolo-palatal affricate or fricative onset (i.e. /ʨʰ ʨ ʥ ɕ/) in Wu Chinese (see a brief discussion in Chen & Gussenhoven Reference Chen and Gussenhoven2015). Historically, these alveolo-palatal onsets are commonly believed to develop from the velar or glottal onsets (i.e. /kʰ k ɡ h/) due to the palatalization process triggered by the following high front segments (e.g. L. Wang Reference Wang1985: 394). Synchronically, there is no contrast between /ʨʰ ʨ ʥ ɕ/ and /ʨʰj ʨj ʥj ɕj/ in Lili Wu Chinese. More remarkably, the transition from the alveolo-palatal affricate to the following vowel is rather brief. Figure 6 illustrates the different transitional characteristics among /tɑ¹/ ‘knife’ (6a), where there is no glide, /tjɑ¹/ ‘marten’ (6b) and /tsjɑ¹/ ‘scorched’ (6c), where there is commonly recognized presence of /j/, and /ʨɑ¹/ ‘to converge’ (6d), where we propose absence of /j/. Adapting the method of Chitoran (Reference Chitoran2002), we marked the beginning of the transition at the start of the sonorant part (i.e. glide or vowel) and the end of the transition as the turning point from a falling F2 to an F2 steady-state, before it falls consistently less than 20 Hz. The F2 values were automatically measured in Praat (Boersma & Weenink Reference Boersma and Weenink2020) with a window length of 5 ms. Note that we would have expected a much more stable realization of /j/ with longer transition from /ʨ/ to /ɑ/ if we assumed the presence of a glide /j/ following /ʨ/. These observations motivated us not to posit an underlying /j/ after alveolo-palatal onsets (following the analysis of Chen & Gussenhoven Reference Chen and Gussenhoven2015 for Shanghainese). But we would like to stress the importance of further experimental studies to investigate the phonological status and phonetic realization of /j/ after alveolo-palatals in Lili Wu Chinese as well as other Chinese dialects.

Figure 6 Waveforms and spectrograms of (a) ‘knife’, (b) ‘marten’, (c) ‘scorched’, and (d) ‘to converge’. Within each syllable, the transition from the end of the preceding consonant to the time that the F2 converges toward the value of is indicated.

Vowels

The traditional quadrilateral vowel plot of Lili Wu monophthongs in open syllables is as follows:

Monophthongs and diphthong in open syllables

Monophthongs in closed syllables and nasalized vowels

In open syllables, there are nine monophthongs (7a) and one diphthong (7b) in Lili Wu Chinese as plotted in Figure 7. These nine monophthongs of Lili Wu Chinese (/i y i̟ ɛ ø u o ɔ ɑ/) constitute a four-way distinction (i.e. close, close-mid, open-mid, and open) in height and a two-way distinction (i.e. front and back) in backness. /i y/ contrast in roundness. In addition, there is one diphthong occurring in open syllables, with /ɘ̝o/ gliding towards the back. Monophthongs in closed syllables and nasalized vowels are plotted in Figure 8, where four (/ɪ ʏ ə ʊ/) occur in syllables closed by a nasal coda (8a), four (/ɪ a ʊ Ʌ/) in syllables closed by a glottal coda (8b) and two (/æ̃ ɑ̃/) are nasalized vowels (8c). Compared to the vowels in open syllables, the number of vowels in closed syllables is largely reduced and so is their acoustic vowel space. Generally speaking, vowels in closed syllables or with nasalization are more central and lower than those in open syllables. Following Chen & Gussenhoven (Reference Chen and Gussenhoven2015), we adopted the same set of symbols (i.e. /ɪ/ and /ʊ/) for monophthongs followed by a nasal coda and those by a glottal coda, although their articulations do differ. The plots of the F1–F2 values are based on accompanying sound files produced by our consultant. The mean formant value of a vowel was calculated by averaging over ten tokens (except for /u/ which was calculated based on five tokens).

Figure 7 The F1–F2 values of monophthongs and diphthong in open syllables: (a) monophthongs; (b) diphthong. Squares represent unrounded vowels; circles represent rounded vowels. The arrow demonstrates the trajectory of the gliding.

Figure 8 The F1–F2 values of monophthongs in closed syllables and nasalized vowels: (a) vowels in syllables closed by a nasal; (b) vowels in syllables closed by a glottal coda; (c) nasalized vowels. Squares represent unrounded vowels; circles represent rounded vowels.

Lili Wu presents an interesting case of fricative vowel, as illustrated in Figure 9, which plots the spectrograms of the minimal pair /i/ in /ti³/ ‘dot’ (9a) and /i̟/ in /ti̟³/ ‘bottom’ (9b). The F2 of /i/ (2399 Hz) is higher than the F2 of /i̟/ (2009 Hz). Perceptually, a striking difference between /i/ and /i̟/ is the frication present in /i̟/. Figure 10 exhibits narrow band spectrograms of /ti³/ (10a) and /ti̟³/ (10b). Harmonics can be clearly identified in /ti³/ but are not in /ti̟³/, especially in the frequency bands above 2 kHz. Furthermore, there is a substantial amount of aperiodic energy in the higher frequency region, particularly above 4 kHz in /ti̟³/, which suggests the presence of strong fricative noise. This observation is further confirmed by the HNR (Harmonics-to-Noise Ratio) results, with /i̟/ in /ti̟³/ (8.1 dB) showing more noise than /i/ in /ti³/ (9.8 dB).

Figure 9 Waveforms and spectrograms of (a) ‘dot’ and (b) ‘bottom’. F2 values are indicated.

Figure 10 Narrow band spectrograms of (a) ‘dot’ and (b) ‘bottom’.

A similar contrast has been reported in Suzhou Wu Chinese (Chao Reference Chao1928: 38; P. Wang Reference Wang1987; Hu Reference Hu2007; Ling Reference Ling, Trouvain and Barry2007, Reference Ling2011). In order to illustrate the frication, Ling (Reference Ling2011) adopted the symbol /i_z/ for the phoneme and [ʒ̻̍] (i.e. the syllabic laminal postalveolar voiced fricative) for its phonetic realization. However, this treatment is problematic. First, a subscript /_z/ does not meet the convention of diacritic symbols in the IPA. Second, articulatory data (i.e. palatographic, linguagraphic, and electromagnetic articulographic studies) of Suzhou Wu Chinese have shown that the constriction of /i̟/ is located at a more anterior position (Ling Reference Ling, Trouvain and Barry2007, Reference Ling2011; Hu & Ling Reference Hu and Ling2019) than /i/. Consequently, the lengthening of the back resonating cavity lowers the F2 of /i̟/ as argued by Ling (Reference Ling2011) following Stevens (Reference Stevens1989). Third, we have also noted that the frication in Lili, compared to that in Suzhou Wu Chinese, is not consistently audible for all /i̟/ words produced by our consultant and also not as strong as that in Suzhou Wu Chinese. For instance, there is little frication in /fi̟¹/ [fᵊɨ̟¹] ‘to fly’ (which tends to be diphthongal). Given the three reasons, we have adopted the symbol /̟/ to highlight the more anterior constriction of /i̟/ and the less friction. Such an articulatory gesture is also accompanied by the raising of the lower jaw in words such as /i̟¹/ ‘clothes’, which however, is not observed in words such as /i¹/ ‘smoke’, as shown in the video recordings. The contrast of high front vowels between /i/ and /i̟/ is an areal feature in many Chinese dialects, especially in the Jianghuai Mandarin family () (R. Shi Reference Shi1998, Zhu Reference Zhu2004b, Zhao Reference Zhao2007). Similar contrasts have also been argued to occur in modern African languages, such as Len Mambila (Connell Reference Connell2007) and Ring languages (Faytak & Merrill Reference Faytak and Merrill2015).

Both /u/ (in /u¹/ ‘crow’) and /o/ (in /ko¹/ ‘melon’) are close/closed-mid back monophthongs with compressed lip rounding. The lips for /o/ are more protruding but for /u/ they are less rounded and more compressed (similar to the /u o/ contrast in Shanghainese as discussed in Chen & Gussenhoven Reference Chen and Gussenhoven2015). After bilabial and labio-dental,Footnote 4 /u/ is produced as [v̩] (i.e. the syllabic labiodental voiced fricative), as exemplified in /pu¹/ [pv̩¹] ‘wave’. After alveolar, alveolo-palatal and velar consonants, /u/ is realized with diphthong quality (i.e. [əʊ]), as shown in /ku¹/ [kəʊ¹] ‘song’. According to a Suzhounese syllabary named A Syllabary of the Soochow Dialect, recorded by A Committee of the Soochow Literary Association (1892) for missionaries in acquiring Suzhounese, such a diphthongal realization of /u/ after alveolar, alveolo-palatal and velar consonants can be traced back to the beginning of the 20th century.

The front vowel /ø/ tends to be produced with a lower F2 such as in /ø¹/ [ʔø̈¹] ‘in safe’ (1228 Hz) than in /jø²/ [ɥø²] ‘rounded’ (1425 Hz). Both, however, are produced with a lip rounding gesture, as shown in the video recordings.

/ɘ̝o/ is a diphthong and only co-occurs with the glide /j/ (e.g. /vjɘ̝o²/ ‘to float’ and /kjɘ̝o¹/ ‘to tick off’) or alveolo-palatals (e.g., /dʑɘ̝o⁶/ ‘used’ and /ɕɘ̝o¹/ ‘to rest’).

Vowels preceding a glottal stop coda show a much shorter duration. When high vowels (i.e. /ɪ/ and /ʊ/) occur before /ʔ/, a general displacement towards an open back position often results in a brief schwa after nuclei, such as /ʨɪʔ⁷/ [ʨɪᵊʔ⁷] ‘hurry’ and /kʊʔ⁷/ [kʊᵊʔ⁷] ‘surname, Guo’.

/æ̃/ and /ɑ̃/ are two nasalized vowels, as illustrated in /tsʰæ̃⁶/ ‘unimpeded’ and /tsʰɑ̃⁶/ ‘to sing’. Both vowels are consistently nasalized without recognizable velum closure in Lili Wu Chinese, different from the case of Shanghainese where a brief velar nasal coda has been reported (Chen & Gussenhoven Reference Chen and Gussenhoven2015).

Syllabic approximants

There are two syllabic approximants in Lili Wu Chinse, which are exemplified in /sɹ̩¹/ [sɹ̪̍¹] ‘silk’ and /sɹ̹̍¹/ [sʷɹ̻̹̍¹] ‘book’. The syllabic approximant /ɹ̩/ [ɹ̪̍] in Lili Wu Chinese is similar to that in Standard Chinese.

With respect to /ɹ̹̍/, two features are to be further noted. First, the lip rounding gesture of the approximant contributes to the labialization of the preceding alveolar sibilant onset (i.e. /s/ [sʷ] before /ɹ̹̍/). Labialized alveolar sibilants are rare in the world’s languages (but see Lao, a Tai-Kadai language reported in Erickson Reference Erickson, Adams and Hudak2001). The rounding feature is believed to evolve from /u/ or /y/, the two rounded vowels reported to be present instead of /ɹ̹̍/ in other Wu dialects, such as /su¹/ in Danyang () Wu and /ɕy¹/ in Songjiang () Wu for ‘book’ (Qian Reference Qian1992: 88). In addition, /ɹ̹̍/ is articulated more laminally. Laminal consonants have been widely reported to exist in Australian languages (Butcher Reference Butcher and Roland1990, Anderson & Maddieson Reference Anderson and Maddieson1994). Such an articulatory gesture of /ɹ̹̍/ is reflected in Figure 11 as a lowered F4 (3375 Hz, compared to 4221 Hz of /ɹ̩/ in /sɹ̩¹/) and the proximity of F3 and F4. F4 lowering is generally said to be related to articulatory retraction (e.g. Fant Reference Fant1960: 121; Stevens & Blumstein Reference Stevens and Blumstein1975; Vaissière Reference Vaissière, Lee and Zee2011). The proximity of F3 and F4 is known as a consequence of weakly coupled resonators by forming a relatively larger frontal cavity (Stevens Reference Stevens1989). For instance, a significant convergence of F3 and F4 is observed in laminal alveolar and postalveolar fricatives in English, as well as in apico-laminal alveolars in French (Dart Reference Dart1991: 104). In short, /ɹ̹̍/ is produced with a more laminal articulation combined with a lip rounding gesture than its counterpart /ɹ̩/. Such differences were also noticed by our consultant who offered his native intuition voluntarily with us. Given the impressionistic nature of the description, needless to say, more instrumental studies (e.g. ultrasound) are needed for a precise description of their articulation and acoustic consequences.

Figure 11 Waveforms and spectrograms of (a) ‘silk’ and (b) ‘book’. F4 values are indicated.

It is worth noting that there exist different proposals to transcribe these sounds. For example, among sinologists (after Karlgren Reference Karlgren1915: 294), /ɹ̩/ and /ɹ̹̍/ have often been transcribed as /ɿ/ and /ʮ/, respectively, and are known as ‘apical vowels’. /ɹ̩/ is sometimes treated as [z̩] (e.g. Ladefoged & Maddieson Reference Ladefoged and Maddieson1996: 314; Wiese Reference Wiese, Wang and Smith1997: 239; Duanmu Reference Duanmu2000: 36 for Standard Chinese; Chen & Gussenhoven Reference Chen and Gussenhoven2015 for Shanghainese). Such a treatment, however, has been questioned with ultrasound imaging data (Lee-Kim Reference Lee-Kim2014, Faytak & Lin Reference Faytak and Lin2015) and acoustic analyses (Howie Reference Howie1976: 10). Lee-Kim (Reference Lee-Kim2014) further argues that it is more appropriate to describe [z̩] as ‘dental approximant [ɹ̪̍]’.Footnote 5 A similar treatment can also be found in Lee & Zee (Reference Lee and Zee2003).

Last but not least, an increasing body of literature has shown that such syllabic approximants are known to affect diachronic changes of high vowels in different languages at different time points, across an overwhelmingly large number of Sino-Tibetan languages (e.g. Baron Reference Baron1974, R. Shi Reference Shi1998, Zhu Reference Zhu2004b, Zhao Reference Zhao2007, Hu & Ling Reference Hu and Ling2019).

Syllable structure

Generally speaking, eight syllable combinations can be identified in Lili Wu Chinese. The canonical syllable minimally consists of an obligatory nucleus (V) and a lexical tone as in /u¹/ ‘crow’ and /ø²/ [ɦø̈²] ‘cold’. The nucleus can be either a vowel or a syllabic consonant (i.e. /ɹ̩ ɹ̹̍ m̩ ŋ̍/).Footnote 6 It may also contain up to three optional elements in the following linear structure: (C₁)(G)V(C₂), where C₁ can be any consonant in the consonant inventory except for /ʔ/, G is either /j/, as in /kjɘ̝o¹/ ‘to tick off’, or /w/, as in /kwɛ¹/ ‘to close’; C₂ is either /ŋ/ or /ʔ/ as in /kʊŋ¹/ ‘public’ and /kaʔ⁷/ ‘to clip’. Parentheses indicate optional constituents. All combinations are demonstrated in Table 3.

Table 3 Syllabic combinations in Lili Wu Chinese.

As illustrated in Table 4, co-occurrence constraints on onset and rhyme combinations can be observed. First, /i ɪŋ ɪʔ i̟/ behave similarly except that /i̟/ can appear after labio-dentals as in /fi̟¹/ ‘to fly’ and /vi̟²/ ‘fat’. /i/, on the other hand, is prohibited in this context (i.e. ^*/fi/, ^*/vi/). Second, /y ʏŋ/ are only allowed after alveolar sonorants and alveolo-palatals, or without an onset. Third, before /ø o ɔ ɑ æ̃/, labio-dentals are prohibited (^*/fø fo fɔ fɑ fæ̃/) but /ɛ u/ are possible as in /fɛ¹/ ‘to turn over’ and /vu²/ ‘to support somebody with one’s arm’. Fourth, the two syllabic approximants /ɹ̩ ɹ̹̍/ occur only after alveolar homorganic sibilant onsets /ts tsʰ ʣ s z/. /j w/ can serve as an onset as in /jø²/ ‘rounded’ and /wɔ⁴/ ‘broken’.

The distribution of the two glides is summarized in Table 5. /j/ is allowed in the majority of cases (e.g. /pjɑ¹/ ‘watch’, /vjɘ̝o²/ ‘to float’, /tjɑ¹/ ‘marten’, /kjɘ̝o¹/ ‘to tick off’, /hjɘ̝o³/ ‘to roar’ and /jø²/ ‘rounded’) except after alveolo-palatals. /w/, however, is more constrained and only allowed after velars (e.g. /kwɛ¹/ ‘to close’), glottal fricative /h/ (e.g. /hwɛ¹/ ‘dust’), or serves as a glide onset (e.g. /wɛ²/ ‘to be back’).

Table 4 Observed onset–rhyme combinations.

and cannot be observed.

cannot be found.

and cannot be observed.

cannot be found.

and cannot be found.

Table 5 Observed onset–glide combinations.

Onsetless syllables

In onsetless syllables with high-register tones (i.e. T1, T3, T5, and T7), the phonetic segment [ʔ] can be observed at the onset of the tone-bearing syllable, as in /ø¹/ [ʔø̈¹] ‘in safe’ and /si̟¹ ø¹/ [si̟⁴⁴ ʔø̈⁴²] ‘a city, Xi’an’. With respect to onsetless syllables with low-register tones (i.e. T2, T4, T6, and T8), we observe phonetic realization of [ɦ] before a non-high vowel (e.g. /ø²/ [ɦø̈²] ‘cold’, /ɔ²/ [ɦɔ²] ‘shoes’, and /aʔ⁸/ [ɦaʔ⁸] ‘box’), in contrast to the cases when there is a high vowel or glide (e.g. /i²/ [ʝi²] ‘salt’, /jø²/ [ɥø²] ‘rounded’, /u²/ ‘river’, and /wɅʔ⁸/ ‘alive’). [ɦ] disappears in non-initial position within a prosodic word, e.g. /tʰɑ⁴ ɔ²/ ‘galoshes’. The general pattern is therefore similar to Shanghainese (Chen & Gussenhoven Reference Chen and Gussenhoven2015).

In Lili Wu Chinese, syllables with low-register tones show relatively stronger breathiness than those with high-register counterparts. As indicated by Figure 12, the Fast Fourier Transform (FFT) spectrum of /ø¹/ [ʔø̈¹] ‘in safe’ (dark) and /ø²/ [ɦø̈²] ‘cold’ (light) shows the phonation contrast in the vowel /ø/, taken within an interval of approximately 30 ms from the first regular vocal pulse of the vowel. As shown by the measurements on H1 – H2 (i.e. amplitude difference between the first and second harmonics), there is a phonatory difference between the two vowels with /ø²/ (4.5 dB) showing more breathiness than /ø¹/ (2 dB). This contrast has also been observed in other Northern Wu dialects (Cao & Maddieson Reference Cao and Maddieson1992).

Figure 12 FFT spectrum of in ‘in safe’ (dark) and /ø²/ ‘cold’ (light) over an interval of approximately 30 ms from the first regular vocal pulse of the vowel. The first two harmonics (H1 and H2) of each syllable are indicated.

Tone sandhi: A preliminary overview

Lexical tones over monosyllabic morphemes undergo changes when they are combined into compounds or phrases. In this description, we offer some preliminary observations concerning tone sandhi variations in Lili Wu Chinese over disyllabic compounds (hereafter called the tone unit). Tonal realization is mainly contingent upon the lexical tone of the second syllable (σ₂). Generally speaking, two general patterns are observed.

First, when σ₂ carries an abrupt tone (i.e. T7 and T8 over a glottal-coda syllable), regardless of the syllable structure of the first syllable (σ₁), only level f0 contours surface, and the specific f0 height is dependent on the lexical tone of σ₁. After a high tone, a low tone appears; while after a low tone, a high tone appears. Both patterns are illustrated in Figure 13, which shows T1 (high–level) + T7/T8 (13a /tsʰəŋ¹ tsɪʔ⁷/‘the Spring Festival’, 13b /tɕɪŋ¹ dʑɪʔ⁸/ ‘Peking Opera’) and T6 (low–level) + T7/T8 (13c /tʰɔ⁶ kwɅʔ⁷/ ‘Thailand’, 13d /ʨʰi̟⁶ dɪʔ⁸/ ‘steam whistle’).

Figure 13 Waveforms, f0 tracks and spectrograms of (a) ‘the Spring Festival’, (b) ‘Peking Opera’, (c) ‘Thailand’, and (d) ‘steam whistle’.

Second, when σ₂ carries a non-abrupt tone (i.e. T1 to T6 over an open syllable or a syllable with a nasal coda), the lexical tonal contour of σ₁ remains and affects the pitch realization of σ₂. The specific f0 contour of σ₂ hinges upon the lexical tonal register of σ₁. When σ₁ is produced with a high-register tone (i.e. T1, T3, T5, and T7), σ₂ is typically realized with a falling f0 contour, as shown in Figure 14 (e.g. 14a /sɪŋ¹ zəŋ⁴/ ‘new kidney’, 14b /kɛ³ zɑ⁴/ ‘to remold’, 14c /tɕɔ⁵ zɑ⁴/ ‘introduction’, and 14d /kʊʔ⁷ tʰu⁴/ ‘territory’). However, other patterns have also been observed. For example, in the combination of T7 + σ₂, when σ₂ bears T1 (e.g. /kʊʔ⁷ kʰu¹/ ‘orthopaedics’), the underlying form of T1 in /kʰu¹/ (high–level) is preserved, instead of a predictable falling contour like /tʰu⁴/ in /kʊʔ⁷ tʰu⁴/ ‘territory’.

Figure 14 Waveforms, f0 tracks and spectrograms of (a) ‘new kidney’, (b) ‘to remold’, (c) ‘introduction’, and (d) ‘territory’.

When σ₁ is pronounced with a low-register tone (i.e. T2, T4, T6, and T8), the sandhi pattern tends to be more complicated. The tonal contour of σ2 seems to also exert influence on the overall tonal realization. For example, Figure 15 shows the contrast of /pʰɔ⁶ tɕʰi⁴/ ‘to dispatch’ (15a) vs. /ʨʰi̟⁶ pʰɑ⁶/ ‘bubble’ (15b). Here, T4 in /tɕʰi⁴/ completely loses its underlying form (low–level) and is realized with a high-falling contour, similar to Shanghainese (Chen & Gussenhoven Reference Chen and Gussenhoven2015). However, the lexical tone of the preceding tone T6 in /pʰɑ⁶/ (low–dipping) is only preserved to a certain extent. The same tone (i.e. T6) is realized with an audible pitch level difference: T6 in /pʰɔ⁶ tɕʰi⁴/ is overall lower than that in /ʨʰi̟⁶ pʰɑ⁶/.

Figure 15 Waveforms, f0 tracks and spectrograms of (a) #x2018;to dispatch’, and (b) ‘bubble’.

In addition, it is worth noting that syllables with aspirated onsets show two different patterns of changes. They pattern either with syllables that have unaspirated onsets and carry T1, or with syllables that have voiced onsets and carry T4, T6, or T8. For example, the sandhi change of /tsʰɪŋ¹ zɹ̩⁶/ ‘in person’ patterns with that of /sɪŋ¹ zɑ̃⁶/ ‘heart’; while /tsʰɅ⁸ djɘ̝o²/ ‘to stand out’ patterns with /zɅ⁸ djɘ̝o²/ ‘tongue’.

It is important to conclude here that even within the arguably simplest construction beyond a monosyllabic morpheme (i.e. disyllabic compounds), Lili Wu Chinese already exhibits different patterns of tonal realization from its neighboring Northern Wu dialects such as Shanghainese (Chen & Gussenhoven Reference Chen and Gussenhoven2015). It is not only subject to the influence of the preceding tone on tonal realization, but also seems sensitive to tonal properties of the second syllable. In this illustration, we have just presented a preliminary glimpse into the pitch contours of disyllabic compounds in Lili Wu Chinese. Needless to say, more data and further research are needed.

Transcription of recorded passage ‘North Wind and the Sun’

The passage is transcribed phonemically, using the symbols presented in the vowel and consonant charts. Tones have been transcribed phonemically before ⁻ (with pitch levels for the eight tonal categories provided in the section on lexical tones). Listeners will find significant deviation in the actual pitch contours of these lexical tones due to contextual tonal variation (see Chen Reference Chen, Cohn, Fougeron and Huffman2012 for a comprehensive review on tonal variation). Given the salient feature of tone sandhi in Lili Wu Chinese (and generally speaking in Wu dialects), we have also provided a transcription of tonal contours based on perceptual impressions of the pitch levels according to Chao’s system (after ⁻). For personal pronouns and modal particles, only actual pitch contours have been transcribed. The boundaries between syllables are indicated by spaces. The boundaries of tone units are marked by parentheses, with | marking the end of major phrases and || that of utterances. Our consultant tends to produce more creakiness in running speech. For example, creaky voice can be identified at the end of /kʰø²¹³/ ‘to look’ in ‘make a comparison (a bit, as an attempt)’ with a sudden rise of f0. Worth noting that segmental reduction usually happens in running utterance. For example, /Ʌ/ is sometimes reduced to [ə]. [] marks the (allophonic) cases that auditorily deviate saliently from the phonemic transcription.

Phonemic transcription

Orthographic transcription

Acknowledgements

We would like to thank our principal consultant Mr Liangquan Cheng for making this possible. We are also grateful to Mr Haimin Li for his coordination and arrangement for the three fieldwork sessions conducted in Lili. In addition, we would like to thank Maarten Mous and Ruiqing Shen for valuable comments on earlier versions of our paper, and to Zhongmin Chen, Hang Cheng, Maarten Kossmann, Feng Ling, Zhongwei Shen, Yimin Sheng, Rujie Shi, Huan Tao, Ping Wang, Xinyi Wen, and Dan Yuan for sharing their thoughts with us on various linguistic aspects of the language, and to Feifan Wang for collecting pilot recordings, and to He Huang, Huaqiang Song, and Lei Wang for sharing references. Moreover, we gratefully acknowledge the anonymous reviewers of this journal for helpful comments and suggestions. The proofreading assistance from Seamus Leith and the audio editing guidance from André Radtke are gratefully appreciated. This work is supported by China Scholarship Council (CSC) and Leiden University Centre for Linguistics (LUCL) scholarship to the first author as well as the KNAW–China Exchange grant (13CDP012) from the Netherlands Royal Academy of Sciences to the second author. Neither the individuals and institutions cited herein nor the funding agencies, however, should be held responsible for the views expressed in this paper.

Supplementary material

To view supplementary material for this article, please visit https://doi.org/10.1017/ S0025100320000092.

Footnotes

¹ In the literature, the same phenomenon has been referred to as Songqifendiao [, lit.: ‘aspiration divides tones’] (e.g. Ho Reference Ho1989), Qiliufendiao [, lit.: ‘airflow divides tones’] (Yue Xu Reference Xu2006), or Ciqingfendiao [, lit.: ‘secondary voiceless divides tones’] (Zhu & Yue Xu Reference Xu2009). The English appellations include ‘aspiration-conditioned tone-lowering’ (Sagart Reference Sagart1981), ‘tone-split by aspiration’ (Ho Reference Ho1989), ‘aspirated tones’ (Shen Reference Shen1994), ‘tonal split based on the aspiration’ (F. Shi Reference Shi1998), and ‘tonal split following voiceless aspirated stop onsets’ (Chen Reference Chen2011).

² In Northern Wu Chinese, there is a group of shared words with nasal initials , ‘bound morpheme for mother/bound morpheme for the literary address of mother/beautiful/very/little darling’) that can co-occur with high-register tones. Such a co-occurrence is argued to be relevant to the affective function of high tones (Zhu Reference Zhu2004a).

³ Sinologists frequently use to describe the phoneme between and , which however, does not exist in the IPA.

⁴ The syllable is quite limited. Only one word ‘bound morpheme for the literary address of mother’ has been found in Lili Wu Chinese. The realization of , however, patterns with that after alveolar and velar consonants.

⁵ In Lee-Kim (Reference Lee-Kim2014: 264), the syllabic diacritic is not used for the sake for simplicity.

⁶ In Lili Wu Chinese, are obligatory to contain an onset.

⁷ ‘manner, degree and quality proximal deixis’ is the coalescence of two syllables ‘proximal deixis’ and ‘deictic suffix’ in fast utterance.

References

A Committee of the Soochow Literary Association (ed.). 1892. A syllabary of the Soochow dialect. Shanghai: American Presbyterian Mission Press.Google Scholar

Anderson, Victoria B. & Maddieson, Ian. 1994. Acoustic characteristics of Tiwi coronal stops. UCLA Working Papers in Phonetics 87, 131–162.Google Scholar

Baron, Stephen P. 1974. On the tip of many tongues: Apical vowels across Sino-Tibetan. Presented at 7th International Conference on Sino-Tibetan Language and Linguistic Studies, Atlanta, Georgia State University. Retrieved from https://halshs.archives-ouvertes.fr/halshs-01400987/document.Google Scholar

Boersma, Paul & Weenink, David. 2020. Praat: Doing phonetics by computer (version 6.0.18). http://www.praat.org/ (accessed 6 February 2020).Google Scholar

Butcher, Andrew. 1990. “Place of articulation” in Australian languages. In Roland, Seidl (ed.), 3rd Australian International Conference on Speech Science and Technology, Melbourne, 420–425.Google Scholar

Cao, Jianfen & Maddieson, Ian. 1992. An exploration of phonation types in Wu dialects of Chinese. Journal of Phonetics 20(1), 77–92.Google Scholar

Chao, Yuen-Ren. 1928. Xiandai Wuyu de Yanjiu [Studies of the modern Wu dialects]. Beijing: Tsinghua Xuexiao Yanjiuyuan.Google Scholar

Chao, Yuen-Ren. 1930. % sistim %v “toun-let%z” [A system of ‘tone-letters’]. Le Maître Phonétique 8(45) no. 30, 24–27. [Reprinted in 1980 in Fangyan [Dialects] 2, 81–83.]Google Scholar

Chao, Yuen-Ren. 1967. Contrastive aspects of the Wu dialects. Language 43(1), 92–101.CrossRef Google Scholar

Chen, Yiya. 2011. How does phonology guide phonetics in segment–f0 interaction? Journal of Phonetics 39(4), 612–625.CrossRef Google Scholar

Chen, Yiya. 2012. Tonal variation. In Cohn, Abigail, Fougeron, Cecile & Huffman, Marie (eds.), Oxford handbook of laboratory phonology, 103–114. New York: Oxford University Press.Google Scholar

Chen, Yiya & Gussenhoven, Carlos. 2015. Shanghai Chinese. Journal of the International Phonetic Association 45(3), 321–337.CrossRef Google Scholar

Chitoran, Ioana. 2002. A perception-production study of Romanian diphthongs and glide–vowel sequences. Journal of the International Phonetic Association 32(2), 203–222.CrossRef Google Scholar

Cole, Ronald A. & Cooper, William E.. 1975. Perception of voicing in English affricates and fricatives. The Journal of the Acoustical Society of America 58(6), 1280–1287.CrossRef Google Scholar PubMed

Connell, Bruce. 2007. Mambila fricative vowels and Bantu spirantisation. Africana Linguistica 13(1), 7–31.CrossRef Google Scholar

Dart, Sarah N. 1991. Articulatory and acoustic properties of apical and laminal articulations. UCLA Working Papers in Phonetics 79, 1–155.Google Scholar

Duanmu, San. 2000. The phonology of Standard Mandarin. New York: Oxford University Press.Google Scholar

Erickson, Blaine. 2001. On the origins of labialized consonants in Lao. In Adams, Karen L. & Hudak, Thomas J. (eds.), Papers from the 6th Annual Meeting of the Southeast Asian Linguistics Society, Tempe, Arizona State University, 135–148.Google Scholar

Fant, Gunnar. 1960. Acoustic theory of speech production: With calculations based on X-ray studies of Russian articulations. The Hague: Mouton.Google Scholar

Faytak, Matthew & Lin, Susan. 2015. Articulatory variability and fricative noise in apical vowels. In The Scottish Consortium for ICPhS 2015 (ed.), 18th International Congress of Phonetic Sciences (ICPhS XVIII), Glasgow, University of Glasgow, https://www.internationalphoneticassociation.org/ icphs-proceedings/ICPhS2015/Papers/ICPHS0516.pdf.Google Scholar

Faytak, Matthew & Merrill, John. 2015. Bantu spirantization is a reflex of vowel spirantization. Presented at Sound Change in Interacting Human Systems: 3rd Biennial Workshop on Sound Change, Berkeley, University of California, http://linguistics.berkeley.edu/SCIHS/abstracts/5_FridayAfternoon/Faytak_Merrill.pdf.Google Scholar

Gao, Jiayin. 2015. Interdependence between tones, segments, and phonation types in Shanghai Chinese: Acoustics, articulation, perception, and evolution. Ph.D. dissertation, Université Sorbonne Nouvelle – Paris III.Google Scholar

Hirayama, Hisao. 2010. Ciyindiao xingcheng de shengxue yuanyin – Yi Wujiang fangyan weili [On the phonetic cause of the evolution of the Ci-yin tones: The case of the Wujiang dialects]. Zhongguo fangyan xuebao [Journal of Chinese Dialectology] 2, 17–23.Google Scholar

Ho, Dah-An. 1989. Songqifendiao jiqi xiangguan wenti [The tone-split by aspiration and other related problems]. Lishi yuyan yanjiusuo jikan [Bulletin of Institute of History and Philology] 60(4), 765–778.Google Scholar

Howie, John M. 1976. Acoustical studies of Mandarin vowels and tones. Cambridge: Cambridge University Press.Google Scholar

Hu, Fang. 2007. Lun Ningbo fangyan he Suzhou fangyan qiangaoyuanyin de qubietezheng – Jiantan gaoyuanyin jixu gaohua xianxiang [On the distinctive features for the high front vowels in Ningbo and Suzhou Wu dialects with reference to sound changes of high vowels]. Zhongguo yuwen [Studies of the Chinese Language] 5, 455–465.Google Scholar

Hu, Fang & Ling, Feng. 2019. Fricative vowels as an intermediate stage of vowel apicalization. Language and Linguistics 20(1), 1–14.Google Scholar

Karlgren, Bernhard. 1915. Études sur la phonologie Chinoise I [Studies in Chinese phonology I]. Leyde: Brill & Stockholm: Norstedt.Google Scholar

Ladefoged, Peter & Maddieson, Ian.1996. The sounds of the world’s languages. Oxford: Blackwell.Google Scholar

Lee, Wai-Sum & Zee, Eric. 2003. Standard Chinese (Beijing). Journal of the International Phonetic Association 33, 109–112.CrossRef Google Scholar

Lee-Kim, Sang-Im. 2014. Revisiting Mandarin ‘apical vowels’: An articulatory and acoustic study. Journal of the International Phonetic Association 44, 261–282.CrossRef Google Scholar

Ling, Feng. 2007. The articulatory and acoustic study of fricative vowels in Suzhou Chinese. In Trouvain, Jürgen & Barry, William J. (eds.), 16th International Congress of Phonetic Sciences (ICPhS XVI), Saarbrücken, Saarland University, 573–576.Google Scholar

Ling, Feng. 2011. Suzhouhua [i] yuanyin de yuyinxue fenxi [A phonetic study of vowel [i] in the Suzhou Wu dialect]. In Centre for Chinese Linguistics Peking University (ed.), Yuyanxue luncong [Essays on linguistics], vol. 43, 177–193. Beijing: Shangwu Yinshuguan.Google Scholar

Maddieson, Ian & Emmorey, Karen. 1985. Relationship between semivowels and vowels: Cross-linguistic investigations of acoustic difference and coarticulation. Phonetica 42(4), 163–174.CrossRef Google Scholar PubMed

Qian, Nairong. 1992. Dangdai Wuyu Yanjiu [Studies of the contemporary Wu dialects]. Shanghai: Shanghai Jiaoyu Chubanshe.Google Scholar

Ren, Nianqi. 1992. Phonation types and stop consonant distinctions: Shanghai Chinese. Ph.D. dissertation, University of Connecticut.Google Scholar

Sagart, Laurent. 1981. Aspiration-conditioned tone-lowering in Chinese dialects. Presented at 14th International Conference on Sino-Tibetan Languages and Linguistics, Gainesville, University of Florida, https://www.academia.edu/38233622/Aspiration-conditioned_tone_lowering_in_Chinese_dialects.Google Scholar

Shen, Zhongwei. 1994. The tones in Wujiang dialect. Journal of Chinese Linguistics 22(2), 279–315.Google Scholar

Shen, Zhongwei. 2006. Syllabic nasals in Chinese dialects. Bulletin of Chinese Linguistics 1(1), 77–104. Leiden: Brill.CrossRef Google Scholar

Shen, Zhongwei, Wooters, Charles & Wang, William S.-Y.. 1987. Closure duration in the classification of stops: A statistical analysis. In Joseph, Brian D. & Zwicky, Arnold M. (eds.), Ohio State University Working Papers in Linguistics 35, 197–209.Google Scholar

Shi, Feng. 1992. Wujiang fangyan shengdiaogeju de fenxi [The analysis of the tonal space in the Wujiang dialects]. Fangyan [Dialects] 3, 189–194.Google Scholar

Shi, Feng. 1998. Songqi shengmu duiyu shengdiao de yingxiang [The influence of aspiration on tones]. Journal of Chinese Linguistics 26(1), 126–145.Google Scholar

Shi, Menghui, Chen, Yiya & Mous, Maarten. 2016. Raising or lowering: The effect of aspiration-induced f0 perturbation in Lili Wu. In Department of Chinese Literature and Language of Fudan University (ed.), 9th International Conference on Wu Dialects, Suzhou, Suzhou University of Science and Technology, 33–36.Google Scholar

Shi, Rujie. 1998. Hanyu fangyan zhong gaoyuanyin de qiangmoca qingxiang [The friction of the high vowel in Chinese dialects]. Yuyan yanjiu [Linguistic Studies] 1, 100–109.Google Scholar

Stevens, Kenneth N. 1989. On the quantal nature of speech. Journal of Phonetics 17(1), 3–45.CrossRef Google Scholar

Stevens, Kenneth N. & Blumstein, Sheila E.. 1975. Quantal aspects of consonant production and perception: A study of retroflex stop consonants. Journal of Phonetics 3(4), 215–233.CrossRef Google Scholar

Vaissière, Jacqueline. 2011. On the acoustic and perceptual characterization of reference vowels in a cross-language perspective. In Lee, Wai-Sum & Zee, Eric (eds.), 17th International Congress of Phonetic Sciences (ICPhS XVII), Hong Kong, City University of Hong Kong, 52–59.Google Scholar

Wang, Li. 1985. Hanyu Yuyinshi [The history of Chinese philology]. Beijing: Zhongguo Shehui Kexue Chubanshe.Google Scholar

Wang, Ping. 1987. Suzhou yinxi zaifenxi [A reanalysis of phonology in the Suzhou dialect]. Yuyan yanjiu [Linguistic Studies] 1, 41–48.Google Scholar

Wang, Ping. 2008. Wujiang fangyan shengdiao zaitaolun [Tones in the Wujiang dialect: A revisit]. Zhongguo yuwen [Studies of the Chinese Language] 5, 455–461.Google Scholar

Wang, Ping. 2010. Wujiangshi Fangyanzhi [A description of the Wujiang dialects]. Shanghai: Shanghai Shehui Kexue Chubanshe.Google Scholar

Wiese, Richard. 1997. Underspecification and the description of Chinese vowels. In Wang, Jialing & Smith, Norval (eds.), Studies in Chinese phonology, 219–249. Berlin & New York: Mouton de Gruyter.Google Scholar

Wurm, Stephan A., Li, Rong, Baumann, Theo & Lee, Mei W. (eds.). 1987. Language atlas of China. Hong Kong: Longman.Google Scholar

Xu, Yanhong. 2013. Wujiang fangyan shengdiao shiyan yanjiu [An experimental study in Wujiang citation tones]. MA thesis, Shanghai University.Google Scholar

Xu, Yue. 2006. Jiangshan fangyan de Qiliufendiao [The tone-split by aspiration of the Jiashan dialect]. Yuyan yanjiu [Linguistic Studies] 3, 77–80.Google Scholar

Xu, Zhen. 2009. Wujiang fangyan shengdiao yanjiu [A reinvestigation of tones in the Wujiang dialects]. MA thesis, Shanghai Normal University.Google Scholar

Ye, Xiangling. 1983. Wujiang fangyan shengdiao zaidiaocha [A revisit to tones in the Wujiang dialects]. Fangyan [Dialects] 1, 32–35.Google Scholar

Zhang, Gonggui & Liu, Danqing. 1983. Wujiang fangyan shengdiao chubu diaocha [A preliminary investigation of tones in the Wujiang dialects]. Nanjing Shida xuebao (Shehui kexue ban) [Journal of Nanjing Normal University (Social Science Edition)] 3, 37–40.Google Scholar

Zhao, Rixin. 2007. Hanyu fangyan zhong de [i] > [ɱ] [Sound change [i] > [ɱ] in Chinese dialects]. Zhongguo yuwen [Studies of the Chinese Language] 1, 46–54.Google Scholar

Zhu, Xiaonong. 2004a. Qinmi yu gaodiao: Dui xiaochengdiao, nvguoyin, meimei deng yuyan xianxiang de shengwuxue jieshi [Intimacy and high pitch: A biological explanation for the use of high pitch in diminutives, ‘female Mandarin’, ‘Taiwan sisters’, etc.]. Dangdai yuyanxue [Contemporary Linguistics] 3, 193–222.Google Scholar

Zhu, Xiaonong. 2004b. Hanyu yuanyin de Gaodingchuwei [Sound changes of high vowels in Chinese dialects]. Zhongguo yuwen [Studies of the Chinese Language] 5, 440–451.Google Scholar

Zhu, Xiaonong & Xu, Yue. 2009. Chihua: Tansuo Wujiang Ciqingfendiao de yuanyin [Slack voice as the cause of tone split in Wujiang: A phonetic investigation into the pitch contour of the syllables with a voiceless aspirated onset in the Wujiang (Songling) Wu dialect]. Zhongguo yuwen [Studies of the Chinese Language] 4, 324–332.Google Scholar

Figure 1 Map of the Wujiang dialects (modified based on the map in Ye 1983).

Figure 2 F0 contours of the lexical tones of the example words.

Figure 3 F0 contours of the lexical tones of the words with voiceless unaspirated onsets (U, black) and those with voiceless aspirated onsets (A, light grey).