Phonological Identity of the Neutral-tone Syllables in Taiwan Mandarin: An Acoustic Study

Taiwan Mandarin, one of the more syllable-timed dialects of Mandarin, has fewer unstressed syllables than Standard Mandarin. Acoustic analyses show that the supposedly unstressed syllables—neutral-tone syllables—in Taiwan Mandarin behave differently from those of Standard Mandarin. Unlike Standard Mandarin, these syllables do not raise their pitch after Tone 3. They have a distinct static mid-low pitch target and the target is implemented with a stronger articulatory strength. Moreover, acoustic analyses demonstrate that not all of these “unstressed syllables” are unstressed. The phonetic evidence suggests that these neutral-tone syllables should be analyzed as unaccented rather than unstressed in Taiwan Mandarin. These unaccented syllables are only lexically marked, and their pitch is neutralized into a mid-low tone. This study sheds light on how rhythm can affect stress and accent in a lexical tone language.

Since f0 is one of the acoustic correlates of stress, one might wonder whether lexical stress is compatible with tone, especially the complex tone systems in which f0 contours are utilized to contrast every syllable.The answer is yes.Lexical stress can be found in complex lexical tone languages, mostly located in East Asia.Although some complex tone languages are argued to have no lexical stress, e.g.Cantonese (Bauer & Benedict, 1997) and Southern Vietnamese (Brunelle, 2017), other complex tone languages have been documented to have lexical stress.In these cases, the contrast of stress is manifested through the reduction of unstressed syllables-reduced duration and different degrees of tone neutralization are often found.For example, in Burmese, the unstressed syllables are called "minor" syllables.A minor syllable has a shorter duration and its vowel is neutralized to [ə].The pitch of the minor syllable has been described as "variable" (Bradley, 1982) and shows no high f0 peak (Gruber, 2011).In Thai, it has been reported that the five lexical tones are neutralized into three tonal registers (low, mid, high) when they are unstressed (Potisuk, Gandour, & Harper, 1994, 1996).However, this analysis is still controversial (Gandour, Tumtavitikul, & Satthamnuwong, 1999;Moren & Zsiga, 2006).Similarly in Nanchang Chinese, a Gan dialect, the lexically unstressed syllables also show tone neutralization in which the five lexical tones are neutralized into two different pitch patterns due to shorter rime durations (J.Liu & Zhang, 2012).

Standard Mandarin
Standard Mandarin is perhaps the best-studied case of lexical stress in complex tone systems.The unstressed syllables are mostly lexically determined in Standard Mandarin (SM) and they are referred to as having a "neutral tone" (Chao, 1968, p. 36).An unstressed syllable reduces in SM (Chao, 1933(Chao, , 1956(Chao, , 1968;;Luo & Wang, 1957).The unstressed vowel centralizes towards a schwa and its coda nasal is usually deleted while the vowel is nasalized (Duanmu, 2000, p. 256;Lin & Yan, 1990).As a result, an unstressed syllable is also shorter in duration.The mean duration of unstressed syllables is about 50%-60% of the stressed ones (J.Cao, 1986;Lee & Zee, 2008;Lin & Yan, 1980).As for intensity, an unstressed syllable does not necessarily have a weakened maximum intensity compared to a stressed syllable or its preceding syllable (J.Cao, 1986;Lin & Yan, 1980), and the intensity curves seem to co-vary with the pitch contours (Lee & Zee, 2008).Furthermore, there are lenition processes found only in unstressed syllables.In connected speech, an unstressed vowel can be devoiced or deleted when the vowel is high and after a fricative, an aspirated stop, or an affricate (Duanmu 2000:257).The initial of an unstressed syllable in connected speech also often undergoes lenition (Chao, 1968, p. 38;Duanmu, 2000, pp. 255-256;S. Xu, 1980, p. 159).
Stress also affects Mandarin on the suprasegmental level.There are four lexical tones.Tone 1, Tone 2, Tone 3, and Tone 4 have high level /H/ [˥], rising /LH/ [˧˥], dipping (low falling) /L/ [˨˩], and high falling /HL/ [˥˩] pitch contours, respectively.The underlying tone of an unstressed syllable in SM is not realized, and its pitch is determined by the preceding tone (Chao, 1932;Luo & Wang, 1957).Therefore an unstressed syllable is considered toneless (Duanmu, 2007, pp. 242-243) and has what is normally termed neutral tone /Ø/1 .The earlier descriptions of the unstressed syllables were mainly impressionistic (Chao, 1932(Chao, , 1933;;Kratochvil, 1968), but Chao's description (1968, p. 27) is still widely cited: the pitch of the neutral tone is half-low after Tone 1 /H/, middle after Tone 2 /LH/, half-high after Tone 3 /L/, and low after Tone 4 /HL/.The instrumental studies showed similar results, though the findings slightly varied in terms of the contour details (Cheng, 1973;Dreher & Lee, 1968;Gao, 1980;Lin & Yan, 1980;Shen, 1990).The most recent acoustic studies showed that the pitch contours of the neutral tone were [˦˩, ˥˩, ˦/˧, ˨˩] after the four lexical tones /H, LH, L, HL/ [˥, ˧˥, ˨˩, ˥˩], respectively (Lee & Zee, 2008).The pitch contours of a neutral tone appear to be an extension of the preceding tone (Z.Li, 2003).However, Chen and Xu (2006) found that when there are consecutive unstressed syllables, the pitch contours slowly approach a mid-level target over the course of these syllables.
At least in Standard Mandarin, scholars disagree on how lexical stress (or lack of stress) in complex tone languages should be analyzed (L.Liu, 2002).Unlike other nontonal languages where primary stress in a word is marked, it seems that the unstressed syllables are marked in these complex tone languages.Because of that, many analyses marked the unstressed syllables as having a short tone and specify the short duration as a feature of the underlying tone since it is common in complex tone languages to have the categories of both long and short tones (e.g.Cantonese, Thai).The problem with the short tone analysis in Standard Mandarin is that some syllables appear to be de-stressed at the post-lexical level.Aside from the lexically marked neutral-tone syllables, which comprise about 15-20% of the syllables in written texts (W.Li, 1981, p. 35), SM speakers tend to de-stress the second syllable of disyllabic or trisyllabic words in colloquial speech (Chao, 1932;Shen, 1990, pp. 38-39).Although unstressed syllables can also be stressed if the word is infrequent (Chao, 1932;Jing, 2002), overall, about one-third of all SM syllables are unstressed and toneless in SM in connected speech (Duanmu, 2000, pp. 257-258).Therefore in SM, it is better to analyze these syllables as unstressed, and treat the neutral tone in SM as a phonetic representation of an unstressed syllable.

Taiwan Mandarin
However, not all Mandarin dialects exhibit a similar pattern.Taiwan Mandarin is a dialect spoken in Taiwan that has been influenced by Southern Min. Mandarin was brought into Taiwan as a standard language when the Nationalist government lost the civil war and retreated to Taiwan in 1949.While the Mainland immigrants spoke some variety of Mandarin, 70% of the population in Taiwan spoke Southern Min as a first language and were forced to speak Mandarin as a second language (Sandel, 2003).After decades of language shift, Mandarin has become the dominant language and has developed into a fully-fledged dialect (Her, 2010).One of the features of the Taiwan Mandarin dialect is that the differences between Taiwan Mandarin (TM) stressed and unstressed syllables are not very distinct perceptually-TM is often described as being more impressionistically syllable-timed compared to SM (Kubler, 1985).
Since the syllable canons are similar across Mandarin dialects, the more syllabletimed rhythm in TM lies in the contrast between stressed and unstressed syllables.Like SM, TM also lexically marks the four lexical tones and the neutral tone.However, lexically marked neutral-tone syllables and syllable de-stressing occur less frequently in TM than in SM (Duanmu, 2007, p. 308;Kubler, 1985, p. 161;Swihart, 2003, p. 110;Tsao, 2000).Also, Duanmu (2000, pp. 266-267) noticed that, unlike in SM, vowel devoicing and deletion are not found in unstressed syllables in TM.Consonant reduction is found in TM, but it can happen in stressed syllables too.Even consonants in word initial position can reduce in TM, for example /k h ə-ʂi/: [k h ɤ-sɯ] 'but' becomes [ɤ-sɯ], and /ʐan-xəu/: [zan-xəu] 'afterwards' becomes [ã-əu]2 .This unconditioned consonant reduction may be a manifestation of the rhythmic differences between SM and TM because the reduction is not restricted to unstressed syllables-a more syllable-timed language tends to have a similar treatment on both stressed and unstressed syllables (Dauer, 1983).
Fewer neutral-tone syllables can be demonstrated in the prescriptive grammar in TM.I compared the words in the List of Neutral-Tone Words for the Standard Mandarin Proficiency Test3 with the online Revised Mandarin Chinese Dictionary published by the Ministry of Education in Taiwan.I found that all the suffixes and the reduplicants remain prescriptively marked as having a neutral tone in TM.However, only about half of the compound words (138/272) were marked as having full-neutral disyllabic patterns in TM4 , while the other half (134/272) were marked with lexical tones.For example, xiàba [ɕjɑ HL pɑ Ø ] 'chin' in SM remains marked as /HL-Ø/ in TM, but yuèliang [yɛ HL ljɑŋ Ø ] 'moon' in SM is marked as /HL-HL/ in TM.I further examined these prescriptively neutral-tone syllables in TM by trying to input the syllables/characters in the computer's operating system5 assuming that the input system is designed to let TM users input these syllables in the most intuitive way.I found that 116 out of 138 (84%) of the TM prescriptive neutral-tone syllables in compound words cannot be found under the neutral-tone entry.This piece of evidence shows that only 8.1% ((138-116)/272) of the SM full-neutral compound words are treated as full-neutral in TM.The percentage of full-neutral disyllabic words might be even lower.For example, although a neutral tone needs to be typed in luó-bo [lwo LH pwo Ø ] 'radishes' and yào-shi [jɑo HL ʂɨ Ø ] 'keys', they seem to be pronounced luó-bō [lwo LH pwo H ] and yào-shǐ [jɑo HL sɨ L ] (or yàoshí [jɑo HL sɨ LH ]) in TM respectively.
For the remaining neutral-tone syllables in TM, some of them clearly do not possess an underlying lexical tone.For example, the second syllable of dìdi [ti HL -ti Ø ] 'younger brother' seems to have a neutral tone because the second di [ti Ø ] is not pronounced with [HL].Many reduplicated kinship terms also show a similar pattern in which the second syllable clearly does not possess a lexical tone of the character.Some of them, mostly grammatical suffixes and final particles, do not have a lexical tone that are transparent to the present-day speakers because they were reduced and grammaticalized diachronically.For example, durative zhe [ʈʂə Ø ] was grammaticalized from zháo [ʈʂɑo LH ] 'contact', but this origin is not obvious to the speakers.
Only a few studies have investigated the acoustic properties of these remaining neutral-tone syllables.Two preliminary studies suggest that the pitch of the TM neutral tone is not influenced by the preceding lexical tones (J.Li, 2005;Tseng, 2004).The pitch pattern of the TM neutral tone was described as having a "certain pitch target" (J.Li, 2005) or pronounced with a low entering tone (short tone) (Tseng, 2004).However, both preliminary studies were based on limited sets of data without controlling for the environment or systematically comparing them to the other lexical tones.It remains unclear how the pitch patterns of the neutral tone should be characterized in relation to the other lexical tones, and whether these syllables are in fact unstressed.If the TM neutral tone has some kind of tonal target, as both previous studies suggested, does that imply the TM neutral-tone syllable has its own tonal representation?If the TM neutral tone has an underlying pitch target that is different from the existing four lexical tones and it is not reduced, we might need to posit a new lexical tone in Taiwan Mandarin.On the other hand, it could simply be the case that the unstressed syllables in TM still reduce and undergo tone neutralization, but its pitch patterns were neutralized differently from SM.To the best of my knowledge, there has been no complete study that examines the acoustic features of the neutral tone in TM and further analyzes the phonological identity of the TM neutral tone.
This acoustic study aims to answer the following research questions: 1) What is the pitch target of the TM neutral tone?Specifically, how is its pitch contour different from the SM neutral tone, and how is the pitch contour different from the other TM lexical tones?2) Is the TM neutral tone unstressed like in SM? Are the syllables reduced in duration or intensity?By answering these questions, I hope to characterize the TM neutral tone syllables and provide a clearer picture of the phonological identity of these "unstressed" syllables in TM.As Taiwan Mandarin is perceived as more syllable-timed than SM, the treatment of the so-called "unstressed" syllables in TM is a perfect chance to help us understand how rhythmic differences are manifested in a tone language.It would also further our understanding of lexical stress in complex tone languages.
In this study, I will examine the neutral-tone syllables in TM.I will focus on the grammatical morphemes and the reduplicated kinship terms because unlike particles and compound words, these items are both frequent and relatively stable in pitch.Two production tests were conducted to characterize the neutral tone in TM: the first test in Section 2 investigated the consecutive neutral tones in TM in comparison with the previous SM results in order to uncover the pitch target of the neutral tone, while the second test in Section 3 compared the neutral tone with the similar Tone 3 /L/ to characterize the pitch target of the neutral tone and to observe whether the neutraltone syllables are reduced.

The pitch target of the neutral-tone syllables
In Standard Mandarin, consecutive neutral tones are influenced by the preceding lexical tones and slowly approach to a mid target-unlike a lexical tone, where the influence of the preceding lexical tone is overcome quickly to implement its tonal target, the influence of the preceding lexical tones is still substantial at the end of three consecutive neutral tones (Chen & Xu, 2006).I hypothesize that unlike SM, consecutive TM neutral tones do not exhibit similar tonal behavior.I expect that when producing consecutive neutral tones, the TM speakers will approximate a pitch target in each consecutive neutral tone.Therefore the carry-over effects of the preceding lexical tones are expected to be overcome quicker than the SM neutral tones, and neutraltone contours with different preceding tones will share similar contour shapes and pitch registers in the second and the third neutral tone.

Design
Several prescriptive neutral-tone syllables were tested.They include four reduplicants (RED) 6 ma, po, nai, and mei [mɑ Ø , p h wo Ø , nɑɪ Ø , mɛ Ø ], the nominalizer (NMLZ)/ possessive (POSS) de [tə Ø ], the plural (PL) men [mən Ø ], the durative (DUR) zhe [tʃə Ø ], and the perfective (PFV) le [lə Ø ].These tested neutral-tone syllables were combined into three sets of meaningful consecutive neutral tone syllables: 1) -X-men-de [~mən Ø tə Ø ] '-RED-PL-POSS', 2) -zhe-de [tʃə Ø tə Ø ] '-DUR-NMLZ', and 3) -le-de [lə Ø tə Ø ] '-PFV-NMLZ'.The first set of sentences had three consecutive neutral tones and the other two sets had two.Each set was put into a meaningful sentence in order to elicit a more natural speech.An example of a sentence is given below.The full list of stimuli sentences is provided in the Appendix A.  Chen and Xu's (2006) methodology, in order to see the influence of the preceding lexical tones, the syllable before the consecutive neutral tones (X) varied in four lexical tones.N represents the neutral-tone syllables (two or three), and the syllable following the consecutive neutral tones (Y) also varied in order to control for contextual influence.Only /L/ and /HL/ were included because /L/ starts at the lowest pitch and /HL/ starts at the highest pitch.As a result, the subjects read 4 (lexical tone of X)* 3 (sets of consecutive neutral-tone syllables)* 2 (Y) = 24 sentences.Six male and six female TM speakers who were between 21 and 33 years old without any speaking or hearing impairment were recruited.The subjects all grew up in Taiwan and their parents all speak Southern Min as their native language.All twelve subjects speak Southern Min, Mandarin, and English.The subjects had not resided in any foreign country for more than three months in the preceding year, a fact that reduces the possibility of dialectal influences from other Mandarin varieties.
The subjects were asked to read these 24 sentences written in Chinese characters.These neutral tone combinations are semantically natural and relatively frequent, and the subjects were expected to produce them without any difficulties.There was no practice session but the subjects were encouraged to re-read a sentence if they felt they had made a mistake.

Data
The subjects were recorded with a Zoom H2 recorder in quiet offices.The speech was recorded with the sample rate of 44100 Hz and was analyzed in Praat.Only the rimes of the syllables were segmented in order to compare the f0 contour across different syllables.Otherwise a syllable with an initial obstruent only has f0 readings in half of the syllable while a syllable with an initial sonorant has f0 readings throughout the syllable.Phonetic studies also show that the pitch contours in the initial consonants are irregular, and the tones are not implemented until the beginning of the rimes (Howie, 1976;Y. Xu, 1999).
The rimes were hand-labeled.With the aid of a Praat script, the f0 of the rimes were measured at the midpoint of all the 1/10 intervals (5%, 15%...95% point of the rime).Measurements were then examined to adjust for pitch-halving and pitch doubling.If a tested syllable had more than 5 consecutive pitch data missing (mostly due to creaky voice), the whole sentence was deleted.
The f0 values were first converted into semitones, and then normalized to z-score by speakers in order to reflect the relative pitch height within each speaker's pitch range-the pitch data distribute between -2 ~ 2 semitone z-score (-2, -1, 0, 1, 2), which can be interpreted as low, mid-low, mid, mid-high, and high pitch range.The pitch data presented in charts were normalized in semitone z-score.However, pitch data in semitone were used in statistical analyses.

Analysis
The data were further analyzed with linear mixed models in SPSS Statistics 23.A mixed models approach was chosen because it retained the remaining data without listwise deletion when there were missing data in repeated measures.The models first included all the tested fixed effects, random intercepts, and random slopes.The tested fixed effects were kept intact regardless of their statistical significance, but other parameters were then reduced to reach the best-fit model.Model selection was done based on the comparison of the Akaike Information criterion (AIC) score.Insignificant interactions were first dropped from the models to see if it yielded a lower AIC score; insignificant main effects were then dropped unless they contributed to a significant interaction.
In this experiment, in order to analyze the pitch contours of the consecutive neutral tones, the dependent variables were the repeated measures of pitch readings.My main focus is to see the effects of the preceding tones on the consecutive neutral tones.Therefore the fixed effects included the preceding tone X (H, LH, L, HL) as a factor, position (10 pitch points labelled 0-9 representing the 5%, 15%, 25%,…, 95% point) as a covariate, and the X-position interaction which captured the pitch changes along the repeated measured pitch points (i.e.pitch contour) by various preceding tones.The random effects included both subjects (12 speakers) and items (tested sentences) as random intercepts in order to control for the variations coming from the speakers and the tested item.By-subject random slopes on position, preceding tone X, following tone Y (HL, L), the neutral tone category (-zhe-de, -le-de) in the case of two consecutive tones, and their interactions were included; and by-item random slopes on position was also included.
Based on my hypothesis, I expect to see that at the second and the third neutral tone, the influence of the preceding tone is overcome, which means that the fixed effect of the preceding tone will not be significant; I also expect that the second and the third neutral tones will share a similar pitch contour regardless of their preceding lexical tones.Therefore the fixed effect of the X-position interaction is expected to be insignificant.

Two consecutive neutral tones
The average pitch contours of the two consecutive neutral tones are plotted in Figure 1.Only the rimes were measured and graphed.The figure shows the eight average normalized pitch contours of the 12 TM speakers producing two sets of two consecutive neutral tones (-zhe-de and -le-de).The eight pitch contours varied the four preceding tones (syllable X) and the two following tones (syllable Y).Two sets of linear mixed model confirm the visual observation stated above.For the first neutral tone, the linear mixed model showed that there were significant effects of preceding tone [F(3, 22.266)=23.794, p<.001], position [F(1, 37.524)=126.291,p<.001], and the tone-position interaction [F(3, 12.187)=19.923, p<.001].The estimates of the fixed effects, shown in Table 1, illustrate that at 5% of the first rime (baseline), the neutral tone after /H/ is 1.719 semitone higher than that after /LH/ (baseline) (p=.001), and the neutral tone after /L/ is 1.856 semitone lower than that after /LH/ (p<.001).The significant position effect confirms that the contours were not flat statistically.The first neutral tone after /LH/ (baseline) reduced 0.142 semitone per unit (10% of the rime), and the pitch contours after different lexical tones differed: the first neutral tone after /H/ reduced a further 0.148 semitone (p<.001), making its estimated pitch contour fall 0.290 semitone per unit; the pitch contour after /L/ fell less than that after /LH/ (p=.039),only 0.075 (=-0.142+0.067)semitone per unit.Overall, if the pitch at 5% of the rimes was higher (such as after /H/), the pitch contour fall was sharper, which suggests the convergence of the pitch contours.At the second neutral tone, the linear mixed model showed that while the effects of the preceding lexical tone were still significant on the pitch (F(3, 38.656)=4.465,p=.009) and position [F(1, 30.219)=129.241,p<.000], their interaction was not significant (F(3, 11.677)=0.5,p=.689).The estimates of fixed effects shown in Table 2 reveal that at 5% of the second neutral tone rime, the pitch estimate after /HL/ was 0.566 semitone lower than the pitch estimate after /LH/ (p=.033).The pitch fall of the second neutral tone after /LH/ was estimated to be -0.126semitone per unit (p<.001), but its slope was not significantly different from those after other lexical tones.The estimates show that while the preceding tone effect was significant, the differences between different preceding tones were within 1 semitone; furthermore, these pitch contours after different lexical tones share a similar degree of pitch fall, and the estimated degree of pitch fall was small (around -0.13 semitone per unit), suggesting a shallower pitch fall.

Three consecutive neutral tones
All the pitch contours of the three consecutive neutral tones were plotted and examined.Ten tokens which were obviously different from others were excluded, which will be discussed separately.The average pitch contours of the three consecutive neutral tones varying the preceding and the following tones are shown in Figure 2. The results show that the reduplicants, the first neutral tone among the three, behaved differently from other neutral-tone syllables.The reduplicants seem to have lexical tones /H/ after /H/: māma /H-Ø/ 'mother' is produced as /H-H/, suggesting that the second syllable was not produced with a neutral tone, but its lexical tone of the character.Also, the reduplicant in the word pópo /LH-Ø/ 'motherin-law' had a higher falling pitch contour compared to the first neutral tone after /LH/ in Figure 1, and the pitch contour is similar to Tone 4 (high falling)-the high falling Tone 4 stays in the high register (0 ~ 2 semitone z-score) (Huang, 2013).Therefore, despite the reduplicants after /HL/ and /L/ falling to the mid to mid-low range as observed in two consecutive neutral tones, the first neutral tone in this set was excluded from the analysis.As for the second neutral tone, the pitch contours all fell and merged to the same pitch range (-0.5 ~ -1 semitone z-score).The pitch contours of the third neutral tone overlapped with each other and generally fell from -0.5 semitone z-score to -1 semitone z-score.3 reveal that at 5% of the second neutral tone rime, the pitch after /H/ was 1.046 semitone higher than the pitch after /LH/ (p=.041), and the pitch after /L/ was 1.169 semitone lower than the pitch after /LH/ (p=.028).As for the pitch contour, the pitch contour after /H/ has a much sharper pitch fall (-0.285semitone/unit) compared to the contour after /LH/ (-0.176 semitone/unit), and the difference (-0.110 semitone/unit) was statistically significant (p=.015).The pitch contours with varying preceding tones seem to converge as found in the first neutral tone in two consecutive neutral tones.4. In summary, the preceding lexical tone (X) affected the second neutral-tone syllables, but not the third neutral-tone syllables.

Discussion
This experiment modifies Chen and Xu's study (2006) on the SM neutral tone and examines how consecutive neutral tones differ in TM.The first difference between the two dialects lies in the tonal variation of the reduplicants.Our results showed that not all of the reduplicants behave like other neutral tones-the reduplicant in māma /H-Ø/ 'mother' was produced with a lexical tone /H/.Also, the reduplicant in pópo /LH-Ø/ 'mother-in-law' had a higher falling pitch.This high falling pitch contour of the reduplicant after /LH/ is similar to the SM neutral tone after /LH/.However, because this pitch contour is only found in /LH-Ø/ reduplicated kinship terms and it is not observed in other /LH-Ø/ words, this is likely to be a sporadic variation.The TM speakers likely lexicalized the SM pitch contour of /LH-Ø/ as /LH-HL/ in those kinship terms.If the TM speakers had adopted the SM neutral tone of high falling contour after /LH/, we would expect it to show up in other neutral tone syllables after /LH/ as well.Therefore this production test was unable to elicit three consecutive neutral tones in the context of four different lexical tones.
In addition, several tokens were excluded due to the tonal variations of the reduplicants.Six out of the ten excluded pitch contours also show possible SM influences on these reduplicated kinship terms.First, one male and one female subject produced the reduplicated /H/ ma with a falling pitch contour instead of a high level pitch when X was /H/ and Y was /L/.This pitch contour of high-falling was similar to the pitch pattern of /H-Ø/ in SM.However, both of the speakers only had this pitch contour in X-RED-PL-POSS before /L/, but not before /H/, which suggests that the pitch pattern was sporadic, not consistent.In addition, six tokens were removed because two other male subjects produced the word nǎi-nai /L-Ø/ 'grandmother' with a [L-H] pitch contour, and a female subject consistently produced the words n ǎ i-nai /L-Ø/ 'grandmother' and pó-po /LH-Ø/ 'mother-in-law' with a [L-LH] pitch contour.As discussed earlier in 1.2, it is common in colloquial TM to adopt [L-H] or [L-LH] pitch contours on reduplicated kinship terms or nicknames when the reduplicated syllables are /L/ or /LH/, in order to show endearment (Hsu, 2006).The tonal variation in the reduplicated kinship terms in the data was likely due to dialect mixtures.The speakers likely borrowed a certain lexicalized pitch pattern rather than borrowing the neutral tone from a different dialect.
The main difference between TM and SM neutral tones, which is the main focus of this experiment, lies in the pitch contour of these consecutive neutral tones.Figure 3 compares the results of three consecutive neutral tones before /L/ produced by the TM subjects in (a) with the ones produced by two SM speakers in (b).Two SM speakers were recruited to complete the same task as the TM speakers.The results from the two SM speakers show similar pitch patterns as reported by Chen and Xu (2006).Therefore it is presented for comparison, despite the small sample size.Comparing Figure 3(a) and (b), the pitch range of TM is smaller than that of SM, as reported by previous acoustic studies (Fon & Chiang, 1999;Torgerson, 2005).The pitches of TM range between 2 ~ -2 semitone z-score, while the pitches of SM range between 3 ~ -3.5 semitone z-score.More importantly, the pitch contours of TM and SM differ drastically.In TM, aside from the reduplicated /H/ and the reduplicated /LH/, all neutral tones have slight falling pitch contours reaching to -0.5 ~ -1 semitone z-score.
In contrast, as found in Chen and Xu (2006), all four SM pitch contours are distinct.The SM neutral tones after /H/ gradually move from high to the mid-low pitch range; the neutral tones after /LH/ complete the rising at the first neutral tone, and then lower to the mid-low range in the following neutral tones; the neutral tones after /L/ gradually rise to the high register until the second neutral tone, and then slowly lower to mid pitch range; the neutral tones after /HL/ have a fast pitch drop in the first neutral tone, and remain at mid-low pitches in the following neutral tones.The four pitch contours only start to converge to mid-level at the end of the third neutral tone in SM.
The influences of the preceding lexical tone differed in SM and TM.The pitch range of the neutral tones in SM remains 3 ~ 3.5 semitone z-score even in the third neutral tone, while the pitch range of the TM neutral tones reduced over time.In SM, the neutral tones are heavily influenced by the preceding lexical tones, and the influence of the preceding lexical tone is significant even at the end of the third neutral tone syllables (Chen & Xu, 2006, p. 61).Furthermore, aside from the obvious different pitch patterns of the post-L neutral tones contour, the influences of the preceding /H, LH, L/ were still obvious at the third neutral tone in SM.On the other hand, the four pitch contours in TM immediately started to converge in the first neutral tones.The results of the first neutral tone in 2.4.1 and the second neutral tones in 0 show that if the pitch was higher due to the influence of the preceding lexical tone, the pitch fall was also estimated to be larger.The pitch contour started to converge and reach toward a mid to mid-low pitch.Although the effect preceding tone was still significant at the second neutral tone in two consecutive neutral tones, the estimated effect was small, and the pitch contours in the context of varying preceding tones share a similar slope.The third neutral tone in three consecutive neutral tones further shows that the influences of preceding tone on pitch were no longer significant, and these pitch contours share a similar flattened slope as well.Overall, compared to the SM neutral tone, the pitch target of the TM neutral tone is implemented in a faster manner.In SM, the mid pitch target claimed by Chen and Xu was unable to be implemented at the end of the third neutral tone.However, the results show that the TM pitch target was approximated immediately at the first neutral tone, and the preceding lexical tone effects had disappeared at the third neutral tone.
Unlike SM, the TM neutral tone seems to have a static pitch target in the mid-low to low pitch range.This experiment shows that in TM, each consecutive neutral tone aims to reach to a mid-low pitch range.The pitch contours started to flatten in the following neutral tones, which indicates that the pitch contour had been approximating to the pitch target.Although the estimated beginning pitches of these neutral tone rimes were slightly higher than the end point of the preceding neutral tone, the differences were minimum (less than 1 semitone).These differences are very likely due to the pitch raising effect of their preceding voiceless onset /t/ (Hombert, 1977;Hombert, Ohala, & Ewan, 1979;Ohala, 1972), rather than targeting for a falling pitch contour.
It should be noted that, although the TM neutral tone has a static pitch target that is approximated in a faster manner compared to the SM neutral tone, the articulation strength to approach this target was still weaker than a pitch target of a lexical tone.The previous studies on carry-over effects of preceding lexical tones showed that the influence of the end pitch of the preceding lexical tone lasted until the end of following /H/ and /LH/ in SM, but not until the end of following /HL/ and /L/ (Y.Xu, 1997); and the carry-over effects of the preceding tones are significant only at the end of the /LH/ tone in TM (Huang, 2013).
To sum up, the results show that the consecutive TM neutral tone has a mid-low or low pitch target, and the target was approached faster than the SM neutral tone, which suggests that it is more similar to a lexical tone.However, the results from this experiment alone are unable to characterize the TM neutral tone.It is unclear how these neutral-tone syllables fit into TM phonology, and if so, how they should be characterized.The next experiment compares the TM neutral-tone syllables with lowtone syllables, with the aim to investigate these questions.

Neutral-tone syllables vs. low-tone syllables
This experiment aims to find out whether the TM neutral tone should be treated as a lexical tone or a neutralized tonal pattern.Although the previous experiment showed that the TM neutral tones are more lexical-tone-like compared to SM, it is also possible to analyze these TM neutral-tone syllables as unstressed, and their pitches were neutralized into a mid-low tone.Furthermore, if these neutral-tone syllables had a lexical tone, how would it be distinct from the other lexical tones?In order to investigate 1) whether these neutral-tone syllables are unstressed, and 2) whether these neutral-tone syllables are distinct from the other lexical tones, this experiment compares the pitch, duration, and intensity of these TM neutral-tone syllables with the same acoustic properties of the low-tone (Tone 3) syllables.Out of the four lexical tones, Tone 3 was chosen because it is the shortest tone in connected speech in Taiwan Mandarin (Deng, Shi, & Lu, 2008;Shi & Deng, 2006).Moreover, the pitch contour of Tone 3 (low-falling) is the most similar to the TM neutral-tone syllables.
If these neutral-tone syllables were to have a weaker prominence, they would have a reduced duration (or even a reduced intensity) compared to the shortest lexical tone.In that case, it is best to analyse them as unstressed syllables with tone being neutralized into a mid-low pitch.However, if the neutral-tone syllables were not reduced in length or intensity, there is no acoustic evidence to support the analysis to treat these neutral-tone syllables as unstressed.They should be analysed as having a lexical tone.In that case, if these neutral-tone syllables had a distinct pitch target from the low tone, they should be analysed as having a fifth tone.

Design
The same subjects in the first experiment were asked to read a list of disyllabic words/phrases that contain the neutral-tone syllables and their corresponding lowtone syllables with the same segments or rimes.The tested (near) minimal pairs were both the second syllables of disyllabic words with varying preceding lexical tones.
Table 5 lists all the tested words used to elicit the syllables.The neutral-tone -zi [tsɨ Ø ] 'DIM' was compared with -zǐ [tsɨ L ], and -zhe [ʧə Ø ] 'DUR' has a corresponding lowtone syllable -zhě [ʧɤ L ] 'person'.The neutral-tone -men [mən Ø ] 'PL' was compared with -běn [pən L ] 'origin' because měn does not exist in Mandarin due to a phonological gap.For the same reason, three neutral-tone syllables with an /ə/ rime (le [lə Ø ] 'PFV', de [tə Ø ] 'POSS; NMLZ', and ge [kə Ø ] 'CLF') were compared with shě [ʃɤ L ] and chě [ʧ h ɤ L ]8 .Phonetically the rimes of chě and shě are described as [ɤ], while the rimes of the neutral-tone de, le, ge are described as [ə] because they are unstressed and reduced.However, even in SM, the quality difference between a stressed /ə/: [ɤ] and an unstressed /ə/: [ə] is not obvious (M.Lin & Yan, 1990).As for the reduplicated syllables, they were compared to their corresponding syllables that have a similar pitch and the same segments respectively: shu [ʃu Ø ], shen [ʃən Ø ], di [ti Ø ] in the kinship terms were compared to shǔ [ʃu L ], shěn [ʃən L ], and dǐ [ti L ].The two kinship terms shú-shu [ʃu LH -ʃu Ø ] and shěn-shen [ʃən L -ʃən Ø ] are sometimes applied with a /L-LH/ or /L-H/ pitch contour to show endearment in colloquial TM.However, the adaptation of the pitch contours is generally less common in read speech.Therefore these words were still chosen to compare with the low tone counterparts.The reduplicated high tone syllables were not included because the reduplicated mā kept the high tone as mentioned in the previous experiment.
The tested neutral-tone syllables were elicited in four different disyllabic words with varying preceding tones, but the tested low-tone syllables were tested only in the combination of /H-L/, /LH-L/, and /HL-L/ due to Tone 3 Sandhi-/L/ becoming [LH] when it is after another /L/.Therefore /L-L/ was not elicited because it is phonetically the same as [LH-L] in TM (See Myers and Tsay (2003) for a review).All the disyllabic words were carried in the frame sentence qǐngshuō X X bācì [tɕ h iŋ L ʃwo H X X pɑ H tsɨ HL ] 'Please say XX eight times'.All the sentences were written in Chinese characters.The 45 tested sentences were randomly mixed with the 73 filler sentences.The subjects were asked to read every sentence just once, but they could repeat a sentence if they felt they made a mistake.The whole procedure, including both experiments, took about 15 minutes.

Data analysis
The pitch data of this experiment was processed as described in 2.2.In addition, the duration and the average intensity were also extracted.In order to compare the pitch (contour) between /L/ and /Ø/, two sets of linear mixed models were fitted to the data.The first set analyzed the pitch contour differences between the low tone and the neutral tone at the first half and the second half of the rime.For the first half of the rime, the dependent variable is the repeated measured pitch in semitone at 5%, 15%, 25%, 35%, 45%; and for the second half of the rime, it is at 55%, 65%, 75%, 85%, 95%.Fixed effects for the two models included the tested tone (L, Ø) as a factor, position (the 5 pitch points coded in 0, 1, 2, 3, 4) as a covariate, and their interaction.The factor tested tone can capture their pitch differences at the starting point, and the tone-position interaction can capture the differences of the contour movement.In order to account for the speaker and item variances, the random effects included both intercepts of the subjects and tested items, as well as by-subject random slopes for tested tone, preceding tone (H, LH, L, HL), tested pair (zhe, zi, Cə, en, RED) and their interactions.
The second set of linear mixed model further investigated the effects of preceding tones and testing pair between /L/ and /Ø/ at the 75% of the second rime.The dependent variable was the pitch in semitone at 75% of the second rime.Fixed effects included preceding tone to examine the carry-over effects, tested pair to examine pair variations, and tested tone (L, Ø) to examine whether they are distinct.All their interactions were included as fixed effects as well.As for random effects, it included subjects and items as intercepts and their random slopes for all the fixed effects.If these TM neutral-tone syllables had a distinct lexical tone, I expect them to have a distinct pitch contour from the low tone regardless of its preceding tone and tested pair.Their pitch target should also be implemented in a similar manner compared to the low tone.
As for the duration, the rimes were also chosen to be measured instead of syllables because languages generally do not count the onset in their calculation of syllable weight; only the rimes are relevant to syllable weight.As for the intensity, the average intensity of the rimes were also extracted with a Praat script.Durational and intensity comparison were carried out with linear mixed models with pair segments (-en, zi, -zhe, Cə, and -u, -en, -i in reduplication pairs) and interaction between tone and pair segment as fixed effects.As for random effects, random intercepts for subjects and tested words and their random slopes for all the fixed effects were included.If the neutral-tone syllables were unstressed, I expect the durations of the TM neutral-tone syllables to be significantly shorter than the low-tone syllables regardless of the pairs.I do not expect the TM neutral-tone syllables to have a lower intensity as reduced intensity was not found in SM unstressed syllables either (J.Cao, 1986;Lin & Yan, 1980).

Pitch contours
The average normalized pitch contours of the low tone and the neutral tone with varying preceding lexical tones are plotted in Figure 4.The neutral tone had four pitch contours with varying preceding tones, represented by the solid lines, but the low tone syllable had only three average pitch contours, represented by the dotted lines, with preceding [H, LH, HL] because [L-L] does not exist in Mandarin due to Tone 3 Sandhi.6, suggests that at the 5% of the rime (baseline), the neutral tone pitch estimate was not different from the low tone one.However, the estimated pitch change for the neutral tone is -0.246 semitone per unit (10% of the rime), which is smaller than the estimated pitch change for the low tone -0.562 (=-0.246-0.317)semitone, and the difference was statistically significant (p<.001).The pitch fall of the low tone in the first half of the rimes was more substantial (2.248 semitone from 5% to 45%).On the other hand, at the second half of the rime, the bestfit model10 revealed that there was significant fixed effects of both tone [F(1, 15.510)=23.264,p<.001] and position [F(1, 967.208)=68.610, p<.001], but the toneposition interaction [F(1, 966.724)=.849,p=.357] was not significant anymore.The estimates of fixed effects in the right column of Table 6 shows that at the 55% of the rime (baseline), the low tone is 1.41 semitone lower than the neutral tone (p<.001)the pitch differences between the /L/ and /Ø/ were quite substantial at the middle part of the rimes.The pitch change estimate for the neutral tone were -0.117 semitone per 10% rime, which is not significantly different from the pitch change estimate for the low tone -0.094 (=-0.117+0.023)semitone.This suggests that both /L/ and /Ø/ have a flattened pitch fall at the second half of the rimes.The estimates of fixed effects are shown in Figure 5 to visually present what the modeled pitch contours looks like.The second set of the linear mixed model was carried out to investigate the effects of preceding tones and testing pair between /L/ and /Ø/ at the 75% of the second rime.The best fit model included preceding tone, tested pair, tested tone, and all the interactions as fixed effects.The random effects included an intercept for subjects and by-subject random slopes for the effects of tone and the interaction between tone and preceding tone.This model revealed that at 75% of the second rime, the pitch was significantly influenced by preceding tone [F(3, 46.962)=9.190, p<.001] and tested tone [F(1, 13.356)=51.545,p<.001].The pair and all the two-way interactions were not statistically significant, but there were a significant three-way interaction between tone, preceding tone, and pair [F(7, 259.041)=2.929,p=.006].The pitch of the neutral tone at this position was significantly higher than the low tone, and the estimated mean difference between /L/ and /Ø/ was 1.422 semitone (p<.001), similar to the estimated difference at the 55% point in the previous model.To better understand the fixed effects, the estimated marginal means with varying preceding tones and tested tones were calculated as shown in Table 7. Post-hoc pairwise using Bonferroni were carried out to 1) compare /L/ and /Ø/ after different preceding tones, and 2) compare the preceding tone effect on /L/ and /Ø/.The first comparison shows that /L/ and /Ø/ are different after /H, HL, LH/, the estimated mean differences are 1.746, 1.427, 1.712 semitone (all p<.000).The second comparison indicates that at this position, the pitch of the neutral tone was influenced by preceding tone-when after /L/, it is lower than after /H, LH, HL/, the estimated mean differences are 0.906 (p=.001), 1.014 (p<.001), and 0.341 (p=.883) semitone respectively.On the other hand, the pitches of the low tone were quite similar regardless of its preceding tone, their differences were not statistically significant.To further investigate the significant interaction between tone, preceding tone, and pair, post-hoc pairwise comparisons using Bonferroni were carried out, and the results are shown in Table 8.This Table only compares /L/ and /Ø/ when the preceding tone was /H, HL, LH/ because [L-L] do not exist due to Tone 3 Sandhi.The results show that most of the differences between /L/ and /Ø/ after different preceding tones were significant except for the pitch differences in the en pair and the zhe pair when the preceding tone was /HL/.In addition to the pitch data, it is worth noting that the pitch contours of many tested syllables were excluded when more than five consecutive f0 readings in their second rime were not available due to their creaky voice quality, which resulted from the lowering of the pitch.The numbers of syllables excluded due to creaky voice are shown in Table 9. Patterning with the pitch data, the low-tone syllables generally have a higher percentage of creaky voice (42.6%) than the neutral-tone syllables (21.0%).Furthermore, the preceding tone also plays a role as well-the neutral tone after /HL, L/ was found with more creaky-voice syllables (31.0% and 26.2%) than the neutral tone after /LH, H/ (11.9% and 13.9%).A model of Generalized Estimating Equations (GEE) was further fitted to analyze the repeatedly measure binary outcomes.The best fit model selected by the lowest QICC indicated significant effects of tested tone (L or Ø), preceding tone, pair (all p<.001), as well as interaction between tone and preceding tone (p=.002).The details of the parameter estimates are attached in Appendix B. These estimates show that the difference between /L/ and /Ø/ was statistically significant (B=1.41,Wald χ 2 = 10.462,p=.001), which indicates that compared to /Ø/, having a /L/ increase the likelihood of having creaky voice by 4.068 times (exp(B)=4.068).Furthermore, the effects of preceding tone and tone-preceding tone interaction show that for the neutral tone, the likelihood to be creaky is influenced by the preceding tone.For example, the neutral-tone syllables after /HL/ and /L/ were more likely to have a creaky voice compared to those after /LH/, the likelihood increased by 3.481 and 2.723 times (p=.006,p=.002) respectively.However, the effect of preceding tone was not significant for the low-tone syllables.
In summary, the neutral-tone syllables are less likely to have a creaky voice, and their pitch contours were generally higher than the pitch contours of the low tone.Also, the pitch height of the neutral tone was influenced by the preceding lexical tones.When the preceding lexical tone had a low offset (/L, HL/), the pitch contours of the neutral tone were closer to the low-tone pitch contours.The number of syllables with creaky voice (and thus excluded) also shows similar trends-a neutral-tone syllable is more likely to be creaky after /L/ and /HL/.

Durations
Table 10 shows the average durations of the rime of the first syllable (S1) and the second syllable (S2) of the tested disyllabic words in milliseconds (ms).The S2/S1 ratios provide a reference for the rhythm of the disyllabic words, but no further comparison will be made because the S1 rimes were not controlled in this experiment.Two S2/S1 ratios are presented in case there was speed variation: slower speech would have more influence on the (average S2)/(average S1) while the average S2/S1 ratios capture the rhythmic proportion of the disyllabic words.A statistical analysis on durations shows that the best-fit model included pair, as well as interaction between tone and pair as fixed effects.For random effects, it included a random intercept for subjects and tested words, and a by-subject random slope for pair.The analysis reveals that the durations of each pair are different [F(6, 40.858=32.822, p<.001] as each rime has inherently different length, but there is also an significant interaction between the pair and the tone [F(7, 32)=8.825,p<.001].As shown in Table 10, the durational differences between /L/ and /Ø/ within each pair were only statistically significant among the zi, zhe, and Cə pairs, but not for the -en pairs and the reduplication pairs.The model estimates of fixed effects (shown in Appendix C) illustrate that for the Cə pair, /L/ is 29.3 ms longer than /Ø/ (t(32)=6.071,p<.000); for the zhe pair, /L/ is 37.1 ms longer (t(32)=4.089,p<.001); and for the zi pair, /L/ is 21.5ms longer (t(32)=2.774,p=.009).To sum up, the durational evidence suggests that neutral-tone zi, zhe, de, le, ge are reduced in duration, but there is no evidence suggesting that the neutral-tone men and the reduplicants are reduced in duration.
The comparison of the average intensities of the rimes of the two tones showed that there was no signs of intensity reduction of the neutral-tone syllables in TM, which is not surprising because intensity differences were not even found in SM.The average intensities of the neutral-tone syllables were all slightly higher than their low-tone counterparts, suggesting that the intensity co-varies with the pitch contours as observed by Lee and Zee (2008).However, the differences were not statistically significant.

Discussion
This experiment compares the pitch contours, durations and intensity of the neutraltone syllables with the low-tone syllables of the same rimes.The results show that overall the pitch contours of the neutral tone are higher than the low tone.The analysis of the pitch contours show that both /L/ and /Ø/ had a steeper pitch fall in the first half of the rime and a shallower pitch fall in the second half of the rime.According to Y. Xu and Wang (2001), the pitch target of a lexical tone is approximated in a asymptotic manner-the f0 first rapidly move to the pitch target, then slow down gradually over time reaching to a steady state.The asymptotic pitch contours for both /L/ and /Ø/ suggests that both tones approached to their static pitch targets respectively.The statistical analysis further show that the two have a distinct pitch target.At both the 55% and the 75% of the rime, the low tone was about 1.4 semitone lower than the pitch target of the neutral tone, and their differences were statistically significant.Considering the normalized pitch contours shown in Figure 2, the static pitch target of the neutral tone should be characterized as mid-low because by the third neutral tone, the pitch targets were around -1 semitone z-score, the mid-low range of a speaker's pitch, just slightly higher than the following low tone.One might wonder whether the higher pitch contour of the neutral tone is a result of the pitch "undershoot" because the duration result shows that some of them are shorter than their low-tone counterparts.With a shorter duration, the f0 movement may approach to the pitch target with a dynamic pitch contour rather than an asymptotic one, and it may not reach the pitch target either (Y.Xu, 2005, p. 229).However, the comparison between each pair after different lexical tone (Table 8) shows that those pairs with significant durational differences (zi, zhe and Cə) did not necessarily have larger estimated pitch differences.Therefore the higher pitch contours of the neutral tone should not be analyzed as a consequence of a shorter duration.
As shown in Figure 4, the end pitches of the neutral tone syllables were generally above -1 semitone z-score, while the low tone syllables mostly end below -1 semitone z-score.A previous perception test (Huang, 2011) also suggested that the end pitch of -1 semitone z-score is the perceptual boundary.When a pitch contour fell to the lower register (-2 ~ 0 semitone z-score), the stimulus was more likely to be identified as a low tone if the end pitch was lower than -1 semitone z-score; and the stimulus was more likely to be identified as a neutral tone if the end pitch was higher than -1 semitone zscore.The results of the perception test showed that the acoustic difference observed here is perceptually distinct to the TM listeners.Furthermore, our acoustic results showed that compared to the neutral tone, the low-tone syllables were more likely to be creaky.The low tone in Mandarin has been observed to co-occur with creaky phonation (Chao, 1968;Davison, 1991;Fon, Chiang, & Cheung, 2004;Zhu, 2012) as creaky voice is a side-effect of low f0 (Kuang, 2013).Mandarin listeners were also reported to utilize the creaky quality as a secondary cue to aid the low tone identification (Belotel-Grenié & Grenié, 1997;R. X. Yang, 2011), including contrasting with the neutral tone (Huang, 2011).Our results along with the perception evidence suggest that the low tone and the neutral tone have phonologically distinct pitch contours.
A more detailed factorial analysis was carried out at the 75% of the rimes.It shows that the pitch differences between /L/ and /Ø/ are most significant except for two pairs after /HL/.Also, the influence of the preceding tone was still significant on the neutral tone at this point, as suggested in Section 2, but not on the low tone.However, the pairwise comparison between /L/ and /Ø/ shows that the differences after each preceding tone were still significant.The examination of the proportion of creaky syllables also shows a similar pattern.Therefore the lower pitch contours of the neutral tone after /HL/ and /L/ should be treated as phonetically-driven rather than phonologically-targeted, and their contrast with the low tone were still maintained.
Moreover, aside from the phonetic evidence above, the phonological process of Tone 3 Sandhi also suggests that the neutral tone is distinct from the low tone (Tone 3).Tone 3 Sandhi in Mandarin requires that Tone 3 becomes Tone 2 when it is before another Tone 3.However, when a Tone 3 syllable is before a neutral-tone syllable, Tone 3 Sandhi was not triggered.This is another piece of evidence to show that the neutral tone is not the same as Tone 3.
The duration results indicate that not all of the neutral-tone syllables are reduced.The durations of the neutral-tone zi, zhe and Cə were all shorter than their low-tone corresponding syllables, suggesting that these neutral-tone syllables were shortened.However, the durations of men [mən Ø ] and the neutral-tone reduplicants were similar to their low-tone corresponding syllables.Even though these neutral-tone syllables in TM were shorter than their low-tone corresponding syllables, the durational differences were small compared to SM.Specifically, zi [tsɨ Ø ] was 28% shorter than zǐ [tsɨ L ] (56.5 vs. 78.0ms), and the durations of Cə /Ø/ and zhe [ʧə Ø ] (about 80 ms) were 27% shorter than the durations of Cə /L/ and zhě [ʧə L ] (about 110 ms).In contrast, in a study comparing the tonal minimal pair of full-full and full-neutral in SM, Lin and Yan (1990) found that the average duration of the neutral-tone S2 rime was 53% shorter than the average duration of the Tone 4 /HL/ S2 rime (100 ms vs. 214 ms).In terms of S2/S1 ratios, TM X-zi [tsɨ Ø ] , X-zhe [ʧə Ø ], and X-Cə /Ø/ were smaller than their low tone corresponding pairs, but the differences were generally around 0.1~ 0.2.On the other hand, the SM data showed their S2/S1 ratio of full-full was 0.89 and the S2/S1 ratio of full-neutral was 0.47.The difference between these two S2/S1 ratios was 0.42.Although Lin and Yan used a different lexical tone to compare, the results are still important for us because when a syllable is in a sentence, /HL/ is the shortest tone in SM while /L/ is the shortest tone in TM (Deng et al., 2008).Lin and Yan's study and mine both compared the neutral tone with the shortest possible lexical tone, and the results indicate that the durational and rhythmical differences in SM are much larger than in TM.
The split of the results in durations, with the zi, zhe and Cə pairs on the one hand and the -men and reduplicants pairs on the other, might be related to their frequency and lexical class.The neutral-tone syllables with shorter durations are all functional morphemes with high frequency, which are more susceptible to phonological reduction compared to content words which carry semantic meaning (Aylett & Turk, 2004;Bybee, Perkins, & Pagliuca, 1994;Selkirk, 1996;Z. Yang, Ramanarayanan, Byrd, & Narayanan, 2013).Specifically, zhe [ʧə Ø ] 'DUR', de [tə Ø ] 'POSS, NMLZ', and le [lə Ø ] 'PFV' are productive verbal and nominal suffixes, ge [kə Ø ] 'CLF' is the most productive generic classifier, and the diminutive suffix zi [tsɨ Ø ] is a very frequent fossilized morpheme, which does not carry a semantic meaning anymore.All of these morphemes are also very frequent in Mandarin speech (See Cai and Brysbaert (2010) for their frequency).On the other hand, the plural suffix -men [mən Ø ] is not obligatory and also limited in its distribution (Norman, 1988, p. 159;Ramsey, 1987, p. 64).As there is no grammatical number in Chinese, the plural meaning carried by the suffix -men [mən Ø ] is not redundant.Therefore it is more similar to content words.As for the reduplications, as discussed in 1.2, the verbal reduplication in TM simply reduplicates the lexical tone, and many of the reduplicated kinship terms and nicknames adopt a /L-H/ or /L-LH/ tone regardless of their lexical tone (Hsu, 2006).Therefore it is not surprising that the tested items (all kinship terms) did not have a reduced second syllable.
Although there are multiple factors that could influence the duration of a syllable, such as focus, inherent segment lengths of the rime, or its lexical tone, these factors were controlled in this experiment.The rimes of the compared syllables were the same and were put in an identical frame sentence, and the syllables that were used for comparison with the neutral tone possess the shortest tone in TM: a low tone.If the plural suffix and the reduplicants were unstressed, we would have found the shortening of their duration as observed in the rest of the tested neutral-tone syllables (zi, zhe and Cə).The diverse results suggest that syllable reduction does not apply to all the neutral-tone syllables in TM.The TM neutral-tone syllables all share a similar tonal target, but no acoustic evidence supports the claim that all of the neutral-tone syllables are reduced (unstressed).

General discussion
The results from both experiments demonstrate that the TM neutral tone differs from the SM neutral tone in the following ways: 1) The two dialects exhibit different pitch patterns on their neutral tone.As demonstrated in the first experiment, the TM neutral-tone syllables do not show post-L rising, instead they have a static pitch target in the mid-low to low range, and the comparison with the low tone in the second experiment confirms that the TM neutral tone has a static mid-low pitch target; 2) Compared to the SM neutral tone, the TM neutral tone shows a much stronger articulation strength.The mid-low pitch target of the TM neutral tone was approached in a faster manner compared to that of the SM neutral tone.As shown in the first experiment, the carry-over effect had disappeared in TM at the third consecutive neutral tone, but it remained strong in SM; 3) While the tonal neutralization is tightly connected to stress in SM-a syllable is toneless because it is unstressed-it does not seem to be the case for TM.As mentioned in 1.1, the SM "neutral-tone" syllables were in fact unstressed syllables because the reduction and the pitch patterns were found in all the unstressed syllables, whether they are lexically-marked as unstressed/neutral tone or not.On the other hand, the TM neutral tone does not seem to be tightly connected to stress in TM.It is less common and strictly lexically determined.The presence of the pitch pattern is not found in the de-stressed syllables-it is only found in those syllables which are specified as having a neutral tone underlyingly at the lexical level.Moreover, the durational comparison in Experiment 2 shows that not all of the neutral-tone syllables are reduced.Therefore, it is hard to argue for a correlation between stress and the process of tonal neutralization in Taiwan Mandarin.
Based on the acoustic evidence listed above, I propose that the neutral-tone syllables in TM should be analyzed as unaccented, and the lexical tone in these syllables is neutralized into a mid-low tone.The mid-low pitch target can be interpreted as an unspecified position with a reduced pitch articulation, similar to [ə] in the vowel space.The term unaccented is used to show that pitch is the only acoustic cue to mark prominence, such as in pitch-accent languages like Japanese, as opposed to stressaccent language that might utilize other acoustic cues such as durations and intensity to mark prominence (van der Hulst, 2010).By analyzing these syllables as unaccented, we can capture the fact that durational cues are not consistent among the TM neutraltone syllables, only some of them-mostly frequent function words-are prone to reduction.Therefore, describing the TM "neutral tone" syllables as "unaccented" would be a better analysis than describing all of them as "unstressed".There have been cases of other Mandarin dialects being documented as having unaccented syllables.For example, in the Xinjiang Mandarin dialects Barköl Mandarin and Ürümqi Mandarin, their neutral-tone syllables are analyzed as being unaccented because they undergo tone loss, but are not reduced, and the pitches of these unaccented syllables are conditioned by the preceding lexical tone (D.Cao, 1988;Wei, 2011).Similarly, the TM neutral-tone syllables also undergo tone loss but they are not reduced.The difference is that the TM unaccented syllables are neutralized into a mid-low tone.
There are three alternative ways to analyse the TM neutral tone.In the following discussion I will illustrate why they are inadequate, and thus the TM neutral tone should be analysed as unaccented.
First, the TM neutral-tone syllables can be analysed as having a mid-low lexical tone, which I term the fifth tone.It is because they have a distinct pitch target which differs from the other four lexical tones, and the mid-low pitch target cannot be predicted by the stress pattern and has to be specified underlyingly.Although the TM neutral tone is more lexical-tone like compared to the SM neutral tone, it should be noted that its articulatory strength is still weaker than lexical tones in the similar pitch range.As shown in Experiment 1, although the pitch contours start to converge immediately in the first neutral tone, the carry-over effects were not overcome until the third neutral tone.Although carry-over effects can be still significant on lexical tones, they are usually observed on lexical tones with higher offsets such as /H/ and /LH/ (Huang, 2013;Y. Xu, 1997).This piece of evidence suggests that these syllables do not seem to possess a lexical tone either.If these neutral-tone syllables possess a mid-low lexical tone, their pitch target should implemented in a faster manner.Furthermore, an extra mid-low level tone will be hard to fit in the tonal structure of /H, LH, L, HL/, or the newly proposed tonal structure of Taiwan Mandarin /H, M, L, HM/ (Huang, 2017).
Second, one might analyze these TM neutral-tone syllables the same as those in SM-"these syllables were neutralized because they are unstressed", and the difference between the two dialects is that in TM, the unstressed syllables are neutralized into a mid-low tone.This analysis would be similar to one of the analyses of Standard Thai unstressed syllables: lexical tones are said to be neutralized into a mid tone (Abramson, 1962).Nonetheless, this analysis needs to be rejected because our results show that not all the neutral-tone syllables are unstressed-some do not show a reduced duration.If the neutralized tone is a result of being unstressed, shorter durations are expected to be found in all the neutral-tone syllables.
Another possibility, suggested by some linguists in China (See L. Liu (2002)), is to analyze these neutral-tone syllables as a result of tone sandhi, i.e., syllables that undergo this tone sandhi change their tones to an unspecified tone with a mid-low pitch target.The problem with this analysis is determining the condition of this tone sandhi process.A tone sandhi process might be triggered by adjacent tones, e.g.Mandarin Tone 3 Sandhi; or by its position in a prosodic domain, e.g.Southern Min tone sandhi.However, this tone sandhi process is neither conditioned by its adjacent tones, nor by its position in a prosodic domain.Although they all locate at the right edge of a foot (and possibly other prosodic units), other lexical tones might occur in these positions as well.Therefore, it is not possible to predict the condition of this tone sandhi.In order to adopt this analysis, these linguists argue that the trigger of this tone sandhi is lexically determined, which makes no generalization about this process.This analysis is like stating that all the [d] in English are actually a result of a voicing of /t/, and this voicing process is lexically determined-of course no linguist actually claims this, instead the logical analysis is that /d/ is specified underlyingly.Similarly, rather than specifying the condition of the tone sandhi lexically, one could simply specify the unaccented syllables underlyingly.
The synchronic analysis of unaccented syllables is likely to be one of the manifestations of how the Mandarin neutral-tone syllables were adopted in Taiwan.As discussed in 1.2, the rhythmic features are manifested in the following ways: first of all, TM has less full-neutral disyllabic words.The evidence from the dictionary, computer input system, and online typing conventions suggests that the majority of the prescriptive neutral-tone syllables in compound words in SM are produced with lexical tones.Secondly, there is evidence suggesting that the post-L rising in the SM neutral tone was reanalyzed by the TM speakers as having a /H/ tone.When a Tone 3 is reduplicated in nicknames or kinship terms, a /L-H/ tonal pattern is often used to show endearment (Hsu, 2006).TM speakers likely produce the post-L rising without a reduced duration, and thus reanalyzed the post-L "neutral-tone syllables" as /H/.In comparison, the SM speakers do not treat the post-L rising as a /H/ because the post-L rising is always accompanied by a shorter duration, which serves as an additional perceptual cue.Therefore, the SM listeners identify the high pitch after /L/ as a neutraltone, not a /H/ tone.This study demonstrates another manifestation of the TM rhythmic features on tones.It shows that, for the rest of the remaining prescriptive neutral tone syllables, at least for those that were examined in this study, the TM speakers likely reanalyzed the SM neutralized tone patterns into a mid-low tone with a weak articulation strength instead of adopting one of the existing lexical tones.This is likely because the SM neutral-tone syllables all have falling pitch contours reaching to the low register except for when they occur after /L/.When the TM speakers did not reduce the duration for these neutral-tone syllables, the SM neutralized pitch patterns were likely treated as a specific pitch contour, rather than a result of destressing and subsequent tone loss.Consequently, these falling pitch contours were generalized by the TM speakers as an unaccented syllable with a mid-low pitch target.
It is worth mentioning that these unaccented syllables are rather peripheral and might be moving towards merging with Tone 3. Acoustic data show their voice quality and pitch contours are sometimes similar to Tone 3, especially when they follow a falling tone or a low tone.Evidence in writing also suggests that the TM speakers are aware of this phonetic similarity.For example, recently some young TM online users use the character 惹 rě /ʐə L / 'to provoke' as a substitute for 了 le /lə Ø / 'PFV' when posting comments online.The character 惹 rě /ʐə L / can be used as a phonogram because the lě /lə L / syllable does not exist in Mandarin and many TM speakers merged /ʐ/ with /l/; I have also observed that online users write 鵝紫 ézǐ /ɤ LH -tsɨ L / 'goose-purple' as a substitute for 兒子 érzi /ɤɹ LH -tsɨ Ø / 'son'.These pieces of evidence suggest that TM speakers are aware of the similarity between Tone 3 and the unaccented syllables, and in fact are willing to use Tone 3 to mark them if given the chance to create their own writing.
Furthermore, phonologically some TM speakers seem to treat the neutral-tone syllables as a Tone 3. I conducted a "wug test" on the four productive neutral-tone syllables perfective le, durative zhe, possessive de, and plural men.Twelve subjects were given obsolete characters with a made-up spelling and a made-up meaning, and they were asked to read a meaningful sentence with a tested suffix attached to each made-up character.Interestingly, one subject consistently applied Tone 3 Sandhi on the novel Tone 3 syllable before the neutral-tone syllable, suggesting that he treated the neutral tone as Tone 3 phonologically.This shows that possibly for some TM speakers, the low pitch of the neutral tone has led them to categorize the neutral-tone syllables as a low-tone syllable.Although Tone 3 Sandhi is not triggered in disyllabic L-Ø words because the pitch patterns are lexicalized, in a completely novel situation these neutral-tone syllables are treated like a low tone.Sociolinguistic research on a larger scale needs to be carried out to investigate whether this is a developing tendency, and, if so, how it affects speakers' production.
Unfortunately, this study was not able to examine all the neutral-tone syllables.The ones that were not examined include: 1) those disyllabic compound words that are prescriptively marked as having a neutral tone in computer input systems, 2) final particles, and 3) more nominal reduplications such as other kinship terms and verbal reduplications.However, despite a limited sample, the tested neutral-tone syllables in this study were among those words/syllables that were most frequently used in Mandarin.The emerging pitch contours of the unaccented syllables found in this study should be representative considering that they comprise a large portion of neutral-tone input/output in speech.Furthermore, the split durational results mean that whether or not those untested neutral-tone syllables reduced will not affect the conclusion of this study.Nevertheless, it will still be important to examine more neutral-tone syllables in future studies to investigate how neutral-tone syllables developed, and how they differ between different word classes and frequency.More work on different lexical-tone languages or other Mandarin dialects will also be useful in observing the interaction between tone and stress.

Conclusion
This study shows that unlike Standard Mandarin, the prescriptive neutral-tone syllables in TM either possess one of the four lexical tones, or they have a static mid-low pitch target that is different from other lexical tones.Furthermore, not all of these neutraltone syllables are reduced.In other words, the tonal neutralization is not a result of the de-stressing.I propose that the TM "neutral-tone" should therefore be analyzed as unaccented with a mid-low pitch target.While stress can affect the pitch patterns of lexical tones through different ways of tone neutralization, this study demonstrates that when the contrast between stressed and unstressed syllables is less pronounced, the unstressed syllables cannot be "reversed" back to their original lexical tones, and an analysis of unaccented syllables thus needs to be proposed.

Figure 1 :
Figure 1: Results of the two consecutive neutral tones

Figure 2 :
Figure 2: Results of three consecutive neutral tones: X-red-pl-poss-Y a. Results of TM speakers b. Results of SM speakers

Figure 3 :
Figure 3: Comparison of the results of X-red-pl-poss-Y by TM and SM speakers

Figure 4
Figure 4 Comparisons of tested neutral-tone syllables with the Tone 3 syllables by the same rimes

Figure 5 :
Figure 5: Estimated pitch contours of the rimes of /L/ and /Ø/

Table 1 :
Estimates of fixed effects on the first neutral tone in two consecutive neutral tones

Table 2 :
Estimates of fixed effects on the second neutral tone in two consecutive neutral tones

Table 3 :
Estimates of fixed effects on the second neutral tone in three consecutive neutral tones Toward the third neutral tone, the effects of the preceding tone lexical tone on pitch range and pitch contour slopes had disappeared.The linear mixed model on the third neutral tone revealed that there were no significant effects from preceding tone[F(3, 2.430)=4.509,p=.154]orinteraction[F(3,4.075)=2.058,p=.246].The position effect was also statistically insignificant[F(1, 8.645)=2.997,p=.119],suggesting the pitch contours were flattened.The estimates of fixed effects are shown in Table

Table 4 :
Estimates of fixed effects on the third neutral tone in three consecutive neutral tones

Table 5 :
Tested corresponding control syllables with the same segment

Table 6 :
Estimates of fixed effects on the first half and the second half of the rime

Table 7 :
Estimated marginal means by preceding tone and tested tone

Table 8 :
Pairwise comparisons between the pitch of /L/ and /Ø/ at the 75% of the rime by pair and preceding tone * .475161.586 .003

Table 9 :
Proportions of syllables excluded due to creaky voice