THE INFLUENCE OF THE ASSIMILATION OPERATOR , SPEECH RATE AND LINGUISTIC BOUNDARY ON THE PRODUCTION OF / Z / IN CROATIAN

THE INFLUENCE OF THE ASSIMILATION OPERATOR, SPEECH RATE AND LINGUISTIC BOUNDARY ON THE PRODUCTION OF /z/ IN CROATIAN It is widely accepted that invariant and discrete phonological units at the linguistic level are transformed into variable and continuous movements of speech organs, which in turn results in equally continuous acoustical results. The variability of phonemic units depends on neighbouring phonetic units, but also on the various linguistic, communicational and pragmatic contexts of a particular speech act. The influence of phonetic units upon each other results in adaptations, coarticulations and assimilations. By means of assimilation at least one distinctive feature of a phoneme is changed, so the observed phoneme becomes similar to its neighbouring sound – the assimilation operator. This paper is aimed at analysing the influence of speech rate on assimilation processes in the voiced fricative /z/, when it is preceded by sounds /s, z, ʃ, ʒ / in four different types of articulatory joint: sentence, clausal, lexemic and proclitical. The articulatory joint refers to the production of two phonemes separated by different types of linguistic boundaries. Twenty female native speakers of Croatian with no history of speech or hearing impairments read a text at both natural and fast speech rates. The acoustical recording was performed in a sound-treated room. The Praat software was used to analyse six variables in all occurrences of the sound /z/: duration, spectrum

In each of these levels the information is coded with a special system of signs and the whole communication chain has to meet the basic requirement to ensure sameness/uniformity in order to achieve the final communication goal, namely that the listener receives the exact information the speaker has sent.Given that the speaker/listener is not an ideal mechanism and has a range of articulatory and acoustical limitations, the whole process of speech communication is not ideal, with interruptions in the transmission of information occurring in different parts of the communication chain.One of the ways to ensure the efficiency of spoken communication is coding it with a high degree of redundancy.
Unlike the hierarchical models of speech production, other models are based on theories of spreading activation in which speech production is shown as an entwined system of units of various speech production levels depending on their degree of activation (Stemberger 1985;Dell 1986;Erdeljac 2009).The network is pictured as a range of levels in which the units of each level are presented as knots connected to other units of the same level with associative connections, and the unit that is activated to the maximum spreads its activation to the next level.This demands timely activation of the needed unit, inhibition of used and other at the moment unnecessary units, as well as the appropriate activation transmission to the next unit.Errors in speech production are interpreted by inadequate activation of needed units, and inadequate inhibition of realised units.

Articulatory Joints
Given that speech in its horizontal timeline can be divided into speech units (Škarić 2007;Horga 2005) that have their own correlates on the linguistic level, a question can be asked as to how particular units are joined together, i.e. how articulatory joints are organised.An articulatory joint is defined as a spoken realisation of two sounds where two speech units of the same hierarchical level meet, with a linguistic boundary passing between them (Horga 2005, Horga/Liker 2006).For example, in the sentence Doputovao je iż Zagreba /z#z/ represents a proclitical articulatory joint that connects the proclitic iż and the full word Zagreb, and between these two /z/ phonemes passes a proclitical linguistic boundary /#/.Theoretically, we can talk about the following articulatory joints: discourse (connection between two discourses: …To ti je priča o Snjeguljici.#Sadaći ti ispričati priču o Crvenkapici…), paragraphic (connection between two spoken paragraphs: …U Splitu je proveo dva mjeseca.#Nakonpovratka u Zagreb nastavio je studij.Upisao je…), sentence (connection between two sentences: Čekao sam ga na kolodvoru.Vlak je kasnio.),clausal (connection between two clauses: Kad sam došao na kolodvor,# saznao sam#da vlak kasni), word phrase (connection between two word phrases: Doputovao je#brzim vlakom.),lexemic (connection between two spoken words: …brzim#vlakom…), clitical (connection between a full word and a clitic that can be either a proclitical: iz#Splita, or enclitical: doputovat#će) joint, morphological (connection between two morphemes), which means a compound in the production of compounds (prefixal: iz#govor, suffixal: mlad#ost, and ending: doputova#la), syllabic (connection between two syllables: Za#greb) and sound-related (connection between two sounds in a syllable: Z#a#g#r#e#b).The boundary between the first and second members of an articulatory joint can be of different degrees of exposure.The clearest boundary between two members of a joint is a pause.The duration of a pause is shortened from those joints that connect larger units towards those that connect shorter units.and that is why the pause will probably not appear in sound, syllable and clitical joints and will be regular in discourse, sentence and clausal joints, while in word phrase and lexemic joints the use of a pause could be freer (Horga/Liker 2006).Even from the standpoints of speech production and receiving speech, the important question is whether the boundaries between particular members of articulatory joints are encoded into spoken sound.or are erased because of coarticulatory influences.Therefore, the speaker is trying to produce the most economical articulatory movement in connecting articulatory gestures characteristic for the neighbouring segments, but at the same time aiming to preserve their linguistic distinctiveness so that the listener can understand what is being said (Lindblom 1979).

Coarticulation
Both in speech production and speech perception the important question is the dichotomy between the physical articulatory-acoustic level of spoken signal and its representational level.The following fundamental communicational question can thus be asked: in what way is the sameness of the information sent by the source, and the one the listener perceives, realised in spite of such reshaping?(Horga/Požgaj Hadži 2012) On a linguistic representational level, the information is presented by the system of abstract, invariant and discrete units, i.e. phonemes.The reshaping of this abstract representation into a spoken acoustic signal is made on a performative-articulatory level, with movement of speech organs that conduct changeable, continuous and partially simultaneous motor patterns that result in an equally variable, continuous and non-discrete acoustic signal.The variability of an acoustic spoken signal and the inability to achieve its segmentation into clearly separated representational discrete units is a consequence of, on the one hand, the specificity of human articulatory abilities, and on the other the universal speech phenomenon of coarticulation defined as a systematic and reciprocal activity of neighbouring spoken segments during their articulation, as they are connecting into larger speech sequences.This means that coarticulation assumes that every spoken segment contains the influence of neighbouring segments, but also that each segment is affecting its neighbours.The articulatory apparatus as a multi-componential mechanism enables the freedom of particular articulatory organs to make anticipatory movements for the next segment, and thus operate on the actual segment in a coarticulatory way.It is also possible for the articulator, active in pronunciation of the neighbouring sounds, to make a movement that is a compromise between the requirements of different neighbouring sounds.In this way, coarticulatory phenomena can be characterized as temporal or spatial.
Due to the coarticulatory influences of neighbouring sounds, different degrees of adaptations and segmental assimilations can be realised.Coarticulation processes can be explained in different ways (Farnetani/Recasens 2013).Glide hypothesis explains coarticulation as simply the mechanical inertia of speech organs that need time to shift from the pronunciation position of one sound into that needed for the next one.According to Lindblom's theory of "adaptive variability" and the theory of "hyper and hypo speech," speech variability occurs because of the constant adjustment of speech production to the requirements of the specific communicational situation.In some communicational situations it is necessary to produce a spoken signal which, from the point of view of the listener, has a maximum degree of contrast, meaning that "hyper speech" mechanisms are activated.On the other hand, in other situations a lesser degree of contrast between segments is allowed, and "hypo speech" mechanisms prevail in articulating the utterance.These are the reasons why pronunciation varies from hyper-correct to inaccurate and careless.Coarticulation mechanisms enable fluent speech production because, just like any other motor behaviour, they satisfy the principles of economy of motor effort.Therefore, the speaker is trying to produce the most economical articulatory movement in connecting articulatory gestures characteristic for the neighbouring segments (Lindblom 1979).According to the "distinctive features spreading" theory, coarticulation is not conditioned only by the mechanical limitations of the pronunciation apparatus, but makes up a constituent part of the phonological level in which the input commands for realisation of variable spoken segments have already been defined.Therefore, the coarticulatory influence of segments is also assumed and planned on the representational level.From the standpoints of speech production and receiving speech, the important question is whether the linguistic boundaries among particular joints are coded into a speech sound or are erased because of coarticulatory influences.
All these mechanisms on the levels of production and perceiving speech function effectively if the coder of the speaker and decoder of the listener are working effectively.Due to a lack of such abilities, speakers of a foreign language will be recognised as native speakers of another language because of their foreign accents.The speaker's skills to reorganise their motor articulatory programmes with regard to the pronunciation of a foreign language include the abilities to follow the rules of transforming the representational linguistic level into the performative articulatory level, and acquiring coarticulatory rules that can be language-specific are needed to reduce their foreign accent.

Aims of Research
The aim of this research was to analyse some characteristics of coarticulation on the example of the sound /z/, behind which, as coarticulatory operators, follow the sounds /s/, /z/, /š/ and /ž/ in a connected text read at a natural and then a fast speech rate.The participants were instructed to read the text as quickly as possible but to keep intelligibility at an acceptable level.Apart from sound operators and speech rate, the influence of different types of boundaries between the first and second members of an articulatory joint on coarticulation was also examined.

Methodology and Procedure
Twenty female students of the Faculty of Humanities and Social Sciences of Zagreb University, native speakers of Croatian with a regular speech status, were involved in the research.Only female students were chosen to make a homogeneous sample with regard to general acoustic characteristics.The spoken material was a connected text (1 page) in which the sound /z/ was the first member of the articulatory joint, while other members were sounds /s/, /z/, /š/ and /ž/, and the boundary between parts of the joint were sentence, clausal, lexemic or proclitical.A total of 32 articulatory joints were analysed for each participant (two joints for each lexical boundary in the natural and the fast speech rate), or a total of 640 articulatory joints for the 20 participants.The participants read the text at the natural speech rate (5.14 syl/s) first and then at the fast (6.71 syl/s) speech rate, with the latter being 30.5% faster than the former.They had three minutes for reading preparation for each speech rate.The text reading was recorded under laboratory conditions that enabled a quality acoustic analysis.The analysis of six acoustic variables in all occurrences of the articulatory joints, namely of the duration, spectrum centre of gravity, standard deviation of the centre of gravity, spectral skewness, spectral kurtosis and harmonic to noise ratio (Kent/Read 2002; Jones/Nolan 2007; Jones/McDougall 2009), was done using the Praat software (Boersma/Weenink 2015).The acoustic variables were measured after segmenting and annotation of the components of the articulatory joints (Figure 1).The spectrum noise centre of gravity is a measure of the biggest concentration of energy in the spectrum, and it approximately matches the central frequency of the analysed segment.Spectral skewness of sound is a standard deviation of the centre of gravity and the illustration of the spectral skewness of sound around the central frequency.Spectral kurtosis is a statistical deviation of the spectrum centre of gravity, and it shows the location most of the energy skewed around the centre of gravity: if most of the energy is below the centre of gravity, then the value is closer to zero or is negative, and if most of the energy is above the centre of gravity then the value is higher and positive.The prominence of the main amplitude is a statistical measure for the relative intensity of the most prominent part of the spectrum in relation to the neighbouring parts of the spectrum.Finally, the harmonic to noise ratio is a measure of the sonority of sounds, and the higher the coefficient the more intensive the harmonic component in the sound is.

RESULTS AND DISCUSSION
The issues of the influences of the sounds /s, š, z, ž/ as the assimilation operator, types of lexical boundaries inside the articulatory joint and speech rate on acoustic characteristics of the sound /z/ analysed in this paper fall into the general research of phonotactics, coarticulation, adaptation and assimilation that occur in connected speech.By analysing assimilation mechanisms as parametric, Škarić/Kišićek (2006) confirm in their research that "the speaker's conflicts that they investigate (i.e. the pronunciation of unpronounceable phonemes DH) are solved in a compromised manner, so that sometimes an assimilation request prevails, striving towards smooth and economical pronunciation, and sometimes the request for clear performance of phonemes, striving to articulate a hint by which the speaker would mark phonemic representation, and that would, along with redundancy, help the listener to reconstruct it."This conclusion is in accordance with Lindblom's views on communicational conditioning of coarticulatory processes, and research has shown that the activity of assimilation operators can be different in particular speaking situations, i.e. that the represented phonemic features could be realized or that the assimilation mechanisms can efface them.
To understand connected speech the listener must build a hierarchy of language units from sounds, syllables, words, word phrases or sentences.This process of perceptive segmentation also involves the revelation of the linguistic boundaries in articulatory joints.Analysis of the influence of linguistic boundaries on the pronunciation of articulatory joints of same phonemic content, but with different positions of lexical boundary (Horga 1996), shows that it has the biggest influence on the duration of consonants (stops were analysed), because in the utterance V#CV their duration is significantly longer, and on average is 83.5 ms, than in the utterance VC#V, where it is shorter and 69.5 ms on average.Therefore, the duration of the consonants has been shown to be a significant hint of the position of the linguistic boundary, while the duration and intensity of the vowel and explosion of stops were not statistically significant parameters in determining this.Horga and Liker (2006) examined the influence of lexical boundaries on the pronunciation of members of articulatory joints using electropalatography.The results showed that a possible indicator of lexical boundaries in lexemic articulatory joints for the consonants /t/ and /k/ was the index of duration of tongue-palate contact.The lexical boundary was also marked by the duration of a pause.The weight indexes of tongue-palate contact were found to be the indicator of lexical boundaries, while the indexes of alveolar and velar contact and the centre of gravity of the contact were not found to be important in determining the position of these.This conclusion is in accordance with Linblom's views on the communicational conditionality of coarticulation processes.The current research has shown that the influence of assimilation operators can be different in particular speaking situations, i.e. that supposed phonemic features could be realised, or that assimilation mechanisms to diminish or even enface them can be at work.These views of coarticulation are in accordance with the features of articulatory phonology in which phonological units are defined as planned movements and dynamic phonetic gestures with an intrinsic temporal dimension, in which the overlapping of particular gestures is allowed (Farnetani/Rascasens 2013).This can explain the even bigger overlapping of articulatory gestures in the faster speech rate, and the larger coarticulatory influence which was also confirmed in this research, based on the observed acoustic variables.

Realisation of Particular Segments of the Articulatory Joint
The articulatory joint can be realised with three segments (first member, pause, second member), with two segments (first member, second member) and as the one segment (first and second member of a joint connected and realised as one segment).Therefore, in the first case the linguistic boundary between members of the joint is clearly expressed with a pause, while in second case it is acoustically possible to divide first and second segments and the linguistic boundary is found in that division.Finally, in the third case the linguistic boundary is erased because two segments are realised as one under the influence of the assimilation operator.The growth of the activity of the assimilation operator can be expected to have the same order, i.e. that it is the lowest when a pause is realised and biggest when the articulatory joint is realised as one segment.The results on the number of articulatory joints regarding their content are presented in Table 1 and Figure 2, and they illustrate that the influence of the assimilation operator is bigger in the faster than the natural speech rate, and that it grows from the sentence, then from clausal and lexemic to proclitical joint, in which it is the highest (χ 2 = 266, p=0.00).
Table 1 and Figure 2: The number of articulatory joints realised with a pause (s1-p-s2), as two segments (s1s2) and as one segment (s1), regarding the type of the linguistic boundary: sentence (sent), clausal (clau), lexemic (lexe) or proclitical (proc) and regarding the speech rate: natural or fast.

Natural Fast
The average duration of a pause in the articulatory joint can be analysed in the same direction as the influence on the degree of assimilation, when it is realised, as can be seen in the Table 2 and Figure 3 (t=7.50, p=0.00).Namely, it can be said that the longer the pause, the lower the influence of the assimilation operator.It is seen that the average duration of a pause is longer for the natural rather than the fast speech rate, and that the duration of a pause is shortened from the sentence, to the clausal, lexemic and proclitical, without any significant difference between the last two.
The influence of the assimilation operator, i.e. the second member of the articulatory joint on the first member, is shown in Table 3 and Figure 4, which illustrates that, generally, the degree of assimilation is higher at the fast speech rate (χ 2 186.12; p=0.00), but that at both rates the assimilation is greater, i.e. the pronunciation of both members of the articulatory joint as one sound for sounds /s/ and /z/ is more common than for sounds /š/ and /ž/.Therefore, it is possible to conclude that the similarity or commonality of the place of articulation will contribute to the influence of the assimilation operator on the degree of assimilation.

Natural Fast
Table 3 and Figure 4: The number of articulatory joints realised with a pause (s1-p-s2), as two segments (s1s2) and as one segment (s1), depending on the second member of the articulatory joint: /s/, /š/, /z/ or /ž/, and regarding the speech rate: natural or fast.

Spectrum Centre of Gravity
Table 4 and Figure 5 show that the spectrum centre of gravity regardless of the type of linguistic boundary in the natural speech rate is shifted towards higher frequencies in comparison to that seen with the fast speech rate (t=6.65;p=0.00).This can be explained with the realisation of the complete articulatory movement and with the realisation of the necessary articulatory conditions for making the intensive and high frequency friction that is characteristic of fricatives.This can also be assigned to the greater influence of the higher frequency affricates /s/ and /z/ as assimilation operators (Table 1, Figure 2).When it comes to the influence of the linguistic boundary between members of a joint, that shift of the spectrum centre of gravity towards higher frequencies is greater for sentence and clausal boundaries than for lexemic and proclitical ones.Sentence and clausal boundaries are most commonly realised with a pause, so the first member of the joint, /z/, is found in the final sentence or clause position that enables that sound to become voiceless and with it the shift of the spectrum centre of gravity towards higher frequencies.By measuring acoustic parameters, Bakran (1996) concludes that in the Croatian language vowels and consonants within a syllable before a pause are significantly longer, regardless of the duration of the pause, and states that this lengthening at the end of the word is on average an extra 12% for vowels and 6-12% for consonants.Therefore, in the current research the more significant shift of the spectrum centre of gravity in front of sentence and clausal boundary in articulatory joints can be attributed to the lengthening of the /z/ sound in front of a pause, which is more common in this type of the articulatory joint than in lexemic and proclitical joints.

Natural Fast
It is worth mentioning the factor of making the sound /z/ voiceless in sentence and clausal joints in front of a pause, which contributes to the heightening of the spectrum centre of gravity for two reasons: on the one hand, the noise component of the sound becomes higher in frequency and, on the other, with the absence of voicing that is lower in frequency, the part of the spectrum that would usually shift the spectrum centre of gravity towards areas with lower frequencies is absent.
Table 4 and Figure 5: Spectrum centre of gravity (in Hz) regarding the type of linguistic boundary: sentence (sent), clausal (clau), lexemic (lexe) or proclitical (proc), and regarding the speech rate: natural or fast.

Spectral Skewness
When it comes to spectral skewness regarding speech rate (Table 5 and Picture 6), it can be said that the difference is significant only for /z/ in the sentence joint, because it is bigger at the faster speech rate than at the natural one, and it can be explained by a bigger articulation variability in the articulation of the sound /z/ allowed by a pause, which is very frequent in the sentence joint.This also explains the overall reduction of spectral skewness due to stronger connections between joint members and the bigger assimilation influence of the operator.In general, it can be said that the spectral skewness did not show statistically significant difference in the two observed speech rates (t=0.20;p=0.84), which can be assigned to the weaknesses of the measure itself.Namely, this measure was evaluated as ineffective in research on the acoustic characteristics of fricatives in the speech of people wearing dental braces (Horga et al. 2013).

Spectral Kurtosis
The values of spectral kurtosis (Table 6 and Figure 7) are generally higher at the fast speech rate than the natural one (t=5.61;p=0.00).Given that the spectrum centre of gravity at the natural speech rate is shifted towards higher frequencies, there is more space left at the lower frequencies for spectral kurtosis towards that lower frequency area.The different behaviour of spectral kurtosis in sentence and clausal joints, unlike that seen with lexemic and proclitical ones, is again prominent, especially in the natural speech rate.Namely, in sentence and clausal joints the kurtosis is placed towards lower frequencies, and in lexemic and proclitical ones towards the higher frequencies.To a certain extent, clausal joints at the fast speech rate are an exception.

Prominence of Amplitude
The prominence of amplitude (Table 7 and Figure 8) shows simultaneous movement of values as well as spectral kurtosis (t=2.11;p=0.04).The amplitude is bigger at the fast speech rate than at the natural speech rate, and it is bigger for lexemic and proclitical joints than for sentence and clausal ones, which shows a higher coarticulatory influence with the fast speech rate.
Table 7 and Figure 8: The prominence of amplitude regarding the type of linguistic boundary: sentence (sent), clausal (clau), lexemic (lexe) or proclitical (proc), and regarding the speech rate: natural or fast.

Harmonic to Noise Ratio
The harmonic to noise ratio (Table 8 and Figure 9) also shows that the coarticulation influence is higher at the fast speech rate because the ratio of harmonic sound in the sound /z/ is under the influence of the voiced assimilation operators /z/ and /ž/ at the fast speech rate, which is higher than at the natural speech rate, while under the influence of voiceless /s/ and /š/ it is somewhat lower (t=5.97;p=0.00).Horga et al. (2013) applies the abovementioned set of acoustic variables to investigate the influence of dental prosthetics on the articulation of fricatives.The results of the paper can be summarised in three points: firstly, speech with braces was closer to the speech of the eugnatic participants; secondly, in speech without braces the Natural Fast participants had fewer possibilities for compensational articulatory mechanisms; and thirdly, speech with prosthetics differed from the speech of eugnate speakers because prosthetics, as foreign bodies in the mouth, influence pronunciation.This research is mentioned because it showed that we can observe the characteristics of pronunciation of fricatives with these types of acoustic variables.It is also interesting that the spectrum centre of gravity was shown to be the most useful variable in this earlier research.
Table 8 and Figure 9. Harmonic to noise ratio in sound /z/ at the natural and fast speech rates depends on whether the assimilation operators are voiced sounds /z/ or /ž/ or voiceless sounds /s/ and /š/.

CONCLUSION
The results of this research can be summarised as follows: • the acoustic variables of spectrum centre of gravity, spectral skewness and kurtosis, as well as the prominence of amplitude, show that a faster speech rate has a higher assimilation influence on the members in the articulatory joint; • the sentence and clausal lexical boundaries between members of articulatory joints preserve the inherent phonemic features of the sound /z/, while lexemic and proclitical boundaries allow a higher influence of the assimilation operator; • the sounds /s/ and /z/ as assimilation operators, due to their similarity to the sound /z/ as the first member of the articulatory joint, have a greater assimilation influence than the sounds /š/ and /ž/, which differ from the sound /z/ when it comes to the place of articulation; • the assimilation influence of the sounds /z/ and /ž/ on /z/ is greater than that of the voiceless sounds /s/ and /š/ because of the same voicing, resulting in the higher harmonic to noise ratio of /z/; • the set of acoustic variables used in this work are proven to be good measures for use in research on assimilation influences.

Natural Fast
The next phase of this research could be an in-depth analysis of observed articulatory joints, and a comparison with the results of an acoustic analysis in order to get a more complete insight into the assimilation processes.Furthermore, it is possible to imagine a whole range of research that would analyse the assimilation relations of other sounds in the articulatory joint.Certainly, research into the physiological parameters of the activity of particular articulators, and a comparison with the results of acoustic and perceptive methods, would give a more thorough insight into the coarticulation phenomenon.It is widely accepted that invariant and discrete phonological units at the linguistic level are transformed into variable and continuous movements of speech organs, which in turn results in equally continuous acoustical results.The variability of phonemic units depends on neighbouring phonetic units, but also on the various linguistic, communicational and pragmatic contexts of a particular speech act.The influence of phonetic units upon each other results in adaptations, coarticulations and assimilations.By means of assimilation at least one distinctive feature of a phoneme is changed, so the observed phoneme becomes similar to its neighbouring sound -the assimilation operator.This paper is aimed at analysing the influence of speech rate on assimilation processes in the voiced fricative /z/, when it is preceded by sounds /s, z, ʃ, ʒ / in four different types of articulatory joint: sentence, clausal, lexemic and proclitical.The articulatory joint refers to the production of two phonemes separated by different types of linguistic boundaries.Twenty female native speakers of Croatian with no history of speech or hearing impairments read a text at both natural and fast speech rates.The acoustical recording was performed in a sound-treated room.The Praat software was used to analyse six variables in all occurrences of the sound /z/: duration, spectrum centre of gravity, standard deviation of the centre of gravity, spectral skewness, spectral kurtosis, and harmonic to noise ratio.The results showed that various linguistic boundaries, speech rates and sounds as assimilation operators influence the degree of assimilation of the phoneme /z/, as measured by the acoustic variables.