Pauses & Transition Features

So far, we have mainly been looking at phonetic features on the segmental level and tried to identify characteristics that help us to ‘label’ or classify individual vowels or consonants. However, a major part of phonetic analysis also deals with features that occur between individual segments or influence them in such a way that they change their qualities in order to adjust to their surroundings. Such intersegmental phenomena essentially encompass two different types, linking and non-linking features.


Pauses perhaps represent the most ‘classical’ non-linking feature, as they either signify very deliberate breaks in utterances that serve as (non-)cohesive or turn-signalling devices on the one hand or, on the other hand, may be due to physical parameters influencing naturally occuring speech, such as the need to draw an occasional breath. As we have seen before, though, especially in connection with our discussion of plosives, not every period of relative silence actually corresponds to a pause. According to Laver:

A silent pause within a speaking turn can be operationally defined as any silence which is 200 msec or more in duration. The reason for setting a minimum threshold to the duration of a silent pause is that the silence associated with the closure phase of a voiceless stop can sometimes last up to 180 msec or so, depending on the overall rate of speech. (Laver, 1995: p. 536)

Thus, we can assume that anything above 180 msec in length will usually correspond to a pause and not be due to the closure phase of a plosive, although, of course, we can also have plosives at the end of pauses, in which case it isn’t actually clear where the pause itself ends and the closure phase begins. Furthermore, impressionistically, we usually distinguish between long and short pauses, but we also need an operational definition of what actually constitutes a long or short pause. I suggest we assume that anything above Laver’s threshold of 180 msec and up to 300 msec constitutes a short pause, and thus constitutes a minor tone group boundary (marked as |), whereas anything above 300 msec constitutes a long pause, and thereby a major tone group boundary (marked as ||).

Transition Features

We’ll discuss all other features, regardless of whether they may be linking or non-linking, under the heading of transitions because we can assume that in any place where a pause is not appropriate or physiologically necessary, such a transition from one word to the next should occur. In general, we can assume that most native speakers try to join words together unless they explicitly want to chunk information or put particular emphasis on specific words, so that non-linking can either have a specific purpose, be a feature of a particular accent that employs a high number of glottal stops that break the flow or an idiosyncratic speaking habit of a particular speaker.

Linking Transitions

Linking transitions encompass all transitions where a word is explicitly linked to the following one, usually either in order to facilitate pronunciation or to group chunks of words together into intonational phrases. Amongst these transitions are all types of assimilation and elision – including h-dropping and extreme vowel reduction or cases of contraction –, liaison and other types of linking in order to avoid a hiatus.

When analysing assimilation, we need to distinguish between two different types, anticipatory and perseverative. The former, where a segment at the end of a word adopts features of the segment at the beginning of the next word, is usually said to be the more common one in English, but this probably only true of the number of potential types, rather than actually occurring tokens. A classic example of anticipatory assimilation are constructions such as that case or might be, where the final (voiceless) alveolar plosive in the first word changes its place of articulation in preparation for producing the following velar or bilabial plosive respectively, thus yielding [ðakkeɪs] and [maɪpbiˑ]. Perseverative assimilation involves the copying of one feature of a (usually) stronger final consonant to the next initial one, such as in is that, realised as [ɪzzat], or in the, realised as [ɪnnə], where the weaker dental fricative assimilates to the stronger alveolar fricative or the nasal (plosive) respectively, despite the fact that the latter is relatively weak itself. A special type of assimilation, where two consonants are said to influence each other, is (yod-)coalsecence, which we can observe in realisations like [wʊdʒʊ] for would you. Further common examples of this are to be found in constructions like could you, did you, made you, etc.

Assimilation itself is often a precursor to elision and may thus facilitate the linking process by providing the basis for eliminating ‘duplicate’ consonants, as in a reduced form of the previous example of that case, where it is quite possible that this gets further reduced to [ðakeɪs], although the ‘dropped’ plosive may well be replaced by a ‘glottal onset’ to the remaining one, as in [ðaʔkeɪs]. Cases of h-dropping usually result in the insertion of linking glides, such as in [aɪjav] as a realisation for I have or, coupled with extreme vowel reduction, may provide the basis for a contraction, as in [aɪv]. Both, however, are only possible if have is either used as a full verb, rather than an auxiliary, or unstressed.

Linking glides, similar to the one replacing the glottal fricative in the example of I have above, occur in such common constructions as I am ([aɪjam]), how old ([haʊwəʊld]), etc., occur in place where a speaker avoid a break between two vowels (a hiatus) by ‘bridging’ them with the help of a semi-vowel. A similar thing happens in the well-known example of an intrusive r in the compund drawing room, realised as [dɹo:rɪŋru:m], where it would perhaps seem more likely, but less ‘idiomatic’, that a speaker could have used a w-glide instead.

Two more, and somewhat similar, linking features are the use of linking rs and cases of liaison. In the latter, the final coda consonant of the last syllable of the first word becomes resyllabified and thereby turns into the onset of the following word which would normally begin with an initial vowel. The same happens with linking rs, the only difference being that these normally only appear in writing , but are not realised when the word is spoken in isolation. However, when analysing different accents, we need to bear in mind that this concept is only applicable to rhotic accents.

Non-Linking Transitions

For most native speaker accents of English, it is quite natural to perform linking when the necessary conditions for it are given. Whenever we choose not to perform any linking in a position where this would be possible, it may therefore be due to our wanting to achieve a certain effect. Often, this effect is to highlight or emphasise specific words in an utterance or to mark a minor or major tone group boundary. Since final consonants in English tend to be considerably weakened by default or not fully released, the final release of a consonant and the fact that we are suppressing the option for e.g. assimilation often signals that we want to emphasise a particular item. Let’s go back to our previous example of that case. If we choose not to assimilate the final consonant in that, but instead produce an alveolar plosive and also give it a full release, our interlocutor(s) will usually be able to assume that we are using that contrastively here to single out a particular instance, rather than just any other one. If we lengthen a final vowel or consonant, we tend to indicate a pause before it actually occurs, thereby potentially giving our interlocutor an early signal that we may yield the floor to them. Conversely, as in the case of the hesitation marker, we may be signalling to our interlocutor that we are still planning what we want to say, but also that we do not want to yield the floor to them yet, so sometimes it isn’t quite obvious which one of the conflicting signs we may want to give.

Glottal stops are features that, by their very nature, interrupt the flow of speech to some extent. They may thus represent a sign of what is often referred to as ‘lack of fluency’ in non-native speakers, but are also more typical and not necessarily disfluent in specific accents, such as the speech of most younger British speakers. Here, it is often a matter of interpretation or to be seen as relative to the overall speech rate of a particular speaker as to whether we may want to regard them as features of real disfluency or not.

Sources & Further Reading:

Laver, J. 1994. Principles of Phonetics. Cambridge: CUP.

Weisser, Martin. 2001. A Corpus-Based Methodology for Comparing and Evaluating Native and Non-Native Speaker Accents. PhD thesis: Lancaster University.