Morphology

Morphology (from Gr. μορφή [morfɪ] = shape/form + λόγος [loɣɒs] = word/speech/account) in the linguistic sense – as opposed to e.g. in biology or other natural sciences – is the study of word forms. It tries to explain how existing forms have come into existence, or how new ones can be constructed using the different elements a language puts at our disposal in terms of its (lexico-)grammar. And although I’ve already used the term word, assuming that we all more or less have an idea what this may be, based on our training in grammar, or simply because we take for granted that we know what they are from our daily-life experience, this concept is by no means as clearly defined as we might think. Thus, it’s important to first clarify a little what exactly we may mean by a ‘word’.

What Is a Word?

As we all know that we can generally trust dictionaries, let’s begin our exploration by looking at some general dictionary definitions of the entry for ‘word’. The Longman Dictionary of Contemporary English (4^th edition) really isn’t very specific, and simply defines a word as “the smallest unit of language that people can understand if it is said or written on its own”. The Collins Cobuild Learner’s Dictionary (Concise Edition) is at least a little more precise, stating that “A word is a single unit of language that can be represented in writing or speech. In English, a word has a space on either side of it when it is written.”.

However, when we examine both these definitions more closely, we realise that they don’t really tell us enough, or are at the very least somewhat problematic. For instance, both definitions would also technically encompass the number 3 in its non-spelt-out form, as well as the letters A or B, for example, but would we really want to consider these words? And, regarding the Cobuild one, how would we categorise what most of us would clearly consider words if they either appear at the very beginning of a text/paragraph or at the end of every sentence or before a punctuation mark, such as the ones highlighted in the following example?

This is a short sample of words, where some fit the above definitions to some extent, but at other times, they don’t. Let’s see whether you can easily identify why...

These would clearly not fit the definition of a word having to be surrounded by spaces because the first one wouldn’t have a space preceding it, whereas the others wouldn’t have any immediately following them. Furthermore, none of these definitions tell us anything about the actual composition of words, either, i.e. that they normally consist of one or more letters of the alphabet (or characters) in writing, and one or more sounds (phonemes, or phones, to be precise) in spoken language. The issue of surrounding spaces in writing is also clearly not applicable to languages such as Chinese, which makes it much more difficult for human learners or computers to determine where the individual words in a ‘sentence’ begin or end. And in all languages, when they’re spoken, words are generally not separated clearly from each other at all, unless spoken by a robot, or whenever there is a genuine prosodic break indicating a pause or the end of a functional chunk of speech.

However, even if we add the criteria that text/paragraph-initial words can be bounded in other ways, the above definitions still make it look as if a word always has to consist of a single such unit, which, especially with regard to English, doesn’t really work. To understand this better, let’s do a little exercise.

Look at the list of expressions in the box below and try to identify their meaning.
Google for the expressions to see whether you identified the correct meaning, or to see whether there are multiple meanings possible.
Do you think the examples constitute one word or two? Justify your answers to the best of your abilities next to each word or group of words.
Next, think about whether your perception may change if you add another word after the expression. Also think about whether you’d consider the resulting expression one, two, or three words.

Apart from the above issues in deciding what is a word in the first place, many words also have a number of variant grammatical or contextual forms. Thus, for nouns, we usual make a singular/plural distinction, for verbs, we may distinguish between 1^st/2^nd/3^rd person singular/plural or at least 3^rd vs. non-third person singular/plural, etc. We’ll discuss these features and their significance in more detail when we’ll talk about individual word classes further below.

Another example for a variant form can also be seen in the short paragraph we looked at above, where the contracted form Let’s contains a reduced (clitic) form of the pronoun us that has been ‘fused to’ the verb. Here, two words have actually been combined into what looks like one, and many people may be tempted to count as one as well, simply because they may adopt the same criteria as our dictionaries. To be more precise in our description, however, we should therefore avoid talking about words only in many cases, but rather us the term word forms instead, as well as constantly remain aware of the potential issues in defining the constructs we’re discussing.

If, in contrast, we refer to a particular word as a concept, we usually use a canonical/citation form (in English, singular for nouns, infinitive for verbs) to represent it. This form is also sometimes referred to as a lemma or lexical entry, especially in the context of dictionaries. There are two linguistic conventions that allows us indicate that we’re talking about the conceptual nature of a word, one that you've already seen me use above, where I’ve enclosed words in single quotes (‘...’), and another where we use (small) capital letters to represent the word, e.g. word. The latter is often used to represent all potential forms of a lemma, e.g. be to represent all the variant forms associated with the citation form, be, am, are, is, was, were, being & been. You may have noted above, too, that I’ve show the words I’ve been discussing in italics. This is yet another of the conventions employed in linguistics that you should get used to soon, and basically marks the word as being a sample of the language discussed – in our case, English – and not part of the description itself. Having discussed and hopefully also developed a better understanding of what a word is, we can now move on to investigating the parts words may be made up of.

The Morpheme

As you’ll hopefully have come to realise through the exercise we did earlier and on the previous page – especially when we looked at the reverse-sorted frequency lists –, and also through my explanations above, words may be made up not only of single items, but also of multiple individual word-like components or even smaller parts that somehow contribute to their overall meaning. These smallest sense-bearing units are referred to as morphemes in linguistics, and are conventionally marked up in curly brackets ({}).

To illustrate the possible relationship between words and morphemes more clearly, let’s take a look at how we can categorise single- and multi-component words according to their number of morphemes. Words that consist of a single morpheme are referred to as mono-morphemic, and are often, though definitely not exclusively, function words, i.e. words that tend to fulfil more of a grammatical than lexical function. In other words, they bear little or relatively indefinite meaning. Some examples of these would be {a}, {the}, {this}, {be}, {to}. If we do find mono-morphemic content words, i.e. words with deeper lexical meaning, these very frequently tend to express some of the more basic every-day life vocabulary and concepts, like {day}, {night}, {month}, {year}, {sun}, {moon}, {summer}, {winter}, {earth}, {sky}, {house}, {car}, {food}, {drink}, {water}, {eat}, {sleep}, {drink}, etc., and which are uninflected. Many of these have been a part of the English language more or less since the Old English period or maybe even before.

Words that contain more than one morpheme are called poly-morphemic, and may be composed of items that can represent individual words in their own right, as in e.g. {ice} {cream}, {light}{house}, {olive} {oil} (two morphemes, two words each), or one word form that can represent either an existing word or a slightly modified variant of it + one or more forms of inflection that may be attached to the beginning and/or end of the original word form, e.g. {eat}{s}, {high}{er}, {low}{est}, {un}{happy} (two morphemes each, but not two words), {in}{cred}{ible}, {im} {poss} {ible} (three morphemes each, but not three words). It’s very important to understand, though, that, as the last two examples have illustrated, morphemes do not need to look exactly like words, even if they’re not inflections, just as it is not to confuse morphemes with syllables, which represent phonological units. To try and understand this distinction, let’s do another exercise. As syllables are at least equally difficult to define properly as words, we’ll just use a working definition of the syllable, where it optionally consists of one or more initial consonants, followed by a vowel, plus one or more optional consonants. If you’re interested in a somewhat more in-depth discussion of syllables, you can also take a look at the syllables page of my Phonetics & Phonology course.

Analyse the words from above, trying to split them into the individual syllables by adding a dot in the appropriate places in the orthographic transcriptions of the word, which are represented in angle brackets. For comparison, the original morphemes are still indicated here, too.

Free vs. Bound Morphemes

As we’ve seen above, when we combine morphemes to produce new word forms, we can either combine existing words or elements that are not meaningful if they occur in isolation. Morphemes that can exist independently and meaningfully are referred to as free morphemes, all other morphemes as bound morphemes. The latter, just like the mono-morphemic function words introduced above, usually tend to carry a grammatical or meaning-distinguishing/diversifying function. We’ll discuss these in more detail when we’ll talk about word classes in the next section, so for now, I’ll just present some of the more salient examples briefly.

Bound morphemes with a clearly grammatical function are for instance the inflectional morphemes {tall}{er}, {be}{tter}, or {tall}{est}, {be}{st}, which allow us to indicate degree on adjectives, i.e. to produce comparative or superlative forms, as well as the singular/plural/person/aspect markers on nouns and verbs.

Some other bound morphemes allow us to change the word class by appending them at the end of a morpheme, or ‘invert’ the meaning or indicate repetition by appending them before it, e.g. {un}{fortun}{ate}, {im}{poss}{ible}.

A special case are the so-called cranberry morphemes, e.g. {cran}{berry}, {rasp}{berry}, {straw}{berry}, or {re}{ceive}, {per}{ceive}, {con}{ceive}. Here, we find what may superficially look like a free morpheme appended at the beginning of what clearly is a free, morpheme (in the {berry} examples), or following another highly productive bound morpheme (in the {ceive} examples). Some people only refer to the most clearcut cases in the above examples as true cranberry morphemes, i.e. those cases, where clearly no independent word, such as *cran or *ceive (the * conventionally indicates unacceptable forms in linguistics) here exists independently. However, even in the case of strawberry or raspberry, where we do find the independently occurring words straw and rasp in English, these words appear to have nothing in common with the meaning that they help produce in their function as cranberry morphemes, so, at the very least, we have to assume that they may have had a different meaning historically, which has since been lost and become opaque, or that we canot trace their origin at all. In the latter cases containing the morpheme {ceive}, we can at least trace the origin back to the Latin word capere, meaning ‘catch’, although even this may not really help us much in interpreting the overall meaning of the resulting two-morpheme words.

When looking at the three categories above, we can make another important distinction, that between grammatical morphemes (the first category) and lexical morphemes (the other two categories), where the latter contribute to the lexical meaning of the words, while the former contribute to their grammatical functions.

‘Morpheme Anchors’

As we’ve already seen in some of the examples above, the free morphemes that we can attach other free or bound morphemes to don’t always need to be identical in shape to existing words, or sometimes a final word (form) may consist of multiple morphemes added to the ‘front’ or ‘back’ of a free morpheme. To be able to distinguish between the different parts of the word form and which shapes or ‘contributions’ they may make, we can distinguish between different types of what I call ‘anchors’. The most general term for this type of ‘anchor’ is base, which essentially includes anything we can add to. Two more specific terms are root, a base that cannot be further analysed and therefore represents a minimal unit, and stem, which is a base that may already have another morpheme attached to it. Examples for the latter two are:

root: {able} (in capable, notable, doable), {ible} (= able in possible, credible); {late} (in translate, relate); {fer} (in confer, refer), etc.
stem: possible (in impossible), fortunate (in unfortunate, fortunately), etc.

Please note that, in some cases, it may not always be clear which part of a word should be considered the root element, especially not when two free morphemes are combined. As a rough rule of thumb, though, you can generally assume that the root belongs to major PoS, such as a noun, verb, or adjective that has something attached to it at the end, and that any other morphemes usually get added to the front of the base last. We’ll soon learn about some of the reasons for this.

Analyse the following words, splitting them into their individual morphemes: inaudible, telephones, distasteful, hyperactivity, predetermined, individual. Also try to see whether you can determine the order in which the morphemes may have been joined together, whether they are grammatical/lexical, and maybe even whether you can tell which language they’ve come from. An example of how you can present this is provided for the first word.

Allomorphs

We’ve previously already noted that morphemes can take on different shapes when they combine with other morphemes. These variant shapes are called allomorphs (Gr. άλλος = other), and are actually fairly common, as you’ll be able to see from the examples below.

plural {s}: {cat}{s} vs. {watch}{es} vs. {sheep} vs. {ox}{en}
third person singular {s}: {sit}{s}, {catch}{es}
‘past participle’ {ed}, {en}: {look}{ed}, {tak}{en}
‘negation morpheme’: {in}{cap}{able}, {im}{polite}, {un}{believ}{able}

Often the shape of different allomorphs is conditioned phonologically, i.e. has (originally) been changed in order to facilitate pronunciation in different contexts. A good example of this is the word impolite above, where the original negation morpheme {in} – actually an allomorph of {un} in this case – is replaced by {im} due to the nature of the initial <p> in polite through a process called anticipatory assimilation, where we ‘adapt’ a consonant sound to the one that follows it in order to make it easier to pronounce the two sounds in sequence. Phonologically, we can represent this as: /ɪn/ + /pəlaɪt/ → [ɪmpəlaɪt]. You may also have noted that I enclosed the letter p in angle brackets; this is yet another convention, used to distinguish written forms – called graphemes – from sounds or morphemes.

Sources & Further Reading:

Bauer, L. (1983). English Word-formation. Cambridge: CUP.

Plag, I. (2003). Word-Formation in English. Cambridge: CUP.