10,49 €
'As entertaining as it is engrossing' John Banville 'Enlightening, delightful' Arthur der Weduwen, author of The Library Why don't eleven and twelve end in –teen? The rest of our counting system sits in neatly arithmetical sets of ten, so why do these two rulebreakers seem so at odds with the numbers that follow them? Admittedly, that's probably a question that might never have occurred to you. But if you're even remotely interested in the origins and oddities of language, it's likely also a question you're now intrigued to know the answer to. Nor is it the only question: take a moment to think about how our language operates and even more spring mind. Why do these letters look the way they do? Why are some uppercase and others lowercase? Why are these words in this order? How are you understanding what these seemingly arbitrary shapes and symbols mean, while doubtless hearing them read to you in a voice inside your head? And what is this question mark really doing at this end of this sentence? Books explaining the origins of our most intriguing words and phrases have long proved popular, but they often overlook the true nuts and bolts of language: the origins of our alphabet and writing system; grammatical rules and conventions; the sound structure of language; and even how our brains and bodies interpret and communicate language itself. Why Is This a Question? is a fascinating and enlightening exploration of linguistic questions you've likely never thought to ask. 'Every page will make you stop, think and wonder.' James Hawes, author of The Shortest History of England 'Enthralling, with a riveting "who knew?" moment on nearly every page.' Caroline Taggart, author of Humble Pie and Cold Turkey
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 446
Veröffentlichungsjahr: 2022
‘An enlightening, delightful book that will make you question every sentence you’ve ever read or written. Filled with colourful anecdotes, surprising explanations and arresting facts, Jones covers an admirably vast range of linguistic history and grammar. A must read for anyone interested in how we use language, and how it shapes who we are.’Arthur der Weduwen, author ofThe Library: A Fragile History
‘Every page will make you stop, think, and wonder.’James Hawes, author ofThe Shortest History of England
‘Enthralling, with a riveting “who knew?” moment on nearly every page.’Caroline Taggart, author ofHumble Pie and Cold Turkey
‘Deft, informative, and packed with fascinating morsels.’Lev Parikian, author ofLight Rains Sometimes Fall
For my mam and dad
Every fact of every language, in the view of the linguistic student, calls for his investigation, since only in the light of all can any be completely understood. To assemble, arrange, and explain the whole body of linguistic phenomena, so as thoroughly to comprehend them, in each separate part and under all aspects, is his endeavour.
William Dwight Whitney,Language and the Study of Language (1867)
Preface
Introduction: What Is the English Language?
Q. 1 What Is a Word?
Q. 2 What Is a Language?
Q. 3 Where Do Languages Come From?
Q. 4 Where Do Words Come From?
Q. 5 What Is the Hardest Language to Learn?
Q. 6 Why Do Languages Have Genders?
Q. 7 Where Do Our Number Names Come From?
Q. 8 Why Is the Alphabet in ABC Order?
Q. 9 Why Do We Have Vowels and Consonants?
Q. 10 Why Do We Capitalise I?
Q. 11 Why Does i Have a Dot?
Q. 12 Why Is Q Always Followed by U?
Q. 13 Why Do We Have Double Letters?
Q. 14 Does Anything Rhyme with Orange?
Q. 15 Why Do Our Words Go in the Order They Do?
Q. 16 How Do We Read?
Q. 17 How Do We Speak?
Q. 18 How Do We Understand?
Q. 19 Why Is This a Question?
Q. 20 Why Do We Use Our Hands When We Talk?
Q. 14 Puzzles: The Solutions
References
Acknowledgements
Index
Everyone is interested in language. I honestly believe that. It’s one of the few things that connects us all – every culture on Earth has a language – and it takes only a glimmer of introspection to find what we say and why we say it interesting.
I noticed this recently, sitting in a pub one Saturday afternoon with a non-linguist friend of mine. He had spotted something I had recently posted on Twitter: the letter A is a tiny millenniaold drawing of an ox’s head, snout upwards, and if you trace its history back through time, you’ll find it comes from an Egyptian hieroglyph, . (The full story behind that comes on page 130.)
‘Is that really true?’ he asked. I get that a lot on Twitter – it’s almost as if there’s a lot of disinformation on there.
‘Yeah,’ I replied. ‘Snout at the top, horns at the bottom. Quite a lot of our letters come from hieroglyphs, actually.’
‘Really? That’s amazing.’
‘I know, right?’ I beamed. I was breaking through to the hardest of non-linguistic hearts. ‘Like, M looks the way it does because it comes from the hieroglyph for water, which was a sort of wavy, zigzaggy line.’ I drew a jagged line of mmmms in the air with my finger.
‘Wow.’ A pause. ‘That’s honestly amazing.’ He looked down at his phone, at the picture I’d tweeted showing the letter A’s gradual evolution from two-horned ox to two-legged triangle. Another pause. ‘It’s your round, by the way.’
Admittedly, yes, it was a short-lived moment of introspection, quickly brought to an end by an empty pint glass. But, hey, I’ll take it. Confirmation of my theory that there really is something about this that unites us all in shared curiosity.
Personally, it’s been over twenty years since I first became fascinated by the science of language. In an English class on my first day at college, I still remember the slow realisation that this was an English lesson, but not as I knew it. Shakespeare and Steinbeck remained steadfastly on their shelves, and in their place out came syntax trees, truth tables, conversational transcriptions, diagrams of the brain, Old English, Middle English and the phonetic alphabet. I was hooked. I didn’t know then that this would end up being such a big part of my life, of course, and I’d probably not have believed you if you’d told me that it would. But within a few short years I was a postgraduate student of linguistics, looking to follow that well-trodden path into research and academia.
But by the end of that course, I realised something was afoot. In short, I had hated it.
Not the subject, you understand, nor the brilliant, accomplished people I was working with. The problem was a wholesale one. It seemed to me that here was this magnificent subject – the most fascinating subject in the world – being preserved like a museum piece, behind glass and under lock and key, never to be touched or meddled with, and only ever to be shown to the people who had the time or inclination to enter the museum in the first place. I found myself wanting to tell everyone about everything, and how there’s so much more to the study of English – or, rather, to language itself – than most of us will ever learn from school. And yet the opposite seemed to be the norm. It felt secretive and cliquish, not open and collaborative, and I felt increasingly at odds with it.
On my last day, in a final meeting with one of my tutors, I took the plunge. I told her I wasn’t going to take my studies further as planned, and was going to gamble that prospective career in academia, go back to waiting tables, and try to build a career for myself doing what I had, in truth, always promised myself I would do. I was going to write. Away from classrooms and campuses, I was going to write about language, for as many people as possible.
‘A gentleman scholar?!’ she exclaimed, excellently failing to hide her horror. I guess she meant that as a slight. I think I took it as a badge of honour.
And now, here we are. It’s over a decade since I started writing purely for fun about The Most Fascinating Subject in the World™, and in that time I’ve written books and countless articles, given I don’t know how many talks and interviews, and found myself tweeting and blogging to a lively online audience as @HaggardHawks. Along the way, I like to think I’ve edged open those museum doors and made all of this a little more accessible to everyone – no matter their background or expertise, no matter how casual or professional their interest in language, and no matter their (in)ability to read Latin and Greek. This book feels a little like the culmination of that.
It’s been a long time coming; I think I’ve been mentally drafting and redrafting this for about eight years. Back then, I wrote a blog about the origin of the number eleven, and why both it and twelve aren’t listed among our teens. (A more robust version of that story is explained on page 109.) I kept thinking as I wrote it that this was one of those questions we would probably never think to ask, yet as soon as we did, we’d want to know the answer. I filed that thought away, along with the idea to answer several more ponderables like it in a book one day. Why Is This a Question? is now that book.
The chapters that follow answer twenty questions such as this, ranging from the basics of our language – defining our words and languages themselves – through to some of the more infamous quirks of the English language, and finally casting a more philosophical eye over the inner workings of language and human communication. It’s an immense topic, in retrospect, and without a doubt this has been the toughest writing challenge I’ve ever taken on. I’ve always said that distilling any academic subject for a mass audience is a little like walking a tightrope: you don’t want to talk down to people who have an understanding of it already, but you don’t want to lean too far the other way and talk over the heads of everyone who does not. Every sentence here has had to walk that line, and I can only hope my balancing act has worked. From an academic point of view, I hope too that some of the decisions I’ve made to keep technical jargon, symbols and academic conventions to a minimum are the right ones. And from an armchair linguist’s point of view, I hope that despite all the theories, models, studies and experiments, this doesn’t feel too much like a dusty old textbook. A guidebook to a dusty old museum, though – that, I would take.
Viewed freely, the English language is the accretion and growth of every dialect, race, and range of time, and is the culling and composition of all. From this point of view, it stands for Language in the largest sense, and is really the greatest of studies.
Walt Whitman, ‘Slang in America’ (1885)
A few miles from Germany’s border with Denmark lies a grassy patchwork of low-lying hills and lakes called the Angeln peninsula. That name, Angeln, is popularly said to refer to the fishhook-shaped angle at which this broad crook of land juts out from the European mainland to form the westernmost arm of the Baltic Sea. Angeln itself forms one of the northernmost tips of one of Germany’s northernmost states, Schleswig-Holstein, and to this growing list of superlatives we can add one more: this unassuming corner of Europe has inadvertently given its name to a language now spoken by one-quarter of the people on Earth.
Angeln was the homeland and namesake of the Angles, one of the ancient peoples whose arrival in Britain kick-started the development of the English language. England itself is literally ‘Angle-land’, and what you’re reading here is ‘Angle-ish’. But what’s on this page bears little resemblance to anything the Angles themselves would have known and used some fifteen centuries ago. The story of how their language became our language is the story of the English language itself.
So let’s set the scene. The Angles originally occupied much of the territory spanning the modern Danish border. To their north were the Jutes, while to the south dwelled the Saxons, the Frisians, the Franks and countless other tribal groups dotted across the western heartland of Europe. These were all descendants of an even earlier wave of migrants from the Ukrainian steppes, who began settling across Europe and Asia in the third millennium BCE. We’re so far back in history at this point that this early migration was probably sparked by the domestication of horses, a landmark achievement that allowed the burden of long-distance travel to be shared with animals for the first time. No longer bound to lands accessible only on foot, people could now journey much more widely – and, as they did so, these ancient wanderers brought with them their equally ancient language.
No record of that language survives, but just as long-dead creatures can be reassembled from their fossilised remains, historical linguists have been able to reconstruct much of it by unearthing evidence from the languages we use today. Similarities between different languages in the present often point to a common ancestor in the past, and as more of these family parallels are discovered, a more detailed ancestral picture can be drawn.
Through this kind of research, we now know with some certainty how this ancient protolanguage might have sounded and operated, what many of its words might have been, and we can even pinpoint where it first emerged: by combining linguistic evidence with more tangible evidence from archaeology and anthropology, we can retrace its speakers’ steps back across Europe to the northern and eastern shores of the Black Sea, where they and their language first emerged around 6,500 years ago. As their culture advanced and the world opened up as a consequence, these Bronze Age peoples migrated and eventually came to inhabit a vast region extending from the fringes and islands of western Europe to the Indian subcontinent. The language they spoke, ultimately, has come to be known as Proto-Indo-European.
With groups of its speakers now scattered so widely, contact between them naturally diminished. That isolation meant any quirks or local differences in the way each individual group happened to speak were not heard or adopted elsewhere. Over the next 3,000 years or so these differences gradually grew more numerous and accentuated, until the entire Proto-Indo-European language had broken up into a patchwork of regional dialects – each with its own unique local features – spoken everywhere from the beaches of Spain to the Arctic coasts of Russia and the banks of the Ganges. As they continued to diverge, these dialects became sufficiently distinct to be no longer understood by outsiders. Far from being merely different forms of the same mutual language, they had become the foundations of an entirely new set of languages.
In this way, almost every language now spoken across this vast stretch of the globe is a living descendant of this one ancestral protolanguage. Through the Indo-European family tree, English is related not only to its nearest geographical neighbours, including Welsh and Irish, but to the likes of Spanish and Italian, Polish and Albanian, Urdu and Afghan Pashto. Our linguistic ancestors wandered so far, in fact, that you could travel to the foothills of the Himalayas today and hear local Nepali speakers using such familiar-sounding words as naam (‘name’), musa (‘mouse’), patha (‘path’) and dryagana (‘dragon’).
In the area of Europe the Angles came to occupy, Proto-Indo-European initially devolved into a dialect known as Proto-Germanic. But by the first century BCE, this too had begun to break apart as scattered groups of its speakers developed increasingly distinct tongues. On the islands of Denmark and the coasts of Norway and Sweden, a new set of North Germanic dialects emerged; their descendants today include modern-day Danish, Norwegian and Swedish. In central Poland, an East Germanic branch arose, although its offspring (including the languages once spoken by the Goths and the Vandals) are now extinct. And in Germany, the Netherlands and mainland Denmark, a family of West Germanic dialects developed among the major players in our story – the Angles, Saxons and Jutes. Their descendants include German, Dutch, Flemish and Luxembourgish, and had history played out differently it’s likely these would have remained the only major West Germanic languages still in existence. But at this point in our story, all that changed.
In the mid-fifth century CE, many Angles, Saxons and Jutes started to abandon their homes on the mainland and cross the sea to Britain. Quite what compelled them to do so is unclear. The Saxons had been raiding British coasts for 200 years before they began to settle there permanently, so it’s possible their growing knowledge of the island prompted the move. Threats to agriculture, like droughts or flooding, might have proved a factor too, as medieval historians later recorded that Angeln was eventually abandoned altogether. But according to the most famous version of this story, the first Germanic settlers arrived in England for one very good and very simple reason: they were invited.
Britain at that time was only home to around 2 million people (though more conservative estimates put that figure closer to 500,000). Most of these were Celtic Britons, whose ancestors would have been among the islands’ earliest inhabitants. Further north were the Picts and Scots, and for a time Britain was home to a considerable Roman population too, following the emperor Claudius’ invasion in 43 CE. The Romans had introduced Latin, but outside the law and the military, the day-to-day language of many Britons had remained their native Celtic tongue, Common Brittonic. Had the Angles and Saxons never arrived, it’s probable this would have formed the basis of what you’re currently reading.
By the fifth century, however, the Romans were gone. With their empire dwindling and Rome besieged, the troops keeping Britain under Roman rule were now needed closer to home, leaving the cities they had founded to fend for themselves. In the face of recurrent (and, apparently, naked) raids from the north, many quickly began to struggle.
No sooner were they [the Romans] gone than the Picts and Scots . . . hastily landed . . . inspired with the same avidity for blood, and all more eager to shroud their villainous faces in bushy hair than to cover with decent clothing those parts of their body which required it.
St Gildas, On the Ruin and Conquest of Britain (c. 510)
In desperation, the de facto leader of the Britons, Vortigern, sent word to the Continent that mercenaries were needed to bolster his defences, and offered Kent’s Isle of Thanet as payment for all those who came to his assistance. Three shipfuls of fighters, led by two Jutish brothers, Hengist and Horsa, landed at nearby Ebbsfleet in 449. In the months that followed, they clashed repeatedly with the Picts and Scots, reportedly successfully defending the Britons’ interests every single time.
As ever more mercenaries arrived, however, the territory Vortigern had initially offered proved inadequate, and many new arrivals began settling much more widely across the island – with some even bringing their families and possessions with them to start new lives in Britain. This gradual encroachment proved understandably unwelcome to the Britons, and relations between the two sides soured. The situation reached a tipping point in 455, when both Horsa and one of Vortigern’s sons, Catigern, were killed in fighting near the village of Aylesford in Kent. In response, Catigern’s brother Vortimer raised an army and for a time succeeded in pushing the Anglo-Saxons back to the North Sea coast. But Hengist retaliated, boosting his own forces by inviting even more of his countrymen to come and take advantage of ‘the richness of the land’ and ‘the worthlessness of the Britons’. By the end of the decade, he had established himself as ruler of the now Jutish kingdom of Kent. Further north and west, more and more Germanic settlers were arriving on British soil. The Anglo-Saxon invasion had begun.
In a final attempt to offset further bloodshed, a summit was arranged in 472, at which it was hoped a peaceful compromise could be brokered. All those in attendance agreed to arrive unarmed as a show of good will, but midway through proceedings that promise was broken. Drawing swords from inside their robes, the Anglo-Saxons launched a surprise attack on the Britons, killing all except Vortigern and one of his earls. This single act of duplicitousness – the so-called Treachery of the Long Knives – effectively ended Celtic rule in Britain and left a vacuum of power the Anglo-Saxons were quick to fill.
Vortigern’s short-sightedness in effectively inviting a superior fighting force to the table has since become the stuff of legend, with much of this blood-spattered tale now considered a myth concocted later to make the founding of Anglo-Saxon England a more dramatic affair. Whether true or not, by the turn of the century, the Celtic hold on Britain was undoubtedly weakening. The Angles now controlled much of northern and eastern England; the Saxons ruled over the Midlands and south, and the Jutes maintained their south-east corner, alongside a second territory at Hampshire. For their part, the Britons made several attempts to reclaim their lands, but none had lasting success and many simply merged into the Anglo-Saxon way of life. Others retreated as their power and status dwindled, drifting westwards into Wales and Cornwall, northwards into Cumbria and the Borderlands, and southwards, across the Channel to Brittany. It is this retreat that remains at least partly responsible for the strongholds of Celtic language and culture that endure here today.
With the Anglo-Saxons now in charge of their ‘Angle-land’, the West Germanic language they had brought with them naturally became the principal language of ancient Britain. But with its speakers now cut off from their continental cousins, local differences again began to emerge that soon formed the foundations of an entirely new ‘Anglish’ language. The success of the Anglo-Saxon invasion therefore marks the beginning of our language’s history – but, even by this point, the language of early England would scarcely have resembled anything on this page.
For one thing, the earliest Old English texts were written in runes, not the Latin alphabet we use today. Initially, English maintained many of the complex grammatical features of its Germanic ancestor too, dividing its words into genders, and using an intricate system of word endings to flag the grammatical roles of the words in its sentences. When it comes to the development of our language, the Anglo-Saxon invasion might have placed the pieces on the board, but the game itself was just beginning.
So what happened? Where did our runic letters go? Why did the Latin alphabet replace them? And what happened to our gendered vocabulary? It is questions such as these, which are so seldom asked, that this book seeks to answer – but to do so addresses only one part of the story.
Yes, English no longer classifies its words into genders – but why do so many other languages continue to do so? Yes, our alphabet has long since replaced our Germanic ancestors’ runes – but where did these two different writing systems come from in the first place? Take a further step back from questions like these, and you might find yourself contemplating the true nuts and bolts of language. Why do different languages exist at all? Why do the letters you’re reading here look the way they do? How do they communicate what we want them to? As you read this sentence, how are you understanding what these seemingly arbitrary symbols mean? And, while we’re on the topic, what even is a question? Or a language? Or, for that matter, a word?
Our use of words is generally inaccurate and seldom completely correct, but our meaning is recognised none the less.
St. Augustine, Confessions (c. 397–8)
Have you ever been asked what a word means, but found yourself utterly unable to explain it? I can still remember the look of panic on my schoolteacher’s face when a boy in my class asked her what grace was (and why Mary was so full of it). And then there was the friend of mine whose endlessly curious three-year-old asked him what depth meant while he was filling her paddling pool during a lazy summer barbecue, and in doing so instantly bamboozled every adult in earshot. Some words, it seems, are just difficult to define. We know what they mean, and can use them without a second thought – but try putting that meaning into words and it’s hard not to resort to little more than a string of synonyms. ‘Depth? Well, it’s just depth, isn’t it? Like . . . deepness.’
When it comes to defining our indefinables, one of the great ironies of language is that the word word is one of them. It might not seem as though it should prompt the same navel-gazing as something like grace, and if someone were to ask you what a word was you’d probably be able to give them a fair idea. (‘A word? Well, it’s a word, isn’t it? Like . . . a little bit of language.’) But in practice, words are surprisingly difficult to pin down, and practically every test or definition devised to do so quickly comes unstuck.
One common explanation is that words are everything found between spaces in writing. That’s certainly how word-counting computer programs operate, and a glance over this page might make it seem a reliable yardstick. But how would you count the first word in the previous sentence, that’s? Is that one word or two?
Another way of defining it is that when extra material is added to a sentence, additional words will always fall between, not inside, those already there. So The owl and the pussycat went to sea could become The wise owl fledglings and even the aloof pussycat quickly went back to the sea. We’d certainly never find ourselves talking about an o-wise-wl or a puss-aloof-ycat, but any definition assuming words can never be infixed like this is abso-bloody-lutely flawed.*
Broad rules of thumb such as these are clearly little use here. A much better starting point is that simple definition from earlier: a word is just a little bit of language. As throwaway as that might seem, at first glance it makes sense. We recognise at, first and glance as words, and can read them here as individual ‘bits’ of language. But it’s explaining precisely what these ‘bits’ are that proves difficult, because as it stands that definition could easily be misinterpreted. After all, individual letters are just bits of language too, as are individual sounds, punctuation marks, nonsense jumbles of characters, and even whole sentences and paragraphs. To exclude everything that isn’t a word, while including everything that is, we’re clearly going to need some firmer ground rules.
Some are certainly more difficult to explain than others, but all words have a meaning. Adding that requirement immediately cuts out a lot of this excess noise, as letters and sounds have no meaning at all on their own, and sentences and paragraphs go too far the other way – blending multiple smaller units into larger, more meaningful wholes. Calling a word a single meaningful unit of language certainly feels like a more reliable definition, but there’s still a problem: in language, not everything that has a meaning is a word.
owl
owls
pussycat
pussycats
dog
dogs
house
houses
word
words
In English, we typically add an –s onto the end of a noun to create its plural – changing one word into many words, one dog into multiple dogs, and a detached house into a row of houses. We’d scarcely think of that –s as a word in its own right, yet to be capable of creating this kind of change it must have some kind of meaning. Put another way, if a dog is a canine animal, and dogs means ‘more than one canine animal’, then surely –s must be the part that means ‘more than one’. So wouldn’t that make –s too a single, meaningful unit of language?
The problem is that –s is not a word, but a morpheme. Morphemes are the smallest possible meaning-bearing components of a language; the meaning they carry (like the ‘more than one’ meaning of –s) is called a sememe. By definition, morphemes can’t be broken down into anything smaller that likewise has any kind of meaningful content. So while the –s of dogs is a morpheme, the meaningless d– is not.
Confusingly, that definition means many words count as morphemes too. Dog can be broken down only into its individual sounds, ‘d’, ‘o’ and ‘g’, and because they have no meaning on their own, it is a morpheme as well. Dogs, on the other hand, can be split apart – into its singular root, dog (‘canine animal’), plus the plural tag –s (‘more than one’). So, while one dog is a morpheme, multiple dogs are not. That overlap can make morphemes a tricky concept to grasp, but this distinction is an important one: whatever definition of a word we end up with, it will have to include the likes of both dog and dogs, while excluding the likes of –s. Dig a little deeper, however, and we have a neat way of doing just that.
Morphemes play a hugely important role in how our language operates. As well as changing singular words into plurals (dog, dogs), we can use the likes of –ing and –ed to change the tense of verbs (giving us talking and talked from talk), and use –er and –est to expand on our adjectives (making quicker and quickest out of quick). These are known as inflectional morphemes, as they work solely to alter the grammar of whatever root they attach to. Conversely, so-called derivational morphemes work to change the meaning of their roots, and thereby create entirely new words. We can use –less to form words implying an absence of something, like faultless or timeless, or tag anti– onto a word to create its opposite, such as antihero or anticlimax.
Unlike the words they connect to, however, on their own the likes of –s and anti– are lost. Attached to nothing, they mean nothing. You could no more draw someone’s attention to a pack of dogs by shouting ‘–s!’ than you could label yourself ‘anti–’ without there being something to be anti– against. These are bound morphemes – fragments of language whose sememes come to the surface only when they are ‘bound’ to other things. The opposite, like dog and house, are free morphemes, which need no such support. And when it comes to defining a word, this freedom is crucial.
The linguist Leonard Bloomfield defined a word as a ‘minimum free form’ – a single unit of language, smaller than a phrase or a sentence, that is capable of maintaining its meaning on its own. The likes of dog and dogs, owl and pussycat, linguist and definition all pass that test, but on their own, –s and anti– fail it. Demanding a word not only have a meaning but be independently meaningful therefore excludes bound morphemes, like –s, while including everything we can use them to create. It’s an ingenious solution. But where does that leave these?
doghouse
coffee shop
hotplate
ice cream
greenhouse
walking stick
downstairs
toy factory
blackbird
golf ball
loudspeaker
post office
When two or more words come together like this, they form a compound. Despite their twofold structure, compound words represent single concepts; if they didn’t, there’d be no difference between a loudspeaker and a loud speaker.
But while loudspeaker and all the other words on the left here are classed as closed or ‘univerbated’ compounds, united as a single word, those on the right are open compounds, divided by a space.* These too are single concepts, as you cannot drop the golf from golf ball any more than you could claim a house and a doghouse are the same thing. Open compounds are therefore single, independently meaningful units of language built from single, independently meaningful units of language. We can’t split them up, because their meaning relies on both their halves working together. So effectively, we’re back where we started: is coffee shop one word or two? Fortunately, Bloomfield foresaw this problem and devised another ingenious solution. Unfortunately, this time he leads us into increasingly murky water.
As anyone who has ever studied poetry or Shakespeare will know, all words have a natural pattern of stressed and unstressed syllables. Pattern is stressed on its first syllable: PA-ttern. Solution on its second: so-LU-tion. Some longer words have multiple stressed syllables: IN-di-VID-u-al. Yet all words have just one primary stress – that is, one syllable accented above all others. So while both the first and third syllables of individual are stressed, IN-di-VID-u-al, only the third syllable receives the primary stress: IN-di-VID-u-al.
In English, unlike other languages, this stress is unpredictable and can fall in different places in different words. But that inconsistency allows it to play a crucial role in how we interpret the words we hear. Written down, a word such as conduct, for instance, is ambiguous, yet read aloud, we can shift its stress back and forth to differentiate between a person’s behaviour (CON-duct) and the act of leading an orchestra (con-DUCT). In compounds, this disambiguating effect is even more obvious. Because greenhouse works as a single unit, it has just one primary stress: GREEN-house. Were its two halves just to happen to fall side by side in a sentence (He lives in the green house opposite the red house), they would operate independently, and so both be stressed: GREEN HOUSE. It’s a subtle difference, but in practice it’s enough for us to distinguish a glass structure for growing plants from a house that just happens to be green. It’s also why you can go downstairs without necessarily going down stairs, and why you wouldn’t need a loudspeaker to listen to a loud speaker. And it’s why only the first of these two statements is true:
A crow is a black bird.A crow is a blackbird.
Open compounds operate as single units too, and so have just one primary stress. Alter or reassign that stress, and their pairing will no longer function as a single unit but as a two-word phrase. A ball for playing golf is a GOLFball (open compound), but a grand party for the members of a golf club is a GOLF BALL (two-word phrase). A staff for aiding a walker is a WAL-king stick, but a WAL-kingSTICK is a pole that’s sprouted legs and ambled away. And while a child could play perfectly safely with a TOY FAC-tory, they would seldom be left unattended inside a TOYfac-tory. Try saying this sentence aloud too:
My French teacher is Swedish.
You probably read French teacher as an open compound there, and understood that sentence as referring to a teacher of French who just happens to come from Sweden. Stress precisely the same pair of words as a two-word phrase, however, and you’ll have a much more factually questionable statement about a teacher from France who is also somehow from Sweden.
Taking stress into account like this proves that open compounds behave just as all our other words, and have one primary stress. As a result, we need to ensure they are included under our definition – but relying on stress alone to decide what’s in and what’s out is problematic, as not all our compound words are quite so keen to follow the rules. How do you say ice cream, for instance? Some people stress its first syllable: ICEcream. Others stress its second: iceCREAM. Some stress both: ICE CREAM. Some people might even switch between the three in different contexts. Under Bloomfield’s stress rules, this would make ice cream a word to some people, a two-word phrase to others, and sometimes a word and sometimes a phrase to everybody else. That’s not a particularly satisfactory conclusion, admittedly. How can we define what a word is if we can’t all agree what is and isn’t as a word in the first place?
Perhaps, then, we need to rethink our approach. As soon as we’re forced to consider the stress patterns of individual words, our focus shifts from written language to spoken language. Speech predates writing, of course, as our evolutionary ancestors were talking to one another long before they thought to put pen to paper (or stylus to clay, as the case may be). Rather than bogging ourselves down in the problems posed by letters and spaces on a page, why not take our language back to its roots and consider a word as primarily a set of sounds? The sounds ‘d’, ‘o’ and ‘g’ (or phonetically, /d/, /ɒ/ and /g/) together form the word dog, /dɒg/, which carries the meaning ‘canine animal’. String together the seven sounds in /kɒfɪ ʃɒp/, and you’ll have the single meaningful compound word coffee shop, with its one primary stress, regardless of whether it is written open or closed.
Drawing this line in the linguistic sand gives us this definition: a word is a single independently meaningful unit of language, consisting of a sound or series of sounds, that can be represented as a series of written characters. Focusing on sound first and considering spelling only as an afterthought avoids the issues presented by compound words, while leaving bound morphemes and everything else to one side. Does that solve our problem? In basic terms, yes – if you’re looking for an answer to the question in the title of this chapter, then something along these lines is probably the closest you will come to defining a word in any kind of watertight way. But in broader terms, no – even at this late stage, there are innumerable problems here.
At the beginning of this chapter, we touched on whether that’s constitutes one word or two. We still haven’t answered that question, for the very good reason it’s all but impossible to do so conclusively. The same goes for contractions such as dontcha, shoulda, gotcha and imma, which operate as single word-like units in speech, despite being built from multiple individual words smashed together. Consider this too: now we know how bound morphemes work, would anti-dog qualify as a word? How about if someone were to comment on how coffee-shopless their neighbourhood was? You’d doubtless know what they meant, but is that a word?
Here’s a sobering thought as well: at no point here have we stepped outside the cosy confines of English to consider how this definition might fare in other languages. German is well known for its capacity to string multiple elements together to form single word-like units with multifaceted meanings. In 1999, the official Association for the German Language nominated Rindfleischetikettierungsüberwachungsaufgabenübertragungsgesetz as one of its Words of the Year.* Is that a single word, or a conjoined collection of individual words? We’ll come on to other languages in more detail later, but even without their input we’re still in a quagmire here. There is, however, some good news: in the grand scheme of things, none of this really matters.
As much as we might think of the study of language as being the study of words, when it comes to examining and comparing languages, the word word just isn’t a particularly useful one. The reason we’ve ended up discussing the likes of morphemes and sememes here is because labels like these are of much greater value to language study than anything as vacuous and temperamental as word. Sure, when we don’t need to be quite so academically rigorous, having a word like word to throw around is immensely useful. It lets us talk about the words on a page, jot down a few words, have a few words with someone, learn and translate words into a different language, and count the number of words in an assignment or essay. Up to the end of this sentence, you’ll have read 3,025 of them in this chapter, and as something with which to casually divide up our written world, word works just fine. When it comes to trying to pin down its meaning more precisely, however, perhaps we shouldn’t be too surprised that something we use so loosely defies all our attempts to define it.
_____________
* The emphatic insertion of one word inside another is called tmesis; when it happens specifically with profanity, it’s known as expletive infixation. Although it is often explained using examples such as fan-bloody-tastic or in-fucking-credible, in rhetorical terms tmesis can also refer to the emphatic division of the two halves of a pair of words, often for humour as much as emphasis, as in ‘Stop your chit and your chat!’ or ‘I want no hanky, nor panky!’
* It’s often just convention, legibility or personal preference that dictates whether a compound is open or closed. Without its space, postoffice is arguably too much of a jumble to function intelligibly as a single word – though the same was doubtless once thought of compounds such as matchbox and hairstyle, so perhaps post office will follow suit in the future. It’s by no means uncommon for words to migrate from open to closed over time, often as their familiarity increases. After all, today and tomorrow were both once two words, as were the first mailboxes and newspapers, and anyone who has lived through the internet age will have seen the first web sites, e-mails and voice mails become websites, emails and voicemails. Even today some compounds remain in a state of flux, and happily exist in both forms: you’re just as likely to see a sign for a car park as you are for a carpark.
* This sixty-three-letter behemoth refers to a law concerning the labelling of beef; when it was introduced to the state legislature of Mecklenburg–Western Pomerania, the minister responsible for it felt obliged to apologise. The occurrence led to Rindfleische ti kettierungs über wa chungsaufga benübertragungs ge setz earning both a place for itself in the Guinness Book of Records, and on the shortlist for the 1999 German Word of the Year. Perhaps understandably, it lost out to das Millennium.
Language is the most massive and inclusive art we know, a mountainous and anonymous work of unconscious generations.
Edward Sapir, Language (1921)
In 2018, a team of researchers studying orangutans in the forests of Indonesia made a remarkable discovery. They had devised an experiment in which one of the team would be disguised as a threat – draping themselves with a blanket of tiger stripes, for instance – and crawl on all fours across the jungle floor, beneath where a mother orangutan and her young were sitting in the branches. Understandably, in most cases the mothers reacted swiftly, raised an alarm and retreated further into the treetops. But, sometimes, they waited. In fact, sometimes as much as twenty minutes would go by before the mother finally produced her alarm call, long after the test was over and the threat had disappeared.
At first, the researchers were baffled as to why some of the mothers appeared to be acting at such a delay, but the reason eventually became clear. Far from being lax or inattentive, they were teaching.
By waiting until it had moved on, the mothers could demonstrate to their young how to report a threat without risking attracting its attention while it was there. Effectively, their calls were not meant to mean ‘There is a danger’ but rather ‘If you ever again see what we saw earlier, this is the sound to make.’ True enough, it was found to be the mothers with the eldest offspring – who would naturally be better acclimatised to life in the jungle – that tended to delay the longest, while those with the youngest offspring seemed justifiably more concerned with escaping to safety.
It was a groundbreaking discovery. The fact that the mothers were able to assess the risk of a situation then decide its potential as a teaching opportunity was remarkable enough, but this experiment had also proved orangutans are capable of something called displaced reference – the ability to communicate about something not actually present at the moment of communication.
Displacement is a fundamental aspect of human language. Without it, we would never be able to tell stories or recall anecdotes, talk about the past or the future, and our conversations would be forever confined to a world of only immediately visible and experienceable things. In order to understand this story, for instance, you would have to have an orangutan alongside you right now as a live reference point – hardly the most practical foundation for a system of communication. Yet in nature, displaced reference is rare.* When you hear birds chirruping in the trees, it’s fair to say they’re not discussing last week’s weather, or what they hope will be on the bird table tomorrow. They’re responding purely to the here and now – establishing territories, searching for mates, seeing off rivals, forming social bonds and reporting predators. Finding evidence of displacement in any wild creature was therefore a significant discovery. Finding it in one of our evolutionary cousins, however, had profound implications on the origin and development of our language, and our understanding of language itself in the natural world.
In the 1960s the linguist Charles Hockett included displaced reference on a list of what he called the design features of language. In all, he identified more than a dozen phenomena such as this that he considered collectively unique to human communication. So as well as our ability to talk about displaced things, our language can be defined by its learnability – our capacity to acquire new languages alongside those we already know. We can pass language on to other people thanks to a feature he called traditional transmission. What you’re currently reading is an example of reflexiveness – our ability to use language to talk about language. Prevarication is what allows us to concoct stories and tell untruths. And because the production of language is deliberate, it exhibits specialisation: it is an intentional act, not merely the by-product of some other process, as when a dog’s panting just happens to communicate to its owner that it’s hot.
Perhaps the most basic feature on Hockett’s checklist was the so-called vocal–auditory channel of human language – the twoway, back-and-forth arrangement by which one person speaks and another person hears. That channel in turn exhibits interchangeability: the listener can easily swap places with the speaker. And while the speaker can project sound in any direction, the listener can identify where that sound is coming from and shift to receive it more clearly thanks to a dual feature Hockett called broadcast transmission and directional reception.
The tendency of speech to fade instantly after it is produced is its transitoriness. Its capacity to carry meaning is its semanticity. A speaker can hear themselves talk thanks to something called total feedback. Arbitrariness means there is no sensible connection between the sounds we make and the meanings we attach to them: what we call dogs, trees and walnuts could just as easily be cats, clouds and footstools, as they’re all fundamentally random labels. The fact each is comprised of individual sounds (‘d’, ‘o’, ‘g’) is called discreteness. And as those sounds have no meaning on their own, language operates around a two-tier framework called double articulation: our near endless supply of meaningful words is built from a limited supply of meaningless sounds.
Many of these features are found in other forms of natural communication – not just orangutan calls, but birdsong, whalesong, and the nectar dances of honeybees. (Some, like double articulation, are even true of music.) Find yourself a system that ticks all sixteen of Hockett’s boxes, however, and by definition you will have human language.
As thorough as Hockett’s checklist might be, it’s fair to say his is hardly the most succinct of definitions. Instead, we can much more concisely say that language is a structured, speech-based system of communication.
We say it is structured because rules govern how it operates, and when those rules aren’t followed, intelligible language isn’t produced. The dos and don’ts of grammar form much of this basic rulebook, as without them there’d be no standard way of saying what we want to say and we’d all end up doing our own thing, forever unable to understand one another. These rules become so ingrained that we can use and apply them without a second thought, even to words we’ve never encountered before. If you were to discover the word shring were a noun, for instance, your inbuilt rulebook would instantly deduce that if you had a shring and I had a shring, we would together have two shrings. Discover it were a verb, however, and you would instead figure out that if you were to shring and I were to shring, we would both be shringing, and tomorrow we would have shringed (or even shrung).
But some rules are more subtle and are never taught to us as overtly as the rules of grammar, leaving us largely unaware we know them at all. Phonotactics is the branch of language that concerns what sounds and sound combinations are permissible in a language, and which are not. We largely piece this information together ourselves based on evidence from the so-called ambient language that surrounds us from the moment we’re born – so although shring is not an English word, you’ll know instinctively that it could be, while something like ngrish could not. You never had to be told that English doesn’t use the ‘ng’ sound at the beginning of its words (while some languages, such as Albanian and Cantonese, do) but you’ll nevertheless know that to be the case, based on a lifetime of linguistic evidence, not one word of which began ng–. Gaining an intuitive feel for what sounds right and wrong is a skill we develop astonishingly early: studies have suggested infants as young as nine months are already so attuned to the sounds of their family’s mother tongue that they can filter out unfamiliar sounds from other languages.*
We say that language is speech-based as it was from our ancestors’ vocalisations that it first developed some 100,000 years or so ago. It took written language only a mere 95,000 years to catch up, and even more recently we’ve added the likes of sign language and Braille to our inventory when our other senses are impaired. But such visual and tactile forms of language are only optional extras, not universals, and language remains a predominantly speech-driven process.
We can call language a system of communication because no matter how it is transmitted, the passing-on of information is its fundamental purpose. Language and communication are not the same, however, as language is only one form of communication alongside all the other sensory techniques living things use to send messages to one another. The territorial sprays of cats and the signal-sending clouds of pheromones released by insects can at least be likened to language, but such scent-based communication is mercifully not an inbuilt quality of language itself, leaving us perfectly capable of communicating without the need to release foul smells. At least, not intentionally.
When we talk about language as a whole, however, we mean something different from a language. That distinction was unpicked more than a century ago, by one of the founders of modern linguistics, Ferdinand de Saussure. In a revolutionary series of lectures in the early 1900s, Saussure separated language itself (which he gave the French name langage) from the individual languages we use to talk to one another (the langue) and the personal, idiosyncratic language produced by an individual (the parole, the French word for speech). According to Saussure, language as a whole, or langage, is partly characterised by our ability to invent words and give them meanings – or, in his terms, signifiers and signifieds. When enough of these signifying pairs become established among a group of people, a language, or langue, is created. But this vast mutually understood system of words and meanings is abstract, confined invisibly to the minds and voices of its speakers. Only when they talk to one another or write things down – that is, produce their parole – does the langue become anything tangible, and capable of being heard, read, circulated and studied. Neither half, therefore, can exist without the other: without the langue there would be no language to produce, yet without the parole, the langue would remain unspoken, and utterly undetectable.
We might not think along quite the same philosophical lines as Saussure, of course, but his concept of a langue nevertheless mirrors what we would call a language: a single system of communication mutually understood and maintained by a group of people. In other words, if language itself is our capacity to communicate, then a language is one of the means by which we do just that.
We can identify individual languages using three common features. Each one has a grammar, comprising the rules that hold it together, including those of its syntax, which dictates the order of the words in its sentences. Those words comprise its lexicon, or vocabulary. And that in turn is built from a set of speech sounds, or phonemes, that make up its phonological system. So just as written letters (D, O, G) form an alphabet, spoken phonemes (‘d’, ‘o’, ‘g’) form a language’s phonology.
A grammar, a lexicon and a phonology might be common to all known languages, but not every language showcases them in quite