Skip to content

Uh And Um Essays

Has anyone — a parent, teacher, or boss — told you to purge the words "um" and "uh" from your conversation?

When these words creep into our narrative as we tell a story at home, school, or work, it's natural to feel that we can do better with our speech fluency.

In How We Talk: The Inner Workings of Conversation, hitting shelves Tuesday, University of Sydney linguist Nick Enfield rescues those words (and everyone who uses them) from censure. In so doing, he exposes the fascinating and intricate workings of what he calls the human conversation machine: "a set of powerful social and interpretive abilities of individuals in tandem with a set of features of communicative situations — such as the unstoppable passage of time — that puts constraints on how we talk."

Using cross-cultural data, Enfield shows how rapid is the turn-taking aspect of human conversation. Across 10 languages (from Italy, Namibia, Mexico, Laos, Denmark, Korea, the U.S., the Netherlands, Japan and Papua New Guinea) the rule is clear: Speakers offer an answer to a question posed to them within 207 milliseconds, on average. The range goes from 7 milliseconds (wow!) in Japanese to close to a half-second in Danish.

Based on speech cues, we anticipate rather than wait for the moment when it's our turn to speak. We risk losing our turn, or seeming hesitant, if we don't jump right into the flow.

What happens, though, if we're experiencing some kind of processing delay as we ready ourselves to speak? Perhaps we can't think of the right term, or we're struggling to process an unfamiliar word we just heard. After 600 milliseconds, "social attribution" kicks in — that is, the delay becomes a matter of concern for the community of speakers. We may, at this point, utter "um" or "uh" as a signal that we are working toward producing speech.

The evidence shows we also may use these words intentionally as buffers before offering what are called dispreferred responses, or answers our conversation partners may not welcome. Let's say a friend asks you to an event that don't wish to attend, and you're about to decline. If you slightly delay that bad news by starting out with "uh" or "um," that's the conversation machine at work.

Enfield's overall point here is that these tiny words, far from just being "noise" for scholars to ignore, deserve linguistic study. "Huh?" plays a key role, too, because, judging again from cross-cultural research, it is a human linguistic universal. When we ask "Huh?" in conversation, it can be a mark of cooperation rather than confusion, a point that Enfield elaborated on via email (Email responses in this post have been edited for length.):

"It's true that 'Huh?' can be a sign of confusion. On the other hand, 'Huh?' does much more than simply signal a problem. The usual effect of 'Huh?' is to get the other person to repeat, confirm, or rephrase what they just said. This is only possible in the highly cooperative context of conversation.

When we talk, we agree to be accountable to each other for doing our respective parts in order to achieve a common goal, that of mutual understanding. Saying 'Huh?' draws attention to a possible failing in keeping up with that commitment, one which needs to be redressed on the spot, and we respond to it by helping the other, redressing the possible failing, so that we can move on.

For Enfield, cooperation in conversation reflects a kind of distributed cognition, a type of "systems" perspective that I think is spot on.

In How We Talk, Enfield aims to set apart our behavior and language from the behavior and communication of all other animals. Our social cognition is different "from that of even our closest relatives in the animal world," he writes. His overall conclusion is striking: "No animal shows the defining properties of human conversation: finely timed cooperative turn-taking, mechanisms for repair, and communicative traffic signals."

Based on my own research into the "dance" of contingent and mutually adjusted ape nonvocal communication, my skeptic's antennae went up at this statement of human uniqueness.

I sought an evaluation from Brittany Fallon, AAUW Fellow and research faculty member in the department of linguistics at the University of New Mexico, who has carried out field research on communication in wild chimpanzees. (I was Fallon's undergraduate professor at the College of William and Mary.)

On repair, for example, the checking we do as we talk to make sure we understand each other and, if necessary, fix breakdowns in comprehension, Fallon had this to say:

"Apes have their own mechanisms of testing comprehension — namely, response waiting, persistence and elaboration. There has been a decade of research showing that apes modify their communication when their target does not comprehend their signals. For example, orangutans famously modify their gestures to solicit help completing an experimental task."

The evidence for turn-taking is less straightforward, but occurs at least sometimes among apes, as Fallon explains:

"We know that apes can regulate vocalizations — they can suppress vocalizations like copulation calls, acoustically match others' calls, and even match human whistles. Apes also turn-take in other contexts, like a paper out this month on chimpanzees spontaneously taking turns during a number-ordering task [shows]. (The authors also lay out very clearly how this finding likely underpins conversational turn-taking.)

Modality also seems to be important for apes when it comes to turn taking. One study on wild chimpanzees found that the modality of the greeting — i.e. vocal, gestural, bimodal — impacted the likelihood of a reciprocal response, with gestural greetings more likely to receive a response than vocal or bimodal greetings. Another study found that a multi-modal communication was more likely to receive a response than a vocal-only communication."

Sophisticated aspects of ape social cognition, then, may be seen in their nonvocal more than their vocal communication.

Enfield acknowledged to me that there are many commonalities in our behaviors with apes, and emphasized that it's just good science to be skeptical of claims that rest on human exceptionalism.

In the end, though, he stands by the uniqueness theme:

"Some 7,000 languages are spoken in the world today, each a massive system made up of many thousands of sounds, words, grammatical structures and rules. Infants acquire these systems natively, without formal instruction, within the first few years of life. Animals do not have language in this sense. In linguistics, this has motivated the search to define what makes this possible across our species, and only in our species. That search does not deny important homologies (and/or analogies) with structures and functions in other species.

Language arguably supports a uniquely human form of social accountability: with language, we can name or describe a piece of behavior, drawing public attention to it, then characterizing it (as good, bad, not allowed, wrong, great, or what have you). Furthermore, language has the special property that it can be used for remarking on itself. Without this possibility in language, the phenomena of repair would not be possible."

Where does all this leave us, then?

Here's my recommendation: Enjoy How We Talk for its data-fueled, often extremely cool insights into the conversation machine we all participate in every day. Before you accept its claim that our communication system is qualitatively different from that of other animals, though, explore the evidence from multiple scientific teams about the complex communication abilities of our closest living relatives.

Barbara J. King is an anthropology professor emerita at the College of William and Mary. She often writes about the cognition, emotion and welfare of animals and about biological anthropology, human evolution and gender issues. Barbara's new book is Personalities on the Plate: The Lives and Minds of Animals We Eat. You can keep up with what she is thinking on Twitter: @bjkingape

Copyright 2018 NPR.

Psycho Babble

The Power of Um


On putting speech disfluencies to work

Marc Wathieu

By Jessica Love

June 26, 2014



Maybe you’re an ummer. Or maybe you’re an uhher. Perhaps you favor another word altogether. Or are you ambi-blunderous?

Whichever your poison, the slips and stumbles that slop up speech have always intrigued us. But these days, some researchers have stopped asking why we produce so many ums and uhs—often called disfluencies—and have started asking: Do they serve any purpose?

In 2002, researchers Herb Clark and Jean Fox Tree proposed that they do. Um and uh are simply words, the pair reasoned, planned for and produced like any others. But instead of referring to something in the world, um, uh, and their brethren have a very special discourse function: speakers use them to “announce that they are initiating what they expect to be a minor or major delay before speaking.”

Some words are more difficult to retrieve from memory than others: rare words, or words that just haven’t come up in a while, or just about anywordsif we’re under stress. So we use um and uh to signal to our listener that even though we’re still not quiteready to proceed, we’re working on it. Just, uh, umm … give us a sec.

A few years later, another research team led by Jennifer Arnold found some of the earliest evidence that such signals do not go unnoticed by listeners. Study participants were shown a display containing multiple pictures and told to follow a set of spoken instructions. As they listened, researchers recorded where on the display their eyes were looking. When the instructions included a disfluency (as in, Now put thee, uh …), people were likelier to turn their gaze to an object that had not yet been mentioned, and thus should be more difficult to produce, than when the instructions were fluid. In other words, the use of um or uh signaled to the listener that the speaker was struggling to retrieve a word—information then used to infer which word they were likely struggling with. A subsequent study finds that disfluencies trigger similar expectations of tough-to-name words in children as young as two or three years old.

But here the relationship between disfluencies and word difficulty gets a bit weird. Tell someone that they are listening to speech from someone with object agnosia, who is unable to recognize even everyday objects, and all of a sudden um and uh seem to lose their predictive powers. And a study out this month finds similar results for nonnative speech: Now put thee, uh … no longer directs our attention to the trickier-to-name object when it is spoken by an inexperienced foreign speaker.

In some ways, the latter finding is especially surprising. The factors that make word retrieval tough for us are generally magnified for less fluent speakers. If anything, then, we might find it useful to attend more carefully to a nonnative speaker’s disfluencies. But we don’t. When we can attribute all the umming and uhhing to a speaker’s more general difficulties with word retrieval, it seems, we no longer bother to attribute it to anything else.


Jessica Love is a contributing editor of the SCHOLAR. She holds a doctorate in cognitive psychology and edits Kellogg Insight at Northwestern University.