Have you ever understood every word on a page but felt completely lost when a native speaker said the same sentence out loud? You hear something like “Wahdaya wanna do?” and only later realise it was “What do you want to do?”. The reason is connected speech: in natural, fluent English, words do not stay in their neat dictionary forms. They join together, lose sounds, and change shape.

This is not careless or lazy speech — it is exactly how every fluent speaker talks, in every accent. The good news is that connected speech follows clear, learnable patterns. Once you know them, fast English suddenly slows down, because you start to expect the changes. This guide covers the four big features — linking, intrusion, elision, and assimilation — plus weak forms, contractions, and the famous gonna / wanna / gotta of relaxed conversation.

Key Takeaways

  • Linking (catenation) joins a final consonant to a following vowel: an apple → “a napple”.
  • Intrusion adds a /r/, /w/, or /j/ sound between two vowels: go away → “gow away”.
  • Elision drops sounds for ease: next day → “nexday”; comfortable → “comfteble”.
  • Assimilation changes a sound to match its neighbour: ten boys → “tem boys”.
  • Weak forms reduce small words to a schwa /ə/ — the engine of English rhythm.
  • Studying connected speech mainly helps listening: you don’t have to use it, but you must recognise it.

What Is Connected Speech?

When we learn a word, we learn its careful citation form — the way it sounds when spoken alone and clearly. But we almost never speak one word at a time. In a flowing sentence, the sounds at the edges of words bump into each other, and English smooths those joins to keep its rhythm.

English is a stress-timed language: stressed syllables fall at roughly regular intervals, like a drumbeat, and the unstressed syllables in between get squeezed to fit. To keep that beat, speakers link words, drop sounds, and blend others. All of connected speech serves this single purpose: smooth, efficient rhythm.

Why It Matters

Connected speech is the single biggest reason learners can read fluently but struggle to follow fast speech. The words you “can’t hear” are usually there — just linked, reduced, or dropped. Recognising the pattern is the whole battle.

Linking (Catenation)

The most common feature is linking, also called catenation. When a word ends in a consonant sound and the next word begins with a vowel sound, the consonant slides onto the vowel, and the two words merge with no pause.

an apple → sounds like “a napple”

turn it off → “tur ni toff”

not at all → “no ta tall”

pick it up → “pi ki tup”

Notice that the spelling never changes — only your ear is fooled into hearing the word boundary in the wrong place. This is why an aim and a name can sound identical, and why learners often mis-hear where one word ends and the next begins.

Intrusion (Intrusive Sounds)

When one word ends in a vowel and the next begins with a vowel, there is no consonant to do the linking. English solves this by inserting a tiny extra sound to bridge the gap. This is called intrusion, and there are three intrusive sounds.

Intrusive soundWhen it appearsExample
Intrusive /r/After /ə/, /ɑː/, /ɔː/ vowelslaw and order → “law-r-and order”
Intrusive /w/After /uː/, /aʊ/, /əʊ/ vowelsgo away → “gow away”
Intrusive /j/After /iː/, /aɪ/, /eɪ/ vowelsI agree → “I-y-agree”

Intrusive /r/ is especially common in British English: the idea of it often becomes “the idea-r-of it”, and media event becomes “media-r-event”, even though there is no letter r in sight. These sounds are never written; they are simply produced to keep the vowels flowing into one another.

Elision (Dropping Sounds)

Elision is the omission of a sound to make a word or phrase easier to say at speed. The most frequent victims are /t/ and /d/ when they sit between two consonants.

next day → “nex day” (the /t/ disappears)

old man → “ol man” (the /d/ disappears)

most common → “mos common”

sandwich → “sanwich”

Vowels are elided too, especially the weak schwa /ə/ in unstressed syllables. This is why several “four-syllable” words are actually said with three:

  • comfortable → “comf-te-ble” (3 syllables, not 4)
  • camera → “cam-ra
  • different → “dif-rent
  • vegetable → “veg-te-ble
  • chocolate → “choc-late

Elision is completely standard and expected. Pronouncing every single sound carefully can actually make you sound less natural, not more.

Assimilation (Sounds Changing)

Assimilation happens when a sound changes to become more like a neighbouring sound — usually the sound that follows it. The mouth is preparing for the next position early, so the sound shifts.

PhraseBecomesWhat changes
ten boys“tem boys”/n/ → /m/ before /b/
good girl“goog girl”/d/ → /g/ before /g/
this shop“thishop”/s/ → /ʃ/ before /ʃ/
that boy“thap boy”/t/ → /p/ before /b/

In each case the speaker’s mouth moves towards the next sound a fraction early. The /n/ in ten turns into an /m/ because the lips are already closing for the /b/ of boys. Like elision, this is a feature of relaxed, natural speech, not an error.

Weak Forms & the Schwa

Perhaps the most important feature for both listening and rhythm is the weak form. Small grammatical words — articles, prepositions, conjunctions, auxiliaries — are usually unstressed, and their vowel reduces to a schwa /ə/, the relaxed “uh” sound. The schwa is the most common vowel in English precisely because of this.

WordStrong formWeak formExample
to/tuː//tə/I want tə go
and/ænd//ən/, /n/fish ’n’ chips
of/ɒv//əv/, /ə/a cup ə tea
for/fɔː//fə/It’s fə you
can/kæn//kən/I kən swim
was/wɒz//wəz/He wəz late

This is why fish and chips sounds like “fish ’n’ chips” and a cup of tea collapses into the famous “cuppa tea”. Crucially, the difference between can /kən/ and can’t /kɑːnt/ in British English is mostly the vowel — weak schwa for the positive, strong long vowel for the negative. Listening for this saves a lot of misunderstandings.

Train Your Ear with Flash Cards

Drill real connected-speech phrases and weak forms with instant feedback on every card.

Practise with Flash Cards

Contractions & Fast-Speech Reductions

Contractions are the one form of connected speech we actually write down: I’m, you’re, don’t, she’ll, we’ve, it’s. They are reductions of two words into one, joining an auxiliary or pronoun to a weak form.

In very informal, fast speech these reductions go further and produce the spellings you see in song lyrics and texts — though you should keep them out of formal writing and exams:

Common reductions

  • going togonna
  • want towanna
  • got to / have got togotta
  • kind of / sort ofkinda / sorta
  • let melemme

In full sentences

  • I’m going to leave → “I’m gonna leave”
  • I want to go → “I wanna go”
  • I’ve got to run → “I gotta run”
  • What do you want? → “Whaddaya want?”
  • Did you eat? → “Djeet?”
Formal Writing Warning

Never write gonna, wanna, or gotta in an essay, email, or IELTS task. They belong to casual speech only. In writing, use the full forms going to, want to, and have to.

Should You Use It — or Just Understand It?

Learners often worry they must master every reduction to sound fluent. You don’t. Clear, well-stressed English with correct weak forms and basic linking already sounds natural and fluent. Elision and assimilation tend to appear on their own as you relax into the language.

The real reason to study connected speech is listening comprehension. Even if you never drop a single /t/ yourself, you must be able to recognise these patterns when a native speaker uses them — otherwise fluent speech will always sound like a blur. Read our guide to English listening skills and try some online listening practice to put this into action, and pair it with the English pronunciation guide for the sounds themselves.

Practice Exercises on LexFizz

Connected speech is a listening skill above all, so the best way to internalise it is through repeated exposure and active recall.

  • Flash Cards — drill weak forms and linked phrases with spaced repetition
  • Quiz — match the “fast” sound to the full written phrase
  • True or False — decide whether a transcription of connected speech is correct
  • Browse all exercises — find more listening and pronunciation games

Frequently Asked Questions

Connected speech is the way sounds change, join, and disappear when words are spoken together in natural, fluent English rather than one at a time. Instead of pronouncing every word in its careful ‘dictionary’ form, speakers link words together, drop some sounds, and change others to make speech smoother and faster. For example, an apple often sounds like “a napple”, and next day becomes “nexday”. Understanding connected speech is essential for listening comprehension because native speakers rarely speak in clearly separated words.

Linking, also called catenation, happens when a word ending in a consonant sound is followed by a word beginning with a vowel sound. The final consonant joins onto the next word, so an apple is heard as “a napple”, and turn it off sounds like “tur-ni-toff”. The words do not really change spelling; the consonant simply attaches to the vowel that follows, creating a smooth, joined-up sound with no pause between the words.

Intrusion is when an extra sound is added between two vowel sounds to link them smoothly. There are three intrusive sounds in English: intrusive /r/, as in law and order becoming “law-r-and order”; intrusive /w/, as in go away becoming “gow-away”; and intrusive /j/ (a ‘y’ sound), as in I agree becoming “I-y-agree”. These sounds are not written but are inserted naturally to avoid an awkward gap between vowels.

Elision is the dropping or omission of a sound in fast speech to make pronunciation easier. The most common type is the loss of /t/ and /d/ between consonants, so next day becomes “nexday” and old man becomes “ol man”. Vowels can also be elided, especially the schwa: comfortable is usually said as “comfteble” (three syllables, not four), and camera as “camra”. Elision is completely normal in native speech and not considered lazy or incorrect.

Assimilation is when a sound changes to become more similar to a neighbouring sound, usually the one that follows it. For example, ten boys often becomes “tem boys” because the /n/ shifts towards the /b/; good girl can sound like “goog girl”; and this shop merges into “thishop”. Assimilation makes the transition between sounds smoother and faster, and it is a natural feature of relaxed, connected speech rather than careless pronunciation.

Weak forms are the reduced pronunciations of common grammatical words such as to, and, of, for, can, was, and them when they are unstressed. In these weak forms the vowel usually becomes a schwa /ə/, the relaxed ‘uh’ sound. So fish and chips becomes “fish ’n’ chips”, a cup of tea becomes “a cuppa tea”, and I can go has a weak can (/kən/). The schwa is the most common vowel sound in English and is the engine of natural rhythm.

Native speakers run words together because English has a stress-timed rhythm: stressed syllables occur at roughly regular intervals, and the unstressed syllables in between are squeezed and reduced to keep that beat. Linking, elision, assimilation, and weak forms all serve this rhythm, making speech faster, smoother, and less effortful. It is not careless or lazy; it is the natural, efficient way the language is spoken by fluent users in every accent.

Gonna, wanna and gotta are informal spoken reductions of going to, want to and got to (or have got to). They are examples of connected speech in fast, casual conversation: I’m going to leave becomes “I’m gonna leave”, I want to go becomes “I wanna go”, and I’ve got to run becomes “I gotta run”. They are perfectly normal in informal speech and song lyrics, but you should avoid writing them in formal essays or exams.

Yes, connected speech is one of the main reasons learners struggle to understand fast native speech, even when they know all the individual words. A phrase like What are you going to do? can sound like “Whaddya gonna do?”, which is unrecognisable if you only know the careful forms. The good news is that once you learn the patterns of linking, elision, assimilation and weak forms, your listening comprehension improves dramatically because you start to expect these changes.

You should aim to use connected speech naturally, but it is not essential to force it. Clear, well-stressed English with correct weak forms and basic linking already sounds fluent and natural. As you become more comfortable, features like elision and assimilation will appear on their own. The most important reason to study connected speech is for listening: even if you never drop a single sound yourself, you must recognise these patterns to understand native speakers.

Ready to train your ear?

Explore All Exercises →