Generating and understanding speech

Speech occurs by air being forced past the vocal chords, which then begin to vibrate. The vibrations become a fundamental tone, which is reinforced in the oral and nasal cavity. The more air forced past the vocal chords per time interval, the stronger the sound – it is here that the volume at which we speak is determined. By placing the tongue and the lips in different positions we form the different sounds we call letters – vowels and voiced and unvoiced consonants. The vowels (a, e, o etc.) are a direct extension of the fundamental tone and are relatively strong compared to the voiced consonants (b, d, m etc.).

The vowels are also a lower frequency and the consonants a high frequency. While the vowels create the sound volume of speech, it is the consonants which are the bearers of information. This can be demonstrated in a very simple way – leave out the vowels when you whisper and it is still possible for the information to be heard in its entirety.

Try this in a more visual way by writing down a sentence, first with all the consonants removed and then with all the vowels taken out instead. Which is the easiest to read?


Normal hearing

Normal hearing

The energy of the vowels primarily lies in the range 250 – 2,000 Hz and that of voiced consonants (b, d, m etc.) in the range 250 – 4,000 Hz.

Unvoiced consonants (f, s, t etc.) vary considerably in strength and lie in the frequency range 2,000 – 8,000 Hz.

To be able to understand speech clearly, it is therefore important to have good hearing across the entire range of frequencies from 125 – 8,000 Hz, but especially in the range of the unvoiced consonants.


Impaired hearing

Deformed hearing illustration

When hearing is impaired, it is common to lose the ability to understand consonants which often contain little sound energy and lie in the frequency range 2,000 – 8,000 Hz.