Perception lecture 10: Music and speech perception
Harmonics
● A is a sound wave that has a frequency that is an integer multiple of a fundamental tone
○ The lowest frequency sound that can be produced is the Fundamental
frequency
● Auditory system is acutely sensitive to natural relationships between harmonics e.g. how
we tell vowels apart
How does the auditory system pick up missing fundamental harmonics?
● Temporal code: areas of the basilar membrane sensitive for frequencies have neurons
show more action potential at spikes in frequencies
○ Important for hearing music and speech
Harmonics cont’d
● Timbre: Psychological sensation by which a listener can judge that two sounds that
have the same loudness and pitch are dissimilar; conveyed by harmonics and other high
frequencies e.g. piano and saxophone playing the same tone but doesn’t sound the
same
○ Perception of timbre depends on the context in which the sound is heard
● Attack: part of the sound in which amplitude increases (start)
● Decay: part of the sound in which amplitude decreases (end)
Hearing in the environment
Auditory scene analysis
● Auditory scene: the entirety of sounds that is audible in a given moment and that
conveys information about the events happening in that moment e.g. how we tell
someone is whispering
● To segregate sounds we can use:
○ spatial cues and loudness i.e. frog is closer to the left, bird is far and to right
○ Auditory stream segregation: perceptual organization of sounds into separate
auditory events
● To group sounds we can group by:
○ Timbre (gestalt law of similarity)
○ Onset (gestalt law of common fate)
● Spectrogram: A pattern for sound analysis that provides a
3D display of intensity as a function of
● time and frequency
● Vision can help with audition
● Ventriloquist effect: An audio-visual illusion in which sound
is misperceived as emanating from a source that can be seen to be moving appropriately
when it actually emanates from a different invisible source
● Restoration based on the gestalt law of good continuation: In spite of interruptions, one
can still “hear” sound
○ At some point, restored missing sounds are encoded in the brain as if they were
actually present, but interpretation depends on the context e.g. meal and wheel
Music
● Music is a way to express thoughts and emotions
● Musical notes are sounds that extend across a frequency range from about 25 to 4500
Hz
● Pitch: the psychological aspect of sounds related to the fundamental frequency
● Octave: the interval between 2 sounds having a frequency ratio of 2:1 e.g. x2
● 2 dimensions of pitch
○ Tone height: A sound quality corresponding to the level of pitch.
Tone height is monotonically related to frequency
○ Tone chroma: A sound quality shared by tones that have the
same octave interval. each note has a different chroma
● Musical instruments produce notes lower than 4000 hertz
● When listening to music there is difficulty perceiving octave relationships
between tones when one is greater than 5kHz
● Chords: Created when two or more notes are played simultaneously
● Consonant: Have simple ratios of note frequencies (3:2, 4:3)
● Dissonant: Less elegant ratios of note frequencies (16:15, 45:32)
● Some chords sound good because of harmonics or cultural differences
○ Musicians’ estimates of intervals between notes correspond to the music scale
from their culture
● Melody: an arrangement of notes or chords in succession – chroma and rhythm– forming
a gestalt e.g. twinkle, twinkle little star
○ Can change octaves, keys and tempo and still be the same melody
● Fugue: a compositional technique (in classical music) in two or more voices, built on a
subject (theme) that is introduced at the beginning and then repeated at different pitches.
● People are predisposed to groups sounds into rhythm patterns
Music in the brain
● The auditory cortex is involved in music perception
● The motor cortices are involved in music production
● Emotion cortices (amygdala) are involved in music perception
● When listening to music we actively generate predictions about what is likely to happen
next
Speech
● Vocal tract: the airway above the larynx used for the production of speech
● Speech production:
● Respiration (lungs)
● Phonation (vocal cords)
● Articulation (vocal tract)
● The initiation of speech starts with respiration and phonation when the diaphragm
pushes air out of the lungs through the trachea and up to the larynx
● At the larynx air passes through 2 vocal flaps which are smaller in kids = higher voices
● Humans change the shape of their vocal tract using jaw, lips, tongue and tongue tip and
soft palate
○ This manipulation creates resonance characteristics
Classifying speech sounds
● Vowels open the vocal tract and use differently shaped tongue and lips
● Constants are an obstructing of vocal tract
○ place of articulation = at lips etc
○ Manner of articulation = total/ slight obstruction of airflow, tha, bah, la
○ Voicing = when your vocal cords vibrate
● Coarticulation: the pronunciation of a sound in a word is affected by the sounds before
and after it e.g. In saying “freon,” for example, nasalization often occurs during the first
vowel, even though it is required only for the /n/
● Categorical perception: people do not perceive continuous variation, they perceive
sharp categorical boundaries
● Similar features can be used in different combinations
and we can rely on multiple cues to recognize a sound
● Sound distinctions are specific to various languages
How do we know where one word ends and another begins?
● Certain sounds are more likely to occur together (sound
probabilities)
Speech in the brain
● Listening to speech: Left and right superior temporal
lobes are activated more strongly in response to speech than to nonspeech sounds
● As sounds become more complex, they are processed by more anterior and ventral
regions of superior temporal cortex – in both hemispheres
● As sounds become more speech-like, more activation in the left brain
● Research indicates that some “speech” areas become active when lip-reading