0% found this document useful (0 votes)
26 views32 pages

Evaluating Spatial Sound Systems

A Conference Presentation from the Light and Sound Interactive 2019 Conference on the objective evaluation of spatial sound reproduction systems and methods.

Uploaded by

Mark Bocko
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views32 pages

Evaluating Spatial Sound Systems

A Conference Presentation from the Light and Sound Interactive 2019 Conference on the objective evaluation of spatial sound reproduction systems and methods.

Uploaded by

Mark Bocko
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Evaluating Spatial Sound

Systems
Mark F. Bocko

Audio & Music Engineering


Audio Engineers love specs …
• Predicting which speakers will sound good …

2
How many speakers are enough?
$
NHK 22.2
$ $
$
$ $
$
$ $$ $ $ $
$ $ $ $ $
$$
$ $ $

$
Quantitatively evaluate Framework
any spatial sound Specify listening space &
1
Specify virtual acoustic
2

reproduction method in speaker placement sources to be created

any space … Compute signals driving 3


each loudspeaker
• Incorporate quantitative models of binaural (Your favorite method)

hearing into audio system design tools 4


Compute acoustic field at Compare
• Identify the computable quantities that listener (directional IR) & Assess

correspond to what listeners report they hear


(locations, spatial extent of sources, diffusiveness) Compute sound field- 5
listener interaction
(head model)
• Make the design of systems for creating spatial
audio more deterministic and less trial and error 6 7
Compute percepts Infer virtual acoustic
• Both for free space sound reproduction (binaural fusion model) source properties
• And for headphone based reproduction

4
Outline
• How the ear works – very briefly
• Meddis hair cell model

• Cross-correlation model of directional hearing


• Audio coherence and spatial hearing
• Interaural time and level differences
• Spectral coloring from source elevation
• Correlograms
• Examples
5
Human
Auditory
System

6
7
Reissner Membrane

Scala Vestibuli

Tectorial Membrane
Organ of Corti

Scala Tympani
Basilar Membrane

8
©2013 by American Physiological Society
9
Meddis Hair
5

Input Signal
0

Cell Model -5

0 0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.04 0.045 0.05
Time (sec)
150

Cell Probability
Deflection
100

Around 3000 inner hair cells


50

0
along the length of the basilar
~ Firing
-50
membrane Hair
-100

-150
0 0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.04 0.045 0.05
Time (sec)

Neuron firing is
Neuronal Pulse Stream

1.2

irregular and 0.8

clustered near
0.6

0.4

signal peaks
0.2

0
0 0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.04 0.045 0.05
Time (sec)
10
Meddis Hair
5

Input Signal
0

Cell Model -5

0 0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.04 0.045 0.05
Time (sec)
150

Cell Probability
Deflection
100

50

~ Firing
-50
Hair
-100

-150
0 0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.04 0.045 0.05
Time (sec)
Neuronal Pulse Stream

1.2

Spontaneous 0.8

0.6

firing rate 0.4

0.2

0
0 0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.04 0.045 0.05
Time (sec)
11
Binaural Fusion Model

ea r Low Freq
l ef t
m
Fro
High Freq

u t
tp
Ou

Site of r
Binaural ht ea
m rig
Fusion Fro
To right
cochlea
To left Represent as a bi-directional delay line
cochlea
12
Binaural fusion mechanism à 2 msec windowed cross-correlation
2 msec *
DELAY LINE FROM RIGHT EAR

DELAY LINE FROM LEFT EAR

W(T)

xr(t) t
T

t1 t2 t3
TW
W(T)
xl(t) 𝜏
T The lag where the peak in the cross-correlation
appears is the Interaural Time Difference
t

t1 - 𝜏 t2 - 𝜏 t3 - 𝜏
TW • Jeffress, L. A. (1948). A place theory of sound localization. Journal
of comparative and physiological psychology, 41(1), 35. 13
Interaural Time Difference and source direction
(in the horizontal plane)

0 50 100 Perceived ITD (direction to source) is


determined by location of the peak in the
Sl S Sr
short-time cross-correlation function
Low frequency limit of
Rayleigh diffraction around sphere

q !#
ITD = 𝑠𝑖𝑛(𝜃)
"$
30° 30°
c is the speed of sound

ITD = 0 when 𝜃 = 0
L R ITD = (3/2)*(d/c) when 𝜃 = 90°
2d
d

Note: Factor of 3/2 is due to diffraction around listeners head


14
Role of coherence in binaural hearing

3 Sec white noise bursts


S1 S2
• S1 alone
• S2 alone
• S1 + S2 the same
• S1 + S2 different

15
Demonstration of lateralization as a function of noise burst duration
• Play a series of uncorrelated stereo noise bursts of decreasing duration
(2sec 1sec 0.5sec 0.2sec 0.1sec 50msec 20msec 10msec 5msec 2msec 1msec)

Series of uncorrelated
2msec stereo noise bursts

• At about 2 msec and less, each burst is identified with a specific location
• The cross-correlation function always has a peak somewhere! But it is different each time.
• The auditory percept being computed by the brain is updated about every 2 milliseconds
16
-0.5

Auditory
10 20 30 40 50 60 70 80
Sample Number
Cross-correlation Function

“Sluggishness”
1
“L” click

Norm X-Corr
0.5

• How quickly can a listener follow time- 0

-0.5
varying binaural cues? -1

• Evidence for a 200 - 300 msec threshold


-80 -60 -40 -20 0 20 40 60 80
Lag (samples)

• Distribution of 2 msec window ITD’s has a


“memory” of 100 - 300 msec Series of L, C, R located clicks
60

50

40

30
10 msec 50 msec 100 msec 250 msec 500 msec
20

10 Your brain averages over a hundred or more 2 msec windows


0
-20 -15 -10 -5 0 5 10 15 20 25 and constructs a histogram of interaural time differences.
Histogram of ITD’s
17
Correlograms – Frequency dependent interaural time differences

u e n cy
Freq

Frequ
n cye
Del
ay
2-D (ITD & frequency) map encodes source location
Brain decodes these maps to source locations ITD
ITD à lateral position of source Stereo speaker pair – center panning
Frequency dependence à source elevation (anechoic conditions)
18
Procedure
• For a given head model …
• Compute the reference correlograms for all possible sound source directions
• Specify the multi-channel reproduction system, the influence of the room, and
the signals driving each speaker (for whatever method you choose)
• Compute the resulting correlogram
• Project the computed correlogram onto the reference set to infer the direction
• One may infer a superposition of source directions
• Specific methods
• Decompose into spherical harmonics (orthogonality helps)
• Error minimization
• Machine learning

19
So how does the method work? … assessing the effect of reverberation

Aula Carolina
(Aachen)

20
Reverberation broadens the source image

250 Trials - Stereo Loudspeakers @ +/- 30 degrees - Delta = 0 (center pan)


80
Reverb
Anechoic

70

60

50
Number of Trials

40

Note: Random nature of nerve


30

impulse stream creates a spread


20

of image width, even in a non-


10

Reverberant space
0
-30 -20 -10 0 10 20 30
21
Perceived Incident Angle of Sound Source in Degrees
Spatial Blur – experimental measurements
The model reproduces the observed angular acuity.

Spread arises from statistics of neuronal pulses.

22
Blauert, J., “Spatial Hearing: The Psychophysics of Human Sound Localization”, MIT Press 1983.
Spatial acuity with one ear!
If you don’t believe the cross-correlation model look at this!

23
Blauert, J., “Spatial Hearing: The Psychophysics of Human Sound Localization”, MIT Press 1983.
Sl Sr Modeling Stereo Reproduction

Frequency dependence of head


diffraction

𝑅"!" 𝑡, 𝜏 = 𝑅# 𝑡, 𝜏 + 𝑓 $ 𝜔 𝑅# 𝑡, 𝜏

+ 𝑓 𝜔 𝑅% 𝛿 𝜏 + 𝜏& + 𝛿(𝜏 − 𝜏& )

𝜏& = left-right ear delay

𝑅# 𝑡, 𝜏 is the cross-correlation of the Sl and Sr


L R
d
24
2
L Speaker Apparent Intended R Speaker

Stereo Sweet Spot calculation 1.5

• Compute peak of distribution of ITD’s for


1
Dl Dr
a real source at the intended location
• Compute peak of distribution of ITD’s for y
(x0,y0)
the stereo rendered intended source
0.5

• Infer the apparent source direction from


peak of ITD distribution x
0
(0,0)
• This example is for coherent sources – the
formalism also can be used with partially
coherent sources, i.e., real signals in
-0.5

reverberant spaces.

-1

-2 -1.5 -1 -0.5 0 0.5 1 25 1.5


Main Points
• Integrated a quantitative neurological model into a spatial audio analysis tool
• Randomness of auditory nerve firing events is important
• Predicts measured angular acuity
• Two time scales are in play
• Short ( ~ 2 msec) window for cross correlation in brainstem
• Longer ( ~ 100 msec) histogram “memory” (higher level processing)
• We can predict what a listener will tell you they hear
• Location and spread of sound source
• There’s a lot left to do …
• Integrate with room modeling software for a complete analysis package
• Create synthesis tools – find the designs and algorithms that best reproduce a desired spatial
sound effect
• Continue to refine auditory models
• Distance cues
26
END

27
Cochlea
28
Cross-correlation (similarity of two signals)
[x1 x2 x3] [x1 x2 x3] [x1 x2 x3] [x1 x2 x3] [x1 x2 x3]
[y1 y2 y3] [y1 y2 y3] [y1 y2 y3] [y1 y2 y3] [y1 y2 y3]
Lag -2 -1 0 1 2

Delay = 0 Two random sequences Two random sequences Delay = 30 samples


10 10

5 5

0 0

-5 -5

-10
-10
20 40 60 80 100 120 140 160 180 200 20 40 60 80 100 120 140 160 180 200

Cross-correlation
Cross-correlation
1
1

0.8
0.8

0.6
0.6

0.4
0.4

0.2
0.2

0
0

-0.2
-0.2 0 50 100 150 200 250 300 350 400
0 50 100 150 200 250 300 350 400

Signals are correlated but delayed


Uncorrelated signals
Two random sequences
10

-5

-10
20 40 60 80 100 120 140 160 180 200

Cross-correlation
0.2

0.15

0.1

0.05

-0.05

-0.1

-0.15
0 50 100 150 200 250 300 350 400

No dominant peak in cross-correlation


Precedence effect
• Law of the first wave-front …
• Direction is inferred from 1st wave-front (up to about 30-40 msec)
• Haas effect – short delays enhance “spaciousness”

0 – 2 msec delay 0 – 40 msec delay 0 – 200 msec delay


(in 20 steps) (in 20 steps) (in 20 steps)

Explained by saturation and recovery time of hair cell response.


31
Directional impulse responses
Directional Impulse Response

Track both the time of 10


-3

arrival and the direction 2.5

of each room reflection 1.5

0.5

z
-0.5
(Matlab Demo: Imp_Resp_w_Angle_3.m) -1

-1.5

-2

-2.5
2

2 0 -3
1 10
0
-3 -1 -2
10 -2
y
-3 x 32

You might also like