LOW FREQUENCY SPATIALIZATION IN ELECTRO-
ACOUSTIC MUSIC AND PERFORMANCE:
COMPOSITION MEETS PERCEPTION
Roger T. Dean
austraLYSIS, Sydney, Australia
MARCS Institute, University of Western Sydney, Sydney, Australia
[Link]@[Link]
The article takes the perspectives of an electro-acoustic musician and an auditory psychologist to consider detection of
localization and movement of low frequency sounds in reverberant performance environments. The considerable literature
on low frequency localization perception in free field, non-reverberant environments is contrasted with the sparser work on
reverberant spaces. A difference of opinion about reverberant environments has developed between on the one hand, audio
engineers and many musicians (broadly believing that low frequency localization capacities are essentially negligible), and
on the other, psychoacousticians (broadly believing those capacities are limited but significant). An exploratory auditory
psychology experiment is presented which supports the view that detection of both localization and movement in low
frequency sounds in ecological performance studio conditions is good. This supports the growing enthusiasm of electro-
acoustic musicians for sound performance using several sub-woofers.
INTRODUCTION music exploit such very low frequencies substantially: for
Psychoacoustics now generally considers that there are three example music with or comprising of drones, noise music,
main mechanisms which allow humans to localize sounds, and and electroacoustic music at large. For this oeuvre, 250 Hz is
detect movements of sonic sources (reviewed: [1-3]). The first hardly conceived as ‘low frequency’, rather more like mid-
two constitute the 'duplex' of inter-aural time differences (ITD) frequencies. Furthermore musical uses of frequencies below
and inter-aural intensity differences (IID) distinguishing a 250 Hz require presentation in reverberant environments,
single sound as heard at the two ears (see below for elaboration). such as the performance studio, dance floor, or concert hall,
The duplex theory was developed by Strutt (Lord Rayleigh) and in moderately reverberant environments such as recording
in the late 19th and early 20th century [4], following early studios; conditions which contrast with those of the vast
observations on localizing the sound of a flute by Venturi in the majority of psychoacoustic studies.
18th century. These two mechanisms are binaural: that is they It is worth pointing out the distinctions between perceiving
require interaction between the auditory pathways flowing from localization and lateralization. Biologically speaking, the
both ears, and may exploit neural coincidence detection [5]. important feature of a sound is the location of its source.
The third mechanism is monaural: the so-called spectral notch However, in audio engineering and in electro-acoustic music
effect, whereby head-related transformations of the spectrum in particular, more important is the degree of movement of
of an incoming sound, due mainly to the pinnae of the ears, the sounds and their components, expressed as their changing
provide location cues. All the mechanisms are influenced by lateralization, that is their apparent positional spread in the
head size and related anatomical features. The IID mechanism listening space, whether virtual (headphones) or physical.
is largely restricted to frequencies above about 1000 Hz; the One experimental approach to making this distinction is
monaural mechanism operates in a fairly similar range. Only to have listeners represent the centre of a sound source on
the ITD mechanism is usefully functional at lower frequency. a one-dimensional scale from L to R [6], another to have
Most psychoacoustic studies of sound localization have used them draw a graphical range of spread. Note the implication
free field conditions, that is conditions in which reflections here that listeners may be perfectly aware of the disposition
and reverberation are virtually lacking, such as in an anechoic of the loudspeakers, yet perceive sounds as disposed almost
chamber or virtual environments in headphones. Those with anywhere in the listening space.
primary involvement in auditory perception might note that this Thus the purpose of this article is to discuss low frequency
usage of the term free field is that of acoustics, and may not perception in reverberant environments of ecological relevance
coincide with their normal usage. to music, as an aspect of applied acoustics of particular
None of these mechanisms of human sound localization importance to music composition, production and performance.
are excellent when faced with low frequency sounds, and I point to some of the gaps in the literature in relation to these
relatively few data specifically address frequencies below applications, suggest and illustrate some useful analytical and
250 Hz. Nevertheless, as I will describe, several fields of experimental approaches, and also seek to provide pointers
102 - Vol. 42, No.2, August 2014 Acoustics Australia
that may eventually be useful to electroacoustic composers and frequency lateralization may be masked to some degree by the
improvisers in constructing the spectral design of their music. presence of broad band energy in higher frequencies [12]. But
other work emphasizes the positive impact of low frequency
A SYNOPTIC REVIEW OF LOW sound on auditory image size, and the additional benefit of
FREQUENCY LATERALIZATION AND ITS stereo (or multiple) subwoofer conditions for perception of
IMPACTS overall image size [13, 14]. Stereo subwoofers were thought
to be distinct from mono pairs, with a limited number of
Sound spatialization in music performance and recording popular music tunes, but the subjective preference experiments
environments. which followed were not statistically conclusive [15, 16].
The sound of an orchestra is spatialized as a consequence Nevertheless, multispeaker systems, vector based amplitude
of the disposition of the instruments in the performing space. panning and ambisonics have been used in attempts to enlarge
For example, the low pitched instruments such as tuba (brass) the 3D impact of sound projections, and to foster impressions of
and double bass (strings) are usually to the back, and at one envelopment [17] and engulfment, the latter seemingly related
or other side of the layout, while the high pitched flutes and to the degree to which 3D impressions do superimpose on 2D
oboes (woodwind) and violins (strings) are to the centre and [18, 19]. The multi-speaker system at ZKM Karlsruhe, neatly
further forwards, in relation to the position occupied by the called Klangdom, has been developed alongside corresponding
conductor. Prior to the era of orchestral music, there were sound diffusion software, called Zirkonium [8]. This allows a
also notable compositional experiments with spatial dialogues composer or sound designer to specify the location of a sound,
between groups of instruments, as in the works of Gabrieli for as a combination of outputs from three adjacent speakers, and
brass chorale. Subsequent to the 19th/20th century dominance allows a compositional dissociation between audio track and
of the orchestra in western music, electroacoustic music since speaker. In contrast virtually all other software assumes that
1950 has enlarged this emphasis on spatialization to extremes, an individual audio track is sent to 1 or 2 speakers (according
where hundreds of loudspeakers may be arrayed around a to whether the track is mono or stereo); some software, such
performing/listening space in a 3D organization, so that sounds as Ardour, MAXMSP and ProTools does permit a single audio
can be projected from any point, or throughout, and can be stream to be sent to any combination of speakers, but they do
‘moved’ around [7, 8]. Composers have also sometimes had not allow specification in terms of localization.
the opportunity to create grand architectural spaces (some The Zirkonium system, and a few others can also well
temporary, some mobile) for such performances, for example represent one of the most notable developments in spatializing
Stockhausen and Xenakis [9], and in Australia, Worrall [10]. electroacoustic music: what Canadian composer Robert
To electroacoustic music, even if presented in relatively Normandeau calls timbral spatialization [8]. This is effectively
humble stereo or quad, timbral flux around the listening space the systematic presentation of different frequency bands from
is a key feature. different parts of the performance space, and their movement
Thus it is a matter of concern that we lack comprehensive around the space. Convenient and accessible software, Kenaxis,
data and mechanistic understanding of sound localization in based on MAXMSP, allows easy approaches to this technique
some parts of the frequency spectrum: notably low frequencies. even in live performance.
This is so much the case that particularly amongst studio and Such developments in electroacoustic music further
sound projection audio engineers there is a repeatedly argued emphasize the importance of understanding our localization,
view that low frequencies are poorly localized. The argument lateralization and movement detection abilities in relation to
consequently suggests there is little point in having multiple every band of the audible frequency spectrum.
subwoofers (the specialized speakers which most fully represent
frequencies below about 200 Hz) in a performance space, The importance of low frequencies in electroacoustic and
unless the purpose is solely enhancing the loudness of those other music: relations to environmental and speech sounds
low frequencies. Furthermore, it is relevant that electroacoustic I mention here a few aspects of the increasing but long-
projection does not provide such detailed visual cues to sound standing musical importance of low frequency sound. Certain
lateralization as an orchestra provides, so the difficulty with low ritualistic musics, such as some of Tibet, and of the Australian
frequencies is not reduced by such cues. In neatly presenting didgeridoo, use repeated low frequency timbres, in a manner
the somewhat opposed views of psychoacousticians and audio akin to the drone in Indian music (which is actually a wide
engineers, a spatialization team from the University of Derby band frequency pattern with strong bass), and to the specialized
[11] present a concise review discussion. They conclude that forms of shamanic [20] or trance-music which use drones. A
on balance it is to be expected that in most environments low drone in this music is a long-sustained low band sound, often
frequency localization is much better than the audio engineer changing very slowly or not at all.
community admits, yet concede there are major gaps in the The evolution of popular music through jazz and rock, via
data and our understanding in relation to ecologically relevant amplification has resulted since the 1960s in much higher sound
musical environments, which are mostly reverberant. levels in the bass instruments: as for example, the acoustic
Electroacoustic composers have continued apace to exploit contrabass has often been heavily amplified or replaced by
sound spatialization, relying mainly on their compositional the electric bass or by bass motives played on a synthesizer.
intuitions and perceptual impressions as to what is or is not The bass ‘riff’, or accompanying repetitive rhythmic-melodic
audible. Empirical evidence suggests that perception of low form of much music in rock and jazz has consequently been
Acoustics Australia Vol. 42, No. 2, August 2014 - 103
able to take a much more foregrounded position in the overall applied acoustics, but as yet has not attracted the attention it
sound mass, a position which has been enhanced by mixing deserves. I turn next to a brief summary of what is known
technologies. A similar but less obvious phenomenon took about this, with particular reference to environments which are
place in certain forms of contemporary classical instrumental ecologically apt for music: in other words, reverberant rather
and electroacoustic music and in dance electronica, where more than free field (non-reflective) environments.
visceral sensations [21] were created than generally sought
Some relevant psychology and psychoacoustics: perception
before (with notable exceptions such as the medieval composer
of low frequencies in reverberant environments
Marais). Thus it is interesting to compare Xenakis’ influential
orchestral piece Metastaseis [22] with, for example, an Elgar It is worthwhile to ask what psychoacousticians treat as ‘low
symphony in respect of the contribution of low frequencies to frequencies’ and why; and to contrast that with even conventional
the overall acoustic intensity: there are sections in Metastaseis compositional perspectives, let alone electroacoustic ones.
in which even the score shows clearly the dominant sweep There is a fairly clear lower limit to the frequencies in which
of instrumental glissandi (pitch slides) in the lower string the IID aspect of the duplex theory provides significant
instruments, something rare in previous music. Similarly, information: around 1000 Hz. It seems this may have driven
Xenakis’s electroacoustic piece Bohor, a work of considerable the psychoacoustician (and perhaps acousticians and audio
power, acoustically and affectively, has sections in which engineers) to treat frequencies below 1000 Hz as ‘low’, and
transitions in the low frequency intensities are the dominant hence rarely to venture below 250 Hz (see for example [31, 32]).
feature, requiring high quality reproduction for full appreciation. In contrast, the frequency at the centre of a piano keyboard is
Electroacoustic music points to the possible relation between the note called ‘middle C’, and it has a fundamental frequency
low frequency timbres in music, speech and environmental of only about 260 Hz. This note also appears right in the middle
sound. Speech (and voice) is relevant because it is a common of the two staves which notate two part tonal music. So for a
component of electroacoustic music, often heard both raw and classical composer, using acoustic instruments and notation to
digitally transformed [23-25]. One interesting aspect of this make ‘pitch-based’ music, low pitches are those at least an octave
is that across speakers and conditions, the median frequency below middle C, in other words, below about 130 Hz; around
of the speech FO (the fundamental pitch) is around 130 and the pitch referenced as the common male speech fundamental.
220 Hz for men and women respectively [26, 27], well below Electroacoustic composers are often influenced by perceptions
the range of most psychoacoustic experiments on other sounds, of, and maybe experience in playing acoustic instruments, so
and well into the ‘low range’ in any conception. Environmental they share the conception of low pitch being around 100 Hz,
sound is relevant partly because of the importance of overt even when they make ‘sound-based’ music, in which pitch may
environmental sounds in soundscape and other aspects not be apparent and is certainly of limited importance [33]. One
of electroacoustic composition, but also because some of of the clearest and fullest studies in support of the idea that
the physical associations of low frequency sounds in the localization is possible at such low pitches [6], though dealing
environment (with large mass, low position, slow movement) with headphones rather than reverberant spaces, indicates
can provide important metaphorical cues in the music. sensitivity with narrow band sounds down to 31.5 Hz, but with
Musical tension, often created by controlling the degree quite lengthy static stimuli (800 ms).
to which expectation is fulfilled [28], can then exploit I provide some pointers to frame our further comments
countermanding cues, such as rapid movement of some on perception of sonic movement, and perception of both
sonic objects at the same time as a lack of movement of low location and movement in reverberant environments. The
frequency objects. Conversely, electroacoustic music, and above-mentioned early and influential experiments of Lord
noise music in particular, often focus on movement of low Rayleigh ‘on the lawn’ (and sometimes with the participation
frequency sounds, again raising fundamental questions for the of ‘Lady Rayleigh’) involved a tuning fork of low frequency
psychoacoustics of reverberant environments. Noise music is a (128 Hz), and did also involve speech. They lead to the duplex
large genre, springing from the work of Xenakis but extending theory, and it was interesting that in the earliest papers (e.g.
from classical acousmatic composition to underground rock 1876) Rayleigh considered the tuning fork a ‘pure’ sound, but
(and the edges of techno and drum ‘n’ bass). In the core of by 1907, when he reviewed [4] his overall work in this area of
noise music, high intensity sounds of at most slowly varying sound localization, he emphasized that it is a more complex
complex timbres (high spectral flatness, poorly defined spectral sound, and brings to bear mechanisms at many frequencies.
centroid) are used. The timbres often start close to white (or This timbre has been studied in detail subsequently, usually
sometimes pink) noise, and sculpt them slowly. Whereas consisting of at least three harmonic components and several
in most previous music, melodic or more generally motivic side-components, over a wide frequency range [34]. The duplex
structure has most often been delivered largely in the high theory has largely survived empirical testing, as summarized in
frequency bands, this is no longer true in noise music, nor in the two recent general review articles I reference [2, 3]. In depth
many other aspects of electroacoustic music [29, 30]. discussion is provided in some of the empirical studies such as
All these observations on the current usage of low frequencies those of Middlebrooks and collaborators. They summarise the
in music emphasise that understanding the perception of situation as follows : 'the duplex theory does serve as a useful
spatialization and movement of timbres comprising frequencies description (if not a principled explanation for) the relative
below 250 Hz is needed for composers to most fully and potency of ITD and ILD [which is what I term IID] cues in
powerfully exploit it. Hence this is a worthwhile topic in low- and high-frequency regimes (p. 2233) [35].
104 - Vol. 42, No.2, August 2014 Acoustics Australia
The main factors which create ITD and IID are clear: the eccentric to the speakers and/or the space. While speech
geometry of the head in relation to the sound source. What is perception is more difficult in reverberant environments,
perhaps less obvious is why frequency should impact on the in some respects musical listening may be enhanced, and
potency of the ITD and IID cues: but it seems that the diffusion there is an important industry concerned with the design of
of energy around the head and ears is such as to annul most low environments and architectures for events, concerts, domestic
frequency IID cues, whereas the ITD remains. A given time and studio listening and living (e.g. [43]). Reverberation
delay represents a smaller proportion of the long cycle times facilitates distance judgements concerning sound sources
of low frequencies, and this may provide more discrimination [44], but it generally diminishes other aspects of localization
than with high frequencies. Monaural localization depends discrimination (reviewed [45, 46]). An important ongoing
primarily on the influence of the pinnae of the outer ear on series of studies on localization of sound in rooms is being
the sound transmission, creating transformed power spectra undertaken by Hartmann (e.g. [47, 48]). From this series,
often with notches in mid to high frequencies, which can one salient observation is that the utility of the ITD declines
provide location cues that can be learnt even with only one ear below 800 Hz, though few data exist concerning frequencies
(see [3, 5]). Neural pathways and possible mechanisms for below 200 Hz: sensitivity (in terms of ITD threshold time) is
localization have been investigated [5, 36] and computationally about twice as good at 800 as 200 Hz [49]. Some ITD-related
modelled [37]. theories of localization fail in this low frequency range [50].
All three localization mechanisms can be influenced by Low frequency IIDs are even further reduced in reverberant
head movement, which consequently is an advantageous environments, nevertheless, low frequencies may still be
feature of listener behaviour [38], perhaps particularly for highly weighted in resultant localization judgements [46, 51].
low frequencies [39]. Many experiments have restricted head Two illusions are particularly relevant to low frequencies
movement, so as to control this influence. Of course, audio and to reverberant environments. Precedence effects occur
engineers, and music creators and listeners are used to taking when two sound sources are separated in time, and determine
advantage of it and hence experiments of ecological relevance whether the sounds undergo fusion: the opening sound may
to them do not restrict it (as in the exploratory experiment dominate the localization, preventing the realistic fission
below). In favourable free-field conditions, the minimum between the two. The duration of the sound is important in
audible angle (MAA) for localization is about 1° for broad band this, and in free field (particularly, anechoic) environments,
noise and the minimum audible movement angle (MAMA) is the crucial transitions generally occur between 1 and 20 ms.
much higher [2]. Data suggest that for a 500 Hz tone MAMA However, this is much less clear for low frequency sound and
is about 8° at velocity 90 degrees per second, and >20° at for reverberant environments (see review [52]). The Franssen
360 degrees per second [40]. There is also evidence that a illusion is related: in this even a gradual transition of acoustic
minimum stimulus duration between 150-300 ms may be energy from one loudspeaker to another can be missed, and
required for motion detection [41]. localization be determined by the opening onset location
There are also possible non-auditory components of low [2, 3, 53].
frequency localization, involving the vestibular system, or
vibrotactile information, perhaps registered on the face, nose EXPERIMENT
or other body parts [21, 42]. Of ecological relevance here is
that many noise artists, used to very high sound intensities, An exploratory experiment on low frequency localization
have learnt to conserve their hearing by the use of ear plugs: and movement detection in reverberant environments
for them, such non-auditory components of input sounds Bearing the preceding discussion in mind, I conducted an
may be of even greater importance. The author, like many exploratory experiment on this topic. It investigated whether
in electroacoustic music or electronic dance music, has and how fast a low frequency sound can be perceived to
experienced disconcertingly extreme intensity sound in some move in a performance studio environment (with quite short
underground sound clubs, and used earplugs both there and reverberation time). It complemented this with measures
when performing noise music. At some venues, earplugs are of location of sounds, using a four-alternative forced choice
always given out at the entry. On the lower scale of acoustic approach. I hypothesized that at frequencies below 200 Hz,
intensity, orchestral wood wind players, usually seated just in localization to L or R of a listener would remain feasible in
front of the brass section, commonly have protective screens a reverberant environment, even for filtered noise. It was
behind them in rehearsal and performance: unlike the screens expected that accuracy would be higher as duration of the
used in recording studios, their purpose is simply reducing the sound increased 0.2 s<2 s<6 s, the latter two durations chosen
sound impact on the woodwind players. The issue of earplugs to correspond to feasible compositional durations for spatial
and sonic localization is worthy of in depth experimental study. movement. Similarly, I hypothesized successful detection of
Finally, I briefly summarize some of the known impacts sonic motion at the latter two times, but to be poor or non-
of reverberation (i.e. rooms, or some partially sequestered existent at 0.2 s. Note that the forced choice design does not
spaces such as valleys). Many of the relevant studies use aim to distinguish between localization and lateralization, as
simulated reverberation, and have not been fully corroborated described above.
in experiments in real environments. The importance of Our experimental procedure and studio environment
reverberation diminishes when listeners are less than about is described more fully in the Appendix, and the legend to
1 metre from the sound source, or when they are substantially Figure 1, but the essence was as follows. I used two Genelec
Acoustics Australia Vol. 42, No. 2, August 2014 - 105
7060B subwoofers as sound sources, each at 45° to, equidistant within MAXMSP software: sine tones of 30 and 60 Hz, and
from, and at the same horizontal level as our seated listeners, white noise. These were presented either for 0.2, 2 or 6 s, with
who were musically trained people. There were five 10ms on/off ramps, with the SPL (dBA) at the listeners’ head
participants. Sounds were of three kinds, each low-pass filtered set unequally on the basis of readily acceptable loudness for
to reduce the presence of higher frequencies, and synthesized the three sounds. During presentation the sound originated
Table 1. Mixed effects analysis of accuracy, across all five participants.
Optimised Model: accuracy ~ duration * location + duration * sound + trial + (1 | participant). This means duration, location, sound and the
interactions duration*location and duration*sound are the fixed effect predictors, and random effects for the intercepts by participant are required.
Predictor
Sequential ANOVA
of the model
Degrees of Freedom Sum of Squares Mean Sum of
(DF) Squares (by DF),
which is also the
F-value
Duration (sonds) 1 218.51 218.51
Location/Movement 3 221.55 73.85
Sound 2 25.16 12.58
Trial number (centred, rescaled) 1 3.11 3.11
Duration x Location 3 54.61 18.20
Duration x sound 2 36.37 18.18
Random Effects
parameter and
Confidence
Intervals for
the Fixed effect
coefficients
Random effects S.D.
Intercept by participant 0.57
Fixed Effects 2.5% Confidence 97.5% Confidence
Interval Interval
Intercept 2.41 4.36
COEFFICIENTS:
Duration -0.10 +0.34
Location: moving LR -6.16 -4.41
Location: moving RL -5.59 -3.85
Location: R -1.70 +0.39
Sound: 30Hz sine 0.87 1.73
Sound: 60Hz sine 1.40 2.27
Trial number 0.0004 0.15
Duration x Location 0.51 0.95
(moving LR)
Duration x Location 0.42 0.85
(moving RL)
Duration x Location -0.10 +0.44
R
Duration x Sound -0.47 -0.15
(30Hz)
Duration x Sound -0.64 -0.32
(60Hz)
106 - Vol. 42, No.2, August 2014 Acoustics Australia
either solely in the Left (L) or Right (R) speaker, or ‘moved’ moving sounds, increasing with their duration. It also shows
with a linear constant power transit L->R or R->L over the that participants had difficulty in locating the very short sounds,
whole sound duration. Listeners were required after the end of and this was almost entirely due to the short moving sounds
the sound (and not before) to indicate as quickly as possible (the interaction shown in the figure as Moves/0.2 seconds). The
which of the four categories of location/movement event difference in performance for the different sounds (which were
they had just heard. There were in total 3(duration) x 3(sound in any case intended as roving stimuli rather than maximally
source) x 4 (location/movement) = 36 different stimuli. These controlled, and so not shown) was very small.
were presented in randomized order, each stimulus 8 times The 0.2 s movement sounds had movement rates of
in each of two blocks, with the listener requested to take a 450 degrees/sond, beyond the rate at which MAMAs are
break between the two. Thus each participant responded to optimal in free field conditions (discussed above). The inability
576 stimuli. This achieved the commonly used ‘roving’ of to judge movement in these stimuli was thus expected.
sound sources in terms of intensity and frequency, intended to Correspondingly, Table 2 shows an aggregated contingency
minimize listeners recognition of specific colouring attached to table for the responses in relation to the stimulus location/
individual speakers or positions in the space. movement. It shows that the moving sounds created confusion,
A mixed effects analysis of the data (using glmer from the where generally the starting point of the movement was taken
lme4 package in R since the data are binomial) to model the to be its static location when the participant failed to recognize
accuracy of responses is summarized in Table 1. There were the movement. This effect is similar to the Franssen illusion
random effects for the intercept by participant; accounting already described.
for this reduces the likelihood of Type 1 errors in the analysis
(e.g. [54]). The sequential ANOVA of the glmer model Response Accuracy by Stimulus Category
suggests that the main explanatory power is provided by
duration, sound location/movement and their interaction;
80
though it is important to note that the exact values in such a
sequential ANOVA depend somewhat on the order in which 60
Accuracy, %
the parameters are entered. The confidence intervals (which
40
also reveal the mean coefficient, as the centre of the range)
20
show that the effect of Duration is largely carried in the
interaction with location (Duration itself is not effective). The
0
moving sounds are both much worse identified than the static
Static/.2s
Static
Static/2s
Static/6s
2sec
6sec
0.2sec
Moves/2s
Moves/6s
Moves
Moves/.2s
ones, which are not different from each other (as shown by the Stimulus Category
fact that the CI on the coefficient for R, which is referenced
against L as base, breaches zero). The two sine tones are both
better located than the noise sound (which is the base level, not Figure 1. Summarized accuracies in detection of localization and
movement in the various conditions. The graph shows percent hit
shown in the table), though this modest effect is reduced as the
rate with 95% confidence intervals, based as conventionally on
sound duration increases: but note that they were not matched
pooling all five participants’ data. Thus the categories overlap in
in dB at the listening position as I did not seek to understand what is shown: the combinations Static/Moves; the three durations;
the influence of timbre. There is a small improvement as the and the 6 duration/location interactions; each include all the data
experiment proceeds (reflected in the positive coefficient on (2804 responses). All individual results (except Moves/.2s) are
Trial number, which was centred and rescaled before analysis). highly significant at p <0.00001 in comparison with the chance
The modelling was guided by Bayesian Information Criteria rate (25%, shown as a horizontal line). A full analysis by mixed
values, coefficient significance and parsimony, and optimized effects modelling is provided above. The text and Appendix describe
by likelihood ratio tests. The optimized model had a BIC of the conditions in more detail. Confidence intervals were also
1814.74. Confidence intervals for the fixed effects parameters determined by a more correct statistical meta-analysis of the separate
were determined by lme4’s ‘confint’ function, using the confidence intervals determined on each (independent) participant
for the stimulus categories shown, and these were a little larger, but
‘profile’ technique. The more approximate ‘Wald’ technique
confirmed all the conclusions.
gave very similar values. The random effects are modelled as
a distribution whose standard deviation is measured, but are
not of primary interest here. The fixed effects are expressed Table 2: Contingency table of percentages of responses to each
as coefficients and confidence intervals (which are here location/movement category of stimulus. Numbers on the diagonals
symmetrical), and where a predictor has several categories are the correct response percentages.
(sound, location) or distinct continuous values (duration), the
Location/ Response:
coefficients are the difference from the ‘base’ level, which is
Movement L R L->R R->L
the level which does not appear in the table. The model was
worsened by treating the location as comprising fixed vs L 97.5 0.7 1.6 0.1
moving, hence this approach is not shown. R 0.4 97.2 1.3 1.1
Figure 1 summarizes the salient comparisons, as judged by L->R 26.9 9.6 61.3 2.1
the mixed effects analysis, and shows that for the static sounds, R->L 4.3 25.0 2.9 67.9
accuracy was extremely high, but it was much worse for the
Acoustics Australia Vol. 42, No. 2, August 2014 - 107
It is important to bear in mind (as detailed in the Appendix), [3] T.R. Letowski and S.T. Letowski, Auditory spatial perception:
that while the digital signal leaving the MAXMSP synthesis auditory localization. 2012, DTIC Document.
in each case contained very little energy at frequencies above [4] L. Rayleigh, "XII. On our perception of sound direction", The
200 Hz, that generated by the subwoofers contained some London, Edinburgh, and Dublin Philosophical Magazine and
Journal of Science 13, 214-232 (1907)
energy detectable above the acoustic background of the
[5] A.R. Palmer, "Reassessing mechanisms of low-frequency
studio, in some cases at frequencies up to about 600 Hz.
sound localisation", Current opinion in neurobiology 14, 457-
Correspondingly, sending high dB sine tones to the speakers 460 (2004)
produced pitch-discernible audible sound at least up to [6] M.I.J. Mohamed and D. Cabrera, "Human sensitivity to
1000 Hz. Thus even ‘clean’ low frequency sounds as presented interaural phase difference for very low frequency sound", in
by these excellent speakers will always contain higher Acoustics 2008 - Australian Acoustical Society Conference. p.
frequencies. The same observations held for larger (much more [Link].
expensive!) Meyer subwoofers. [7] P. Lennox, Spatialization and Computer Music, in The Oxford
Handbook of Computer Music, ed., R.T. Dean, Oxford
CONCLUSIONS University Press, New York, USA, 2009) p. 258.
[8] R. Normandeau, "Timbre Spatialisation: The medium is the
Can a composer use low frequency lateralization and space", Organised Sound 14, 277 (2009)
movement, and hope for perceptibility? [9] S. Kanach, Music and Architecture by Iannis Xenakis. 2008,
I conclude that musically-experienced listeners have good Hillsdale, Pendragon Press.
[10] D. Worrall, "Space in sound: sound of space", Organised Sound
capability in relation to location/lateralization and movement
3, 93-99 (1998)
perception of low frequency sounds in our reverberant studio
[11] A.J. Hill, S.P. Lewis, and M.O. Hawksford, "Towards a
environment, though movement accuracy is much lower than generalized theory of low-frequency sound source localization",
location accuracy, as normally observed. There was worse Proceedings of the Institute of Acoustics 34, 138-149 (2012)
performance with very short sounds. Our listeners may have [12] P. Laine, H.-M. Lehtonen, V. Pulkki, and T. Raitio. "Detection
learnt much about low frequency listening from their musical and lateralization of sinusoidal signals in presence of dichotic
experience; but nevertheless, their performance improved pink noise", in Audio Engineering Society Convention 122.
slightly with trial as the experiment proceeded. On the other 2007. Audio Engineering Society.
hand, I argue that most people learn from environmental sound [13] D. Cabrera, S. Ferguson, and A. Subkey. "Localization
around them, and that there is a biological advantage in gaining and image size effects for low frequency sound", in Audio
ability to localize even low frequency sounds, especially if Engineering Society Convention 118. 2005. Audio Engineering
Society.
they are moving. This remains to be more fully tested.
[14] W.L. Martens. "The impact of decorrelated low-frequency
The results support the view that sound projection systems reproduction on auditory spatial imagery: Are two subwoofers
with multiple sub-woofers can add timbral flux and spatial better than one?" in Audio Engineering Society Conference:
control to composers’ armories. It will be interesting to assess 16th International Conference: Spatial Sound Reproduction.
the influence of sub-woofers at different elevations [55], in 3D 1999. Audio Engineering Society.
space, in addition to different azimuths in 2D space, as I have [15] A.J. Hill and M.O. Hawksford. "On the perceptual advantage
done here. This is especially so given the readily available of stereo subwoofer systems in live sound reinforcement",
Max patches for sound diffusion and movement, the panning in Audio Engineering Society Convention 135. 2013. Audio
software VBAP, and specialized facilities like those at ZKM Engineering Society.
introduced above, and the 22:4 system in our studio. In 3D [16] T.S. Welti. "Subjective comparison of single channel versus
spatialization, issues of front-back discrimination also come two channel subwoofer reproduction", in Audio Engineering
Society Convention 117. 2004. Audio Engineering Society.
into play (not discussed here). These are generally far more
[17] J. Braasch, W.L. Martens, and W. Woszczyk. "Identification
problematic for listeners than lateral location or movement and discrimination of listener envelopment percepts associated
detection [3], and hence will require considerable attention. with multiple low-frequency signals in multichannel sound
Interesting in the longer run, both psychoacoustically and reproduction", in Audio Engineering Society Convention 117.
musically, are questions concerning the possible competition 2004. Audio Engineering Society.
between musical low frequency and high frequency timbre [18] T. Nakayama, T. Miura, O. Kosaka, M. Okamoto, and T. Shiga,
spatialization. While already in practical use, the impact of "Subjective assessment of multichannel reproduction", Journal
these perceptually is little understood. Experimental acoustics of the Audio Engineering Society 19, 744-751 (1971)
and psychoacoustics will clearly have more to contribute to [19] G. Paine, R. Sazdov, and K. Stevens. "Perceptual Investigation
composers and performance space design in this area. into Envelopment, Spatial Clarity, and Engulfment in Reproduced
Multi-Channel Audio", in Audio Engineering Society Conference:
REFERENCES 31st International Conference: New Directions in High Resolution
Audio. 2007. Audio Engineering Society.
[1] J. Blauert, Spatial hearing: the psychophysics of human sound [20] M. Tucker, Dreaming with open eyes. The shamanic spirit in
localization. MIT Press, Cambridge, MA (1997) twentieth century arts and culture. Aquarian, London (1992)
[2] C. Stecker and F. Gallun, "Binaural hearing, sound localization, [21] N.P.M. Todd and F.W. Cody, "Vestibular responses to loud
and spatial hearing", Translational perspectives in auditory dance music: A physiological basis of the “rock and roll
neuroscience: normal aspects of hearing. San Diego (CA): threshold”?", The Journal of the Acoustical Society of America
Plural Publishing, Inc, (2012) 107, 496-500 (2000)
108 - Vol. 42, No.2, August 2014 Acoustics Australia
[22] R.T. Dean and F. Bailes, Event and process in the fabric [39] W. Martens, S. Sakamoto, L. Miranda, and D. Cabrera.
and perception of electroacoustic music in Proceedings of "Dominance of head-motion-coupled directional cues over
the international Symposium: Xenakis. The electroacoustic other cues during walking depends upon source spectrum"
music ([Link] in Proceedings of Meetings on Acoustics. 2013. Acoustical
rencontres/intervention11_xenakis_electroacoustique.pdf), ed., Society of America.
(Centre de Documentation Musique Contemporaine, 2013) pp. [40] D.R. Perrott and A. Musicant, "Minimum auditory movement
Intervention11, pp.12. angle: Binaural localization of moving sound sources", The
[23] H. Smith, The voice in computer music and its relationship Journal of the Acoustical Society of America 62, 1463-1466
to place, identity and community, in The Oxford Handbook of (1977)
Computer Music, ed., R.T. Dean, (Oxford University Press, [41] D.W. Grantham, "Detection and discrimination of simulated
New York, 2009) pp. 274-293. motion of auditory targets in the horizontal plane", The Journal
[24] R.T. Dean and F. Bailes, "NoiseSpeech", Performance Research of the Acoustical Society of America 79, 1939-1949 (1986)
11, 134-135 (2007) [42] N.P. Todd, A.C. Paillard, K. Kluk, E. Whittle, and J.G.
[25] H.A. Smith and R.T. Dean, "Voicescapes and Sonic Structures Colebatch, "Vestibular receptors contribute to cortical auditory
in the Creation of Sound Technodrama", Performance Research evoked potentials", Hearing Research 309, 63-74 (2014)
8, 112-123 (2003) [43] B. Blesser and L.-R. Salter, Spaces speak, are you listening? :
[26] M. Södersten, S. Ternström, and M. Bohman, "Loud MIT press, (2007)
speech in realistic environmental noise: phonetogram data, [44] E. Larsen, N. Iyer, C.R. Lansing, and A.S. Feng, "On the
perceptual voice quality, subjective ratings, and gender minimum audible difference in direct-to-reverberant energy
differences in healthy speakers", Journal of Voice 19, 29-46 ratio", The Journal of the Acoustical Society of America 124,
(2005) 450-461 (2008)
[27] R.C. Smith and S.R. Price, "Modelling of human low frequency [45] H. Al Saleh, Effects of reverberation and amplification on
sound localization acuity demonstrates dominance of spatial sound localisation. 2011, University of Southampton.
variation of interaural time difference and suggests uniform [46] A. Ihlefeld and B.G. Shinn-Cunningham, "Effect of source
just-noticeable differences in interaural time difference", PLoS spectrum on sound localization in an everyday reverberant
One 9, e89033 (2014) room", The Journal of the Acoustical Society of America 130,
[28] D. Huron, Sweet anticipation. MIT Press, Cambridge, MA 324-333 (2011)
(2006) [47] W.M. Hartmann, "Localization of sound in rooms", The Journal
[29] R.T. Dean, Hyperimprovisation: Computer interactive sound of the Acoustical Society of America 74, 1380-1391 (1983)
improvisation; with CD-Rom. A-R Editions, Madison, WI [48] W.M. Hartmann and E.J. Macaulay, "Anatomical limits
(2003) on interaural time differences: an ecological perspective",
[30] P. Hegarty, "Just what is it that makes today's noise music so Frontiers in neuroscience 8, (2014)
different, so appealing?", Organised Sound 13, 13 (2008) [49] A. Brughera, L. Dunai, and W.M. Hartmann, "Human interaural
[31] D.R. Perrott and J. Tucker, "Minimum audible movement time difference thresholds for sine tones: the high-frequency
angle as a function of signal frequency and the velocity of the limit", The Journal of the Acoustical Society of America 133,
source", The Journal of the Acoustical Society of America 83, 2839-2855 (2013)
1522-1527 (1988) [50] W.M. Hartmann and A. Brughera, "Threshold interaural time
[32] K. Saberi and D.R. Perrott, "Lateralization thresholds obtained differences and the centroid model of sound localization", The
under conditions in which the precedence effect is assumed to Journal of the Acoustical Society of America 133, 3512-3512
operate", The Journal of the Acoustical Society of America 87, (2013)
1732-1737 (1990) [51] F.L. Wightman and D.J. Kistler, "The dominant role of low‐
[33] L. Landy, Sound-based music 4 all, in The Oxford Handbook frequency interaural time differences in sound localization",
of Computer Music, ed., R.T. Dean, (Oxford University Press, The Journal of the Acoustical Society of America 91, 1648-
New York, USA, 2009) pp. 518–535. 1661 (1992)
[34] T.D. Rossing, D.A. Russell, and D.E. Brown, "On the acoustics [52] R.Y. Litovsky, H.S. Colburn, W.A. Yost, and S.J. Guzman,
of tuning forks", Am. J. Phys 60, 620-626 (1992) "The precedence effect", The Journal of the Acoustical Society
[35] E.A. Macpherson and J.C. Middlebrooks, "Listener of America 106, 1633-1654 (1999)
weighting of cues for lateral angle: the duplex theory [53] W.M. Hartmann and B. Rakerd, "Localization of sound in
of sound localization revisited", The Journal of rooms IV: The Franssen effect", The Journal of the Acoustical
the Acoustical Society of America 111, 2219-2236 Society of America 86, 1366-1373 (1989)
(2002) [54] H. Quené and H. Van den Bergh, "Examples of mixed-effects
[36] J. Ahveninen, N. Kopčo, and I.P. Jääskeläinen, "Psychophysics modeling with crossed random effects and with binomial data",
and neuronal bases of sound localization in humans", Hearing Journal of Memory and Language 59, 413-425 (2008)
Research 307, 86-97 (2014) [55] V.R. Algazi, C. Avendano, and R.O. Duda, "Elevation
[37] J. Liu, D. Perez-Gonzalez, A. Rees, H. Erwin, and S. Wermter, localization and head-related transfer function analysis at low
"A biologically inspired spiking neural network model frequencies", The Journal of the Acoustical Society of America
of the auditory midbrain for sound source localisation", 109, 1110-1122 (2001)
Neurocomputing 74, 129-139 (2010) [56] C. Hak, R. Wenmaekers, and L. Van Luxemburg, "Measuring
[38] W.O. Brimijoin, A.W. Boyd, and M.A. Akeroyd, "The Room Impulse Responses: Impact of the Decay Range on
contribution of head movement to the externalization and Derived Room Acoustic Parameters", Acta Acustica United
internalization of sounds", PLoS One 8, e83068 (2013) with Acustica 98, 907-915 (2012)
Acoustics Australia Vol. 42, No. 2, August 2014 - 109
Appendix A - The studio and experimental set up through a MAXMSP low pass resonator (to reduce frequencies
The performance research space at MARCS Institute is in the tones above 100 Hz, with 24 dB per octave roll off). The
roughly rectangular, 8 x 6.1 metres. It has 22 speakers in the differences between the three spectra were obvious, and as expected.
roof area, up to 4 subwoofers on or near the floor, and two large A MOTU 896 mk. III digital interface was used (sampling rate 44.1
retractable projection screens. These screens are on adjacent kHz; output level -3dB), and the Genelec speakers were at default
corrugated acoustic walls. The speakers for the experiment were settings, with their nominal cut-off (equivalent to a cross-over)
0.7 metres from these walls, and 3.3metres from the listening frequency being 120 Hz. They were visible, and the room was
position, facing inwards. The other two walls of the studio illuminated. It was noted that the loudness of all the sounds ramped
contain glass windows allowing adjacent rooms to function rapidly to a maximum, but if sustained, then after 7-8 seconds it
as control rooms. The windows have retractable curtains, and dropped to a new steady state which continued unchanged for
these make a large difference to the coloration of sound in the a prolonged time. This was also observed with large Meyer
studio, so the experiments were done with them completely subwoofers, and hence durations longer than 6 seconds were not
open. The speaker and listener positions were chosen such that suitable, and not used. The power spectrum of the sounds at the
there was no consistent audible coloration distinction between listeners’ position was measured, and it showed the vast majority
the speakers with a wide range of input tones. of energy to be below 200 Hz, but at higher frequencies there
The Harlequin brand floor has a vinyl cover on top of wooden was slight energy above background levels, declining strongly
squares, providing suitable spring for dance use, and there were and progressively with frequency. The 30 Hz stimulus was above
some normal studio items inside the room. The reverberation time background up to 400 Hz, the 60 Hz to 600 Hz (particularly during
(T30) measured from impulse responses in accordance with ISO the transient ramp on), and the noise tone to 500 Hz.
3382 [56] was 400 ms, and at 30 Hz this was extended to about The 5 participants (mean age 39.0 years, s.d. 17.2) all had
700 ms. The sounds were intended to be roving stimuli, and so musical experience; 4 also had experience of recording technology
they were not exactly matched for intensity: at the position of the and practice; and there was one female. During an experiment,
listeners’ heads they were measured to be 40-47 dB(A), using a the stimuli were presented in randomized order, and in two blocks,
Bruel and Kjaer 2250 sound level monitor. The noise sound was each of 288 stimuli; participants could take a break between the
set at the lower intensity for listener comfort. Background noise blocks. They fixated on the computer screen while listening, and
levels in the studio were c. 30 dB. were asked to make their judgement of location/movement as
The sounds were generated in MAXMSP as white noise, and quickly as possible after each sound had ended. Number keys
as 30 Hz and 60 Hz sine tones. Each sound was digitally filtered were pressed to indicate the location (1 for L; 2 for R; 3 for LR; 4
for RL). Data was also recorded in the MAXMSP patch.
110 - Vol. 42, No.2, August 2014 Acoustics Australia