Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2010, EURASIP Journal on Advances in Signal Processing
work on geometric wavefield decomposition which accounts for propagation phenomena such as diffusion and diffraction and serves as a computational engine for both wavefield rendering and binaural rendering. Still in the area of binaural rendering are the two contributions to this special issue, the first of which is by L. Wang et al., which addresses the longdebated problem of cross-talk cancellation. This paper is followed by that of M. Cobos et al., which proposes a method that allows us to avoid using a dummy head in binaural recording sessions.
IEEE Potentials, 2013
e all are used to perceiving sound in a three-dimensional (3-D) world. In order to reproduce real-world sound in an enclosed room or theater, extensive study on how spatial sound can be created has been an active research topic for decades. Spatial audio is an illusion of creating sound objects that can be spatially positioned in a 3-D space by passing original sound tracks through a sound-rendering system and reproduced through multiple transducers, which are distributed around the listening space. The reproduced sound field aims to achieve a perception of spaciousness and sense of directivity of the sound objects. Ideally, such a sound reproduction system should give listeners a sense of an immersive 3-D sound experience. Spatial audio can primarily be divided into three types of sound reproduction techniques, namely, loudspeaker stereophony, binaural technology, and reconstruction using synthesis of the natural wave field [which includes Ambisonics and wave field synthesis (WFS)], as shown in Fig. 1(a). The history of spatial audio dates back to the late 1800s, with the very first invention being the gramophone used in sound recording. As shown in the timeline in Fig. 1(b), there have been major advancements in terms of both technical and perceptual aspects in the last century. Spatial sound systems have evolved over the years from a two-channel stereo system to a multichannel surround sound system. These surround systems are not only limited to cinemas and auditoriums but are also being adapted in home entertainment systems. Conventional headphones, which employ a pair of small emitters, aim to produce highquality sound close to the ears, and they do not need to account for inaccuracies due to surroundings in contrast to loudspeakers. Nowadays, multiple emitters are embedded inside the ear cup to create a virtual surround sensation in 3-D surround headphones. Modern electroacoustic systems have improved significantly with new functionalities to adapt or correct the sound field in a given room acoustic. Toward the end of the 19th century, new reproduction techniques like Ambisonics and WFS [see Fig. 1(b)], which use the principle behind physical sound wave propagation in air and thus provide true sound experience in any environment, were introduced to overcome the limitations of stereo systems. Two-channel stereophony is the oldest and simplest audio technology, which has been progressively extended to multichannel stereophony systems, through 5.1, 7.1, 10.2, and 22.1 surround sound systems. [Note that in the x.y representation, x indicates the number of full
The available technologies for the presentation of audiovisual scenes to large audiences show different degrees of maturity. While high quality physics based rendering of 3D scenes is found in many visual applications, the presentation of the accompanying audio content is based on much simpler technologies. Multi-channel cinema sound systems are only capable of delivering sound effects, but they do not faithfully reproduce an acoustic scene. New methods for acoustic rendering are required to provide physically correct recreation of audiovisual environments. A new technology for the reproduction of sound fields is based on Huygen's principle. It utilizes a large number (tens or hundreds) of loudspeakers to recreate a 3D sound field based on the physical description of the original acoustic environment. This paper describes different aspects of this technology, presents applications pursued by the European project CARROUSO and our own contributions to the project including the development of algorithms for loudspeaker and listening room compensation and real-time software implementation of a rendering software on a PC.
Wave field synthesis and Ambisonics strive to reconstruct a sound field within a listening area using the interference of loudspeaker signals. Due to the spatial sampling, an error-free reconstruction is not achieved within the entire listening area and consequently, the perceived quality of the reproduction may be impaired. Specifically, sound events may be localized incorrectly and the individual loudspeaker signals may result in perceived coloration. Here, a binaural auditory model was employed to predict the localization error in several off-center-listening positions and to visualize coloration artifacts. The model outputs provided good match for perceptual data from previously conducted listening tests, verifying the applicability of the model to evaluate the reconstructed sound fields.
In this paper we propose two metrics for the evaluation of the impact of pre-echoes and post-echoes on the perceived quality in soundfield rendering applications. These metrics are derived from psychoacoustic-based considerations, in particular the masking effect, well known in the literature of perceptual coding. The measurement is accomplished through a virtual microphone array that samples the soundfield on a circumference. The soundfield within the circle is estimated by means of the circular harmonic decomposition. As a result, space-time impulse responses of the rendering system are obtained, which are then analyzed through the masking curve to extract the pre- and post-echoes metrics. A comparison between experimental and simulative results, conducted using the same setup, allows to discriminate the impact of the adopted rendering engine and of the non-idealities of the real system (environment and loudspeakers) on pre- and post-echoes.
2014
In contrast to two other multichannel sound-field synthesis techniques, Higher Order Ambisonics (HOA) and Wavefield Synthesis (WFS), Discrete Multichannel Simulation (DMS) has distinct advantages in the display of three-dimensional virtual acoustic sound fields for human listeners. In particular, it is the impact of listener head movements on reproduction fidelity that is the primary concern, although the influence of static head acoustics on the displayed sound field is another critical factor. Whereas movement of the listener's head outside of the 'sweet spot' for HOA can result in dramatic failure in terms of both spatial and timbral performance of a sound-field display system, given a large number of loudspeakers, a value of the DMS system is that the configuration of simulated sources and reflections is fixed on each of the respective loudspeakers even though the absolute angles vary with head translation and rotation. Experiences in the design and evaluation of thr...
The Journal of the Acoustical Society of America, 2021
A method of binaural rendering from microphone array signals of arbitrary geometry is proposed. To reproduce binaural signals from microphone array recordings at a remote location, a spherical microphone array is generally used for capturing a soundfield. However, owing to the lack of flexibility in the microphone arrangement, the single spherical array is sometimes impractical for estimating a large region of a soundfield. We propose a method based on harmonic analysis of infinite order, which allows the use of arbitrarily placed microphones. In the synthesis of the estimated soundfield, a spherical-wave-decomposition-based binaural rendering is also formulated to take into consideration the distance in measuring head-related transfer functions. We develop and evaluate a composite microphone array consisting of multiple small arrays. Experimental results including those of listening tests indicate that our proposed method is robust against change in listening position in the recording area.
Applied Acoustics, 2016
In this manuscript we propose an analytic solution to the problem of sound field rendering, based on the plane wave decomposition, which is here derived with reference to the Herglotz density function. The plane wave decomposition encodes the directional contributions to the sound field, and allows us to describe elementary sound fields in a model-based fashion, parameterized only by the source location. We show how this representation can be exploited for rendering purposes using a wide variety of loudspeaker arrangements. We start by deriving closed-form expressions for the loudspeaker weights based on the plane wave decomposition and we validate this derivation with an analysis of the reproduction error for the case of a circular array of speakers. We then show how to extend the proposed method to non-circular geometries. We assess the performance of the proposed rendering solution for various array configurations, offering a comparison with state-of-the-art analytical rendering techniques.
2009
This paper considers the problem of 3-D sound rendering in the near field through a low-order HRTF model. Here we concentrate on diffraction effects caused by the human head which we model as a rigid sphere. For relatively close source distances there already exists an algorithm that gives a good approximation to analytical spherical HRTF curves; yet, due to excessive
2003
A method of computationally efficient 3D sound reproduction via headphones is presented using a virtual Ambisonic approach. Previous studies have shown that incorporating head tracking as well as room simulation is important to improve sound source localization capabilities. The simulation of virtual acoustic space requires to filter the stimuli with head related transfer functions (HRTFs). In time-varying systems this yields
Journal of Sound and Vibration, 1997
Audio Signal Processing for Next-Generation Multimedia Communication Systems, 2004
Conventional multichannel audio reproduction systems for entertainment or communication are not capable of immersing a large number of listeners in a well defined sound field. A novel technique for this purpose is the so-called wave field synthesis. It is based on the principles of wave physics and suitable for an implementation with current multichannel audio hard-and software components. A multiple number of fixed or moving sound sources from a real or virtual acoustic scene is reproduced in a listening area of arbitrary size. The listeners are not restricted in number, position, or activity and are not required to wear headphones. A successful implementation of wave field synthesis systems requires to address also spatial aliasing and the compensation of non-ideal properties of loudspeakers and of listening rooms.
It is well-known that humans perceive sound binaurally, which literally means with two ears. The acquired knowledge from the investigation of the auditory system is used for the development of methods for analyzing and processing sound signals, in many cases of applications. Such an application is " Binaural Simulation " , which facilitates listening into spaces that only exist in the form of computer models. Namely, a " Virtual Auditory Environment " is created, which should be perceived by a listener, as being natural, or at least highly plausible. In the work presented here, binaural simulation is used for the processing of the typical stereo audio signals.In more detail, the left and the right audio signals feed two, or more, virtual sound sources which can be placed arbitrarily anywhere in the virtual auditory environment. The result of this process, gives a synthesized binaural signal, which is suitable for listening over headphones. If listening over loudspeakers is required, then the binaural signal passes through a cross-talk cancellation network.
IEEE Transactions on Multimedia, 2000
Immersive audio systems can be used to render virtual sound sources in three-dimensional (3-D) space around a listener. This is achieved by simulating the head-related transfer function (HRTF) amplitude and phase characteristics using digital filters. In this paper, we examine certain key signal processing considerations in spatial sound rendering over headphones and loudspeakers. We address the problem of crosstalk inherent in loudspeaker rendering and examine two methods for implementing crosstalk cancellation and loudspeaker frequency response inversion in real time. We demonstrate that it is possible to achieve crosstalk cancellation of 30 dB using both methods, but one of the two (the Fast RLS Transversal Filter Method) offers a significant advantage in terms of computational efficiency. Our analysis is easily extendable to nonsymmetric listening positions and moving listeners
2000
2021
Supplementary material for:<br> Arend, J. M., Ramírez, M., Liesefeld, H. R., & Pörschmann, C. (2021). Do near-field cues enhance the plausibility of non-individual binaural rendering in a dynamic multimodal virtual acoustic scene? <em>Acta Acust.</em>, <em>5</em>(55), 1–14. https://doi.org/10.1051/aacus/2021048<br> The .pdf file contains Additional figures on the synthesized near-field HRTFs and the employed filters Additional results figures showing individual movement data Information on the Matlab material and the video material The .mp4 file provides Video illustration of the experimental procedure The .zip file contains The Matlab script and material to synthesize the near- and far-field HRTFs, design the headphone-compensation filter and the loudspeaker filter, and generate the filtered noise and speech test signals employed in the listening experiments
2004
Abstract High-quality virtual audio scene rendering is required for emerging virtual and augmented reality applications, perceptual user interfaces, and sonification of data. We describe algorithms for creation of virtual auditory spaces by rendering cues that arise from anatomical scattering, environmental scattering, and dynamical effects. We use a novel way of personalizing the head related transfer functions (HRTFs) from a database, based on anatomical measurements.
Sound field perception and realization of real spaces in virtual environments using effective binaural technology were investigated. The objective of this study was to design and develop techniques and algorithm that enable us to record perceptual sound fields in reality, reconstruct it, and perform coherent emulation in virtual spaces. The employed technique involves binaural technology with adaptive beamforming and effective crosstalk cancellation using wavelet transformation. Before virtual sound field immersions, capturing the real sound along with the room's acoustic and psychoacoustic parameters and spatial decomposition are essential tasks. Regeneration is addressed in the following two ways: 1) headphones are fixed to the user's head, and the user moves simultaneously with them so that head and body movements do not change the coupling between sensors and the user's ears; 2) the loudspeaker arrays are positioned away from the user's body, and the user moves in accordance with the sound sources. The goal of creating a full virtual space with properties of co-existence with remote spaces was accomplished using knowledge about human sound perception, cognition of the environment of the sound, and considering external events with natural impression.
EURASIP Journal on Advances in Signal Processing, 2007
A real-time audio rendering system is introduced which combines a full room-specific simulation, dynamic crosstalk cancellation, and multitrack binaural synthesis for virtual acoustical imaging. The system is applicable for any room shape (normal, long, flat, coupled), independent of the a priori assumption of a diffuse sound field. This provides the possibility of simulating indoor or outdoor spatially distributed, freely movable sources and a moving listener in virtual environments. In addition to that, near-tohead sources can be simulated by using measured near-field HRTFs. The reproduction component consists of a headphone-free reproduction by dynamic crosstalk cancellation. The focus of the project is mainly on the integration and interaction of all involved subsystems. It is demonstrated that the system is capable of real-time room simulation and reproduction and, thus, can be used as a reliable platform for further research on VR applications.
6th Int. Conference on Digital Audio Effects (DAFX-03), 2003
Wave Field Synthesis is a method for 3D sound reproduction, based on the precise construction of the desired wave field by using an array of loudspeakers. The main purpose of this work is to present a set of software tools that brings to the audio community a feasible an easy way to start working with wave field synthesis systems. First in the paper, an introduction to different 3D sound techniques and an overview of WFS theory and foundations are given. Next, a series of software tools specially developed to simulate, analyze and implement WFS systems are presented. The first software module helps the user in the design of the array of loudspeakers to be employed in the reproduction by computing the equations for each speaker signal excitation. Another tool simulates the wave field generated by the arrays and analyses both performance and quality of the acoustic field. Finally a user friendly tool for realtime convolution capable of producing the excitation signals for the array of loudspeakers is presented. Also, different experiments that have been carried out with this software in order to evaluate the precision and behaviour of different WFS configurations are presented and interpreted.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.