Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
This paper proposes a new bit plane coding method for signed integer sequences. This method consists in mapping successive bit planes onto quinary symbols (+, -, 0, 1, EoP ), where the symbol "EoP " stands for "End of Plane", and applying arithmetic coding. Sign bits are efficiently coded in combination with the corresponding most significant bit of non-zero integers. Moreover, bit planes are scanned and coded in a non-sequential manner to exploit the correlation between successive planes. Results for conversational transform coding of wideband speech and audio signals -sampled at 16 kHz -show that the performance/complexity of the proposed bitplane coder is near equivalent to non-embedded coding (stack-run coding), while offering additional flexibility (bitstream scalability).
2008
This paper proposes a new model-based method for transform coding of audio signals. The input signal is mapped in "perceptual" domain by linear-predictive weighting filter followed by modified discrete cosine transform (MDCT). To provide bitstream scalability, model-based bit plane coding is then applied with respect to the mean square error (MSE) criterion. We present methods to estimate the symbol probability in bit planes assuming a generalized Gaussian model for the distribution of MDCT coefficients. We compare the performance of the proposed bitstream scalable coder with stack-run coding and ITU-T G.722.1. Objective and subjective quality results are presented. The proposed coder is equivalent to or slightly worse than reference coders, but presents the nice advantage of being scalable. Performance penalty due to bitstream scalability is evident at low bitrates.
Electronics Letters, 2007
A simple and flexible bit-plane coding method is developed for scalable audio coding. It is different from the traditional bit-plane coding in that the optimal bit-plane scanning order is adapted to the scale between the energy of the residual signal and the audio mask. Both the perceptual quality and the lossless compression ratio at common lossy bitrates are greatly improved compared with the performance of state-of-the-art scalable audio.
2006
The MPEG-4 Scalable to Lossless (SLS) audio coding is recently being developed to provide a unified solution for highcompression perceptual audio coding and high-quality lossless audio coding. SLS provides efficient Fine Granular Scalable (FGS) coding from AAC core layer to lossless, and achieves reasonable perceptual quality at its scalable coding range using a sequential bit-plane scanning method, which minimizes the audio distortion according to the spectral shape of the core layer quantization errors. In this paper, it is shown that the perceptual quality performance of SLS at intermediate rates can be further improved by incorporating psychoacoustic model into the bit-plane coding process. In addition, it is also found that such an improvement can be achieved by slightly tweaking the original bit-plane coding process of SLS and hence preserving its nice features such as compatibility to lossless coding and low complexity.
Communications, IEEE Transactions on, 1994
Abstract-This paper presents a transform coding algorithm devoted to high quality audio coding at a bit rate of 64 kbps per monophonic channel. It enables the transmission of a high quality stereo sound through the basic access (2B channels) of ISDN. Although a complete ...
2003
Lifting scheme based integer transforms are very powerful tools to construct lossless coding schemes. These transforms such as the Integer Fast Fourier Transform (IntFFT) and the Integer Modified Discrete Cosine Transform (Int-MDCT) are integer approximations of the original floatingpoint transforms, and hence there is an approximation error in the transform domain. This paper will propose structures for improved integer transforms in terms of improved approximation accuracy and computational efficiency. Experimental results will show that clear improvements in these two points are achieved in lossless audio coding.
2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07, 2007
In this paper we present a new model-based method to code the transform coef cients of audio signals. The histogram of transform coef cients is approximated by a generalized Gaussian model for ef cient model-based bit allocation and the spectrum is coded by scalar quantization followed by arithmetic coding. An example coder operating at 16 kHz and using predictive modi ed discrete cosine transform (MDCT) coding is described. We compare the performance of the proposed coder with ITU-T G.722.1. Objective and subjective quality results are presented. The proposed coder is better than ITU-T G.722.1 at 24 kbit/s and equivalent at 32 kbit/s.
Journal of the Audio …, 2007
Recently the MPEG Audio standardization group has successfully concluded the standardization process on technology for lossless coding of audio signals. A summary of the scalable lossless coding (SLS) technology as one of the results of this standardization work is given. MPEG-4 scalable lossless coding provides a fine-grain scalable lossless extension of the well-known MPEG-4 AAC perceptual audio coder up to fully lossless reconstruction at word lengths and sampling rates typically used for high-resolution audio. The underlying innovative technology is described in detail and its performance is characterized for lossless and near lossless representation, both in conjunction with an AAC coder and as a stand-alone compression engine. A number of application scenarios for the new technology are discussed.
2003
This paper presents a scalable lossless enhancement of MPEG-4 Advanced Audio Coding (AAC). Scalability is achieved in the frequency domain using the Integer Modified Discrete Cosine Transform (IntMDCT), which is an integer approximation of the MDCT providing perfect reconstruction. With this transform, and only minor extension of the bitstream syntax, the MPEG-4 AAC Scalable codec can be extended to a lossless operation. The system provides bit-exact reconstruction of the input signal independent of the implementation accuracy of the AAC core coder. Furthermore, scalability in sampling rate and reconstruction word length is supported.
In this paper a new representation or modeling method of speech signals is introduced. The proposed method is based on the generation of the so-called Predefined Signature S={S R } and Envelope vector E={E K } Sets (PSEVS). These vector sets are speaker and language independent. In this method, once the speech signals are divided into frames with selected lengths, then each frame signal piece X i is reconstructed by means of the mathematical form of X i =C i E K S R . In this representation, C i is called the frame coefficient, S R and E K are the vectors properly assigned from the PSEVS respectively. It is shown that the proposed method provides fast reconstruction and substantial compression ratio with acceptable hearing quality.
Digital Signal Processing, 2003
This paper presents a very low-complexity audio codec that provides audio playback quality similar to the MPEG-I/audio level 3 codec at 64 Kbps for a monophonic-channel signal. This welldesigned low-sophistication scheme uses a simple noise-masking model, a specialized nonuniform quantizer, an effectively recursive refinement module, and an adaptive arithmetic coder with multiplication-free adaptation. For bit allocation, we propose an appropriate nonuniform quantizer incorporating noise-masking effects and designed for fast implementation as well as efficient acceleration of the proposed refinement process. This recursive refinement algorithm effectively improves recovered perceptual audio quality after quantization. The adaptive arithmetic coder uses two fast adaptation algorithms that do not require multiplication to quickly obtain efficient bit allocation. A Taylor series expansion is used to simplify the frequently executed functions in the masking threshold formulas. Thus, the proposed high-quality audio codec appears to be a very valuable consumer electronic approach or a software solution at low cost.
2009 IEEE International Conference on Acoustics, Speech and Signal Processing, 2009
Traditionally, speech coding and audio coding were separate worlds. Based on different technical approaches and different assumptions about the source signal, neither of the two coding schemes could efficiently represent both speech and music at low bitrates. This paper presents a unified speech and audio codec, which efficiently combines techniques from both worlds. This results in a codec that exhibits consistently high quality for speech, music and mixed audio content. The paper gives an overview of the codec architecture and presents results of formal listening tests comparing this new codec with HE-AAC(v2) and AMR-WB+. This new codec forms the basis of the reference model in the ongoing MPEG standardization activity for Unified Speech and Audio Coding.
1992
This paper presents a transform coding algorithm designed for audio coding at a bit rate of 64 kbit/s. It enables the transmission of a high quality stereo sound through the 2B channels of ISON. Although a complete system including framing, synchronization and error correction has been developed, only the bit rate compression algorithm is described here. A detailed analysis of the signal processing techniques such as the time~frequency transformation, the preecho reduction by adaptive filtering, the fast algorithm computations .... is provided. The use of psychoacoustical properties is also precisely reported. Finally, some subjective evaluation results and one real time implementation of the coder using the AYr DSP32C digital signal processor are presented.
2003
This papers presents an embedded fine grain scalable perceptual and lossless audio coding scheme. The enabling technology for this combined perceptual and lossless audio coding approach is the Integer Modified Discrete Cosine Transform (IntMDCT), which is an integer approximation of the MDCT based on the lifting scheme. It maintains the perfect reconstruction property and therefore enables efficient lossless coding in the frequency domain. The close approximation of the MDCT also allows to build a perceptual coding scheme based on the IntMDCT. In this paper a bitsliced arithmetic coding technique is applied to the IntMDCT values. Together with the encoded shape of the masking threshold a perceptually hierarchical bitstream is obtained, containing several stages of perceptual quality and extending to lossless operation when transmitted completely. A concept of encoding subslices is presented in order to obtain a fine adaptation to the masking threshold especially in the range of perceptually transparent quality.
Proceedings of 1994 28th Asilomar Conference on Signals, Systems and Computers, 1994
Digerential encoding is a well known low complezily coding technique. Its use in the coding of wideband audio is limited by its inability t o follow rapid changes in the signal. This is a serious drawback when coding high fidelity audio where this inability can seriovsly degrade the perceptual quality of the reconstruction. This overload problem can be remedied by using a recursively indexed quantizer. In this paper we present some empirical results f o r the differential coding of audio signals.
2008 IEEE International Conference on Acoustics, Speech and Signal Processing, 2008
A fully scalable audio coding structure based on a novel combination of the non-core MPEG-4 scalable lossless audio coding (SLS), the state-of-the-art psychoacoustic model, joint stereo coding and the perceptually prioritized bit-plane coding is presented in this paper. The psychoacoustic information is implicitly embedded in the scalable bitstream with negligible amount of side information and trivial modi cation to the standardized SLS decoder. Results of extensive evaluation show that the subjective quality of scalable audio is improved signi cantly.
IEEE Transactions on Speech and Audio Processing, 1995
A new audio transform coding technique is proposed that reduces the bitrate requirements of the Perceptual Transform Audio Coders, by utilizing the stationarity characteristics of the audio signals. The method detects the frames which have significant audible content and codes them in a way similar to conventional Perceptual Transform Coders. However, when successive data frames are found to be similar to those sections, then their audible differences are only coded. An error analysis for the proposed method is presented and results from tests on different types of audio material are listed, indicating that an average of 30% in compression gain (over the conventional Perceptual Audio Coders bitrate) can be achieved, with a small deterioration in the audio quality of the coded signal. The proposed method has the advantage of easy adaptation within the Perceptual Transform Coders architecture and add only small computational overhead to these systems. n recent years the introduction of digital audio as a method for storing, processing and transmitting high-fidelity acoustic signals has helped in the evolution of numerous applications in the field of consumer electronics and professional audio. It is also envisaged that, in the near future, significant new techniques will become commercially available and novel applications will emerge based on the manipulation of audio data within multimedia or audiovisual technologies. However, the feasibility of such future applications, as well as some current ones, greatly depends on the use of data compression techniques which reduce the data transmission rate and memory storage requirements. Given the existence of such techniques, terrestrial or satellite transmission channels can be economically employed for single or multi-channel audio data transmission, and also data storage media can be efficiently utilized for storing lengthy segments of acoustic signals. The storage and transmission of such high-quality audio data (here it will be considered as reference the Compact Disc format, based on a 44.1 kHz sampling rate and 16-bit resolution) results in the relatively high bit-rate of 706 kBits/s, per data channel. This data rate can be technically or economically prohibitive for many applications, and this necessitates the introduction of data compression, preferably by using low-complexity methods (so that real-time implementations are not impeded), and without the insertion of perceptually detectable distortions. Applications which have emerged or are expected to appear with strong dependence on such signal compression technology , are in the area of high-fidelity audio for radio broadcasting (especially for the Digital Audio Broadcasting -DAB format [2]), in the area of multichannel audio for HDTV, in storing and processing of audio signals for domestic (e.g. multimedia or home studio) and professional applications (multichannel music recording), in transmitting audio data through computer or communication networks (e.g. ISDN), etc. Coding and data compression methods for acoustic signals have been known for at least 4 decades, but until recently they were mainly concerned with speech signals [3], [4]. More recently, Transform Manuscript
2007
This paper illustrates the suitability of ADSP 21160 for implementation of high performance DSP applications requiring 32 bit floating point precision. We present a real-time implementation of a multi-channel MPEG-2 AAC-LC encoder. The performance results of this implementation are also discussed in this paper.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.