Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
Speech Compression is a field of digital signal processing that focuses on reducing bit-rate of speech signals to enhance transmission speed and storage requirement of fast multimedia. ADPCM is a waveform based compression algorithm works by coding the difference between two consecutive samples of PCM. ADPCM retains advantages of PCM, with a reduced bit rate. ITU-T G.726 uses adaptive quantization. Adaptive quantization is the quantization process where the step size is varied based on the changes of the input signal as a means of achieving efficient compression. Sample differences may be represented with 5,4,3 or 2 bits corresponding to bit rates 40kbit/s, 32kbit/s, 24kbit/s and 16kbit/s respectively. The principle of ADPCM involves using knowledge of the signal in the past time to predict it in the future. ADPCM in FPGA convert 64 kbps digital streams in to 40 kpbs, 32 kbps, 24 kbps or 16 kbps using VHDL.
2017
Compression is a process of reducing an input data (Speech Signal) bit stream into a small bit size with high quality. Analog signal is a continuous signal takes more space to store the data in memory devices with original size (Bit). All sensor data (Analog Data) stored in computer with original size (Bit), but because of compression technique we store the same data in reduced format we same quality. In compression the unwanted data is eliminate. The main purpose of speech compression is to reduce the data bits for transmission of original data from one place to other & store this data that maintaining the quality as same as original signal. In this compression technique the analog to digital conversion (ADC) process played important role, because of analog to digital conversion analog to digital conversion (ADC) we get quantized sample signal. In that sample signal high correlation property is present between the sampled speech signal. The Adaptive Delta Pulse Code Modulation (ADP...
2019
The analog speech signal is digitized by sampling. For maintaining the voice quality, each sample has to be represented by 13 or 16 bits. The compression is nothing but to reduce the original data bits from higher bits to lower bits with good quality of signal as compared to original signal and decompression is reconstructed original signal from the compressed signal.Adaptive differential pulse code modulation is very useful & efficient for compression & decompression designed by Bell labs in 1970 for the reduction of bits of Analog signal. The ADPCM uses the difference techniques of next samples of original signal & predicted signal of last sample.The analog speech signals are amplified by the pre-amplifier and fed to the CODEC for analog to digital conversion. The CODEC transmits the digitized signal to the ADSP 2105/2115 processor, which then compresses the speech data using the ADPCM techniques and store in RAM. When the processor is interrupted, it reads the compressed data fro...
Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology, 2012
Voice-over Internet Protocol (VoIP) telephony is increasingly becoming a variant telecommunication technology that one day may surpass the old analog and digital telephone systems. The Quality-of-Service (QoS) factor is an important parameter to be considered when measuring the performance of a VoIP system. Many factors may influence the QoS of a given VoIP system. Some of these factors are delay, jitter and packet loss. This paper served to demonstrate that the algorithmic delay (latency) of the Pulse Code Modulation (PCM) speech compression algorithm which has originally been implemented in software can be significantly reduced by implementation in hardware. The PCM speech compression algorithm is first implemented, verified to demonstrate equivalence and then validated by comparing the latency of the hardware-implemented speech compression algorithm with that of an existing software implementation.
1993
Compared to most digital data types, with the exception of digital video, the data rates associ-ated with uncompressed digital audio are substan-tial. Digital audio compression enables more effi-cient storage and transmission of audio data. The many forms of audio compression techniques offer a range of encoder and decoder complexity, compressed audio quality, and differing amounts of data com-pression. The -law transformation and ADPCM coder are simple approaches with low-complexity, low-compression, and medium audio quality algo-rithms. The MPEG/audio standard is a high-complexity, high-compression, and high audio qual-ity algorithm. These techniques apply to general au-dio signals and are not specifically tuned for speech signals.
2010
The field of speech compression has advanced rapidly due to cost-effective digital technology and diverse commercial applications. In voice communication a real-time system should be considered. It is not still possible to compress signals without facing any loss in real-time system. This paper presents a theory of loss-less digital compression for saving high quality speech signals. Emphasis is given on the quality of speech signal. In hearing music high quality music is always needed, consuming smaller memory space. In this compression 8-bit PCM/PCM speech signal is compressed. When values of samples are varying they are kept same. When they are not varying the number of samples containing same value is saved. After compression the signal is also an 8-bit PCM/PCM but expansion is needed before hearing it. This technique may also be used in real-time systems.
2010
This paper presents a theory of lossless digital compression. Quality of voice signal is not important for voice communication. In hearing music high quality music is always recommended. For this emphasis is given on the quality of speech signal. To save more music it is needed to save them consuming smaller memory space. In proposed compression 8-bit PCM/PCM speech signal is compressed. When values of samples are varying they are kept same. When they are not varying the number of samples containing same value is saved. After compression the signal is also an 8bit PCM/PCM. MPEG-4 ALS is applied in this compressed PCM signal for better compression.
2004
In this paper a new speech compression method is presented. The traditional speech compression method is based on linear prediction. The compression method, proposed in this paper, is based on the use of an orthogonal transform, the discrete cosine packets transform. This method is well suited for the speech processing, taking into account the sine model of this kind of signals and because this transform converges asymptotically to the Karhunen-Loève transform. After the computation of the discrete cosine packets transform, the coefficients obtained are processed with a threshold detector, who keeps only the coefficients superior to a given threshold. This way the number of non zero coefficients is reduced doing the compression. The next block of the compression system is the quantization system. This is build following the speech psycho-acoustic model. The proposed compression method is transparent, the compression rate obtained is important and the operations number and the memory volume used are not very high.
Mostly transforms are used for speech data compressions which are lossy algorithms. Such algorithms are tolerable for speech data compression since the loss in quality is not perceived by the human ear. However the vector quantization (VQ) has a potential to give more data compression maintaining the same quality. In this paper we propose speech data compression algorithm using vector quantization technique. We have used VQ algorithms LBG, KPE and FCG. The results table shows computational complexity of these three algorithms. Here we have introduced a new performance parameter Average Fractional Change in Speech Sample (AFCSS). Our FCG algorithm gives far better performance considering mean absolute error, AFCSS and complexity as compared to others.
This chapter presents an introduction to speech compression techniques, together with a detailed description of speech/audio compression standards including narrowband, wideband and fullband codecs. We will start with the fundamental concepts of speech signal digitisation, speech signal characteristics such as voiced speech and unvoiced speech and speech signal representation. We will then discuss three key speech compression techniques, namely waveform compression, parametric compression and hybrid compression methods. This is followed by a consideration of the concept of narrowband, wideband and fullband speech/audio compression. Key features of standards for narrowband, wideband and fullband codecs are then summarised.
The paper gives the details about the speech compression using discrete wavelet transform in FPGA. In today's world multimedia files are used, storage space required for these files is more and sound files have no option so ultimate solution for this is compression. Compression is nothing but high input stream of data converted into smaller size. Compression is done for all, such as image, data; signals. Here speech compression technique is used and done using DWT. For this purpose only single level implementation is done to get compressed signal, and this is implemented in FPGA by using VHDL code. In this technique DWT code is written in VHDL that include separation of high level component and low level component from given input wav file and after separating these components down sampling is done and we get the compressed speech signal by keeping only approximation part of the result. The compressed speech signal was read back after up-sampling was performed. The resulting compressed signal is with some noise and future work is to reduce noise.
Informatica, 2010
In this paper new semilogarithmic quantizer for Laplacian distribution is presented. It is simpler than classic A-law semilogarithmic quantizer since it has unit gain around zero. Also, it gives for 2.97 dB higher signal-to-quantization noise-ratio (SQNR) for referent variance in relation to A-law, and therefore it is more suitable for adaptation. Forward adaptation of this quantizer is done on frame-by-frame basis. In this way G.712 standard is satisfied with 7 bits/sample, which is not possible with classic A-law. Inside each frame subframes are formed and lossless encoder is applied on subframes. In that way, double adaptation is done: adaptation on variance within frames and adaptation on amplitude within subframes. Joined design of quantizer and lossless encoder is done, which gives better performances. As a result, standard G.712 is satisfied with only 6.43 bits/sample. Experimental results, obtained by applying this model on speech signal, are presented. It is shown that experimental and theoretical results are matched very well (difference is less than 1.5%). Models presented in this paper can be applied for speech signal and any other signal with Laplacian distribution.
2009 International Conference on Future Computer and Communication, 2009
This research work reveals that Voice Signal Compression (VSC) is a technique that is used to convert the voice signal into encoded form when compression is required, it can be decoded at the closest approximation value of the original signal. This work presents a new algorithm to compress voice signals by using an "Adaptive Wavelet Packet Decomposition and Psychoacoustic Model". The main goals of this research work is to transparent compression (48% to 50%) of high quality voice signal at about 45 kbps with same extension (i.e. .wav to .wav), second is evaluate compressed voice signal with original voice signal with the help of distortion and frequency spectrum analysis and third is to compute the signal to noise ratio (SNR) of the source file.For this, a filter bank is used according to psychoacoustic model criteria and computational complexity of the decoder. The bit allocation method is used for this which also takes the input from Psychoacoustic model. Filter bank structure generates quality of performance in the form of subband perceptual rate which is computed in the form of perceptual entropy (PE). Output can get best value reconstruction possible considering the size of the output existing at the encoder. The result is a variable-rate compression scheme for high-quality voice signal. This work is well suited to high-quality voice signal transfer for Internet and storage applications.
The aim of this paper is a new speech compression method. This is a fast method. At its basis stays the Discrete Cosine Packets Transform (DCPT). This transform has a great advantage, its resemblance with the sinusoidal model of speech. The new speech compression algorithm is presented and some simulation results are reported. The advantages of the proposed method versus the speech compression method used in GSM are reported.
The growth of the cellular technology and wireless networks all over the world has increased the demand for digital information by manifold. This massive demand poses difficulties for handling huge amounts of data that need to be stored and transferred. To overcome this problem we can compress the information by removing the redundancies present in it. Redundancies are the major source of generating errors and noisy signals. Coding in MATLAB helps in analyzing compression of speech signals with varying bit rate and remove errors and noisy signals from the speech signals. Speech signal's bit rate can also be reduced to remove error and noisy signals which is suitable for remote broadcast lines, studio links, satellite transmission of high quality audio and voice over internet This paper focuses on speech compression process and its analysis through MATLAB by which processed speech signal can be heard with clarity and in noiseless mode at the receiver end .
2011
Abstract: This paper deals with speech compression based on Linear predictive coding, Discrete wavelet transforms and Wavelet packet transform. We used Malayalam (one of south Indian language) for this experiment. We could successfully compress and reconstructed Malayalam spoken words with perfect audibility by using both waveform coding and parametric coding.
IEEE Transactions on Speech and Audio Processing, 1998
This paper presents specific new techniques for coding of speech representations and a new general approach to coding for compression that directly utilizes the multidimensional nature of the input data. Many methods of speech analysis yield a two-dimensional (2-D) pattern, with time as one of the dimensions. Various such speech representations, and power spectrum sequences in particular, are shown here to be amenable to 2-D compression using specific models which take account of a large part of their structure in both dimensions. Newly developed techniques, multistep adaptive flux interpolation (MAFI) and multistep flow-based prediction (MFBP) are presented. These are able to code power spectral density (PSD) sequences of speech more completely and accurately than conventional methods. This is due to their ability to model nonstationary, but piecewise-continuous, signals, of which speech is a good example. Initially, MAFI and MFBP are applied in the time domain, then reapplied to the encoded data in the second dimension. This approach allows the coding algorithm to exploit redundancy in both dimensions, giving a significant improvement in the overall compression ratio. Furthermore, the compression may be reapplied several times. The data is further compressed with each application.
2014
Delta modulation is a waveform coding techniques which the data rate to alarge extent in data communication ; the problem encountered in delta modulation is the slope overload error , which is inherent in the system. In order for the signal to have good fidelity, the slope-overload error need to be as small as possible. Hence there is need for adaptive techniques to be applied to delta modulation to reduce noise .Adaptive delta modulation reduce the slope overload error at the same time increase the dynamic range and the tracking capabilities of fixed step size delta modulation. The adaptive algorithm adjust the step size (from the range of step size) to the power level of the signal and thus enhance the dynamic range of the coding system. This paper discusses the experiment worked using quantization delta modulation and adaptive modulation and their improvements with each other .
Journal of Electrical Engineering, 2000
Low Complex Forward Adaptive Loss Compression Algorithm and Its Application in Speech CodingThis paper proposes a low complex forward adaptive loss compression algorithm that works on the frame by frame basis. Particularly, the algorithm we propose performs frame by frame analysis of the input speech signal, estimates and quantizes the gain within the frames in order to enable the quantization by the forward adaptive piecewise linear optimal compandor. In comparison to the solution designed according to the G.711 standard, our algorithm provides not only higher level of the average signal to quantization noise ratio, but also performs a reduction of the PCM bit rate for about 1 bits/sample. Moreover, the algorithm we propose completely satisfies the G.712 standard, since it provides overreaching the curve defined by the G.712 standard in the whole of variance range. Accordingly, we can reasonably believe that our algorithm will find its practical implementation in the high quality c...
IEEE Signal Processing Magazine, 2001
audio cooling, lossless compression, Internet audio
IEEE Transactions on Audio, Speech and Language Processing, 2007
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.